Skip to content

Biscotto58/Acuity

Repository files navigation

Acuity

Sharpness of perception, keenness of mind.

A fully local AI assistant platform built for precision, depth, and extensibility — no cloud, no compromise.

Next.js React Tailwind CSS llama.cpp License: PolyForm-NC-1.0.0


Acuity’s "New Session" page


Overview

Acuity bridges the gap between a polished consumer chat interface and a powerful developer-centric local AI orchestrator. It runs entirely on your machine — ensuring absolute privacy, zero cloud latency, and full control over your data and models.

Under the hood, Acuity orchestrates llama.cpp via a robust child-process manager, uses SQLite (WAL mode) for fast local state, and integrates LanceDB for native, local-first Retrieval-Augmented Generation (RAG). It's designed to be a serious alternative to proprietary tools — built open, built local.


Features

  • 🔒 100% Local & Private — No cloud, no telemetry. Your models, your data, your rules.
  • 👁️ Native Multimodal Support — Automatic detection and binding of vision projectors (mmproj) for seamless image analysis.
  • 📚 Built-in RAG & Long-Term Memory — LanceDB vector storage automatically indexes conversations for semantic search. The AI autonomously learns and persists your preferences across sessions.
  • 🛠️ Extensible Tooling via AcuitySDK
    • Sandboxed Python Execution — Safe data analysis and math, right in the chat.
    • Web Browsing — DuckDuckGo integration with smart page extraction.
    • Custom JS/TS Tools — Write, save, and hot-reload your own tools directly in the UI using the Monaco Editor and the unified AcuitySDK.
  • 🔀 Non-Linear Conversations — In-place message editing, versioning, and branching. Never lose a train of thought.
  • ⚙️ Dynamic Prompting — Inject real-time context with slots like {{datetime}}, {{memory}}, and {{semantic_context}}.
  • 🎨 Polished UI — Built with Tailwind v4, featuring custom typography, smart interruptible auto-scroll, and dynamic theming (Dark, Light, OLED).

Getting Started

Prerequisites

  • Node.js v20+
  • llama.cpp binaries compiled for your system (llama-server / llama-server.exe)
  • At least one GGUF model (e.g., Llama 3, Mistral, Phi, Qwen)

Hardware Requirements

Acuity itself is lightweight. Performance depends on the model you load:

Setup Minimum Recommended
RAM (CPU inference) 16 GB 32 GB+
VRAM (GPU inference) 6 GB (7B Q4) 12 GB+ (13B+)
Storage 5 GB + model size SSD strongly recommended

GPU offloading via llama.cpp is supported and strongly recommended for any model above 7B.

Installation

  1. Clone the repository:

    git clone https://github.com/Biscotto58/Acuity.git
    cd acuity
  2. Install dependencies:

    npm install
  3. Start the development server:

    npm run dev
  4. Initial Configuration:

    • Open http://localhost:3000 in your browser.
    • Go to Settings → Server and point Acuity to your llama.cpp binary and models directory.
    • Go to Settings → AI and select your default inference and embedding models.

The Acuity SDK

Custom tools are written directly in the browser using the Monaco Editor. They hot-reload instantly and run in a sandboxed Node.js VM. Every tool receives the AcuitySDK — giving it direct access to the LLM, the vector database, and long-term memory.

module.exports = {
  name: "my_custom_tool",
  uiDescription: "Example Tool",
  iconName: "Wrench",
  isAutonomous: false,
  schema: {
    name: "my_custom_tool",
    description: "Instructions for the LLM on when to use this tool.",
    parameters: {
      type: "object",
      properties: {
        query: { type: "string", description: "What to process" }
      },
      required: ["query"]
    }
  },
  execute: async (args, toolSettings, acuity) => {
    // Generate an embedding
    const vector = await acuity.vector.getEmbedding(args.query);

    // Ask the LLM a sub-query
    const analysis = await acuity.llm.chat([
      { role: "user", content: `Analyze this: ${args.query}` }
    ]);

    // Persist a preference to long-term memory
    acuity.memory.savePreference("last_analyzed", args.query);

    return `Analysis complete: ${analysis}`;
  }
};

Architecture

Layer Tech
Frontend Next.js 16 App Router, React 19, Tailwind CSS v4
Backend Next.js API Routes (proxy + orchestrator)
Process Manager Custom Node.js event emitter — manages llama.cpp lifecycle, port allocation, and health polling
Relational DB better-sqlite3 in WAL mode (sessions, messages, settings)
Vector DB @lancedb/lancedb (embeddings, semantic search, memory)
Model Parsing Custom GGUF metadata reader — extracts context size and model type without loading weights

Roadmap

There is no fixed roadmap. Acuity is built iteratively based on what's actually useful.

Have a feature idea or found something missing? Open an issue or start a discussion — suggestions from the community are always welcome and considered. What gets implemented is ultimately my call, but good ideas get built.


Contributing

Contributions are welcome. If you want to add a core tool, improve the RAG pipeline, optimize the UI, or fix something that's been bugging you — go for it.

  1. Fork the project
  2. Create a feature branch (git checkout -b feature/your-feature)
  3. Commit your changes (git commit -m 'Add your feature')
  4. Push and open a Pull Request

If you're unsure whether something fits the project's direction, open an issue first to discuss it.


Disclaimer

Acuity provides a platform for running AI models and executing tools locally on your machine. You are solely responsible for any tools you run, code you execute, and actions taken by the AI on your behalf. This includes — but is not limited to — custom tools written via the AcuitySDK, web browsing actions, file system access, and any Python code executed within the sandboxed environment.

The author(s) of Acuity assume no liability for any damage, data loss, security breaches, or unintended consequences resulting from the use of this software or any tools run through it. Use at your own risk.


License

Distributed under the PolyForm Noncommercial License 1.0.0. Free for personal, educational, and non-commercial use. See LICENSE for details.

About

A llama.cpp based chat interface with a sandboxed environment for the LLM

Topics

Resources

License

Stars

Watchers

Forks

Contributors