PenceAI is a self-hosted, local-first AI agent platform built with TypeScript. It brings together multi-provider LLM access, short- and long-term memory management, graph-assisted context retrieval, cognitive memory patterns, a web interface, a gateway layer, and automated tests in a single codebase.
This repository is designed as a practical engineering foundation for AI agent experiments and product-oriented development. It combines chat-based agent execution, memory extraction, semantic recall, conversation summarization, background tasks, and observable retrieval flows inside one cohesive system.
The primary goal is to provide more than a simple chatbot. PenceAI is structured as an agent runtime that can retain conversational context, distinguish between memory types, shift into more deliberate reasoning when needed, and behave more consistently over time.
Core capabilities include:
- Agent runtime and tool-calling loop
- Episodic and semantic memory separation
- SQLite-based memory and conversation storage
- Embedding-powered semantic search and graph relationships
- Cognitive signals such as cognitive load, priming, and spreading activation
- Experimental decision mechanisms including reconsolidation and dual-process routing
- WebSocket-based web UI and gateway server
- Jest-based test infrastructure
- [NEW] Containerized Deployments via Docker and Docker Compose
- End-to-end TypeScript architecture: Agent, gateway, memory, router, web, and test layers share one language and type system.
- Agent runtime + tool loop:
AgentRuntimemanages reasoning, tool calls, observations, and response generation in a unified flow. - Cognitive memory layer:
MemoryManagercoordinates conversation history, long-term memory, retrieval orchestration, and maintenance routines. - Episodic / semantic memory separation: Memories are treated not only by content, but also by their functional role.
- GraphRAG (Graph-based Retrieval Augmented Generation):
src/memory/graphRAG/provides graph-aware retrieval with PageRank scoring, community detection, community summarization, and deterministic RAG patterns (Evaluation Gate, Phrase Bonus Scoring) for high-reliability memory recall. - MCP (Model Context Protocol) integration:
src/agent/mcp/implements extensible tool ecosystem with 18+ modules including marketplace, security layer, event bus, and unified registry. - Docker Ready: Built-in multi-stage
Dockerfileanddocker-compose.ymlto effortlessly deploy on Windows, Mac, or Linux without OS-level C++ compilation issues. - Background job queue: Persistent workflows support memory maintenance, embedding backfill, summarization, and deeper extraction tasks.
- Web interface + gateway: An HTTP/WebSocket server works together with a React-based client with React Query for data fetching and state management.
- Multi-provider LLM integration: Adapters are available for OpenAI, Anthropic, Groq, Mistral, Ollama, NVIDIA, GitHub, and other providers.
- Observability & cost tracking: Custom local metrics system provides token usage tracking and cost estimation natively across all 8 providers without external dependencies.
- Multi-channel support: Telegram, Discord, and WhatsApp channel integrations for broader accessibility.
- Token usage analytics: Real-time cost calculation with provider/model-specific pricing via
costCalculator.ts.
- Language: TypeScript
- Runtime: Node.js
- Server: Express + WebSocket
- Database: SQLite /
better-sqlite3 - Vectors:
sqlite-vecfor embedding storage - Infrastructure: Docker & Docker Compose
- MCP:
@modelcontextprotocol/sdkfor Model Context Protocol - Observability: Built-in local metrics and tracing system
- Embeddings:
@xenova/transformers(ONNX) + provider-backed embedding layers - Frontend: React + Vite
- State Management: Zustand + React Query
- Testing: Jest + Playwright + Testing Library
- Logging: Pino
| Requirement | Docker | Manual |
|---|---|---|
| Node.js ≥ 22 | Not needed on host | Required |
| npm | Not needed on host | Required |
| Python 3 + C++ build tools | Not needed on host | Required (for better-sqlite3, sqlite-vec) |
| Docker | Required | Not needed |
Windows users: Install Visual Studio Build Tools (C++ workload) before
npm installif you see native compilation errors.Linux users:
sudo apt install build-essential python3(Debian/Ubuntu) or equivalent.
The easiest way to get started — the setup script handles everything automatically:
| OS | Command |
|---|---|
| Windows | scripts\setup.ps1 |
| Linux / macOS | bash scripts/setup.sh |
git clone <repo-url> && cd PenceAI
# Windows (PowerShell)
scripts\setup.ps1
# Linux / macOS
bash scripts/setup.shThe setup script will:
- Check Node.js ≥ 22 is installed
- Install all dependencies (root + frontend)
- Create your
.envfile from.env.example - Prompt you to choose an LLM provider and enter your API key
- Build the project (TypeScript + Vite frontend)
- Show you how to start the application
If you prefer Docker (no Node.js needed on host), see Method 1 below.
Using Docker avoids native C++ compilation issues (better-sqlite3, sqlite-vec) and provides an isolated runtime that works the same on every OS.
# 1. Clone the repository
git clone <repo-url> && cd PenceAI
# 2. Create your .env from the example
cp .env.example .env
# 3. Edit .env — at minimum, set an LLM API key
# Example: OPENAI_API_KEY=sk-...
nano .env # or use any editor
# 4. Build and start
docker compose up -d --buildAccess the dashboard at http://localhost:3001
The database is persistently stored in ./data on the host, so it survives container restarts.
Common Docker commands:
docker compose up -d --build # Build & start
docker compose down # Stop & remove
docker compose logs -f # Follow logs
docker compose restart # RestartConnection troubleshooting: If you can't reach http://localhost:3001, make sure
HOST=0.0.0.0is set in your.envfile. The default is0.0.0.0(listens on all interfaces). SettingHOST=localhostinside Docker will prevent external access.
# 1. Clone the repository
git clone <repo-url> && cd PenceAI
# 2. Install root dependencies (includes devDependencies needed for build)
npm install
# 3. Install frontend dependencies
cd src/web/react-app && npm install && cd ../..
# 4. Create your .env from the example
cp .env.example .env
# 5. Edit .env — at minimum, set an LLM API key
nano .env
# 6a. Development (hot-reload backend + frontend):
npm run dev
# 6b. OR — Production build + run:
npm run build
npm startDevelopment mode starts both the backend (port 3001) and the frontend dev server (port 5173) concurrently. The frontend proxies /api and /ws requests to the backend automatically.
Production mode serves the pre-built frontend from dist/web/public on port 3001.
Create your .env file by copying .env.example:
cp .env.example .envAt least one LLM API key must be set. The DEFAULT_LLM_PROVIDER determines which one is used:
| Variable | Description |
|---|---|
OPENAI_API_KEY |
OpenAI (default provider) |
ANTHROPIC_API_KEY |
Anthropic (Claude) |
GROQ_API_KEY |
Groq |
MISTRAL_API_KEY |
Mistral |
MINIMAX_API_KEY |
MiniMax |
NVIDIA_API_KEY |
NVIDIA |
GITHUB_TOKEN |
GitHub Models |
OLLAMA_BASE_URL |
Local Ollama (default: http://localhost:11434) |
| Variable | Default | Description |
|---|---|---|
HOST |
0.0.0.0 |
Server bind address (0.0.0.0 for all interfaces) |
PORT |
3001 |
Server port |
DB_PATH |
./data/penceai.db |
SQLite database path |
DEFAULT_LLM_PROVIDER |
openai |
Active LLM provider |
DEFAULT_LLM_MODEL |
gpt-4o |
Default model name |
EMBEDDING_PROVIDER |
openai |
Embedding provider (openai, minimax, voyage, none) |
EMBEDDING_MODEL |
text-embedding-3-small |
Embedding model |
LOG_LEVEL |
info |
Logging level (debug, info, error) |
DASHBOARD_PASSWORD |
— | Password protect the web dashboard |
Important: Never commit real API keys or passwords to the repository.
Full variable list
PORT— Server port (default: 3001)HOST— Bind address (default: 0.0.0.0)DB_PATH— SQLite database file path (default: ./data/penceai.db)
OPENAI_API_KEY,ANTHROPIC_API_KEY,GROQ_API_KEY,MISTRAL_API_KEY,MINIMAX_API_KEY,NVIDIA_API_KEY,GITHUB_TOKENDEFAULT_LLM_PROVIDER— One of:openai,anthropic,ollama,minimax,github,groq,mistral,nvidiaDEFAULT_LLM_MODEL— Model name (default:gpt-4o)OLLAMA_BASE_URL— Ollama server URL (default:http://localhost:11434)ENABLE_OLLAMA_TOOLS— Enable Ollama tool calling (default: false)ENABLE_NVIDIA_TOOLS— Enable NVIDIA tool calling (default: false)
EMBEDDING_PROVIDER—openai,minimax,voyage,none(default:openai)EMBEDDING_MODEL— Embedding model (default:text-embedding-3-small)VOYAGE_API_KEY— Voyage API key
TELEGRAM_BOT_TOKEN,TELEGRAM_ALLOWED_USERSDISCORD_BOT_TOKEN,DISCORD_ALLOWED_USERSWHATSAPP_ENABLED
ALLOW_SHELL_EXECUTION— Enable shell command execution (default: false)FS_ROOT_DIR— Root directory for file operationsDASHBOARD_PASSWORD— Password for web dashboardBRAVE_SEARCH_API_KEY— Brave Search API keySENSITIVE_PATHS— Comma-separated protected paths
SYSTEM_PROMPT— Custom system prompt overrideAUTONOMOUS_STEP_LIMIT— Max autonomous reasoning steps (default: 5)MEMORY_DECAY_THRESHOLD— Memory decay days (default: 30)SEMANTIC_SEARCH_THRESHOLD— Similarity threshold (default: 0.7)LOG_LEVEL—debug,info,error(default:info)DEFAULT_USER_NAME— Default user display name
ENABLE_MCP— Enable MCP (default: true)MCP_SERVERS— JSON array of MCP server configsMCP_TIMEOUT— Timeout in ms (default: 30000)MCP_MAX_CONCURRENT— Max parallel MCP calls (default: 5)MCP_LOGGING— Enable MCP logging (default: true)
ENABLE_HOOKS— Enable hooks (default: true)HOOK_SECURITY_MONITOR— Path traversal & secret detection (default: true)HOOK_OUTPUT_SANITIZER— API key masking (default: true)HOOK_CONSOLE_LOG_DETECTOR—ask,approve,block(default: ask)HOOK_OBSERVATION_CAPTURE— Log tool calls (default: true)HOOK_DEV_SERVER_BLOCKER— Block dev server commands (default: true)HOOK_CONTEXT_BUDGET_GUARD— Compaction enforcement (default: true)HOOK_SESSION_SUMMARY— Session end metrics (default: true)
AGENTIC_RAG_ENABLED— Enable agentic RAG (default: true)AGENTIC_RAG_MAX_HOPS— Multi-hop retrieval depth, 1-5 (default: 3)AGENTIC_RAG_DECISION_CONFIDENCE— Minimum confidence (default: 0.5)AGENTIC_RAG_CRITIQUE_RELEVANCE_FLOOR— (default: 0.5)AGENTIC_RAG_CRITIQUE_COMPLETENESS_FLOOR— (default: 0.3)AGENTIC_RAG_VERIFICATION_SUPPORT_FLOOR— (default: 0.6)AGENTIC_RAG_VERIFICATION_UTILITY_FLOOR— 1-5 (default: 2)AGENTIC_RAG_MAX_REGENERATIONS— 0-3 (default: 1)
| Command | Description |
|---|---|
scripts\setup.ps1 (Win) / bash scripts/setup.sh (Unix) |
One-command setup wizard |
scripts\start.ps1 (Win) / bash scripts/start.sh (Unix) |
Start production server |
npm run dev |
Development mode (backend + frontend with hot-reload) |
npm run dev:backend-only |
Backend only with hot-reload |
npm run build |
Production build (TypeScript + Vite) |
npm start |
Start production server (requires npm run build first) |
npm run cli |
Interactive CLI |
npm run maintenance |
Maintenance CLI |
src/agent/runtime.ts manages reasoning, tool calls, observations, and writes results back to conversation history.
src/memory/manager/index.ts is the center of the memory system combining short-term, long-term, and semantic relationships.
src/gateway/index.ts is the main application entry point (Express & WebSockets).
The React-based client lives under src/web/react-app. Uses React Query for data fetching.
PenceAI uses a built-in local metrics system for observability. Operations that are automatically traced and stored locally include:
- LLM calls and token usage across all 8 providers
- Cost calculation based on provider/model pricing
- Agent reasoning, memory retrieval, tool executions, and latency metrics.
No external API keys are required for observability, ensuring your data remains fully local.
The project is under active development. Notable implemented areas include:
- A working gateway and web chat flow
- GraphRAG module with shadow-mode testing
- MCP integration with marketplace
- Docker Compose containerization for reliable deployment
- Multi-channel support (Telegram, Discord, WhatsApp)
- Detailed observability dashboard
Roadmap highlights:
- GraphRAG production rollout (moving out of shadow-mode)
- Hardening MCP Security and sandboxing
- Enhancing Web UI debug panels
- Adding stronger authentication and rate limiting
This repository provides a strong base for both product development and applied AI / cognitive systems research. Contributions should pay particular attention to memory safety, backward compatibility, test coverage, and public repository hygiene.
PenceAI is an experimental but serious engineering foundation for building an agent architecture that remembers, relates information, recalls context with varying levels of attention, and behaves more consistently over time.