MA is the standalone AI development agent for NekoCore OS. It builds, researches, writes code, manages projects, runs recurring tasks, and maintains its own memory — all from a browser GUI or terminal CLI.
MA runs as a self-contained Node.js server with zero npm dependencies.
# 1. Start the server
node MA-Server.js
# 2. Open the GUI
# → http://localhost:3850
# 3. Configure your LLM (click ⚙ in the GUI, or use the CLI)
node MA-cli.jsOn first launch, MA copies ma-config.example.json → MA-Config/ma-config.json. Edit it or configure via the GUI.
| Feature | Description |
|---|---|
| Multi-LLM Support | OpenRouter, Ollama, OpenAI-compatible endpoints |
| Task Engine | 8 task types with planning → execution → summary pipeline |
| Workspace Tools | File read/write/list/delete/move/mkdir — sandboxed to MA-workspace/ |
| Command Execution | Sandboxed shell with configurable whitelist (30+ defaults) |
| Web Search & Fetch | Search the web, fetch & extract page text |
| Memory System | Episodic + semantic memory with keyword search |
| Knowledge Base | 9 reference docs loaded on-demand by topic |
| Project Archives | Persistent project state with open/close/status lifecycle |
| Agent Catalog | 6 specialist agents (code-reviewer, senior-coder, etc.) |
| Blueprint System | Task-type-specific execution guides for plan/execute/summarize phases |
| Slash Commands | 25 commands for health, memory, knowledge, projects, config, pulses, chores, models |
| File Context | Auto-detects file paths in chat and reads them for context |
| Drag & Drop | Drop files into the GUI chat — content sent to MA as context |
| Ollama Integration | Browse local models, pull new ones, auto-fill maxTokens from model info |
| Intelligent Model Routing | Evaluates tasks, selects the best model from user roster, local-first, learns from results |
| Model Performance Tracking | Records model grades per task type/language, avoids poor performers, promotes good ones |
| Token Budget | Tracks context usage, reserves response budget, shows usage bar (up to 1M tokens) |
| Auto Self-Review | Reads back written files to verify completeness |
| History Compression | Compresses older chat turns to fit long conversations in context |
| Continuation | Graceful stop/continue when hitting token limits |
| Pulse Engine | Timer-driven recurring tasks: health scans, chore execution |
| Chores System | Repeating tasks delegated to agents, graded by MA |
| Health Scanner | 20-file integrity check with critical/warning reporting |
| User Guide | Built-in HTML user guide accessible from the GUI (? button) |
MA/
├── MA-Server.js HTTP server (port 3850)
├── MA-cli.js Terminal CLI
├── MA-server/ Core modules (15 files)
│ ├── MA-core.js Bootstrap, state, chat orchestration
│ ├── MA-llm.js LLM calling (OpenRouter / Ollama / model management)
│ ├── MA-tasks.js Intent classifier + task runner
│ ├── MA-pulse.js Pulse engine (timers, health scans, chores)
│ ├── MA-model-router.js Intelligent model selection + performance tracking
│ ├── MA-workspace-tools.js Tool execution engine
│ ├── MA-cmd-executor.js Sandboxed shell + whitelist
│ ├── MA-web-fetch.js Web search / fetch
│ ├── MA-memory.js Memory store (episodic/semantic)
│ ├── MA-project-archive.js Project lifecycle
│ ├── MA-agents.js Agent catalog
│ ├── MA-health.js System health scanner
│ ├── MA-rake.js RAKE keyword extraction
│ ├── MA-bm25.js BM25 search scoring
│ └── MA-yake.js YAKE keyword extraction
├── MA-client/ Browser GUI
│ └── MA-index.html Single-file SPA
├── MA-Config/ Runtime config (gitignored)
├── MA-entity/ Entity definitions + agent roster
├── MA-knowledge/ Reference documentation (9 docs)
├── MA-blueprints/ Task execution guides
│ ├── core/core/ 5 core blueprints
│ ├── modules/modules/ 8 task-type blueprints
│ ├── nekocore/ NekoCore build blueprint (5 parts)
│ └── rem-system/ REM System build blueprint (6 layers)
├── MA-workspace/ Sandboxed project workspace
│ ├── rem-system/ REM System Core (23 modules, 205 tests)
│ └── nekocore/ NekoCore Cognitive Mind (97 modules, 176 tests)
├── MA-logs/ Pulse logs (health scans, chore results)
└── MA-scripts/ Utility scripts
Edit MA-Config/ma-config.json or use the GUI settings panel (⚙):
{
"type": "openrouter",
"endpoint": "https://openrouter.ai/api/v1/chat/completions",
"apiKey": "sk-or-...",
"model": "anthropic/claude-sonnet-4",
"maxTokens": 12288
}| Field | Values | Default |
|---|---|---|
type |
openrouter, ollama |
— |
endpoint |
API URL | — |
apiKey |
Your key (blank for Ollama) | — |
model |
Model identifier | — |
maxTokens |
1024–1000000 | 12288 |
{
"type": "ollama",
"endpoint": "http://localhost:11434",
"apiKey": "",
"model": "llama3.1:8b",
"maxTokens": 8192
}When Ollama is selected in the GUI, the model field becomes a dropdown populated from your local Ollama instance. Selecting a model auto-fills maxTokens from the model's context length. You can also pull new models directly from the settings panel.
MA can route tasks to different models based on job requirements. Configure a roster of available models in MA-Config/model-roster.json or via /models add:
{
"models": [
{
"id": "ollama/llama3.1:8b",
"provider": "ollama",
"model": "llama3.1:8b",
"endpoint": "http://localhost:11434",
"contextWindow": 131072,
"tier": "local",
"strengths": ["python", "javascript"],
"weaknesses": ["rust"]
}
]
}MA evaluates each task's complexity, language, and context needs, then selects the best model:
- Local models first — always prefers free local models when they can handle the job
- Performance learning — tracks model grades (A–F) per task type and language
- Strength/weakness matching — avoids models with known weaknesses for the task
- Tier escalation — only uses premium models for complex/architect-level work
- Cost efficiency — prefers cheaper models when quality is comparable
Use /models research <name> to have MA research a model's capabilities via the LLM.
MA can only execute commands on the whitelist. Managed via:
- GUI: Settings → Command Whitelist tab
- Slash:
/whitelist,/whitelist add,/whitelist remove,/whitelist reset - File:
MA-Config/cmd-whitelist.json
Default whitelist includes: cargo, rustc, python, node, npm, gcc, go, git, cat, grep, and more. Dangerous binaries (rm, curl, bash, powershell, etc.) are always blocked.
| Port | Purpose |
|---|---|
| 3850 | Default |
| 3851–3860 | Fallback range if default is busy |
MA uses smart port management: if port 3850 is occupied, the server identifies what's running, prompts you, and starts on the next available port. Background launches (e.g. from the process manager) auto-resolve without prompting.
| Endpoint | Method | Body | Description |
|---|---|---|---|
/api/chat |
POST | { message, history?, attachments? } |
Chat / run tasks |
/api/config |
GET | — | Get config status |
/api/config |
POST | { type, endpoint, apiKey, model, maxTokens } |
Set config |
/api/entity |
GET | — | Get entity info |
/api/health |
GET | — | System health scan |
/api/commands |
GET | — | List available slash commands |
/api/slash |
POST | { command } |
Execute slash command |
/api/whitelist |
GET | — | Get command whitelist |
/api/whitelist/add |
POST | { binary, subcommands? } |
Add to whitelist |
/api/whitelist/remove |
POST | { binary } |
Remove from whitelist |
/api/whitelist/reset |
POST | {} |
Reset to defaults |
/api/ollama/models |
GET | ?endpoint=... |
List local Ollama models |
/api/ollama/show |
POST | { endpoint?, model } |
Get model info (context length, etc.) |
/api/ollama/pull |
POST | { endpoint?, model } |
Pull a model from Ollama |
/api/pulse/status |
GET | — | Pulse timer status + config |
/api/pulse/config |
POST | { healthScan?, choreCheck? } |
Update pulse config |
/api/pulse/start |
POST | — | Start all pulses |
/api/pulse/stop |
POST | — | Stop all pulses |
/api/pulse/logs |
GET | ?type=health&lines=50 |
Read pulse logs |
/api/chores |
GET | — | List all chores |
/api/chores/add |
POST | { name, description?, assignTo?, intervalMs? } |
Add a chore |
/api/chores/update |
POST | { id, ...fields } |
Update a chore |
/api/chores/remove |
POST | { id } |
Remove a chore |
/api/models/roster |
GET | — | List model roster |
/api/models/add |
POST | { provider, model, endpoint, ... } |
Add model to roster |
/api/models/update |
POST | { id, ...fields } |
Update a roster model |
/api/models/remove |
POST | { id } |
Remove model from roster |
/api/models/route |
POST | { message, taskType?, agentRole? } |
Test model routing for a job |
/api/models/performance |
GET | — | All model performance records |
/api/models/research |
POST | { model } |
Research model capabilities via LLM |
/api/memory/search |
GET | ?query=...&limit=5 |
Search memories |
/api/memory/store |
POST | { type, content, meta } |
Store a memory |
/api/memory/stats |
GET | — | Memory statistics |
/api/memory/ingest |
POST | { filePath } |
Ingest file to memory |
MA uses these tools via [TOOL:name {json}] blocks in LLM output. Params are validated with Zod schemas.
| Tool | Usage | Description |
|---|---|---|
ws_list |
[TOOL:ws_list {"path":"dir/"}] |
List directory |
ws_read |
[TOOL:ws_read {"path":"file"}] |
Read file (≤32KB) |
ws_write |
[TOOL:ws_write {"path":"file"}]content[/TOOL] |
Write file |
ws_append |
[TOOL:ws_append {"path":"file"}]content[/TOOL] |
Append to file |
ws_delete |
[TOOL:ws_delete {"path":"file"}] |
Delete file/folder |
ws_mkdir |
[TOOL:ws_mkdir {"path":"dir/"}] |
Create directory |
ws_move |
[TOOL:ws_move {"src":"old","dst":"new"}] |
Move/rename file |
web_search |
[TOOL:web_search {"query":"search"}] |
Web search |
web_fetch |
[TOOL:web_fetch {"url":"https://..."}] |
Fetch page text |
cmd_run |
[TOOL:cmd_run {"cmd":"command"}] |
Run shell command |
All file tools are sandboxed to MA-workspace/. Command execution is sandboxed via the whitelist.
MA maintains persistent memory across sessions using a flat-file storage system with full text indexing. Every conversation, task, and insight is automatically stored and becomes searchable context for future interactions.
| Type | Storage | Purpose | Persistence |
|---|---|---|---|
| Episodic | MA-entity/entity_ma/memories/episodic/ |
Individual conversation events, tasks completed, interactions | ✓ Full |
| Semantic | MA-entity/entity_ma/memories/semantic/ |
Abstracted knowledge, patterns, insights extracted from episodes | ✓ Full |
| Chat History | MA-Config/chat-history.json |
Full chat transcript (user messages + MA responses) | ✓ Full |
Each memory record is a folder containing:
record.json— Metadata (id, type, topics, importance, decay, timestamps, access history)semantic.txt— Plain-text content (what gets loaded into LLM context)
MA-entity/entity_ma/
├── memories/
│ ├── episodic/ ← Conversation events (~500 bytes–50 KB per memory)
│ │ ├── mem_xxx_yyy/
│ │ │ ├── record.json
│ │ │ └── semantic.txt
│ │ └── mem_aaa_bbb/
│ │ └── ...
│ ├── semantic/ ← Extracted knowledge (auto-compressed)
│ │ └── mem_ccc_ddd/
│ │ └── ...
│ └── index/
│ ├── memoryIndex.json ← Fast topic-to-memory lookup tables
│ └── topicIndex.json ← Topic frequency tracking
└── index/
└── memoryIndex.json ← Global index for all memories
MA's memory system has no hard storage limit and scales linearly:
- Per-entity capacity: Depends on available disk space. Typical usage is ~50–200 MB per 10,000 memories.
- Indexing: In-memory lookup uses three indexes (topic-to-memory, topic counts, recency). Indexes are cached and persisted to disk.
- Search time: O(1) topic lookup + O(n) BM25 scoring over matching candidates, capped at configurable limit (default 10 results).
- Retrieval: Automatic pagination via
limitparameter; default returns top 10 most relevant memories per query.
Typical memory volume per user:
- First week: ~50–100 episodic memories (5–10 per day)
- First month: ~400–800 memories across episodic + semantic
- After 1 year: ~5,000–15,000 memories (manageable, searches remain <100ms)
When you ask MA something, it automatically searches stored memories using:
-
Keyword Extraction — Your message is analyzed with RAKE (Rapid Automatic Keyword Extraction) and YAKE (Yet Another Keyword Extractor) to pull out key topics and phrases.
-
Index Lookup — MA checks the topic index for matching memories in O(1) time.
-
BM25 Scoring — For each candidate memory, MA ranks by:
- Relevance (45%): How well the memory's topics match your query
- Importance (35%): Manually-weighted importance score (0.0–1.0) assigned when the memory was stored
- Recency Decay (20%): Older memories decay naturally over time (1% per day for standard memories, but minimum floor of 0.1 so nothing is ever fully forgotten)
-
Access tracking — Every retrieved memory gets an updated access count and timestamp. Frequently-used memories are candidates for consolidation into semantic knowledge.
Result: Most relevant memories appear first; you get consistent context retrieval without manual tagging.
MA remembers everything from previous conversations:
- Start a new chat session → automatic memory search for relevant prior work
- Mention a past project → MA finds all related memories and loads context
- Reference your preferences → MA recalls them from semantic memory even months later
- Return after days/weeks → Full chat history available plus all insights from intervening work
Example: If you ask "How did we solve the async bug last month?" MA will:
- Search episodic memories for "async bug" + related terms
- Retrieve the original debugging session
- Return the solution code + decision rationale
- Surface any follow-up notes or related errors
| Endpoint | Method | Purpose |
|---|---|---|
/api/memory/search |
GET | Search all memories by query, returns ranked results with scores |
/api/memory/store |
POST | Manually store episodic or semantic memory (rarely needed — MA auto-stores) |
/api/memory/stats |
GET | Get counts: episodic + semantic, total memory size, index health |
/api/memory/ingest |
POST | Ingest a file (project archive, codebase, documentation) as chunked semantic memories |
Example stats output:
{
"episodic": 342,
"semantic": 47,
"total": 389,
"indexHealth": "valid",
"diskUsage": "12.4 MB",
"topicsTracked": 156,
"avgAccessCount": 2.3
}-
Search before starting — Use
/memory statsor/memory search <topic>to understand what MA already knows about a subject. -
Tag important findings — When storing manual memories via the API, set
importance: 0.7+for insights you'll want to prioritize in future searches. -
Ingest documentation — Use
/memory ingest <file>to load project READMEs, architecture docs, or codebase snapshots. MA chunks them automatically and indexes all topics. -
Review consolidation — As episodic memories age, MA automatically compresses related ones into semantic knowledge. Check with
/memory statsto see the semantic knowledge growing. -
Reset if needed — Use
node MA-Reset-All.jsto clear all memories (wipesMA-entity/entity_ma/memories/completely). Chat history is separate and can be reset independently.
node -e "const h=require('./MA-server/MA-health');console.log(h.formatReport(h.scan()))"Reports: file count, critical errors, warnings. Checks JS syntax, JSON validity, HTML tag balance.
Projects built using MA:
- NekoCore OS — Cognitive WebOS built with Memory Architect
MA v1.0 — Part of NekoCore OS.
MIT License - See LICENSE for details.
Part of NekoCore OS.