MA v1.0 — Memory Architect

MA is the standalone AI development agent for NekoCore OS. It builds, researches, writes code, manages projects, runs recurring tasks, and maintains its own memory — all from a browser GUI or terminal CLI.

MA runs as a self-contained Node.js server with zero npm dependencies.

Quick Start

# 1. Start the server
node MA-Server.js

# 2. Open the GUI
# → http://localhost:3850

# 3. Configure your LLM (click ⚙ in the GUI, or use the CLI)
node MA-cli.js

On first launch, MA copies ma-config.example.json → MA-Config/ma-config.json. Edit it or configure via the GUI.

Features

Feature	Description
Multi-LLM Support	OpenRouter, Ollama, OpenAI-compatible endpoints
Task Engine	8 task types with planning → execution → summary pipeline
Workspace Tools	File read/write/list/delete/move/mkdir — sandboxed to `MA-workspace/`
Command Execution	Sandboxed shell with configurable whitelist (30+ defaults)
Web Search & Fetch	Search the web, fetch & extract page text
Memory System	Episodic + semantic memory with keyword search
Knowledge Base	9 reference docs loaded on-demand by topic
Project Archives	Persistent project state with open/close/status lifecycle
Agent Catalog	6 specialist agents (code-reviewer, senior-coder, etc.)
Blueprint System	Task-type-specific execution guides for plan/execute/summarize phases
Slash Commands	25 commands for health, memory, knowledge, projects, config, pulses, chores, models
File Context	Auto-detects file paths in chat and reads them for context
Drag & Drop	Drop files into the GUI chat — content sent to MA as context
Ollama Integration	Browse local models, pull new ones, auto-fill maxTokens from model info
Intelligent Model Routing	Evaluates tasks, selects the best model from user roster, local-first, learns from results
Model Performance Tracking	Records model grades per task type/language, avoids poor performers, promotes good ones
Token Budget	Tracks context usage, reserves response budget, shows usage bar (up to 1M tokens)
Auto Self-Review	Reads back written files to verify completeness
History Compression	Compresses older chat turns to fit long conversations in context
Continuation	Graceful stop/continue when hitting token limits
Pulse Engine	Timer-driven recurring tasks: health scans, chore execution
Chores System	Repeating tasks delegated to agents, graded by MA
Health Scanner	20-file integrity check with critical/warning reporting
User Guide	Built-in HTML user guide accessible from the GUI (? button)

Architecture

MA/
├── MA-Server.js           HTTP server (port 3850)
├── MA-cli.js              Terminal CLI
├── MA-server/             Core modules (15 files)
│   ├── MA-core.js         Bootstrap, state, chat orchestration
│   ├── MA-llm.js          LLM calling (OpenRouter / Ollama / model management)
│   ├── MA-tasks.js        Intent classifier + task runner
│   ├── MA-pulse.js        Pulse engine (timers, health scans, chores)
│   ├── MA-model-router.js Intelligent model selection + performance tracking
│   ├── MA-workspace-tools.js  Tool execution engine
│   ├── MA-cmd-executor.js Sandboxed shell + whitelist
│   ├── MA-web-fetch.js    Web search / fetch
│   ├── MA-memory.js       Memory store (episodic/semantic)
│   ├── MA-project-archive.js  Project lifecycle
│   ├── MA-agents.js       Agent catalog
│   ├── MA-health.js       System health scanner
│   ├── MA-rake.js         RAKE keyword extraction
│   ├── MA-bm25.js         BM25 search scoring
│   └── MA-yake.js         YAKE keyword extraction
├── MA-client/             Browser GUI
│   └── MA-index.html      Single-file SPA
├── MA-Config/             Runtime config (gitignored)
├── MA-entity/             Entity definitions + agent roster
├── MA-knowledge/          Reference documentation (9 docs)
├── MA-blueprints/         Task execution guides
│   ├── core/core/         5 core blueprints
│   ├── modules/modules/   8 task-type blueprints
│   ├── nekocore/          NekoCore build blueprint (5 parts)
│   └── rem-system/        REM System build blueprint (6 layers)
├── MA-workspace/          Sandboxed project workspace
│   ├── rem-system/        REM System Core (23 modules, 205 tests)
│   └── nekocore/          NekoCore Cognitive Mind (97 modules, 176 tests)
├── MA-logs/               Pulse logs (health scans, chore results)
└── MA-scripts/            Utility scripts

Configuration

LLM Setup

Edit MA-Config/ma-config.json or use the GUI settings panel (⚙):

{
  "type": "openrouter",
  "endpoint": "https://openrouter.ai/api/v1/chat/completions",
  "apiKey": "sk-or-...",
  "model": "anthropic/claude-sonnet-4",
  "maxTokens": 12288
}

Field	Values	Default
`type`	`openrouter`, `ollama`	—
`endpoint`	API URL	—
`apiKey`	Your key (blank for Ollama)	—
`model`	Model identifier	—
`maxTokens`	1024–1000000	12288

Ollama (Local)

{
  "type": "ollama",
  "endpoint": "http://localhost:11434",
  "apiKey": "",
  "model": "llama3.1:8b",
  "maxTokens": 8192
}

When Ollama is selected in the GUI, the model field becomes a dropdown populated from your local Ollama instance. Selecting a model auto-fills maxTokens from the model's context length. You can also pull new models directly from the settings panel.

Model Roster (Intelligent Routing)

MA can route tasks to different models based on job requirements. Configure a roster of available models in MA-Config/model-roster.json or via /models add:

{
  "models": [
    {
      "id": "ollama/llama3.1:8b",
      "provider": "ollama",
      "model": "llama3.1:8b",
      "endpoint": "http://localhost:11434",
      "contextWindow": 131072,
      "tier": "local",
      "strengths": ["python", "javascript"],
      "weaknesses": ["rust"]
    }
  ]
}

MA evaluates each task's complexity, language, and context needs, then selects the best model:

Local models first — always prefers free local models when they can handle the job
Performance learning — tracks model grades (A–F) per task type and language
Strength/weakness matching — avoids models with known weaknesses for the task
Tier escalation — only uses premium models for complex/architect-level work
Cost efficiency — prefers cheaper models when quality is comparable

Use /models research <name> to have MA research a model's capabilities via the LLM.

Command Whitelist

MA can only execute commands on the whitelist. Managed via:

GUI: Settings → Command Whitelist tab
Slash: /whitelist, /whitelist add, /whitelist remove, /whitelist reset
File: MA-Config/cmd-whitelist.json

Default whitelist includes: cargo, rustc, python, node, npm, gcc, go, git, cat, grep, and more. Dangerous binaries (rm, curl, bash, powershell, etc.) are always blocked.

Ports

Port	Purpose
3850	Default
3851–3860	Fallback range if default is busy

MA uses smart port management: if port 3850 is occupied, the server identifies what's running, prompts you, and starts on the next available port. Background launches (e.g. from the process manager) auto-resolve without prompting.

API Reference

Endpoint	Method	Body	Description
`/api/chat`	POST	`{ message, history?, attachments? }`	Chat / run tasks
`/api/config`	GET	—	Get config status
`/api/config`	POST	`{ type, endpoint, apiKey, model, maxTokens }`	Set config
`/api/entity`	GET	—	Get entity info
`/api/health`	GET	—	System health scan
`/api/commands`	GET	—	List available slash commands
`/api/slash`	POST	`{ command }`	Execute slash command
`/api/whitelist`	GET	—	Get command whitelist
`/api/whitelist/add`	POST	`{ binary, subcommands? }`	Add to whitelist
`/api/whitelist/remove`	POST	`{ binary }`	Remove from whitelist
`/api/whitelist/reset`	POST	`{}`	Reset to defaults
`/api/ollama/models`	GET	`?endpoint=...`	List local Ollama models
`/api/ollama/show`	POST	`{ endpoint?, model }`	Get model info (context length, etc.)
`/api/ollama/pull`	POST	`{ endpoint?, model }`	Pull a model from Ollama
`/api/pulse/status`	GET	—	Pulse timer status + config
`/api/pulse/config`	POST	`{ healthScan?, choreCheck? }`	Update pulse config
`/api/pulse/start`	POST	—	Start all pulses
`/api/pulse/stop`	POST	—	Stop all pulses
`/api/pulse/logs`	GET	`?type=health&lines=50`	Read pulse logs
`/api/chores`	GET	—	List all chores
`/api/chores/add`	POST	`{ name, description?, assignTo?, intervalMs? }`	Add a chore
`/api/chores/update`	POST	`{ id, ...fields }`	Update a chore
`/api/chores/remove`	POST	`{ id }`	Remove a chore
`/api/models/roster`	GET	—	List model roster
`/api/models/add`	POST	`{ provider, model, endpoint, ... }`	Add model to roster
`/api/models/update`	POST	`{ id, ...fields }`	Update a roster model
`/api/models/remove`	POST	`{ id }`	Remove model from roster
`/api/models/route`	POST	`{ message, taskType?, agentRole? }`	Test model routing for a job
`/api/models/performance`	GET	—	All model performance records
`/api/models/research`	POST	`{ model }`	Research model capabilities via LLM
`/api/memory/search`	GET	`?query=...&limit=5`	Search memories
`/api/memory/store`	POST	`{ type, content, meta }`	Store a memory
`/api/memory/stats`	GET	—	Memory statistics
`/api/memory/ingest`	POST	`{ filePath }`	Ingest file to memory

Tools Available to MA

MA uses these tools via [TOOL:name {json}] blocks in LLM output. Params are validated with Zod schemas.

Tool	Usage	Description
`ws_list`	`[TOOL:ws_list {"path":"dir/"}]`	List directory
`ws_read`	`[TOOL:ws_read {"path":"file"}]`	Read file (≤32KB)
`ws_write`	`[TOOL:ws_write {"path":"file"}]`content`[/TOOL]`	Write file
`ws_append`	`[TOOL:ws_append {"path":"file"}]`content`[/TOOL]`	Append to file
`ws_delete`	`[TOOL:ws_delete {"path":"file"}]`	Delete file/folder
`ws_mkdir`	`[TOOL:ws_mkdir {"path":"dir/"}]`	Create directory
`ws_move`	`[TOOL:ws_move {"src":"old","dst":"new"}]`	Move/rename file
`web_search`	`[TOOL:web_search {"query":"search"}]`	Web search
`web_fetch`	`[TOOL:web_fetch {"url":"https://..."}]`	Fetch page text
`cmd_run`	`[TOOL:cmd_run {"cmd":"command"}]`	Run shell command

All file tools are sandboxed to MA-workspace/. Command execution is sandboxed via the whitelist.

Memory & Session Persistence

How Chats & Memories Are Stored

MA maintains persistent memory across sessions using a flat-file storage system with full text indexing. Every conversation, task, and insight is automatically stored and becomes searchable context for future interactions.

Memory Types

Type	Storage	Purpose	Persistence
Episodic	`MA-entity/entity_ma/memories/episodic/`	Individual conversation events, tasks completed, interactions	✓ Full
Semantic	`MA-entity/entity_ma/memories/semantic/`	Abstracted knowledge, patterns, insights extracted from episodes	✓ Full
Chat History	`MA-Config/chat-history.json`	Full chat transcript (user messages + MA responses)	✓ Full

Each memory record is a folder containing:

record.json — Metadata (id, type, topics, importance, decay, timestamps, access history)
semantic.txt — Plain-text content (what gets loaded into LLM context)

Storage Location

MA-entity/entity_ma/
├── memories/
│   ├── episodic/           ← Conversation events (~500 bytes–50 KB per memory)
│   │   ├── mem_xxx_yyy/
│   │   │   ├── record.json
│   │   │   └── semantic.txt
│   │   └── mem_aaa_bbb/
│   │       └── ...
│   ├── semantic/           ← Extracted knowledge (auto-compressed)
│   │   └── mem_ccc_ddd/
│   │       └── ...
│   └── index/
│       ├── memoryIndex.json  ← Fast topic-to-memory lookup tables
│       └── topicIndex.json   ← Topic frequency tracking
└── index/
    └── memoryIndex.json     ← Global index for all memories

Capacity & Scaling

MA's memory system has no hard storage limit and scales linearly:

Per-entity capacity: Depends on available disk space. Typical usage is ~50–200 MB per 10,000 memories.
Indexing: In-memory lookup uses three indexes (topic-to-memory, topic counts, recency). Indexes are cached and persisted to disk.
Search time: O(1) topic lookup + O(n) BM25 scoring over matching candidates, capped at configurable limit (default 10 results).
Retrieval: Automatic pagination via limit parameter; default returns top 10 most relevant memories per query.

Typical memory volume per user:

First week: ~50–100 episodic memories (5–10 per day)
First month: ~400–800 memories across episodic + semantic
After 1 year: ~5,000–15,000 memories (manageable, searches remain <100ms)

Retrieval & Ranking

When you ask MA something, it automatically searches stored memories using:

Keyword Extraction — Your message is analyzed with RAKE (Rapid Automatic Keyword Extraction) and YAKE (Yet Another Keyword Extractor) to pull out key topics and phrases.
Index Lookup — MA checks the topic index for matching memories in O(1) time.
BM25 Scoring — For each candidate memory, MA ranks by:
- Relevance (45%): How well the memory's topics match your query
- Importance (35%): Manually-weighted importance score (0.0–1.0) assigned when the memory was stored
- Recency Decay (20%): Older memories decay naturally over time (1% per day for standard memories, but minimum floor of 0.1 so nothing is ever fully forgotten)
Access tracking — Every retrieved memory gets an updated access count and timestamp. Frequently-used memories are candidates for consolidation into semantic knowledge.

Result: Most relevant memories appear first; you get consistent context retrieval without manual tagging.

Cross-Session Continuity

MA remembers everything from previous conversations:

Start a new chat session → automatic memory search for relevant prior work
Mention a past project → MA finds all related memories and loads context
Reference your preferences → MA recalls them from semantic memory even months later
Return after days/weeks → Full chat history available plus all insights from intervening work

Example: If you ask "How did we solve the async bug last month?" MA will:

Search episodic memories for "async bug" + related terms
Retrieve the original debugging session
Return the solution code + decision rationale
Surface any follow-up notes or related errors

Memory Operations via API

Endpoint	Method	Purpose
`/api/memory/search`	GET	Search all memories by query, returns ranked results with scores
`/api/memory/store`	POST	Manually store episodic or semantic memory (rarely needed — MA auto-stores)
`/api/memory/stats`	GET	Get counts: episodic + semantic, total memory size, index health
`/api/memory/ingest`	POST	Ingest a file (project archive, codebase, documentation) as chunked semantic memories

Example stats output:

{
  "episodic": 342,
  "semantic": 47,
  "total": 389,
  "indexHealth": "valid",
  "diskUsage": "12.4 MB",
  "topicsTracked": 156,
  "avgAccessCount": 2.3
}

Memory Best Practices

Search before starting — Use /memory stats or /memory search <topic> to understand what MA already knows about a subject.
Tag important findings — When storing manual memories via the API, set importance: 0.7+ for insights you'll want to prioritize in future searches.
Ingest documentation — Use /memory ingest <file> to load project READMEs, architecture docs, or codebase snapshots. MA chunks them automatically and indexes all topics.
Review consolidation — As episodic memories age, MA automatically compresses related ones into semantic knowledge. Check with /memory stats to see the semantic knowledge growing.
Reset if needed — Use node MA-Reset-All.js to clear all memories (wipes MA-entity/entity_ma/memories/ completely). Chat history is separate and can be reset independently.

Health Check

node -e "const h=require('./MA-server/MA-health');console.log(h.formatReport(h.scan()))"

Reports: file count, critical errors, warnings. Checks JS syntax, JSON validity, HTML tag balance.

Made With MA

Projects built using MA:

NekoCore OS — Cognitive WebOS built with Memory Architect

Version

MA v1.0 — Part of NekoCore OS.

License

MIT License - See LICENSE for details.

Part of NekoCore OS.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MA v1.0 — Memory Architect

Quick Start

Features

Architecture

Configuration

LLM Setup

Ollama (Local)

Model Roster (Intelligent Routing)

Command Whitelist

Ports

API Reference

Tools Available to MA

Memory & Session Persistence

How Chats & Memories Are Stored

Memory Types

Storage Location

Capacity & Scaling

Retrieval & Ranking

Cross-Session Continuity

Memory Operations via API

Memory Best Practices

Health Check

Made With MA

Version

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
MA-Config		MA-Config
MA-blueprints		MA-blueprints
MA-client		MA-client
MA-entity		MA-entity
MA-knowledge		MA-knowledge
MA-scripts		MA-scripts
MA-server		MA-server
MA-skills		MA-skills
MA-workspace		MA-workspace
.gitignore		.gitignore
LICENSE		LICENSE
MA-Reset-All.js		MA-Reset-All.js
MA-Server.js		MA-Server.js
MA-cli.js		MA-cli.js
README.md		README.md
USER-GUIDE.md		USER-GUIDE.md
_health-check.js		_health-check.js
_test_tools.js		_test_tools.js
ma-config.example.json		ma-config.example.json
ma-start.js		ma-start.js
package-lock.json		package-lock.json
package.json		package.json
test-zod-tools.js		test-zod-tools.js

Folders and files

Latest commit

History

Repository files navigation

MA v1.0 — Memory Architect

Quick Start

Features

Architecture

Configuration

LLM Setup

Ollama (Local)

Model Roster (Intelligent Routing)

Command Whitelist

Ports

API Reference

Tools Available to MA

Memory & Session Persistence

How Chats & Memories Are Stored

Memory Types

Storage Location

Capacity & Scaling

Retrieval & Ranking

Cross-Session Continuity

Memory Operations via API

Memory Best Practices

Health Check

Made With MA

Version

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages