Open-source AI agent system — run agentic coding assistants with any local LLM
Quick Start • Features • Architecture • MCP Servers • Models • Roadmap
Bytez3 Nexus is an open-source AI agent system that lets you run powerful, agentic coding assistants using any local or cloud-hosted LLM — no proprietary API keys required. Built on a restored and extended version of the Claude Code CLI architecture, Nexus adds a transparent translation proxy that routes all agent requests through Ollama's OpenAI-compatible endpoint.
Unlike cloud-locked alternatives, Nexus gives you:
- Full data sovereignty — your code never leaves your machine
- Model freedom — swap between Llama, Qwen, DeepSeek, Mistral, or any Ollama model
- Agent-grade tooling — 30+ built-in tools, MCP server support, multi-agent coordination
- Zero vendor lock-in — open source, self-hosted, yours to extend
- Multi-agent coordination — orchestrate multiple AI agents working in parallel
- Recursive subagent spawning — agents can spawn their own nested subagents autonomously for complex, multi-tiered task delegation
- Agent memory (Brain) — centralized persistent memory context (e.g., project-scoped brains) attached automatically to every active subagent
- 30+ built-in tools — file editing, bash execution, grep, git, web search, and more
- Tool calling — full function calling with structured input/output schemas
- Streaming — real-time token streaming for responsive interactions
- Model Context Protocol — connect to any MCP-compatible tool server
- Extensible — add GitHub, GitKraken, database, and custom tool servers
- Dynamic tool discovery — agents automatically discover and use MCP tools
- Permission system — granular control over what agents can access
- Ollama integration — native support for local and cloud-hosted Ollama instances
- Any model — use whatever you've pulled: Llama, Qwen, DeepSeek, Mistral, CodeLlama
- Privacy by default — no telemetry, no data collection, no cloud dependency
- Offline capable — works entirely without internet (with local models)
- Ollama Cloud — connect to remote Ollama deployments with API key auth
- Extensible provider system — architecture supports adding new LLM providers
- Hybrid mode — mix local and cloud models as needed
- Ollama installed and running (
ollama serve) - A model pulled — e.g.
ollama pull qwen2.5-coder:7b - Node.js 18+
# Clone
git clone https://github.com/SteOnChain/bytez3-nexus.git
cd bytez3-nexus
# Install dependencies
npm install# Translation tests (no Ollama needed)
npm test
# Full test suite with live Ollama connection
npm run test:live
# Test with specific model
OLLAMA_MODEL=qwen2.5:7b npm run test:ollama# Local Ollama — run with any local model
CLAUDE_CODE_USE_OLLAMA=1 ANTHROPIC_MODEL=qwen2.5-coder:7b nexus
# Ollama Cloud — connect to remote instances
CLAUDE_CODE_USE_OLLAMA=1 \
OLLAMA_BASE_URL=https://your-cloud.ollama.ai \
OLLAMA_API_KEY=sk-your-key \
ANTHROPIC_MODEL=llama3.2 \
nexusBytez3 Nexus uses a fetch-level interception architecture. When the Ollama provider is active, all API calls from the agent system are transparently translated:
┌──────────────────────────────────────────────────────────┐
│ BYTEZ3 NEXUS │
│ │
│ ┌──────────┐ ┌──────────┐ ┌───────────────────┐ │
│ │ Agent │───▶│ SDK │───▶│ Fetch Override │ │
│ │ System │ │ Client │ │ (Translation) │ │
│ └──────────┘ └──────────┘ └─────────┬─────────┘ │
│ │ │ │
│ ┌────▼─────┐ ┌────────────▼─────────┐ │
│ │ MCP │ │ Request Translator │ │
│ │ Servers │ │ Anthropic → OpenAI │ │
│ └──────────┘ └────────────┬─────────┘ │
│ │ │
│ ┌────────────▼─────────┐ │
│ │ Ollama │ │
│ │ /v1/chat/completions │ │
│ └────────────┬─────────┘ │
│ │ │
│ ┌────────────▼─────────┐ │
│ │ Response Translator │ │
│ │ OpenAI SSE → Agent │ │
│ └──────────────────────┘ │
└──────────────────────────────────────────────────────────┘
| Agent Format (Anthropic) | LLM Format (OpenAI/Ollama) |
|---|---|
system: [{type: 'text', text}] |
{role: 'system', content} |
{type: 'tool_use', id, name, input} |
tool_calls: [{function: {name, arguments}}] |
{type: 'tool_result', tool_use_id} |
{role: 'tool', tool_call_id} |
content_block_delta (SSE) |
choices[].delta (SSE) |
stop_reason: 'tool_use' |
finish_reason: 'tool_calls' |
Bytez3 Nexus supports the Model Context Protocol (MCP) — an open standard for connecting AI agents with external tools and data sources. Any MCP-compatible server works out of the box:
| Server | Tools Provided |
|---|---|
| GitHub | Issues, PRs, code search, repo management |
| GitKraken | Git operations, branch management, worktrees |
| Prisma | Database migrations, schema management, Studio |
| Google Maps | Geocoding, routing, places, directions |
| Custom servers | Build your own with the MCP SDK |
MCP tools are automatically discovered, translated through the Ollama proxy, and available to the agent system — no configuration needed beyond connecting the server.
| Model | Size | Best For | Tool Calling |
|---|---|---|---|
qwen2.5-coder:7b |
4.7 GB | Code generation, refactoring | ✅ Excellent |
qwen2.5-coder:32b |
18 GB | Complex coding tasks | ✅ Excellent |
llama3.2 |
2.0 GB | General purpose, fast | ✅ Good |
deepseek-r1:8b |
4.9 GB | Reasoning, debugging | |
mistral:7b |
4.1 GB | Balanced performance | ✅ Good |
codellama:13b |
7.4 GB | Code-focused, large context | ✅ Good |
qwen2.5:72b |
41 GB | Best quality (needs GPU) | ✅ Excellent |
- For tool calling (MCP, file editing, bash): Use
qwen2.5-coder— best tool-calling accuracy - For speed: Use
llama3.2— smallest, fastest responses - For reasoning: Use
deepseek-r1:8b— chain-of-thought reasoning - For maximum quality: Use
qwen2.5:72b— if you have the VRAM
| Variable | Default | Description |
|---|---|---|
CLAUDE_CODE_USE_OLLAMA |
— | Set to 1 to enable Ollama provider |
OLLAMA_BASE_URL |
http://localhost:11434 |
Ollama server URL |
OLLAMA_API_KEY |
— | API key for Ollama Cloud authentication |
ANTHROPIC_MODEL |
— | Model name (e.g., qwen2.5-coder:7b) |
bytez3-nexus/
├── restored-src/src/ # Agent system source (TypeScript)
│ ├── main.tsx # CLI entry point
│ ├── services/api/
│ │ ├── client.ts # API client factory (Ollama injection)
│ │ ├── claude.ts # Message handling & streaming
│ │ └── ollama/ # Ollama translation proxy
│ │ ├── ollamaClient.ts # Fetch override factory
│ │ ├── requestTranslator.ts # Anthropic → OpenAI translation
│ │ └── responseTranslator.ts # OpenAI → Anthropic SSE translation
│ ├── services/mcp/ # MCP server integration
│ ├── tools/ # 30+ built-in tools
│ ├── coordinator/ # Multi-agent coordination
│ ├── assistant/ # Assistant mode
│ ├── plugins/ # Plugin system
│ ├── skills/ # Skills system
│ └── voice/ # Voice interaction
├── test-ollama.mjs # Test suite (49 assertions)
├── assets/ # Branding assets
└── package.json # Project metadata
═══ Bytez3 Nexus — Test Suite ═══
▸ Request Translation (Anthropic → OpenAI) 16/16 ✅
▸ Response Stream Translation (OpenAI → Anthropic) 10/10 ✅
▸ Tool Call Response Translation 8/8 ✅
▸ Non-Streaming Response Translation 10/10 ✅
▸ Live Ollama Connection 5/5 ✅
═══ Results ═══
49 passed 0 failed ✓ All systems operational
- Ollama integration — local + cloud model support
- Tool calling — full function calling translation
- Streaming — real-time SSE translation
- MCP support — Model Context Protocol compatibility
- Multi-model routing — route different tasks to different models
- Agent memory — persistent context across sessions
- Custom agent definitions — Markdown/YAML-based agent configuration
- Web UI — browser-based agent dashboard (port 3000)
- Plugin marketplace — community-built extensions (framework ready)
- Voice mode — voice-driven agent interaction (requires Whisper/TTS integration)
- Team agents — collaborative multi-agent workflows
Contributions are welcome! Whether it's bug fixes, new features, documentation, or ideas — open an issue or submit a PR.
Branch Protection: The
mainbranch is protected. All changes require a Pull Request with at least 1 approving review before merging. Direct pushes tomainare blocked.
# 1. Fork the repo on GitHub, then clone your fork
git clone https://github.com/YOUR_USERNAME/bytez3-nexus.git
cd bytez3-nexus
# 2. Install dependencies
cd restored-src
npm install
# 3. Create a feature branch
git checkout -b feat/your-feature-name
# 4. Set up your environment
cp .env.example .env
# Edit .env with your Ollama settings:
# CLAUDE_CODE_USE_OLLAMA=1
# ANTHROPIC_MODEL=qwen2.5-coder:7b
# 5. Start the Web UI for testing
npx tsx --env-file=.env src/server/webServer.ts
# Open http://localhost:3000
# 6. Make your changes, commit, and push
git add -A
git commit -m "feat: describe your change"
git push origin feat/your-feature-name
# 7. Open a Pull Request on GitHub
# → Base: main ← Compare: feat/your-feature-name
# → Request review from @SteOnChain- One feature per PR — keep changes focused and reviewable
- Test your changes — verify in the Web UI before submitting
- Follow conventional commits —
feat:,fix:,docs:,refactor: - Don't commit
.envfiles ornode_modules/
MIT © Bytez3
Built with ⚡ by Bytez3
