Skip to content

ben-scire/agent-starter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

⚡️ Agent Starter

Local AI Autonomy — Powered by FastAPI × Ollama (Llama 3)

Python FastAPI Ollama License: MIT

Build, deploy, and run intelligent agents fully offline — no APIs, no cloud, no gatekeepers.


🚀 Overview

Agent Starter is a minimal yet production-ready template for creating local AI agents that can plan, act, and reflect using your own hardware.

It includes:

  • ⚙️ FastAPI backend to orchestrate message flow
  • 🤖 Ollama-powered LLMs (Llama 3, Mistral, etc.) for reasoning and response
  • 🧩 Tool interface (currently web.fetch) for real-world actions
  • 🧠 Short-term memory for contextual continuity
  • 🔁 Sense → Plan → Act → Reflect control loop
  • 🌐 Optional live web retrieval & summarization

Sense → Plan → Act → Reflect loop
Autonomous loop: Sense → Plan → Act → Reflect

💡 Quickstart

# 1. Clone and set up
git clone git@github.com:ben-scire/agent-starter.git
cd agent-starter
python -m venv .venv && source .venv/bin/activate
pip install -e .

# 2. Create .env
echo "LLM_PROVIDER=ollama
LLM_MODEL=llama3.1:8b-instruct-q4_K_M
LLM_BASE_URL=http://172.17.128.1:11434" > .env

# 3. Run FastAPI
uvicorn api.main:app --reload --env-file .env

# 4. Test
curl -s http://127.0.0.1:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"query":"Fetch https://en.wikipedia.org/wiki/Artificial_intelligence and summarize it."}'

🧩 Architecture & Example

Agent Flow

FastAPI  (api/main.py)
└── SingleAgent.run()
    ├── Plan → uses Llama (via Ollama)
    ├── Act  → calls allowed tools (e.g., web.fetch)
    ├── Reflect → evaluates and summarizes
    └── Memory → stores short-term context

Llama 3 runs locally via Ollama at http://127.0.0.1:11434.

Example Query

curl -s http://127.0.0.1:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"query":"Summarize https://en.wikipedia.org/wiki/Quantum_computing in 2 sentences."}'

Example Response

{
  "summary": "Quantum computing leverages quantum-mechanical phenomena to perform computations that are infeasible for classical systems. It promises exponential speedups for specific problems like factorization and simulation.",
  "citations": []
}

🧰 Core Components

Feature File Description
Core agent logic core/agent.py Sense → Plan → Act → Reflect loop
Memory manager core/memory.py Stores contextual exchanges
LLM interface core/llm.py Routes messages to Ollama
Web tool tools/web.py Fetch + clean webpage text
API endpoint api/main.py FastAPI routes for /chat

Add new tools (filesystem, browser automation, crypto APIs, etc.) and plug them into the allowed_tools whitelist.


🧑‍💻 Why This Matters

Many agent frameworks rely on hosted APIs and paywalls. This project runs entirely local, letting you:

  • Inspect every request and response
  • Prototype new agent behaviors
  • Benchmark real LLM latency on-device
  • Run safely offline

It is built for engineers who want transparency and control over their agents.


🪄 Coming Soon

  • Multi-agent orchestration (Maestro, Scrubsy, Vegas)
  • Persistent memory store
  • CLI + simple web UI
  • Optional GPU inference benchmark mode

⚖️ License

MIT © 2025 Ben Scire


“Build tools, not dependencies.”

About

Local-first AI agent template powered by FastAPI + Ollama (Llama3) — plan, act, and reflect with full local inference.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors