LocalPilot is a Windows-first local AI assistant starter project. It runs on your machine with Ollama, keeps its memory in plain local files, exposes explicit operating modes, and uses Python tools for file work, shell access, web lookup, screenshots, and guarded desktop control.
The first version is intentionally boring: no giant agent framework, no autonomous background loops, no vector database, and no self-modifying core logic. It is built to be easy to inspect, test, and extend.
- CLI chat entrypoint:
python localpilot.py - Optional Tkinter GUI with live activity timeline
- Explicit modes: chat, code, research, desktop, memory
- Local Ollama integration for text and placeholder vision
- LM Studio agent planner, vision, and local RAG embeddings
- Keyword router instead of opaque autonomous planning
- File and shell tools with confirmation gates
- DuckDuckGo research with 5-result cap
- Plain-file notes and learned facts memory
- Windows UI Automation before screenshot vision
- Structured JSONL logging plus readable text logs
cd C:\LocalPilot
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
ollama pull gemma4:31b
ollama pull qwen2.5-coder:14b-instruct-q3_K_M
ollama pull qwen2.5-coder:7b
ollama pull granite3.3:2b
ollama pull nomic-embed-text
python localpilot.pyRecommended install:
powershell -ExecutionPolicy Bypass -File scripts/install_recommended_models.ps1Optional models:
powershell -ExecutionPolicy Bypass -File scripts/install_optional_models.ps1Optional Gemma 4 comparison models:
powershell -ExecutionPolicy Bypass -File scripts/install_optional_gemma4.ps1RTX 3060 tuning:
powershell -ExecutionPolicy Bypass -File scripts/configure_ollama_rtx3060.ps1Model check:
powershell -ExecutionPolicy Bypass -File scripts/check_models.ps1Model doctor:
python localpilot.py --model-doctorSystem doctor:
python localpilot.py --system-doctorDoctor alias:
python localpilot.py --doctorBenchmark:
powershell -ExecutionPolicy Bypass -File scripts/benchmark_models.ps1Gemma 4 comparison:
model compare gemma4
- Main reasoning / chat role:
gemma4:31b - Coder role:
qwen2.5-coder:14b-instruct-q3_K_M - Coder fallback:
qwen2.5-coder:7b - Vision role:
gemma4:31b - Router role:
granite3.3:2b - Embedding role:
nomic-embed-text
Optional slow quality mode:
ollama pull qwen3:30bLocalPilot keeps coding and routing specialized by default, while this machine now uses gemma4:31b for both main reasoning and vision:
gemma4:31bhandles default planning, chat, and everyday reasoning on this machineqwen2.5-coder:14b-instruct-q3_K_Mhandles coding and app generationqwen2.5-coder:7bis the automatic coder fallback if the 14B coder model is missinggemma4:31bis also used for screenshots and visual inspectiongranite3.3:2bis reserved for fast routing experimentsnomic-embed-textis reserved for future local memory searchqwen3:30bremains available as an optional slow high-quality mode and is not the defaultgemma4:e4bis available as an optional faster Gemma comparison modelgemma4:latestis available as an optional quality comparison model if it runs well on this machine
LocalPilot expects a reachable Ollama API, typically at http://127.0.0.1:11434.
Runtime memory stays local by default. memory/notes.md and memory/learned_facts.json are treated as runtime data, not tracked repo content. If you want to keep or share memory intentionally later, export it into docs/ or a future exports/ folder instead of autosyncing the live memory files.
For this PC, a shorter Ollama keep-alive is recommended so one large model does not stay loaded too long:
[Environment]::SetEnvironmentVariable("OLLAMA_KEEP_ALIVE", "2m", "User")Recommended order on this RTX 3060 machine:
- Install recommended models.
- Run
scripts/configure_ollama_rtx3060.ps1. - Fully quit and restart Ollama.
- Run
scripts/check_models.ps1. - Run
scripts/benchmark_models.ps1. - Run LocalPilot and ask for
model benchmark.
Performance notes:
- The old
4096context default is not reliable for LM Studio agent follow-ups. - Use
8192at minimum for the planner model, and prefer16384. - Only one heavy model should stay loaded at a time.
qwen3:30bis slow quality mode only and should not be the everyday default.
If models show missing after restart:
- Run
ollama list. - Run
python localpilot.py --model-doctor. - Check whether
OLLAMA_MODELSchanged. - Rerun
powershell -ExecutionPolicy Bypass -File scripts/install_recommended_models.ps1. - Fully quit and restart Ollama.
LocalPilot will report similar installed models when the exact configured tag is missing. For example, qwen2.5-coder:14b is not treated as the same thing as qwen2.5-coder:14b-instruct-q3_K_M; it is only reported as a likely nearby tag.
If desktop observation shows dependency_missing for UI Automation:
- Run
python localpilot.py --system-doctor - Install the missing dependency into the LocalPilot virtual environment:
.\.venv\Scripts\python.exe -m pip install uiautomation- If more than one package is missing, reinstall the full requirements set:
.\.venv\Scripts\python.exe -m pip install -r requirements.txt- Restart LocalPilot using Run LocalPilot.bat
Run LocalPilot.bat is already configured to use .venv\Scripts\python.exe, so dependency fixes should be installed into that same environment.
LocalPilot uses OCR only as a support layer for page understanding and confidence scoring. It does not replace UI Automation safety checks, and OCR text alone is never enough to authorize clicking or typing.
OCR is used to help answer:
- what text is visible on screen
- what buttons, labels, and fields can be read
- whether OCR agrees with UI Automation and screenshot vision
- whether OCR should slightly raise or lower confidence
Install OCR dependencies:
.\.venv\Scripts\python.exe -m pip install pytesseractThen install Tesseract OCR for Windows and make sure tesseract.exe is on PATH, or install it in:
C:\Program Files\Tesseract-OCR\tesseract.exe
Test OCR with:
python localpilot.py --doctorand inside LocalPilot:
ocr screenshot
Generated OCR-preprocessed images are written under workspace/debug_views and stay local because workspace/* is ignored by Git.
cd C:\LocalPilot
python localpilot.pyBy default LocalPilot starts the CLI and tries to open the GUI alongside it. If Tkinter is unavailable, the CLI still runs.
Install the browser bridge dependencies:
npm install --prefix browserIf LocalPilot does not find Chrome or Edge automatically, set a browser path explicitly:
$env:LOCALPILOT_BROWSER_EXECUTABLE="C:\Program Files\Google\Chrome\Application\chrome.exe"Then run the AI-driven agent CLI:
python localpilot.py --agent-cliFor websites, the agent prefers Puppeteer-controlled browser tools over desktop mouse tools.
LocalPilot's AI-driven agent planner is intended to run through LM Studio with:
- Planner model:
qwen2.5-coder-14b-instruct - Vision model:
qwen3-vl-8b-instruct - Recommended planner context:
16384 - Minimum usable planner context:
8192 - Bad for follow-ups:
4096 - Recommended planner concurrency:
Max Concurrent Predictions = 1
Recommended LM Studio settings for the planner model:
Context Length: 16384
Max Concurrent Predictions: 1
If LocalPilot warns that the planner context is too small, increase the LM Studio context length before testing multi-step follow-ups like continue, what happened, or screenshot and browser clarifications.
LocalPilot can build a local RAG index for project knowledge and prior session summaries.
- Embedding endpoint:
POST http://localhost:1234/v1/embeddings - Embedding model:
text-embedding-nomic-embed-text-v1.5 - Storage:
memory/rag/ - Default indexed sources:
app/,tests/,.pilotrules,README.md,config/settings.json,memory/rag/memory_bank.md, and session summaries frommemory/sessions/
To use it through the AI agent:
Index the LocalPilot workspace for memory.
What files implement the timer tool and what do they do?
What model is the planner supposed to use?
If LM Studio embeddings are unavailable, LocalPilot returns a real error instead of pretending the index is ready.
For double-click launch on Windows, use Run LocalPilot.bat. It will:
- create
.venvautomatically on first run if it does not exist - install
requirements.txtinto that virtual environment - use
.venv\Scripts\python.exefor all normal launches - keep the window open if startup fails so the error stays visible
- File overwrite requires approval.
- File move into an existing target requires approval.
- Shell commands require approval.
- Dangerous shell commands are blocked by default.
- Desktop click, type, and hotkey actions require approval.
- Windows UI Automation is preferred over screenshots.
- Vision is only used when desktop inspection needs it.
- OCR is only used as a support signal for page understanding and confidence scoring.
- The assistant should not rewrite its own core logic without user approval.
C:\LocalPilot
README.md
.gitignore
requirements.txt
localpilot.py
config\
settings.json
model_profiles.json
capabilities.json
app\
__init__.py
rag\
__init__.py
chunker.py
embeddings.py
indexer.py
retriever.py
sources.py
vector_store.py
main.py
router.py
safety.py
memory.py
logger.py
app\llm\
__init__.py
ollama_client.py
prompts.py
app\tools\
__init__.py
files.py
shell.py
web.py
screen.py
mouse_keyboard.py
windows_ui.py
ocr.py
app\modes\
__init__.py
chat_mode.py
code_mode.py
research_mode.py
desktop_mode.py
memory\
notes.md
learned_facts.json
logs\
.gitkeep
workspace\
.gitkeep
tests\
test_safety.py
test_files.py
test_router.py
test_memory.py
- Terminal chat loop
- GUI status window with role and mode activity
- Keyword routing
- File listing, read, write, append, mkdir, copy, move
- Shell execution with approval and dangerous-command blocking
- Web search with capped results
- Notes save/search/show
- Screenshot capture and mouse position
- Basic active window and UI Automation inspection
- Vision entrypoint placeholder with graceful failure path
- More reliable natural-language parsing for code and desktop commands
- Richer GUI controls for approvals and history filtering
- Stronger Windows UI Automation control actions
- Better structured learned facts updates
- Keep multimodal verification strong for the configured vision role
- Add stronger OCR backends later if Tesseract is not enough
- Add Whisper.cpp later for voice input
- Stabilize the existing tool interfaces and tests.
- Improve command parsing without turning the app into one giant loop.
- Add better memory indexing while staying file-based.
- Expand Windows UI Automation read-only inspection.
- Make vision analysis production-ready once the local Ollama multimodal path is verified.
Target repository: https://github.com/Code4life69/LocalPilot
If gh is not installed, create the repository manually on GitHub as a public repo named LocalPilot, then run:
cd C:\LocalPilot
git init
git add .
git commit -m "Initial LocalPilot starter"
git branch -M main
git remote add origin https://github.com/Code4life69/LocalPilot.git
git push -u origin main