LocalPilot

LocalPilot is a Windows-first local AI assistant starter project. It runs on your machine with Ollama, keeps its memory in plain local files, exposes explicit operating modes, and uses Python tools for file work, shell access, web lookup, screenshots, and guarded desktop control.

The first version is intentionally boring: no giant agent framework, no autonomous background loops, no vector database, and no self-modifying core logic. It is built to be easy to inspect, test, and extend.

Features

CLI chat entrypoint: python localpilot.py
Optional Tkinter GUI with live activity timeline
Explicit modes: chat, code, research, desktop, memory
Local Ollama integration for text and placeholder vision
LM Studio agent planner, vision, and local RAG embeddings
Keyword router instead of opaque autonomous planning
File and shell tools with confirmation gates
DuckDuckGo research with 5-result cap
Plain-file notes and learned facts memory
Windows UI Automation before screenshot vision
Structured JSONL logging plus readable text logs

Installation

cd C:\LocalPilot
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
ollama pull gemma4:31b
ollama pull qwen2.5-coder:14b-instruct-q3_K_M
ollama pull qwen2.5-coder:7b
ollama pull granite3.3:2b
ollama pull nomic-embed-text
python localpilot.py

Recommended install:

powershell -ExecutionPolicy Bypass -File scripts/install_recommended_models.ps1

Optional models:

powershell -ExecutionPolicy Bypass -File scripts/install_optional_models.ps1

Optional Gemma 4 comparison models:

powershell -ExecutionPolicy Bypass -File scripts/install_optional_gemma4.ps1

RTX 3060 tuning:

powershell -ExecutionPolicy Bypass -File scripts/configure_ollama_rtx3060.ps1

Model check:

powershell -ExecutionPolicy Bypass -File scripts/check_models.ps1

Model doctor:

python localpilot.py --model-doctor

System doctor:

python localpilot.py --system-doctor

Doctor alias:

python localpilot.py --doctor

Benchmark:

powershell -ExecutionPolicy Bypass -File scripts/benchmark_models.ps1

Gemma 4 comparison:

model compare gemma4

Required Ollama Models

Main reasoning / chat role: gemma4:31b
Coder role: qwen2.5-coder:14b-instruct-q3_K_M
Coder fallback: qwen2.5-coder:7b
Vision role: gemma4:31b
Router role: granite3.3:2b
Embedding role: nomic-embed-text

Optional slow quality mode:

ollama pull qwen3:30b

LocalPilot keeps coding and routing specialized by default, while this machine now uses gemma4:31b for both main reasoning and vision:

gemma4:31b handles default planning, chat, and everyday reasoning on this machine
qwen2.5-coder:14b-instruct-q3_K_M handles coding and app generation
qwen2.5-coder:7b is the automatic coder fallback if the 14B coder model is missing
gemma4:31b is also used for screenshots and visual inspection
granite3.3:2b is reserved for fast routing experiments
nomic-embed-text is reserved for future local memory search
qwen3:30b remains available as an optional slow high-quality mode and is not the default
gemma4:e4b is available as an optional faster Gemma comparison model
gemma4:latest is available as an optional quality comparison model if it runs well on this machine

LocalPilot expects a reachable Ollama API, typically at http://127.0.0.1:11434.

Runtime memory stays local by default. memory/notes.md and memory/learned_facts.json are treated as runtime data, not tracked repo content. If you want to keep or share memory intentionally later, export it into docs/ or a future exports/ folder instead of autosyncing the live memory files.

For this PC, a shorter Ollama keep-alive is recommended so one large model does not stay loaded too long:

[Environment]::SetEnvironmentVariable("OLLAMA_KEEP_ALIVE", "2m", "User")

Recommended order on this RTX 3060 machine:

Install recommended models.
Run scripts/configure_ollama_rtx3060.ps1.
Fully quit and restart Ollama.
Run scripts/check_models.ps1.
Run scripts/benchmark_models.ps1.
Run LocalPilot and ask for model benchmark.

Performance notes:

The old 4096 context default is not reliable for LM Studio agent follow-ups.
Use 8192 at minimum for the planner model, and prefer 16384.
Only one heavy model should stay loaded at a time.
qwen3:30b is slow quality mode only and should not be the everyday default.

Troubleshooting Models

If models show missing after restart:

Run ollama list.
Run python localpilot.py --model-doctor.
Check whether OLLAMA_MODELS changed.
Rerun powershell -ExecutionPolicy Bypass -File scripts/install_recommended_models.ps1.
Fully quit and restart Ollama.

LocalPilot will report similar installed models when the exact configured tag is missing. For example, qwen2.5-coder:14b is not treated as the same thing as qwen2.5-coder:14b-instruct-q3_K_M; it is only reported as a likely nearby tag.

Troubleshooting Desktop Observation

If desktop observation shows dependency_missing for UI Automation:

Run python localpilot.py --system-doctor
Install the missing dependency into the LocalPilot virtual environment:

.\.venv\Scripts\python.exe -m pip install uiautomation

If more than one package is missing, reinstall the full requirements set:

.\.venv\Scripts\python.exe -m pip install -r requirements.txt

Restart LocalPilot using Run LocalPilot.bat

Run LocalPilot.bat is already configured to use .venv\Scripts\python.exe, so dependency fixes should be installed into that same environment.

OCR Support

LocalPilot uses OCR only as a support layer for page understanding and confidence scoring. It does not replace UI Automation safety checks, and OCR text alone is never enough to authorize clicking or typing.

OCR is used to help answer:

what text is visible on screen
what buttons, labels, and fields can be read
whether OCR agrees with UI Automation and screenshot vision
whether OCR should slightly raise or lower confidence

Install OCR dependencies:

.\.venv\Scripts\python.exe -m pip install pytesseract

Then install Tesseract OCR for Windows and make sure tesseract.exe is on PATH, or install it in:

C:\Program Files\Tesseract-OCR\tesseract.exe

Test OCR with:

python localpilot.py --doctor

and inside LocalPilot:

ocr screenshot

Generated OCR-preprocessed images are written under workspace/debug_views and stay local because workspace/* is ignored by Git.

How To Run

cd C:\LocalPilot
python localpilot.py

By default LocalPilot starts the CLI and tries to open the GUI alongside it. If Tkinter is unavailable, the CLI still runs.

Puppeteer-Controlled Browser

Install the browser bridge dependencies:

npm install --prefix browser

If LocalPilot does not find Chrome or Edge automatically, set a browser path explicitly:

$env:LOCALPILOT_BROWSER_EXECUTABLE="C:\Program Files\Google\Chrome\Application\chrome.exe"

Then run the AI-driven agent CLI:

python localpilot.py --agent-cli

For websites, the agent prefers Puppeteer-controlled browser tools over desktop mouse tools.

LM Studio Agent Planner

LocalPilot's AI-driven agent planner is intended to run through LM Studio with:

Planner model: qwen2.5-coder-14b-instruct
Vision model: qwen3-vl-8b-instruct
Recommended planner context: 16384
Minimum usable planner context: 8192
Bad for follow-ups: 4096
Recommended planner concurrency: Max Concurrent Predictions = 1

Recommended LM Studio settings for the planner model:

Context Length: 16384
Max Concurrent Predictions: 1

If LocalPilot warns that the planner context is too small, increase the LM Studio context length before testing multi-step follow-ups like continue, what happened, or screenshot and browser clarifications.

Local RAG Memory

LocalPilot can build a local RAG index for project knowledge and prior session summaries.

Embedding endpoint: POST http://localhost:1234/v1/embeddings
Embedding model: text-embedding-nomic-embed-text-v1.5
Storage: memory/rag/
Default indexed sources: app/, tests/, .pilotrules, README.md, config/settings.json, memory/rag/memory_bank.md, and session summaries from memory/sessions/

To use it through the AI agent:

Index the LocalPilot workspace for memory.
What files implement the timer tool and what do they do?
What model is the planner supposed to use?

If LM Studio embeddings are unavailable, LocalPilot returns a real error instead of pretending the index is ready.

For double-click launch on Windows, use Run LocalPilot.bat. It will:

create .venv automatically on first run if it does not exist
install requirements.txt into that virtual environment
use .venv\Scripts\python.exe for all normal launches
keep the window open if startup fails so the error stays visible

Safety Rules

File overwrite requires approval.
File move into an existing target requires approval.
Shell commands require approval.
Dangerous shell commands are blocked by default.
Desktop click, type, and hotkey actions require approval.
Windows UI Automation is preferred over screenshots.
Vision is only used when desktop inspection needs it.
OCR is only used as a support signal for page understanding and confidence scoring.
The assistant should not rewrite its own core logic without user approval.

Project Structure

C:\LocalPilot
  README.md
  .gitignore
  requirements.txt
  localpilot.py
  config\
    settings.json
    model_profiles.json
    capabilities.json
  app\
    __init__.py
    rag\
      __init__.py
      chunker.py
      embeddings.py
      indexer.py
      retriever.py
      sources.py
      vector_store.py
    main.py
    router.py
    safety.py
    memory.py
    logger.py
  app\llm\
    __init__.py
    ollama_client.py
    prompts.py
  app\tools\
    __init__.py
    files.py
    shell.py
    web.py
    screen.py
    mouse_keyboard.py
    windows_ui.py
    ocr.py
  app\modes\
    __init__.py
    chat_mode.py
    code_mode.py
    research_mode.py
    desktop_mode.py
  memory\
    notes.md
    learned_facts.json
  logs\
    .gitkeep
  workspace\
    .gitkeep
  tests\
    test_safety.py
    test_files.py
    test_router.py
    test_memory.py

What Works In v1

Terminal chat loop
GUI status window with role and mode activity
Keyword routing
File listing, read, write, append, mkdir, copy, move
Shell execution with approval and dangerous-command blocking
Web search with capped results
Notes save/search/show
Screenshot capture and mouse position
Basic active window and UI Automation inspection
Vision entrypoint placeholder with graceful failure path

Current TODO

More reliable natural-language parsing for code and desktop commands
Richer GUI controls for approvals and history filtering
Stronger Windows UI Automation control actions
Better structured learned facts updates
Keep multimodal verification strong for the configured vision role
Add stronger OCR backends later if Tesseract is not enough
Add Whisper.cpp later for voice input

First Roadmap

Stabilize the existing tool interfaces and tests.
Improve command parsing without turning the app into one giant loop.
Add better memory indexing while staying file-based.
Expand Windows UI Automation read-only inspection.
Make vision analysis production-ready once the local Ollama multimodal path is verified.

GitHub

Target repository: https://github.com/Code4life69/LocalPilot

If gh is not installed, create the repository manually on GitHub as a public repo named LocalPilot, then run:

cd C:\LocalPilot
git init
git add .
git commit -m "Initial LocalPilot starter"
git branch -M main
git remote add origin https://github.com/Code4life69/LocalPilot.git
git push -u origin main

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LocalPilot

Features

Installation

Required Ollama Models

Troubleshooting Models

Troubleshooting Desktop Observation

OCR Support

How To Run

Puppeteer-Controlled Browser

LM Studio Agent Planner

Local RAG Memory

Safety Rules

Project Structure

What Works In v1

Current TODO

First Roadmap

GitHub

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
.agents/skills		.agents/skills
.clinerules		.clinerules
app		app
browser		browser
config		config
docs		docs
logs		logs
memory-bank		memory-bank
memory		memory
scripts		scripts
tests		tests
workspace		workspace
.gitignore		.gitignore
.pilotrules		.pilotrules
README.md		README.md
Run LocalPilot.bat		Run LocalPilot.bat
a		a
localpilot.py		localpilot.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

LocalPilot

Features

Installation

Required Ollama Models

Troubleshooting Models

Troubleshooting Desktop Observation

OCR Support

How To Run

Puppeteer-Controlled Browser

LM Studio Agent Planner

Local RAG Memory

Safety Rules

Project Structure

What Works In v1

Current TODO

First Roadmap

GitHub

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages