Skip to content

Code4life69/LocalPilot

Repository files navigation

LocalPilot

LocalPilot is a Windows-first local AI assistant starter project. It runs on your machine with Ollama, keeps its memory in plain local files, exposes explicit operating modes, and uses Python tools for file work, shell access, web lookup, screenshots, and guarded desktop control.

The first version is intentionally boring: no giant agent framework, no autonomous background loops, no vector database, and no self-modifying core logic. It is built to be easy to inspect, test, and extend.

Features

  • CLI chat entrypoint: python localpilot.py
  • Optional Tkinter GUI with live activity timeline
  • Explicit modes: chat, code, research, desktop, memory
  • Local Ollama integration for text and placeholder vision
  • LM Studio agent planner, vision, and local RAG embeddings
  • Keyword router instead of opaque autonomous planning
  • File and shell tools with confirmation gates
  • DuckDuckGo research with 5-result cap
  • Plain-file notes and learned facts memory
  • Windows UI Automation before screenshot vision
  • Structured JSONL logging plus readable text logs

Installation

cd C:\LocalPilot
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
ollama pull gemma4:31b
ollama pull qwen2.5-coder:14b-instruct-q3_K_M
ollama pull qwen2.5-coder:7b
ollama pull granite3.3:2b
ollama pull nomic-embed-text
python localpilot.py

Recommended install:

powershell -ExecutionPolicy Bypass -File scripts/install_recommended_models.ps1

Optional models:

powershell -ExecutionPolicy Bypass -File scripts/install_optional_models.ps1

Optional Gemma 4 comparison models:

powershell -ExecutionPolicy Bypass -File scripts/install_optional_gemma4.ps1

RTX 3060 tuning:

powershell -ExecutionPolicy Bypass -File scripts/configure_ollama_rtx3060.ps1

Model check:

powershell -ExecutionPolicy Bypass -File scripts/check_models.ps1

Model doctor:

python localpilot.py --model-doctor

System doctor:

python localpilot.py --system-doctor

Doctor alias:

python localpilot.py --doctor

Benchmark:

powershell -ExecutionPolicy Bypass -File scripts/benchmark_models.ps1

Gemma 4 comparison:

model compare gemma4

Required Ollama Models

  • Main reasoning / chat role: gemma4:31b
  • Coder role: qwen2.5-coder:14b-instruct-q3_K_M
  • Coder fallback: qwen2.5-coder:7b
  • Vision role: gemma4:31b
  • Router role: granite3.3:2b
  • Embedding role: nomic-embed-text

Optional slow quality mode:

ollama pull qwen3:30b

LocalPilot keeps coding and routing specialized by default, while this machine now uses gemma4:31b for both main reasoning and vision:

  • gemma4:31b handles default planning, chat, and everyday reasoning on this machine
  • qwen2.5-coder:14b-instruct-q3_K_M handles coding and app generation
  • qwen2.5-coder:7b is the automatic coder fallback if the 14B coder model is missing
  • gemma4:31b is also used for screenshots and visual inspection
  • granite3.3:2b is reserved for fast routing experiments
  • nomic-embed-text is reserved for future local memory search
  • qwen3:30b remains available as an optional slow high-quality mode and is not the default
  • gemma4:e4b is available as an optional faster Gemma comparison model
  • gemma4:latest is available as an optional quality comparison model if it runs well on this machine

LocalPilot expects a reachable Ollama API, typically at http://127.0.0.1:11434.

Runtime memory stays local by default. memory/notes.md and memory/learned_facts.json are treated as runtime data, not tracked repo content. If you want to keep or share memory intentionally later, export it into docs/ or a future exports/ folder instead of autosyncing the live memory files.

For this PC, a shorter Ollama keep-alive is recommended so one large model does not stay loaded too long:

[Environment]::SetEnvironmentVariable("OLLAMA_KEEP_ALIVE", "2m", "User")

Recommended order on this RTX 3060 machine:

  1. Install recommended models.
  2. Run scripts/configure_ollama_rtx3060.ps1.
  3. Fully quit and restart Ollama.
  4. Run scripts/check_models.ps1.
  5. Run scripts/benchmark_models.ps1.
  6. Run LocalPilot and ask for model benchmark.

Performance notes:

  • The old 4096 context default is not reliable for LM Studio agent follow-ups.
  • Use 8192 at minimum for the planner model, and prefer 16384.
  • Only one heavy model should stay loaded at a time.
  • qwen3:30b is slow quality mode only and should not be the everyday default.

Troubleshooting Models

If models show missing after restart:

  1. Run ollama list.
  2. Run python localpilot.py --model-doctor.
  3. Check whether OLLAMA_MODELS changed.
  4. Rerun powershell -ExecutionPolicy Bypass -File scripts/install_recommended_models.ps1.
  5. Fully quit and restart Ollama.

LocalPilot will report similar installed models when the exact configured tag is missing. For example, qwen2.5-coder:14b is not treated as the same thing as qwen2.5-coder:14b-instruct-q3_K_M; it is only reported as a likely nearby tag.

Troubleshooting Desktop Observation

If desktop observation shows dependency_missing for UI Automation:

  1. Run python localpilot.py --system-doctor
  2. Install the missing dependency into the LocalPilot virtual environment:
.\.venv\Scripts\python.exe -m pip install uiautomation
  1. If more than one package is missing, reinstall the full requirements set:
.\.venv\Scripts\python.exe -m pip install -r requirements.txt
  1. Restart LocalPilot using Run LocalPilot.bat

Run LocalPilot.bat is already configured to use .venv\Scripts\python.exe, so dependency fixes should be installed into that same environment.

OCR Support

LocalPilot uses OCR only as a support layer for page understanding and confidence scoring. It does not replace UI Automation safety checks, and OCR text alone is never enough to authorize clicking or typing.

OCR is used to help answer:

  • what text is visible on screen
  • what buttons, labels, and fields can be read
  • whether OCR agrees with UI Automation and screenshot vision
  • whether OCR should slightly raise or lower confidence

Install OCR dependencies:

.\.venv\Scripts\python.exe -m pip install pytesseract

Then install Tesseract OCR for Windows and make sure tesseract.exe is on PATH, or install it in:

C:\Program Files\Tesseract-OCR\tesseract.exe

Test OCR with:

python localpilot.py --doctor

and inside LocalPilot:

ocr screenshot

Generated OCR-preprocessed images are written under workspace/debug_views and stay local because workspace/* is ignored by Git.

How To Run

cd C:\LocalPilot
python localpilot.py

By default LocalPilot starts the CLI and tries to open the GUI alongside it. If Tkinter is unavailable, the CLI still runs.

Puppeteer-Controlled Browser

Install the browser bridge dependencies:

npm install --prefix browser

If LocalPilot does not find Chrome or Edge automatically, set a browser path explicitly:

$env:LOCALPILOT_BROWSER_EXECUTABLE="C:\Program Files\Google\Chrome\Application\chrome.exe"

Then run the AI-driven agent CLI:

python localpilot.py --agent-cli

For websites, the agent prefers Puppeteer-controlled browser tools over desktop mouse tools.

LM Studio Agent Planner

LocalPilot's AI-driven agent planner is intended to run through LM Studio with:

  • Planner model: qwen2.5-coder-14b-instruct
  • Vision model: qwen3-vl-8b-instruct
  • Recommended planner context: 16384
  • Minimum usable planner context: 8192
  • Bad for follow-ups: 4096
  • Recommended planner concurrency: Max Concurrent Predictions = 1

Recommended LM Studio settings for the planner model:

Context Length: 16384
Max Concurrent Predictions: 1

If LocalPilot warns that the planner context is too small, increase the LM Studio context length before testing multi-step follow-ups like continue, what happened, or screenshot and browser clarifications.

Local RAG Memory

LocalPilot can build a local RAG index for project knowledge and prior session summaries.

  • Embedding endpoint: POST http://localhost:1234/v1/embeddings
  • Embedding model: text-embedding-nomic-embed-text-v1.5
  • Storage: memory/rag/
  • Default indexed sources: app/, tests/, .pilotrules, README.md, config/settings.json, memory/rag/memory_bank.md, and session summaries from memory/sessions/

To use it through the AI agent:

Index the LocalPilot workspace for memory.
What files implement the timer tool and what do they do?
What model is the planner supposed to use?

If LM Studio embeddings are unavailable, LocalPilot returns a real error instead of pretending the index is ready.

For double-click launch on Windows, use Run LocalPilot.bat. It will:

  • create .venv automatically on first run if it does not exist
  • install requirements.txt into that virtual environment
  • use .venv\Scripts\python.exe for all normal launches
  • keep the window open if startup fails so the error stays visible

Safety Rules

  • File overwrite requires approval.
  • File move into an existing target requires approval.
  • Shell commands require approval.
  • Dangerous shell commands are blocked by default.
  • Desktop click, type, and hotkey actions require approval.
  • Windows UI Automation is preferred over screenshots.
  • Vision is only used when desktop inspection needs it.
  • OCR is only used as a support signal for page understanding and confidence scoring.
  • The assistant should not rewrite its own core logic without user approval.

Project Structure

C:\LocalPilot
  README.md
  .gitignore
  requirements.txt
  localpilot.py
  config\
    settings.json
    model_profiles.json
    capabilities.json
  app\
    __init__.py
    rag\
      __init__.py
      chunker.py
      embeddings.py
      indexer.py
      retriever.py
      sources.py
      vector_store.py
    main.py
    router.py
    safety.py
    memory.py
    logger.py
  app\llm\
    __init__.py
    ollama_client.py
    prompts.py
  app\tools\
    __init__.py
    files.py
    shell.py
    web.py
    screen.py
    mouse_keyboard.py
    windows_ui.py
    ocr.py
  app\modes\
    __init__.py
    chat_mode.py
    code_mode.py
    research_mode.py
    desktop_mode.py
  memory\
    notes.md
    learned_facts.json
  logs\
    .gitkeep
  workspace\
    .gitkeep
  tests\
    test_safety.py
    test_files.py
    test_router.py
    test_memory.py

What Works In v1

  • Terminal chat loop
  • GUI status window with role and mode activity
  • Keyword routing
  • File listing, read, write, append, mkdir, copy, move
  • Shell execution with approval and dangerous-command blocking
  • Web search with capped results
  • Notes save/search/show
  • Screenshot capture and mouse position
  • Basic active window and UI Automation inspection
  • Vision entrypoint placeholder with graceful failure path

Current TODO

  • More reliable natural-language parsing for code and desktop commands
  • Richer GUI controls for approvals and history filtering
  • Stronger Windows UI Automation control actions
  • Better structured learned facts updates
  • Keep multimodal verification strong for the configured vision role
  • Add stronger OCR backends later if Tesseract is not enough
  • Add Whisper.cpp later for voice input

First Roadmap

  1. Stabilize the existing tool interfaces and tests.
  2. Improve command parsing without turning the app into one giant loop.
  3. Add better memory indexing while staying file-based.
  4. Expand Windows UI Automation read-only inspection.
  5. Make vision analysis production-ready once the local Ollama multimodal path is verified.

GitHub

Target repository: https://github.com/Code4life69/LocalPilot

If gh is not installed, create the repository manually on GitHub as a public repo named LocalPilot, then run:

cd C:\LocalPilot
git init
git add .
git commit -m "Initial LocalPilot starter"
git branch -M main
git remote add origin https://github.com/Code4life69/LocalPilot.git
git push -u origin main

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors