🛠️ Autonomous Coding Assistant

An agentic debugging tool built on Claude's native tool use. You give it a buggy Python file and a plain-English description of the bug; it then autonomously reads the code, searches documentation when it's stuck, writes a fix, runs the test suite, reads the failures, and iterates — looping until the tests pass or it hits a 5-cycle limit. Crucially, Claude decides which tool to call next based on each result; nothing in the code hardcodes a fixed read→fix→test sequence. That open-ended, result-driven control loop — over a real tool surface (code execution, web search, file I/O, pytest) — is what makes this a genuine agent rather than a single prompt, and it's the part worth showing an employer.

How the agent loop works

                         ┌──────────────────────────────────────┐
   buggy file +          │                Streamlit UI           │
   bug description ─────▶ │  upload · live log · diff · results   │
                         └───────────────────┬──────────────────┘
                                             │ background thread
                                             ▼
        ┌─────────────────────── agent loop (agent.py) ───────────────────────┐
        │                                                                      │
        │   ┌─────────────┐  tool_use   ┌──────────────────────────────────┐  │
        │   │   Claude    │ ──────────▶ │  dispatch tool (tools.py)        │  │
        │   │ (Sonnet 4.6)│             │  • execute_code   ┐              │  │
        │   │  + adaptive │             │  • web_search     │→ sandbox.py  │  │
        │   │   thinking  │ ◀────────── │  • read_write_file│  (Docker or  │  │
        │   └─────────────┘ tool_result │  • run_tests      ┘   subprocess)│  │
        │          │                    └──────────────────────────────────┘  │
        │          │ tests pass? ── no ──▶ iterate (up to 5 test cycles)       │
        │          ▼                                                           │
        │     tests pass ✓  /  max iterations reached  /  stopped by user      │
        └──────────────────────────────────────────────────────────────────────┘

Each turn, Claude is given the four tool schemas (tool_schemas.py) and decides what to do. The loop intercepts every tool_use block, executes it, logs it, and feeds the tool_result back — repeating until Claude stops calling tools or the tests go green.

Project layout

File	Role
`app.py`	Streamlit UI: setup wizard, inputs, live log, diff/result panel
`agent.py`	The Claude tool-use loop + iteration logic
`tools.py`	The four tool implementations + dispatcher
`tool_schemas.py`	JSON schemas passed to Claude's `tools` parameter
`sandbox.py`	Docker / subprocess execution backend
`Dockerfile`	The `coding-agent-sandbox` image (pytest preinstalled)
`.env.example`	Required env vars with placeholder values
`requirements.txt`	Pinned dependencies

Setup

1. Get API keys

Anthropic — sign in at https://console.anthropic.com/, open API Keys, and create a key (starts with sk-ant-).
Tavily (web search) — sign up at https://tavily.com/ and copy your key from the dashboard (starts with tvly-). The free tier is plenty for a demo.

2. Install

python -m venv .venv
# Windows (PowerShell):  .venv\Scripts\Activate.ps1
# macOS/Linux:           source .venv/bin/activate
pip install -r requirements.txt

3. (Recommended) Build the sandbox image

docker build -t coding-agent-sandbox .

This builds the image the app uses to run code/tests in an isolated, network-disabled container. You can sanity-check it directly:

docker run --rm coding-agent-sandbox python -c "import pytest; print('sandbox OK')"

Docker is optional but strongly recommended. Without it, the app falls back to running code in a plain subprocess on your machine (see the security notice).

4. Run

# With the venv activated:
streamlit run app.py

# Or without activating (Windows):
.venv\Scripts\python.exe -m streamlit run app.py

On first launch you'll see a setup screen: paste your two keys, and they're validated with a live API call and saved to a local .env. You won't see that screen again (use the Reset API keys expander in the sidebar to re-enter them). Then upload or paste a buggy .py file, describe the bug, optionally attach a pytest file, and click Run agent.

When the agent finishes, a Download fixed file button appears below the diff so you can save the fixed version — the agent always works on an isolated copy of your file and never modifies the original.

⚠️ Security notice

This agent executes code on your computer. When the Docker sandbox is available, code and tests run inside an ephemeral, network-isolated container. When Docker is not available, the app falls back to running code in a plain subprocess directly on your host — with no isolation. The model writes and runs this code autonomously.

The live log shows every tool call (timestamp, tool, inputs, output) — review them, especially in subprocess-fallback mode (the UI flags it in yellow).
Do not point this at sensitive code, secrets, or a machine you can't afford to have arbitrary code run on, without reviewing each tool call first.
File writes are confined to a throwaway per-session workspace, but executed code is only as contained as your backend (Docker = isolated; subprocess = not). Prefer Docker.

Built with the Anthropic Python SDK, Streamlit, and Tavily.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🛠️ Autonomous Coding Assistant

How the agent loop works

Project layout

Setup

1. Get API keys

2. Install

3. (Recommended) Build the sandbox image

4. Run

⚠️ Security notice

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
README.md		README.md
agent.py		agent.py
app.py		app.py
requirements.txt		requirements.txt
sandbox.py		sandbox.py
tool_schemas.py		tool_schemas.py
tools.py		tools.py

Folders and files

Latest commit

History

Repository files navigation

🛠️ Autonomous Coding Assistant

How the agent loop works

Project layout

Setup

1. Get API keys

2. Install

3. (Recommended) Build the sandbox image

4. Run

⚠️ Security notice

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages