Refactor to production-grade Jarvis AI assistant with FastAPI backend#1
Open
askmy-stack wants to merge 3 commits into
Open
Refactor to production-grade Jarvis AI assistant with FastAPI backend#1askmy-stack wants to merge 3 commits into
askmy-stack wants to merge 3 commits into
Conversation
Documents project structure, tech stack, setup instructions, key conventions, known issues, and LangChain patterns for use by AI coding assistants. https://claude.ai/code/session_01BsGXyhfA6BWuSjaA23KmET
Complete rewrite implementing the full modernization roadmap: Architecture: - FastAPI async backend (backend/main.py) with /chat/stream SSE endpoint - Streamlit frontend (app.py) calls backend, renders streaming tokens in real time - LangChain v0.3 tool-calling agent (backend/agent.py) with astream_events v2 - ChromaDB persistent vector memory (backend/memory.py) — survives restarts - pydantic-settings typed config (backend/config.py) — fails fast on missing keys Four Jarvis tools: - get_system_info — psutil CPU/memory/disk (no API key) - web_search — DuckDuckGo (no API key) - get_weather — Open-Meteo free API (no API key) - control_device — Home Assistant REST API (optional, allow-listed) Production readiness: - .gitignore, .env.example, pyproject.toml with uv support - Dockerfile + docker-compose.yml (one-command deploy, chroma_db volume) - GitHub Actions CI: ruff lint, mypy type-check, pytest, gitleaks secret scan - structlog structured logging throughout - LangSmith tracing support (opt-in via LANGCHAIN_TRACING_V2=true) Tests: - tests/test_tools.py — tool unit tests, allow-list enforcement, no LLM calls - tests/test_api.py — FastAPI endpoint tests with mocked agent - tests/conftest.py — fake API key fixture for CI Fixes all pre-existing bugs: - Logic flaw: model no longer called on every keystroke - Deprecated langchain.chat_models / langchain.llms imports updated to v0.3 - Deprecated chat(messages) call replaced with astream_events - input builtin no longer shadowed https://claude.ai/code/session_01BsGXyhfA6BWuSjaA23KmET
Documents: - What has been completed (architecture redesign, tools, streaming, memory, etc.) - What remains to do (revoke API keys, set up .env, run tests, deploy) - Step-by-step execution for local development, Docker deployment, and CI/CD - Architecture explained in simple terms - Troubleshooting common issues - Complete checklist for deployment https://claude.ai/code/session_01BsGXyhfA6BWuSjaA23KmET
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Completely refactored the codebase from a simple Streamlit chatbot into a production-grade personal AI assistant called Jarvis. The new architecture separates concerns into a FastAPI backend with a tool-calling LangChain agent and a Streamlit frontend that streams responses token-by-token.
Key Changes
Backend Architecture: Created a FastAPI backend (
backend/main.py) that exposes/chat/streamfor streaming responses,/healthfor liveness checks, and session management endpointsLangChain Agent: Implemented a tool-calling agent (
backend/agent.py) usinggpt-4o-miniwith four integrated tools:Vector Memory: Added persistent ChromaDB-backed memory (
backend/memory.py) that stores conversation exchanges and retrieves semantically similar past interactions to inject context into the agentStreaming: Implemented token-by-token streaming via
AgentExecutor.astream_events(version="v2")so users see text appearing in real time as the LLM generates itConfiguration: Introduced typed, validated settings via
pydantic-settings(backend/config.py) that reads from.envwith sensible defaultsStreamlit UI: Completely rewrote
app.pyto:Testing: Added comprehensive test suite:
tests/test_tools.py— unit tests for all tools (no LLM calls)tests/test_api.py— FastAPI endpoint tests with mocked agenttests/conftest.py— shared fixtures for CI compatibilityDevOps & Documentation:
Dockerfileanddocker-compose.ymlfor one-command deploymentREADME.mdwith architecture diagram, quick start, and configuration guideCLAUDE.mdfor AI assistant guidance on the codebaseci.yml) for linting, type-checking, testing, and secret scanning.gitignoreand.env.exampleProject Metadata: Created
pyproject.tomlwith all dependencies, dev extras, and tool configurations (ruff, mypy, pytest)Notable Implementation Details
https://claude.ai/code/session_01BsGXyhfA6BWuSjaA23KmET