Problem
Profile creation currently works via a heavyweight subprocess chain:
FastAPI → subprocess: gemini CLI (Node.js) → MCP protocol → FastMCP server → Meticulous API
This introduces multiple failure points:
- The Gemini CLI (
@google/gemini-cli) is a Node.js app requiring Node + npm in the container
- An MCP server (s6-managed Python process) must be running for the CLI to call tools
- Lightweight / non-standard distros (e.g. Puppy Linux) have had issues where the CLI fails silently, breaking all AI features
- The subprocess spawn + MCP handshake adds latency to every profile creation
Meanwhile, image analysis (/analyze_coffee) already uses the Python Gemini SDK (google-generativeai) directly and works reliably.
Proposed Solution
Replace the Gemini CLI subprocess with the Python Gemini SDK's native function-calling / tool-use API. The new flow:
FastAPI → Python Gemini SDK (tool declarations) → direct HTTP to Meticulous API
Implementation Steps
-
Define Gemini tool declarations in Python for create_profile (and optionally list_profiles, run_profile, validate_profile). The SDK supports function-calling natively — declare tools as Python functions with type hints, and the model returns structured tool-call requests.
-
Implement a tool-call loop in coffee.py: send prompt to Gemini with tool declarations → receive tool-call request → execute it (call Meticulous machine API directly via pyMeticulous / httpx) → send result back → get final response. Typically 1-2 tool calls per profile.
-
Import profile validation/building logic from the MCP server's profile_builder.py and profile_validator.py directly into the FastAPI server (they are pure Python, no MCP dependency).
-
Remove the Gemini CLI dependency from Dockerfile.unified (Node.js, npm, @google/gemini-cli). Optionally keep the MCP server for external integrations (Claude Desktop, etc.) but it is no longer required for core functionality.
-
Remove or demote the MCP s6 service — it can become optional rather than a hard dependency.
Benefits
| Aspect |
Current (CLI) |
Proposed (SDK) |
| Architecture |
3 processes (FastAPI + CLI subprocess + MCP server) |
1 process (FastAPI only) |
| Container deps |
Python + Node.js + npm |
Python only |
| Image size |
~100 MB larger due to Node.js/npm |
Smaller |
| Memory footprint |
Higher (Node.js runtime per invocation) |
Lower (important for RPi / Puppy Linux) |
| Latency |
Subprocess spawn + MCP handshake overhead |
Direct SDK call |
| Failure modes |
CLI binary missing, MCP server down, s6 restart race |
Single code path, same as image analysis |
| API key handling |
Env var must reach CLI + SDK separately |
One SDK client for all AI features |
Risks / Considerations
- The Gemini CLI's
-y (yolo) mode auto-loops tool calls. We need our own loop (straightforward).
- OEPF schema validation currently runs inside the MCP server — needs to be imported into FastAPI.
- The MCP server is still useful for external tool integrations (Claude Desktop, Cursor, etc.) — keep it as an optional service, not a hard requirement for profile creation.
- Ensure backward compatibility: the
/analyze_and_profile endpoint's response format must stay the same.
Acceptance Criteria
Problem
Profile creation currently works via a heavyweight subprocess chain:
This introduces multiple failure points:
@google/gemini-cli) is a Node.js app requiring Node + npm in the containerMeanwhile, image analysis (
/analyze_coffee) already uses the Python Gemini SDK (google-generativeai) directly and works reliably.Proposed Solution
Replace the Gemini CLI subprocess with the Python Gemini SDK's native function-calling / tool-use API. The new flow:
Implementation Steps
Define Gemini tool declarations in Python for
create_profile(and optionallylist_profiles,run_profile,validate_profile). The SDK supports function-calling natively — declare tools as Python functions with type hints, and the model returns structured tool-call requests.Implement a tool-call loop in
coffee.py: send prompt to Gemini with tool declarations → receive tool-call request → execute it (call Meticulous machine API directly viapyMeticulous/httpx) → send result back → get final response. Typically 1-2 tool calls per profile.Import profile validation/building logic from the MCP server's
profile_builder.pyandprofile_validator.pydirectly into the FastAPI server (they are pure Python, no MCP dependency).Remove the Gemini CLI dependency from
Dockerfile.unified(Node.js, npm,@google/gemini-cli). Optionally keep the MCP server for external integrations (Claude Desktop, etc.) but it is no longer required for core functionality.Remove or demote the MCP s6 service — it can become optional rather than a hard dependency.
Benefits
Risks / Considerations
-y(yolo) mode auto-loops tool calls. We need our own loop (straightforward)./analyze_and_profileendpoint's response format must stay the same.Acceptance Criteria