Skip to content

Replace Gemini CLI subprocess with Python SDK function-calling #214

@hessius

Description

@hessius

Problem

Profile creation currently works via a heavyweight subprocess chain:

FastAPI → subprocess: gemini CLI (Node.js) → MCP protocol → FastMCP server → Meticulous API

This introduces multiple failure points:

  • The Gemini CLI (@google/gemini-cli) is a Node.js app requiring Node + npm in the container
  • An MCP server (s6-managed Python process) must be running for the CLI to call tools
  • Lightweight / non-standard distros (e.g. Puppy Linux) have had issues where the CLI fails silently, breaking all AI features
  • The subprocess spawn + MCP handshake adds latency to every profile creation

Meanwhile, image analysis (/analyze_coffee) already uses the Python Gemini SDK (google-generativeai) directly and works reliably.

Proposed Solution

Replace the Gemini CLI subprocess with the Python Gemini SDK's native function-calling / tool-use API. The new flow:

FastAPI → Python Gemini SDK (tool declarations) → direct HTTP to Meticulous API

Implementation Steps

  1. Define Gemini tool declarations in Python for create_profile (and optionally list_profiles, run_profile, validate_profile). The SDK supports function-calling natively — declare tools as Python functions with type hints, and the model returns structured tool-call requests.

  2. Implement a tool-call loop in coffee.py: send prompt to Gemini with tool declarations → receive tool-call request → execute it (call Meticulous machine API directly via pyMeticulous / httpx) → send result back → get final response. Typically 1-2 tool calls per profile.

  3. Import profile validation/building logic from the MCP server's profile_builder.py and profile_validator.py directly into the FastAPI server (they are pure Python, no MCP dependency).

  4. Remove the Gemini CLI dependency from Dockerfile.unified (Node.js, npm, @google/gemini-cli). Optionally keep the MCP server for external integrations (Claude Desktop, etc.) but it is no longer required for core functionality.

  5. Remove or demote the MCP s6 service — it can become optional rather than a hard dependency.

Benefits

Aspect Current (CLI) Proposed (SDK)
Architecture 3 processes (FastAPI + CLI subprocess + MCP server) 1 process (FastAPI only)
Container deps Python + Node.js + npm Python only
Image size ~100 MB larger due to Node.js/npm Smaller
Memory footprint Higher (Node.js runtime per invocation) Lower (important for RPi / Puppy Linux)
Latency Subprocess spawn + MCP handshake overhead Direct SDK call
Failure modes CLI binary missing, MCP server down, s6 restart race Single code path, same as image analysis
API key handling Env var must reach CLI + SDK separately One SDK client for all AI features

Risks / Considerations

  • The Gemini CLI's -y (yolo) mode auto-loops tool calls. We need our own loop (straightforward).
  • OEPF schema validation currently runs inside the MCP server — needs to be imported into FastAPI.
  • The MCP server is still useful for external tool integrations (Claude Desktop, Cursor, etc.) — keep it as an optional service, not a hard requirement for profile creation.
  • Ensure backward compatibility: the /analyze_and_profile endpoint's response format must stay the same.

Acceptance Criteria

  • Profile creation uses Python SDK function-calling (no subprocess)
  • Image analysis continues to work unchanged
  • OEPF schema validation integrated into FastAPI server
  • Node.js/npm removed from Dockerfile (or made optional)
  • MCP s6 service no longer required for core profile creation
  • All existing tests pass, new tests cover the tool-call loop
  • Container image size reduced
  • Works on Raspberry Pi and Puppy Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions