Skip to content

perf(cache): add Redis caching layer for Gemini LLM and OCR extraction #97

@AndresL230

Description

@AndresL230

Problem

The two heaviest operations in the request path have no caching:

  • Gemini calls in backend/services/gemini_service.py (call_gemini, call_gemini_json, call_gemini_multiturn) — every classify/summarize/extract on document upload, every study-guide generation, every room summary hits the API fresh. Latency is multi-second and cost scales linearly with traffic.
  • OCR/extraction in backend/services/extraction_service.py (Docling / GOT-OCR / Tesseract). Re-uploads of the same file (common during testing, retries, and student re-submissions) re-run the full OCR pipeline.

There is currently no shared cache across workers — only the per-request stat_cache dict in backend/routes/profile.py:491 and DB-backed table caches (room_summaries, course_summary, study_guides).

Proposal

Introduce Redis as a shared cache, accessed through a thin wrapper module (e.g., backend/services/cache.py) so it can be mocked in tests the way db/connection.py is.

Cache keys:

  • Gemini: gemini:{sha256(model + prompt + params + response_schema)}
  • OCR: ocr:{sha256(file_bytes)}:{backend_name}

TTLs:

  • Gemini classify/extract on document content: long (7–30 days) — content-addressed, safe to keep.
  • Gemini summaries that depend on graph/mastery state: short or hash-keyed on the underlying state (mirror what social_cache_service.py already does with member_hash).
  • OCR: long (content-addressed by file hash).

Acceptance criteria

  • services/cache.py wrapper with get, set(ttl), and get_or_compute helpers
  • gemini_service.call_gemini* route through the cache (opt-in via kwarg so existing callers can skip when needed)
  • extraction_service caches OCR results by file-content hash + backend name
  • REDIS_URL added to .env.example and config.py
  • docker-compose.yml adds a redis service
  • Tests use a fake/in-memory Redis (e.g., fakeredis) via conftest.py
  • Cache hit/miss metrics logged so we can measure the win

Tradeoffs / risks

  • New infra dependency (Redis container + env config).
  • Encryption boundary: cached values must respect the encrypt_if_present / decrypt_if_present rules in services/encryption.py. Decide whether to cache plaintext (faster, more sensitive) or ciphertext (safer, requires re-decrypt on read). Default recommendation: cache ciphertext for anything that touches encrypted columns.
  • Invalidation surface: must integrate with the existing _invalidate_study_guide_cache background task in routes/documents.py and apply_graph_update in services/graph_service.py.

Related

  • Companion issues: in-process lru_cache for hot reads, HTTP Cache-Control + ETag for GETs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions