perf(cache): add Redis caching layer for Gemini LLM and OCR extraction

## Problem

The two heaviest operations in the request path have no caching:

- **Gemini calls** in `backend/services/gemini_service.py` (`call_gemini`, `call_gemini_json`, `call_gemini_multiturn`) — every classify/summarize/extract on document upload, every study-guide generation, every room summary hits the API fresh. Latency is multi-second and cost scales linearly with traffic.
- **OCR/extraction** in `backend/services/extraction_service.py` (Docling / GOT-OCR / Tesseract). Re-uploads of the same file (common during testing, retries, and student re-submissions) re-run the full OCR pipeline.

There is currently no shared cache across workers — only the per-request `stat_cache` dict in `backend/routes/profile.py:491` and DB-backed table caches (`room_summaries`, `course_summary`, `study_guides`).

## Proposal

Introduce Redis as a shared cache, accessed through a thin wrapper module (e.g., `backend/services/cache.py`) so it can be mocked in tests the way `db/connection.py` is.

**Cache keys:**
- Gemini: `gemini:{sha256(model + prompt + params + response_schema)}`
- OCR: `ocr:{sha256(file_bytes)}:{backend_name}`

**TTLs:**
- Gemini classify/extract on document content: long (7–30 days) — content-addressed, safe to keep.
- Gemini summaries that depend on graph/mastery state: short or hash-keyed on the underlying state (mirror what `social_cache_service.py` already does with `member_hash`).
- OCR: long (content-addressed by file hash).

## Acceptance criteria

- [ ] `services/cache.py` wrapper with `get`, `set(ttl)`, and `get_or_compute` helpers
- [ ] `gemini_service.call_gemini*` route through the cache (opt-in via kwarg so existing callers can skip when needed)
- [ ] `extraction_service` caches OCR results by file-content hash + backend name
- [ ] `REDIS_URL` added to `.env.example` and `config.py`
- [ ] `docker-compose.yml` adds a `redis` service
- [ ] Tests use a fake/in-memory Redis (e.g., `fakeredis`) via `conftest.py`
- [ ] Cache hit/miss metrics logged so we can measure the win

## Tradeoffs / risks

- New infra dependency (Redis container + env config).
- Encryption boundary: cached values must respect the `encrypt_if_present` / `decrypt_if_present` rules in `services/encryption.py`. Decide whether to cache plaintext (faster, more sensitive) or ciphertext (safer, requires re-decrypt on read). **Default recommendation: cache ciphertext for anything that touches encrypted columns.**
- Invalidation surface: must integrate with the existing `_invalidate_study_guide_cache` background task in `routes/documents.py` and `apply_graph_update` in `services/graph_service.py`.

## Related

- Companion issues: in-process `lru_cache` for hot reads, HTTP `Cache-Control` + `ETag` for GETs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(cache): add Redis caching layer for Gemini LLM and OCR extraction #97

Problem

Proposal

Acceptance criteria

Tradeoffs / risks

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

perf(cache): add Redis caching layer for Gemini LLM and OCR extraction #97

Description

Problem

Proposal

Acceptance criteria

Tradeoffs / risks

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions