Add backend/CLAUDE.md contributor guide#6233
Conversation
Closes #6232. Backend had no CLAUDE.md despite being the most complex part of the codebase. This doc covers service architecture, database patterns, auth, testing, local dev setup, and common gotchas — sourced from operational experience across the team. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Greptile SummaryThis PR adds The content is accurate against the codebase and well-organized. One actionable issue was found:
Confidence Score: 4/5Safe to merge after fixing the missing import in the Rate Limiting example, which would otherwise misdirect contributors trying to apply the pattern. One P1 issue exists: the Rate Limiting code snippet omits the required backend/CLAUDE.md — Rate Limiting example (lines 116–122) needs the missing import alias added. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
Mobile([Mobile / Desktop Client])
subgraph GKE["GKE — prod-omi-backend"]
BL[backend-listen\nmain.py]
end
subgraph CR["Cloud Run"]
PU[pusher\npusher/main.py]
DI[diarizer\ndiarizer/main.py]
AP[agent-proxy\nagent-proxy/main.py]
end
subgraph Modal["Modal — GPU"]
VAD[vad]
NJ[notifications-job\ncron]
end
subgraph Stores["Shared Stores"]
FS[(Firestore)]
RD[(Redis)]
PC[(Pinecone)]
N4[(Neo4j)]
end
DG[Deepgram STT]
VM[User Agent VMs\nprivate IP :8080]
Mobile -- REST / WS --> BL
BL -- WS pub/sub --> PU
BL --> DI
BL --> VAD
BL --> DG
PU --> DI
PU --> DG
AP -- WS --> VM
Mobile -- agent WS --> AP
BL --- FS & RD & PC & N4
PU --- FS & RD
Reviews (1): Last reviewed commit: "Add backend/CLAUDE.md contributor guide ..." | Re-trigger Greptile |
| ```python | ||
| # Applied via dependency injection | ||
| uid: str = Depends(auth.with_rate_limit(auth.get_current_user_uid, "chat:send_message")) | ||
|
|
||
| # Policies defined in utils/rate_limit_config.py | ||
| # RATE_LIMIT_BOOST env var (float, default 1.0) scales all limits | ||
| # RATE_LIMIT_SHADOW_MODE (bool) logs violations without blocking |
There was a problem hiding this comment.
Missing import for
auth alias in Rate Limiting example
The rate limiting snippet uses auth.with_rate_limit(auth.get_current_user_uid, ...), but the auth here is actually the utils.other.endpoints module imported as an alias — not firebase_admin.auth. Without seeing the import, a contributor will naturally assume auth refers to firebase_admin.auth (as shown in the HTTP Endpoints section above), and be confused when with_rate_limit doesn't exist on it.
Across the codebase, the pattern is consistently:
from utils.other import endpoints as auth
uid: str = Depends(auth.with_rate_limit(auth.get_current_user_uid, "chat:send_message"))The HTTP Endpoints example in the same section uses from utils.other.endpoints import get_current_user_uid (a direct import), which is inconsistent with the Rate Limiting example's aliased style. Adding the import makes the pattern self-contained and unambiguous.
| database/ # All persistence: Firestore, Redis, Pinecone, Neo4j | ||
| _client.py # Firestore singleton (google.cloud.firestore.Client) | ||
| redis_db.py # Redis: caching, rate limiting (Lua scripts), pub/sub | ||
| memories.py # Example domain module (conversations.py, users.py, etc.) |
There was a problem hiding this comment.
Misleading "Example domain module" label for
database/memories.py
The comment # Example domain module (conversations.py, users.py, etc.) could be read as "this file is just an example/placeholder." The intent seems to be "this is one instance of the domain module pattern, alongside conversations.py, users.py, etc." — but the phrasing is ambiguous and could lead contributors to disregard the file.
A clearer label would be:
| memories.py # Example domain module (conversations.py, users.py, etc.) | |
| memories.py # Domain module (same pattern: conversations.py, users.py, etc.) |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Removed deployment targets, GKE namespaces, Cloud Logging queries, Helm charts, and monitoring details. Added real dev learnings: async gotchas, WebSocket state races, queue safety, langdetect pitfalls, lazy import patterns. Trimmed from 210 to 130 lines. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replaced guessed descriptions with code-verified details for every directory: pusher (real-time distribution hub with 5 task types), diarizer (3 endpoints, pyannote + wespeaker models), agent-proxy (VM lifecycle + chat history + encryption), modal (speaker ID + VAD + cron), database (25+ domain modules), utils (60+ files with RAG tools), routers (42 files). Added LOC counts for key files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
lgtm |
Closes BasedHardware#6232. The backend/ directory is the most complex part of the Omi codebase (45+ routers, 5 microservices, Firestore/Redis/Pinecone/Neo4j) but had no CLAUDE.md. This adds a focused contributor guide sourced from operational experience across the team (mon, hiro, kenji), covering service architecture, database patterns, auth, testing, local dev, and common gotchas. Link related issues: BasedHardware#6232 --- _This pr was drafted by AI on behalf of @beastoin_
Closes #6232. The backend/ directory is the most complex part of the Omi codebase (45+ routers, 5 microservices, Firestore/Redis/Pinecone/Neo4j) but had no CLAUDE.md. This adds a focused contributor guide sourced from operational experience across the team (mon, hiro, kenji), covering service architecture, database patterns, auth, testing, local dev, and common gotchas.
Link related issues: #6232
This pr was drafted by AI on behalf of @beastoin