fix(occurrences): remove all n+1 queries from occurrence API #1274
fix(occurrences): remove all n+1 queries from occurrence API #1274
Conversation
… remove N+1 Issue: #1271 OccurrenceViewSet.list issued one query per occurrence to fetch detection paths, the best machine prediction, and the best identification, scaling roughly linearly with page size and producing the 34-40s cold-cache responses tracked in Sentry as AMI-PLATFORM-API-BMY. This change adds reusable Prefetch factories in ami/main/models_future/occurrence.py and wires them into the list path: - prefetch_detection_images() populates a `prefetched_detection_images` to_attr list of detections that have a path - prefetch_classifications_for_best_prediction() populates detections__classifications with select_related taxon and algorithm - best_prediction_from_prefetch / best_identification_from_prefetch pick the winners in Python without re-querying OccurrenceQuerySet.with_list_prefetches() bundles these prefetches and is called from OccurrenceViewSet.get_queryset only when action == 'list'. OccurrenceListSerializer reads from the prefetch caches when present and falls back to the existing model methods for non-list callers (admin, exports, signals). Local measurement on a project with 25 occurrences: page size | before | after | reduction limit=5 | 24 | 13 | 1.8x limit=25 | 64 | 7 | 9.1x A new TestOccurrenceListQueryCount guards against regression by asserting the query count does not grow with page size. Follow-ups (separate PR): - django-zen-queries QueriesDisabledViewMixin to fail loudly on lazy queries during list rendering - audit other large-list serializers (SourceImage, Taxon) for the same pattern Co-Authored-By: Claude <noreply@anthropic.com>
✅ Deploy Preview for antenna-preview canceled.
|
✅ Deploy Preview for antenna-ssec canceled.
|
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughList and detail queryset prefetches were split and centralized into new helpers; serializers consume prefetched detections/classifications/identifications via helper functions for best-identification, best-prediction, and detection image URLs. Tests and exporter usage were updated to rely on these prefetch contracts. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client as Client
participant View as OccurrenceViewSet\n(get_queryset)
participant QS as OccurrenceQuerySet\n(with_*_prefetches)
participant DB as Database
participant Serializer as OccurrenceList/Detail\nSerializer
participant Helpers as models_future/occurrence.py
Client->>View: GET /occurrences/ or /occurrences/{id}/
View->>QS: build queryset -> with_list_prefetches() / with_detail_prefetches()
QS->>DB: query occurrences + prefetch(detections, classifications, identifications...)
DB-->>QS: rows + prefetched related objects
View->>Serializer: pass queryset / instance
Serializer->>Helpers: best_identification_from_prefetch / best_prediction_from_prefetch / detection_image_urls_from_prefetch
Helpers-->>Serializer: chosen identification/prediction and image URLs
Serializer-->>Client: JSON response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 4 minutes and 24 seconds.Comment |
There was a problem hiding this comment.
Pull request overview
This PR targets the /api/v2/occurrences/ list endpoint’s dominant N+1 sources by adding reusable Prefetch factories, wiring them into the list queryset, and teaching the list serializer to consume prefetched caches (with fallbacks), plus adding a regression query-count test.
Changes:
- Added
ami/main/models_future/occurrence.pywith reusablePrefetchfactories and Python helpers for selecting “best” records from prefetched data. - Added
OccurrenceQuerySet.with_list_prefetches()and applied it only forOccurrenceViewSetlist actions. - Updated
OccurrenceListSerializerto read detection images / best prediction / best identification from prefetch caches when present, and added a query-count regression test.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| ami/main/tests.py | Adds TestOccurrenceListQueryCount to guard against query-count scaling with page size. |
| ami/main/models_future/occurrence.py | Introduces prefetch factories and helper functions to pick best prediction/identification and derive detection image URLs from caches. |
| ami/main/models.py | Adds OccurrenceQuerySet.with_list_prefetches() to apply the list serializer’s required prefetches. |
| ami/main/api/views.py | Applies list-only prefetching via with_list_prefetches(); keeps existing detail prefetch behavior. |
| ami/main/api/serializers.py | Routes list fields through prefetched data when available, with fallbacks to existing model methods. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ami/main/models_future/occurrence.py`:
- Around line 69-91: best_prediction_from_prefetch currently aggregates all
det.classifications and ranks them, which diverges from
Occurrence.best_prediction because that method first applies per-algorithm
max-score filtering via Occurrence.predictions(); update
best_prediction_from_prefetch to call/use Occurrence.predictions() (or replicate
its per-algorithm max-score grouping) to produce the filtered set of
Classification objects, then apply the same sort key (terminal, -score, -pk) and
return the first element; ensure you reference and reuse
Occurrence.predictions() so the prefetch path preserves identical semantics to
best_prediction.
In `@ami/main/tests.py`:
- Around line 2956-2959: The try/except around caches["default"].clear() is
swallowing failures—remove the broad except or replace it with a specific catch
that fails the test; call caches["default"].clear() directly (so exceptions
propagate) or if you must catch, re-raise or call pytest.fail() with the caught
exception text so a cache-clear failure causes the test to fail and exposes the
error rather than hiding it.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 829c8693-706f-40db-89be-84b6b55a965a
📒 Files selected for processing (5)
ami/main/api/serializers.pyami/main/api/views.pyami/main/models.pyami/main/models_future/occurrence.pyami/main/tests.py
- Prevent detail-path N+1: `_best_prediction` now requires both
`detections` AND nested `classifications` to be in the prefetch cache
before calling `best_prediction_from_prefetch`. Detail viewset only
prefetches `detections`, so the previous gate still walked
`det.classifications.all()` lazily (one query per detection).
- Mirror `Occurrence.best_prediction` semantics in the prefetch helper:
`Occurrence.predictions()` filters classifications to the per-algorithm
max-score row before ordering by `-terminal, -score`. The helper now
applies the same per-algorithm grouping in Python so list and detail
return the same winner for occurrences whose top-scoring classification
is non-terminal.
- Add deterministic tiebreaker to `best_identification_from_prefetch`:
tuple-keyed comparison (`created_at_is_set, created_at, pk`) treats
`None` timestamps as lowest and breaks ties by `pk`, matching
`order_by('-created_at', '-pk')`.
- Collapse the duplicate `detections` prefetch: prior PR loaded
detections twice (once via a `to_attr` filtered list, once via
`detections__classifications`). Replace both with a single
`prefetch_detections_for_list()` that nests classifications, and
filter for non-null paths in Python. Saves one query per request.
- Don't swallow cache-clear failures in the regression guard. A silent
cache-clear failure would let the warm cache hide query-count
regressions.
Refs: #1271
PR: #1274
Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ami/main/api/serializers.py`:
- Around line 1377-1379: The fast-path should only run for list responses, not
any object with a cached 'detections' relation. Change the condition in the
serializer method (the branch using detection_image_urls_from_prefetch and
obj.detection_images) to require an explicit list-only marker (e.g. check
getattr(obj, "_prefetched_for_list", False) && "detections" in getattr(obj,
"_prefetched_objects_cache", {})); then set that marker when you build the list
queryset that prefetches detections (for example, after the prefetch in the list
view or in the queryset iteration set obj._prefetched_for_list = True) so
OccurrenceSerializer and other detail callers still call detection_images() and
preserve ordering/filtering.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 274a54ab-5ddf-4af8-9c00-c63dab62b87b
📒 Files selected for processing (3)
ami/main/api/serializers.pyami/main/models_future/occurrence.pyami/main/tests.py
Mirror BEST_MACHINE_PREDICTION_ORDER: define the identification ordering
once and use it everywhere a "best identification" is selected.
- New constant `BEST_IDENTIFICATION_ORDER = ("-created_at", "-pk")`
alongside `BEST_MACHINE_PREDICTION_ORDER`.
- `Occurrence.best_identification` and `OccurrenceQuerySet.with_verification_info`
now order by the constant. Adds a `-pk` deterministic tiebreaker; created_at
is auto_now_add so equal timestamps are vanishingly rare in practice.
- `best_identification_from_prefetch` references the constant in its
docstring (the helper's tuple key already encodes the same ordering).
Co-Authored-By: Claude <noreply@anthropic.com>
…paths Per PR review: the cache-key-gated fallbacks in `OccurrenceListSerializer` exercised the same N+1 path the PR closed, and silently triggered on the detail endpoint via shared serializer inheritance. - Drop `_best_prediction`, `_best_identification`, `get_detection_images` fallbacks. Serializer trusts the prefetch contract. - Add `prefetch_detections_for_detail()` factory + `with_detail_prefetches()` queryset method. Detail viewset now applies it instead of an inline Prefetch. - Drop `has_prefetched_classifications()` helper (no longer needed). - Add `TestOccurrenceDetailQueryCount` regression guard. Detail endpoint measures 12 queries; list (warm test client) measures 6. Closes review threads on PR #1274 (mihow comments 3164296309/3164306519, CodeRabbit comment 3163836361). Co-Authored-By: Claude <noreply@anthropic.com>
…on_images
Per PR review (mihow): adopt the 'disable model methods' safety pattern as
strict prefetch contract. Helpers now raise RuntimeError when required
relations are missing from `_prefetched_objects_cache` instead of silently
slow-pathing through Django's lazy lookups.
- `_require_prefetch()` gate on best_prediction / best_identification /
detection_image_urls helpers.
- Restore old behavior: list responses cap detection_images at 3
(`OccurrenceListSerializer.detection_images_limit`). Detail subclass
overrides to None (unbounded) — TODO: bound when occurrence tracking
enables thousands of detections per occurrence.
- New tests:
- TestOccurrenceResponseShape pins list + detail JSON keys + the list cap.
- TestOccurrencePrefetchHelpersEdgeCases asserts safe defaults on empty
prefetched relations, and RuntimeError when prefetch is missing.
Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ami/main/models_future/occurrence.py`:
- Around line 83-85: The strict-path prefetch check currently only asserts
detections were prefetched but then iterates det.classifications.all(), which
can trigger per-detection queries; update the check to require the nested
classifications prefetch (use _require_prefetch(occurrence,
"detections__classifications") or otherwise validate that each Detection object
has a prefetched 'classifications' attribute) before iterating, so the code
fails fast when nested prefetching is missing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: dff16c79-6ad9-4049-9270-883a29b5cf76
📒 Files selected for processing (5)
ami/main/api/serializers.pyami/main/api/views.pyami/main/models.pyami/main/models_future/occurrence.pyami/main/tests.py
🚧 Files skipped from review as they are similar to previous changes (2)
- ami/main/api/views.py
- ami/main/tests.py
Drop TestOccurrenceResponseShape — most assertions are key-presence checks already enforced by serializer Meta. Fold the only valuable assertion (the detection_images per-row cap on list responses) into _list_occurrences in TestOccurrenceListQueryCount where the response data is already inspected. Co-Authored-By: Claude <noreply@anthropic.com>
…cover image to 1
Exports invoke OccurrenceExportSerializer (subclasses OccurrenceSerializer),
whose inherited get_determination_details / _best_prediction call the strict
prefetch helpers in models_future/occurrence.py. JSONExporter.get_queryset()
did not apply the prefetch, so two CI tests in ami.exports.tests.DataExportTest
raised RuntimeError ("Occurrence X is missing prefetched relations
['detections']"). Add .with_detail_prefetches() to the exporter queryset.
Cap OccurrenceListSerializer.detection_images_limit from 3 to 1 — list cards
render a single cover image; payload stays tight without changing response
shape.
Document the nested-prefetch caveat in _require_prefetch: it only checks
top-level relations, so a caller prefetching 'detections' without nested
'classifications' silently re-introduces per-detection queries. Tighter
enforcement is deferred to the django-zen-queries follow-up (issue #1271),
which catches per-row queries at the view boundary regardless of depth.
Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
best_prediction_from_prefetch:
- Skip classifications with score=None (mirrors SQL NULL semantics:
Max(NULL) = NULL, score IN (NULL) never matches).
- Removes the asymmetry between -inf in max calc and 0.0 in sort key.
- Document Python-vs-SQL semantic difference around per-algorithm
max grouping (Python is the stricter interpretation).
TestOccurrenceDetailQueryCount:
- Add test_detail_query_count_does_not_scale_with_detections,
inflating one occurrence to 9 detections x 4 classifications.
The single-detection test alone could not catch a multi-detection
N+1; this guards the actual property the test name asserts.
Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
ami/main/api/serializers.py (1)
1384-1426: ⚡ Quick winMemoize
_best_predictionper occurrence to avoid duplicate scans.
_best_prediction(obj)is used in bothget_determination_details(Line 1406) andget_best_machine_prediction(Line 1425). Caching it once on the object avoids repeated Python traversal of prefetched detections/classifications.Suggested refactor
def _best_prediction(self, obj: Occurrence): from ami.main.models_future.occurrence import best_prediction_from_prefetch - return best_prediction_from_prefetch(obj) + cache_attr = "_cached_best_prediction_from_prefetch" + if hasattr(obj, cache_attr): + return getattr(obj, cache_attr) + value = best_prediction_from_prefetch(obj) + setattr(obj, cache_attr, value) + return value🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ami/main/api/serializers.py` around lines 1384 - 1426, The _best_prediction traversal is called twice; modify _best_prediction to memoize its result on the Occurrence instance (e.g., check for and return obj._cached_best_prediction if present, otherwise compute via best_prediction_from_prefetch(obj), assign it to obj._cached_best_prediction and return it) so both get_determination_details and get_best_machine_prediction reuse the cached value and avoid duplicate scans; update only the _best_prediction function (reference symbols: _best_prediction, best_prediction_from_prefetch, get_determination_details, get_best_machine_prediction) and ensure you cache None results as well.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ami/main/tests.py`:
- Around line 3049-3054: The test creates a Detection with bbox set to
normalized floats but the codebase expects absolute pixel coordinates; update
the Detection.objects.create call to supply integer pixel bbox values instead of
normalized floats—for example compute with the fixture image dimensions (e.g.,
[int(source_image.width*0.1), int(source_image.height*0.1),
int(source_image.width*0.2), int(source_image.height*0.2)]) or use explicit
pixel values like [10, 10, 20, 20] for the bbox field so the Detection.bbox
matches production assumptions.
---
Nitpick comments:
In `@ami/main/api/serializers.py`:
- Around line 1384-1426: The _best_prediction traversal is called twice; modify
_best_prediction to memoize its result on the Occurrence instance (e.g., check
for and return obj._cached_best_prediction if present, otherwise compute via
best_prediction_from_prefetch(obj), assign it to obj._cached_best_prediction and
return it) so both get_determination_details and get_best_machine_prediction
reuse the cached value and avoid duplicate scans; update only the
_best_prediction function (reference symbols: _best_prediction,
best_prediction_from_prefetch, get_determination_details,
get_best_machine_prediction) and ensure you cache None results as well.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2a4cdc38-93a6-40e4-883d-075bc1815960
📒 Files selected for processing (4)
ami/exports/format_types.pyami/main/api/serializers.pyami/main/models_future/occurrence.pyami/main/tests.py
🚧 Files skipped from review as they are similar to previous changes (1)
- ami/main/models_future/occurrence.py
| det = Detection.objects.create( | ||
| source_image=source_image, | ||
| occurrence=occurrence, | ||
| timestamp=source_image.timestamp, | ||
| bbox=[0.1, 0.1, 0.2, 0.2], | ||
| path=f"detections/inflated_{i}.jpg", |
There was a problem hiding this comment.
Use pixel-space bbox values in this detection fixture.
At Line 3053, the bbox uses normalized-looking floats. This repo treats Detection.bbox as absolute pixel coordinates, so using integer pixel values here will keep fixtures consistent with production assumptions.
Suggested fix
for i in range(8):
det = Detection.objects.create(
source_image=source_image,
occurrence=occurrence,
timestamp=source_image.timestamp,
- bbox=[0.1, 0.1, 0.2, 0.2],
+ bbox=[10, 10, 20, 20],
path=f"detections/inflated_{i}.jpg",
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ami/main/tests.py` around lines 3049 - 3054, The test creates a Detection
with bbox set to normalized floats but the codebase expects absolute pixel
coordinates; update the Detection.objects.create call to supply integer pixel
bbox values instead of normalized floats—for example compute with the fixture
image dimensions (e.g., [int(source_image.width*0.1),
int(source_image.height*0.1), int(source_image.width*0.2),
int(source_image.height*0.2)]) or use explicit pixel values like [10, 10, 20,
20] for the bbox field so the Detection.bbox matches production assumptions.
Staging A/B benchmarkSide-by-side run against Single-request latency (5 runs each, varied offset to bypass cachalot)
The wall-clock speedup is smaller than the 9.1× query-count reduction at limit=25 — per-request fixed costs (TLS, auth, project filter, presigned URL generation) don't scale with query count, so they dominate the residual. Detail endpoint (3 runs)
Concurrent load (10 in flight, 30 total,
|
| metric | arctia (PR) | serbia (baseline) | speedup |
|---|---|---|---|
| wall | 17.2s | 47.8s | 2.78× |
| p50 | 5.92s | 14.80s | 2.50× |
| p95 | 6.51s | 21.29s | 3.27× |
| p99 | 6.54s | 21.68s | 3.31× |
Tail widens under load — the PR holds steady, the baseline blows out at p95/p99.
Worker-pool unblocking — /captures/?project_id=5&limit=10 while 8 /occurrences/?limit=25 are in flight
| env | captures baseline | captures under occurrence load |
|---|---|---|
| arctia (PR) | 0.32–0.50s | 1.46–2.37s |
| serbia (baseline) | 0.91–1.66s | 3.50–5.72s |
Captures-under-occurrence-load on the PR (~2s) is roughly equivalent to captures-no-load on the baseline (~1.2s). Workers free up ~3× faster, so the "every page feels frozen" symptom under multi-reviewer use should largely go away.
Other view A/B (PR doesn't change these — sanity check that nothing regressed)
| view | arctia (PR) | serbia (baseline) |
|---|---|---|
| captures | 0.40s | 0.89s |
| taxa | 0.27s | 0.44s |
| events | 0.45s | 0.90s |
| deployments | 0.32s | 0.48s |
| jobs | 0.64s | 1.10s |
| sourceimages | 0.27s | 0.26s |
sourceimages is identical on both → confirms baseline path. Other deltas are infra/cache noise (sequential probes, no shared load); the PR doesn't touch these viewsets.
Production sizing reference (no benchmark run on prod, just for scale)
api.antenna.insectai.org Vermont Atlas of Life (project 18, 31,741 occurrences):
| limit | latency (no PR) |
|---|---|
| 5 | 1.25s avg |
| 25 | 2.24s avg |
| 100 | 6.07s avg |
Post-deploy: based on the staging multipliers, expect roughly halved latency at limit=25 (~1.0s) and ~2× at limit=5 (~0.6s). Larger pages benefit less in absolute terms because per-row serializer cost dominates above the N+1 ceiling.
Correctness
- Top-level keys identical on PR vs baseline ✓
detection_imageslength = 1 in list response on both (this dataset has 1 detection per occurrence; cap not visible)- Detail content diff shows only env-specific URLs and a re-run classification (different
id/score/created_at) — expected, since arctia ran ML again - Identification create flow exercised in UI on arctia (occurrences 50781, 50804 identified live, no 500) — confirms strict prefetch contract holds under POST
- Occurrences-by-collection JSON export triggered for 7,090 records — validates
JSONExporter.get_queryset()strict-prefetch wiring
Still pending
- Soak window on
AMI-PLATFORM-API-BMY(the original 34–40s cold-response Sentry issue) — needs ~30 min of real reviewer load on arctia - Production benchmark on
api.antenna.insectai.orgafter merge
Production load test (api.antenna.insectai.org, project 18 = 31,741 occurrences)Light concurrency, no concurrent users at probe time.
Prod baseline holds up better than Export verification status
|
Reusable curl + xargs runner that probes the project's occurrence count, then issues N parallel requests at varied offsets to bypass cachalot. Reports wall, avg, min, p50, p95, p99, max, and per-status error counts. Used to A/B compare staging deployments (arctia vs serbia) and to characterize prod baseline for PR #1274. Re-run after deploy to verify the speedup against prod's pre-merge numbers in the PR thread. Co-Authored-By: Claude <noreply@anthropic.com>
Production load test — scaledRuns against
Findings:
Inferred post-merge expectation, based on the staging multipliers (~3.3× p95 reduction at limit=25 under concurrent load):
Reproducer# Pre-merge prod baseline
./scripts/benchmark_occurrences_list.sh https://api.antenna.insectai.org 18 10 30 25
./scripts/benchmark_occurrences_list.sh https://api.antenna.insectai.org 18 20 60 25
./scripts/benchmark_occurrences_list.sh https://api.antenna.insectai.org 18 10 30 100
# Post-merge prod re-run (same exact commands)
# Compare against the table above.The script:
|
…rings Simplify the prefetch contract surface introduced in this branch: - Drop _best_prediction / _best_identification thin wrappers in OccurrenceListSerializer; inline best_*_from_prefetch calls. - Drop prefetches_for_list_serializer / prefetches_for_detail_serializer 1-line list wrappers; with_list_prefetches / with_detail_prefetches now call the Prefetch factories directly. - Compress _require_prefetch and best_*_from_prefetch docstrings; keep the strict-prefetch behavior and the SQL/Python null-handling note as a single line each. - OccurrenceSerializer.detection_images_limit = 100 (was None) — bound detail responses from the start; pagination is the long-term answer. Net -47 lines, no behavior change beyond the detail detection_images cap. Co-Authored-By: Claude <noreply@anthropic.com>
Summary
Speeds up the occurrences list endpoint by ~9x and frees up gunicorn workers that were previously pinned for 30+ seconds per request, which under multi-user load was stalling every other API endpoint sharing the pool (captures, jobs, deployments — "every page feels frozen"). Fewer queries per request also drops Postgres load and lets the same hardware support more concurrent reviewers. Closes #1271 (Sentry
AMI-PLATFORM-API-BMY, 34–40s cold responses).Mechanically: introduces reusable
Prefetchfactories and wires them intoOccurrenceViewSetlist and detail paths. Strict prefetch contract — serializer raisesRuntimeErrorif a caller skips the prefetch instead of silently re-introducing per-row queries.Measurement
CaptureQueriesContextagainst an in-test project with 25 occurrences (each: one detection, one classification). Cachalot cache cleared between runs.Baseline grew ~1.6 queries per occurrence (clear N+1). Post-fix is flat regardless of page size.
Changes
ami/main/models_future/occurrence.py(new module)prefetch_detections_for_list()— detections + nested classifications, ordered for stable galleriesprefetch_detections_for_detail()— detections + nested classifications +source_image, ordered most-recent-firstbest_prediction_from_prefetch()— mirrorsOccurrence.predictions()per-algorithm max-score grouping in Pythonbest_identification_from_prefetch()— mirrorsBEST_IDENTIFICATION_ORDER+withdrawn=Falsedetection_image_urls_from_prefetch(occurrence, limit=...)— bounded URL list_require_prefetch()— strict gate; raises if relations missing from_prefetched_objects_cacheOccurrenceQuerySet.with_list_prefetches()/with_detail_prefetches()— aggregator methods on the queryset.OccurrenceListSerializerdetection_images,determination_details,best_machine_predictionnow route through the strict helpers — no fallback to model methods.detection_images_limit: int | None = 1— list cards render a single cover image.OccurrenceSerializer(detail) overridesdetection_images_limit = None(unbounded; TODO bound when occurrence tracking lands).OccurrenceViewSet.get_querysetcallswith_list_prefetches()on list,with_detail_prefetches()on detail.JSONExporter.get_queryset(ami/exports/format_types.py) addswith_detail_prefetches(). The export serializer subclassesOccurrenceSerializerand inherits the strict helpers, so it needs the same prefetch contract.Behavior changes (read carefully)
detection_imagesis now capped at 1 (was unbounded via the model method, but in practice fed from the same query and rarely larger). Detail response still returns all detection images. Response shape unchanged.OccurrenceListSerializerorOccurrenceSerializermust applywith_list_prefetches()orwith_detail_prefetches()on the queryset. CI surfaced one missing site (JSONExporter) — fixed here. Audited all six occurrence-related serializers; only the three that hit the strict helpers (OccurrenceListSerializer,OccurrenceSerializer,OccurrenceExportSerializer) need prefetches, and all three are wired.Caveats
_require_prefetchonly checks top-level relations. A caller prefetchingdetectionswithout nestedclassificationswill silently re-introduce per-detection queries. Mitigated by always going through the list/detail factories. Tighter depth-enforcement deferred to django-zen-queries follow-up.Deferred to follow-up
django-zen-queriesQueriesDisabledViewMixinas a guardrail (cachalot interaction needs a quick spike — cache-key lookups may register as queries)SourceImageViewSetandTaxonViewSetlist paths for the same patternTest plan
python manage.py test ami.main.tests.TestOccurrenceListQueryCountpassespython manage.py test ami.main.tests.TestOccurrenceDetailQueryCountpassespython manage.py test ami.main.tests.TestOccurrencePrefetchHelpersEdgeCasespasses (zero-detection no 500, missing prefetch raises)python manage.py test ami.exports.tests.DataExportTest.test_export_record_count ami.exports.tests.DataExportTest.test_json_export_record_countpasses (regression caught the missing exporter prefetch)/api/v2/occurrences/?project_id=18&limit=25against staging, confirm response shape unchanged and Sentry latency drops/api/v2/occurrences/<id>/still returns full nested detections/predictionsSummary by CodeRabbit
New Features
Improvements
Tests