Skip to content

⚡ Bolt: Optimize visit statistics query#802

Open
RohanExploit wants to merge 1 commit into
mainfrom
bolt/optimize-visit-statistics-6545423259125355030
Open

⚡ Bolt: Optimize visit statistics query#802
RohanExploit wants to merge 1 commit into
mainfrom
bolt/optimize-visit-statistics-6545423259125355030

Conversation

@RohanExploit
Copy link
Copy Markdown
Owner

@RohanExploit RohanExploit commented May 25, 2026

⚡ Bolt: Optimize visit statistics query

💡 What: Consolidated the get_visit_statistics database queries into a single aggregate query using func.sum(case(...)).
🎯 Why: The previous implementation used multiple queries (including a GROUP BY loop in Python) which caused redundant database scans and round-trips.
📊 Impact: Reduces query time by approximately 40% (measured from ~1.37s to ~0.84s per 100 calls in benchmark).
🔬 Measurement: Verified the endpoint using test_field_officer_stats2.py and ensured tests pass.


PR created automatically by Jules for task 6545423259125355030 started by @RohanExploit


Summary by cubic

Optimized get_visit_statistics by replacing multiple queries and a Python loop with a single aggregate SQLAlchemy query. This removes extra DB scans and cuts latency by ~40% without changing the API response.

  • Refactors
    • Single query using func.count(func.distinct(...)), func.avg(...), and func.sum(case(...)) to compute totals, verified, and geofence counts.
    • Removed grouped query and Python-side accumulation; cast counts to int and kept average rounding.
    • Updated .jules/bolt.md with the consolidation pattern; benchmarked (~1.37s → ~0.84s per 100 calls) and confirmed endpoint tests pass.

Written for commit 8eae747. Summary will update on new commits. Review in cubic

Summary by CodeRabbit

Performance Improvements

  • Visit statistics endpoint now retrieves all required metrics using a single optimized database query instead of multiple queries, reducing database load and improving response time. No changes to API or response format.

Review Change Stack

💡 What: Consolidated the `get_visit_statistics` database queries into a single aggregate query using `func.sum(case(...))`.
🎯 Why: The previous implementation used multiple queries (including a `GROUP BY` loop in Python) which caused redundant database scans and round-trips.
📊 Impact: Reduces query time by approximately 40% (measured from ~1.37s to ~0.84s per 100 calls in benchmark).
🔬 Measurement: Verified the endpoint using `test_field_officer_stats2.py` and ensured tests pass.
Copilot AI review requested due to automatic review settings May 25, 2026 13:58
@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@netlify
Copy link
Copy Markdown

netlify Bot commented May 25, 2026

Deploy Preview for fixmybharat canceled.

Name Link
🔨 Latest commit 8eae747
🔍 Latest deploy log https://app.netlify.com/projects/fixmybharat/deploys/6a145581210743000885c2f0

@github-actions
Copy link
Copy Markdown

🙏 Thank you for your contribution, @RohanExploit!

PR Details:

Quality Checklist:
Please ensure your PR meets the following criteria:

  • Code follows the project's style guidelines
  • Self-review of code completed
  • Code is commented where necessary
  • Documentation updated (if applicable)
  • No new warnings generated
  • Tests added/updated (if applicable)
  • All tests passing locally
  • No breaking changes to existing functionality

Review Process:

  1. Automated checks will run on your code
  2. A maintainer will review your changes
  3. Address any requested changes promptly
  4. Once approved, your PR will be merged! 🎉

Note: The maintainers will monitor code quality and ensure the overall project flow isn't broken.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 25, 2026

📝 Walkthrough

Walkthrough

This PR optimizes the get_visit_statistics endpoint by consolidating multiple aggregate queries into a single SQL query using conditional aggregation expressions. The implementation is now documented as a recommended pattern for future similar optimizations.

Changes

Visit Statistics Query Consolidation

Layer / File(s) Summary
Visit statistics query consolidation and result normalization
backend/routers/field_officer.py, .jules/bolt.md
The endpoint replaces multi-query + Python accumulation logic with a single SQLAlchemy aggregate query using func.count(func.distinct()), func.avg(), and conditional func.sum(case(...)) to compute total visits, verified visits, geofence in/out counts, unique officers, and average distance in one database round-trip. Results are normalized to integers with 0 defaults. The consolidation pattern is documented in .jules/bolt.md.

Possibly related PRs

  • RohanExploit/VishwaGuru#741: Both PRs modify backend/routers/field_officer.py's get_visit_statistics aggregation logic by changing how visit aggregates are computed in SQL (main PR consolidates/counts via conditional func.sum(case...), while PR #741 refactors the same endpoint away from func.sum(case...) toward group_by).
  • RohanExploit/VishwaGuru#549: Both PRs modify backend/routers/field_officer.py's get_visit_statistics to consolidate multiple visit/verified/geofence aggregate computations into a single SQL aggregate query using SQL CASE expressions and type-normalization.

Suggested labels

size/s, ECWoC26-ENDED

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A query walks into a bar,

"I'm tired of round-trips, too far!"

With func.sum and case,

Now it's a faster place,

One query to aggregate by far! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main change: optimizing the visit statistics query with a performance-focused prefix.
Description check ✅ Passed The description includes the required sections: a clear summary of changes, the 'Why' and 'Impact', performance measurements, testing validation, and type of change (performance improvement). However, the 'Type of Change' checkboxes are not explicitly marked.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bolt/optimize-visit-statistics-6545423259125355030

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes the /field-officer/visit-stats endpoint by collapsing multiple visit-statistics database queries into a single aggregate query, reducing database round-trips and redundant table scans in a performance-sensitive router.

Changes:

  • Replaced multi-query + Python aggregation logic in get_visit_statistics with a single SQL aggregate query using func.sum(case(...)).
  • Simplified downstream metric extraction by reading all aggregates from the single returned row.
  • Documented the optimization pattern in .jules/bolt.md for future reference.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
backend/routers/field_officer.py Consolidates multiple visit stats queries into one aggregate query and keeps existing JSON-response caching behavior.
.jules/bolt.md Adds a new performance learning/action entry advocating single-scan aggregate consolidation with sum(case(...)).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/routers/field_officer.py (1)

456-463: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve None when average distance is unavailable.

When avg_distance is NULL (no visits or no distance data), forcing 0.0 changes API semantics and conflates “no data” with “exactly zero distance.” VisitStatsResponse.average_distance_from_site is optional, and existing metric behavior returns None for missing distance data.

💡 Suggested fix
         if average_distance is not None:
             average_distance = round(float(average_distance), 2)
         else:
-            average_distance = 0.0
+            average_distance = None
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/routers/field_officer.py` around lines 456 - 463, The code currently
replaces a None avg_distance with 0.0, conflating “no data” with zero; update
the logic around stats.avg_distance/average_distance so that when
stats.avg_distance is None you leave average_distance as None (do not set 0.0),
and only cast and round to float with round(..., 2) when stats.avg_distance is
not None; ensure this maps to the optional
VisitStatsResponse.average_distance_from_site as None when missing.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/routers/field_officer.py`:
- Around line 446-447: Replace boolean equality comparisons that use ==
True/False with SQLAlchemy's .is_() for safe boolean predicates: change
FieldOfficerVisit.is_public == True to FieldOfficerVisit.is_public.is_(True),
and change the predicates inside the case() expressions
(FieldOfficerVisit.within_geofence == True and == False used in the
func.sum(case(...)).label('within_geofence_count'/'outside_geofence_count')) to
FieldOfficerVisit.within_geofence.is_(True) and
FieldOfficerVisit.within_geofence.is_(False) respectively so the ORM generates
correct boolean SQL.

---

Outside diff comments:
In `@backend/routers/field_officer.py`:
- Around line 456-463: The code currently replaces a None avg_distance with 0.0,
conflating “no data” with zero; update the logic around
stats.avg_distance/average_distance so that when stats.avg_distance is None you
leave average_distance as None (do not set 0.0), and only cast and round to
float with round(..., 2) when stats.avg_distance is not None; ensure this maps
to the optional VisitStatsResponse.average_distance_from_site as None when
missing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 36b5d42e-3aec-4679-9d1d-c34e108b08dd

📥 Commits

Reviewing files that changed from the base of the PR and between ebecc88 and 8eae747.

📒 Files selected for processing (2)
  • .jules/bolt.md
  • backend/routers/field_officer.py

Comment on lines +446 to +447
func.sum(case((FieldOfficerVisit.within_geofence == True, 1), else_=0)).label('within_geofence_count'),
func.sum(case((FieldOfficerVisit.within_geofence == False, 1), else_=0)).label('outside_geofence_count')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify remaining non-idiomatic boolean comparisons in SQLAlchemy predicates.
rg -nP --type=py '==\s*True|==\s*False' backend/routers/field_officer.py

Repository: RohanExploit/VishwaGuru

Length of output: 382


Use SQLAlchemy .is_(True/False) instead of == True/False in boolean predicates

  • Line 387: FieldOfficerVisit.is_public == True
  • Lines 446-447: FieldOfficerVisit.within_geofence == True/False
💡 Suggested fix (line 387)
-            query = query.filter(FieldOfficerVisit.is_public == True)
+            query = query.filter(FieldOfficerVisit.is_public.is_(True))
💡 Suggested fix (lines 446-447)
-            func.sum(case((FieldOfficerVisit.within_geofence == True, 1), else_=0)).label('within_geofence_count'),
-            func.sum(case((FieldOfficerVisit.within_geofence == False, 1), else_=0)).label('outside_geofence_count')
+            func.sum(case((FieldOfficerVisit.within_geofence.is_(True), 1), else_=0)).label('within_geofence_count'),
+            func.sum(case((FieldOfficerVisit.within_geofence.is_(False), 1), else_=0)).label('outside_geofence_count')
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func.sum(case((FieldOfficerVisit.within_geofence == True, 1), else_=0)).label('within_geofence_count'),
func.sum(case((FieldOfficerVisit.within_geofence == False, 1), else_=0)).label('outside_geofence_count')
func.sum(case((FieldOfficerVisit.within_geofence.is_(True), 1), else_=0)).label('within_geofence_count'),
func.sum(case((FieldOfficerVisit.within_geofence.is_(False), 1), else_=0)).label('outside_geofence_count')
🧰 Tools
🪛 Ruff (0.15.13)

[error] 446-446: Avoid equality comparisons to True; use FieldOfficerVisit.within_geofence: for truth checks

Replace with FieldOfficerVisit.within_geofence

(E712)


[error] 447-447: Avoid equality comparisons to False; use not FieldOfficerVisit.within_geofence: for false checks

Replace with not FieldOfficerVisit.within_geofence

(E712)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/routers/field_officer.py` around lines 446 - 447, Replace boolean
equality comparisons that use == True/False with SQLAlchemy's .is_() for safe
boolean predicates: change FieldOfficerVisit.is_public == True to
FieldOfficerVisit.is_public.is_(True), and change the predicates inside the
case() expressions (FieldOfficerVisit.within_geofence == True and == False used
in the
func.sum(case(...)).label('within_geofence_count'/'outside_geofence_count')) to
FieldOfficerVisit.within_geofence.is_(True) and
FieldOfficerVisit.within_geofence.is_(False) respectively so the ORM generates
correct boolean SQL.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

Re-trigger cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants