Fix blockchain integrity, optimize deduplication, and add issue caching#396
Fix blockchain integrity, optimize deduplication, and add issue caching#396RohanExploit wants to merge 4 commits intomainfrom
Conversation
- Added `previous_integrity_hash` and `parent_issue_id` to `Issue` model and migration.
- Updated `create_issue` to store duplicate issues with `status='duplicate'` and `parent_issue_id` for better tracking.
- Implemented `GET /api/issues/{issue_id}` with thread-safe caching (`issue_details_cache`).
- Fixed blockchain verification to use stored `previous_integrity_hash` (O(1) verification) with fallback for legacy data.
- Added cache invalidation for issue updates (`upvote`, `verify`, `status_update`).
Co-authored-by: RohanExploit <178623867+RohanExploit@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
❌ Deploy Preview for fixmybharat failed. Why did it fail? →
|
🙏 Thank you for your contribution, @RohanExploit!PR Details:
Quality Checklist:
Review Process:
Note: The maintainers will monitor code quality and ensure the overall project flow isn't broken. |
📝 WalkthroughWalkthroughAdds a global thread-safe Changes
Sequence DiagramsequenceDiagram
participant Client
participant Router as issues.Router
participant Blockchain as IntegrityCalc
participant DB as Database
participant Cache as CacheSystem
Client->>Router: POST /create_issue(payload)
Router->>Blockchain: fetch_previous_hash() / compute(prev_hash, description, category)
Blockchain-->>Router: prev_hash, integrity_hash
Router->>DB: query nearby issues
DB-->>Router: nearby list
alt no nearby
Router->>DB: save Issue(integrity_hash, previous_integrity_hash, status="new")
DB-->>Router: new_issue
Router->>Cache: invalidate recent_issues_cache, nearby_issues_cache
Router-->>Client: {id: issue_id, message: "new issue created"}
else nearby found
Router->>DB: save Issue(status="duplicate", parent_issue_id, previous_integrity_hash)
DB-->>Router: duplicate_issue
Router->>Cache: invalidate recent_issues_cache, nearby_issues_cache
Router-->>Client: {id: null, message: "linked to existing issue"}
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/routers/issues.py (2)
146-161:⚠️ Potential issue | 🟠 MajorStale cache: linked issue's detail cache not invalidated after duplicate-triggered upvote.
When a duplicate is detected, the closest existing issue is upvoted (lines 152–156), but
issue_details_cachefor thatlinked_issue_idis never invalidated. SubsequentGET /api/issues/{linked_issue_id}calls will return a stale upvote count until the TTL expires.Add invalidation after the duplicate upvote:
🐛 Proposed fix
# Commit the upvote await run_in_threadpool(db.commit) + # Invalidate cache for the upvoted issue + issue_details_cache.invalidate(f"issue_{linked_issue_id}") + logger.info(f"Spatial deduplication: Linked new report to existing issue {linked_issue_id}")Also applies to: 228-240
228-240:⚠️ Potential issue | 🟡 MinorCache not invalidated for duplicate issues.
When a duplicate is created, the
recent_issues_cacheandnearby_issues_cacheare not cleared (the invalidation block at lines 236–238 is gated bynew_issue.status != "duplicate"). While duplicates havestatus='duplicate'and probably won't appear in "open" filtered queries, therecent_issuesendpoint has no status filter — so duplicates could appear there once the cache expires, but wouldn't show up while the old cache is still warm. This inconsistency is minor but could confuse debugging.Consider moving cache invalidation outside the
status != "duplicate"guard, or documenting why duplicates shouldn't invalidate list caches.
🤖 Fix all issues with AI agents
In `@backend/routers/issues.py`:
- Around line 168-177: Add a short comment above the block that computes
prev_hash/integrity_hash (the run_in_threadpool call that queries
Issue.integrity_hash and the subsequent prev_hash and integrity_hash
computation) noting the race condition: concurrent create_issue requests can
read the same latest integrity_hash before commits and thus produce identical
previous_integrity_hash values (a forked chain); mention this is a known
limitation unless writes are serialized (e.g., DB row/table locks, serializable
transactions, or a monotonic sequence enforced), so auditors are aware and you
can consider adding locking or transaction isolation in future.
- Around line 270-315: The get_issue route (function get_issue handling GET
"/api/issues/{issue_id}") is declared before static routes and will preemptively
match requests like "/api/issues/nearby" causing 422s; move the get_issue route
declaration so it appears after the static routes "/api/issues/nearby",
"/api/issues/user", "/api/issues/status", and "/api/issues/recent" in the file
so the static paths are registered first and will be matched before the
parameterized "/api/issues/{issue_id}" handler.
🧹 Nitpick comments (1)
backend/init_db.py (1)
127-147: Migration steps follow the existing pattern.The new migration steps are consistent with the rest of the file. One minor note: lines 130 and 137 use
print()while the index creation on line 144 useslogger.info(). This inconsistency exists elsewhere in the file too, but worth standardizing if you do a cleanup pass.
backend/routers/issues.py
Outdated
| # Fetch previous issue's integrity hash for blockchain chaining | ||
| # Optimization: Fetch only the last hash to maintain the chain with minimal overhead | ||
| prev_issue = await run_in_threadpool( | ||
| lambda: db.query(Issue.integrity_hash).order_by(Issue.id.desc()).first() | ||
| ) | ||
| prev_hash = prev_issue[0] if prev_issue and prev_issue[0] else "" | ||
|
|
||
| # Simple but effective SHA-256 chaining | ||
| hash_content = f"{description}|{category}|{prev_hash}" | ||
| integrity_hash = hashlib.sha256(hash_content.encode()).hexdigest() | ||
| # Simple but effective SHA-256 chaining | ||
| hash_content = f"{description}|{category}|{prev_hash}" | ||
| integrity_hash = hashlib.sha256(hash_content.encode()).hexdigest() |
There was a problem hiding this comment.
Race condition: concurrent create_issue calls can fetch the same prev_hash.
Two concurrent requests can both read the same latest integrity_hash before either commits, resulting in two issues with identical previous_integrity_hash values — a forked chain. This is somewhat inherent to the design without serialized writes, but worth documenting as a known limitation, especially if blockchain integrity audits are performed on the full chain.
At minimum, consider adding a comment acknowledging this concurrency limitation.
🤖 Prompt for AI Agents
In `@backend/routers/issues.py` around lines 168 - 177, Add a short comment above
the block that computes prev_hash/integrity_hash (the run_in_threadpool call
that queries Issue.integrity_hash and the subsequent prev_hash and
integrity_hash computation) noting the race condition: concurrent create_issue
requests can read the same latest integrity_hash before commits and thus produce
identical previous_integrity_hash values (a forked chain); mention this is a
known limitation unless writes are serialized (e.g., DB row/table locks,
serializable transactions, or a monotonic sequence enforced), so auditors are
aware and you can consider adding locking or transaction isolation in future.
There was a problem hiding this comment.
Pull request overview
This PR implements blockchain integrity improvements, duplicate issue tracking, and performance optimizations for the issue reporting system. The changes enhance data reliability by storing the previous hash directly on each issue record for O(1) blockchain verification, while preserving all duplicate reports for analytics purposes. A new caching layer for individual issue details is introduced along with the missing GET endpoint that the frontend expects.
Changes:
- Blockchain integrity: Store
previous_integrity_hashon each issue for efficient verification with fallback to legacy query method - Duplicate tracking: Save duplicate reports with
status='duplicate'andparent_issue_idwhile maintaining existing API contract (returningid=None) - Performance: Add
issue_details_cacheand implement caching for the new issue details endpoint
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| backend/models.py | Add previous_integrity_hash and parent_issue_id columns to Issue model with ForeignKey constraint |
| backend/init_db.py | Add migration scripts for new columns and index on parent_issue_id |
| backend/cache.py | Add new issue_details_cache instance with 60s TTL and 1000 max entries |
| backend/routers/issues.py | Implement blockchain hash storage, duplicate issue tracking, new GET endpoint, and cache invalidation |
| tests/test_blockchain.py | Update tests to include previous_integrity_hash in test data |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
backend/models.py
Outdated
| action_plan = Column(JSONEncodedDict, nullable=True) | ||
| integrity_hash = Column(String, nullable=True) # Blockchain integrity seal | ||
| previous_integrity_hash = Column(String, nullable=True) | ||
| parent_issue_id = Column(Integer, ForeignKey("issues.id"), nullable=True, index=True) |
There was a problem hiding this comment.
The ForeignKey constraint ForeignKey("issues.id") on parent_issue_id should specify ondelete behavior. Without an explicit ondelete parameter, if the parent issue is deleted, the database behavior depends on the backend configuration and could cause constraint violations or orphaned records. Consider adding ondelete="SET NULL" or ondelete="CASCADE" based on the desired behavior for duplicate tracking when parent issues are deleted.
| parent_issue_id = Column(Integer, ForeignKey("issues.id"), nullable=True, index=True) | |
| parent_issue_id = Column(Integer, ForeignKey("issues.id", ondelete="SET NULL"), nullable=True, index=True) |
backend/routers/issues.py
Outdated
| # Invalidate cache so new issue appears | ||
| try: | ||
| recent_issues_cache.clear() | ||
| nearby_issues_cache.clear() |
There was a problem hiding this comment.
The nearby_issues_cache is being cleared when a new non-duplicate issue is created, but this cache should also be invalidated when issues are updated (e.g., status changes, upvotes) since those changes affect the nearby issues query results. Consider whether this cache should have a shorter TTL or be invalidated more granularly based on geographic proximity to changed issues.
tests/test_blockchain.py
Outdated
| category="Road", | ||
| integrity_hash="invalidhash" | ||
| integrity_hash="invalidhash", | ||
| previous_integrity_hash="" |
There was a problem hiding this comment.
Tests should be updated to verify that duplicate issues are properly saved with status='duplicate' and parent_issue_id set correctly. The existing test_blockchain.py only tests the blockchain integrity fields but doesn't cover the new duplicate tracking functionality introduced in this PR. Consider adding tests that create a duplicate report and verify the database record is created with the correct status and parent_issue_id.
backend/init_db.py
Outdated
| conn.execute(text("ALTER TABLE issues ADD COLUMN parent_issue_id INTEGER")) | ||
| print("Migrated database: Added parent_issue_id column.") |
There was a problem hiding this comment.
The migration adds the parent_issue_id column but doesn't create the foreign key constraint defined in the SQLAlchemy model. For existing databases, the ForeignKey constraint won't be enforced until explicitly added via migration. Consider adding a migration statement to create the constraint: ALTER TABLE issues ADD CONSTRAINT fk_parent_issue_id FOREIGN KEY (parent_issue_id) REFERENCES issues(id) with appropriate ON DELETE behavior.
backend/routers/issues.py
Outdated
| issue = db.query( | ||
| Issue.id, | ||
| Issue.category, | ||
| Issue.description, | ||
| Issue.created_at, | ||
| Issue.image_path, | ||
| Issue.status, | ||
| Issue.upvotes, | ||
| Issue.location, | ||
| Issue.latitude, | ||
| Issue.longitude | ||
| ).filter(Issue.id == issue_id).first() |
There was a problem hiding this comment.
The GET endpoint returns issues regardless of their status, including issues with status='duplicate'. Consider whether duplicate issues should be exposed via this public API endpoint, or if they should be filtered out since they're primarily for internal analytics. If duplicates should be accessible, document this behavior in the docstring.
backend/routers/issues.py
Outdated
| action_plan=None, | ||
| integrity_hash=integrity_hash, | ||
| previous_integrity_hash=prev_hash, | ||
| status="duplicate", |
There was a problem hiding this comment.
The status value "duplicate" should be added to the IssueStatus enum in backend/schemas.py. Currently, the Issue model uses this string literal status value, but it's not defined in the enum. This could cause validation issues or unexpected behavior when the status field is validated against the enum in API responses or status transitions. Add DUPLICATE = "duplicate" to the IssueStatus enum.
There was a problem hiding this comment.
2 issues found across 5 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/routers/issues.py">
<violation number="1" location="backend/routers/issues.py:211">
P2: Missing `issue_details_cache` invalidation for the parent issue after the deduplication upvote. The `upvote_issue` endpoint correctly invalidates the cache, but this deduplication code path does not, so `GET /api/issues/{linked_issue_id}` will return a stale upvote count. Add `issue_details_cache.invalidate(f"issue_{linked_issue_id}")` after the upvote commit.</violation>
<violation number="2" location="backend/routers/issues.py:270">
P0: Route declaration order bug: This parameterized route `/api/issues/{issue_id}` must be declared after all static `/api/issues/*` sub-routes (like `/api/issues/nearby`, `/api/issues/user`, `/api/issues/status`, `/api/issues/recent`). FastAPI/Starlette matches routes in declaration order, so requests to static paths will attempt int validation on path segments like 'nearby' and return 422 errors instead of reaching the correct handlers.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
backend/routers/issues.py
Outdated
| linked_issue_id=linked_issue_id | ||
| ) | ||
|
|
||
| @router.get("/api/issues/{issue_id}", response_model=IssueSummaryResponse) |
There was a problem hiding this comment.
P0: Route declaration order bug: This parameterized route /api/issues/{issue_id} must be declared after all static /api/issues/* sub-routes (like /api/issues/nearby, /api/issues/user, /api/issues/status, /api/issues/recent). FastAPI/Starlette matches routes in declaration order, so requests to static paths will attempt int validation on path segments like 'nearby' and return 422 errors instead of reaching the correct handlers.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/routers/issues.py, line 270:
<comment>Route declaration order bug: This parameterized route `/api/issues/{issue_id}` must be declared after all static `/api/issues/*` sub-routes (like `/api/issues/nearby`, `/api/issues/user`, `/api/issues/status`, `/api/issues/recent`). FastAPI/Starlette matches routes in declaration order, so requests to static paths will attempt int validation on path segments like 'nearby' and return 422 errors instead of reaching the correct handlers.</comment>
<file context>
@@ -247,6 +267,53 @@ async def create_issue(
linked_issue_id=linked_issue_id
)
+@router.get("/api/issues/{issue_id}", response_model=IssueSummaryResponse)
+def get_issue(issue_id: int, db: Session = Depends(get_db)):
+ """
</file context>
- Refactored `backend/init_db.py` to use `safe_migrate` helper function. - Each migration step is now executed in an independent transaction (commit on success, rollback on failure). - This prevents `current transaction is aborted` errors on PostgreSQL when checking for existing columns, ensuring all migrations (including `previous_integrity_hash` and `parent_issue_id`) run successfully. - This fixes the deployment failure where new columns were missing due to aborted transactions. Co-authored-by: RohanExploit <178623867+RohanExploit@users.noreply.github.com>
There was a problem hiding this comment.
1 issue found across 1 file (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/init_db.py">
<violation number="1" location="backend/init_db.py:40">
P2: Silent exception swallowing hides real migration failures. The `except Exception` block discards the error without any logging, making it impossible to diagnose unexpected failures (permissions, disk issues, DB corruption, etc.). At minimum, log the exception at `debug` or `warning` level so operators can troubleshoot.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
backend/init_db.py
Outdated
| except Exception: | ||
| conn.rollback() |
There was a problem hiding this comment.
P2: Silent exception swallowing hides real migration failures. The except Exception block discards the error without any logging, making it impossible to diagnose unexpected failures (permissions, disk issues, DB corruption, etc.). At minimum, log the exception at debug or warning level so operators can troubleshoot.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/init_db.py, line 40:
<comment>Silent exception swallowing hides real migration failures. The `except Exception` block discards the error without any logging, making it impossible to diagnose unexpected failures (permissions, disk issues, DB corruption, etc.). At minimum, log the exception at `debug` or `warning` level so operators can troubleshoot.</comment>
<file context>
@@ -31,206 +31,98 @@ def migrate_db():
+ conn.commit()
+ if success_msg:
+ logger.info(success_msg)
+ except Exception:
+ conn.rollback()
+
</file context>
| except Exception: | |
| conn.rollback() | |
| except Exception as e: | |
| conn.rollback() | |
| logger.debug(f"Migration skipped (likely already applied): {e}") |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@backend/init_db.py`:
- Around line 34-41: The safe_migrate function is swallowing all
exceptions—capture and log the exception before rolling back so real failures
are visible; inside safe_migrate (which calls conn.execute(text(sql)) and
conn.commit()) update the except Exception block to log the exception (e.g.,
logger.debug or logger.error with the exception object and include context like
the SQL or success_msg) and then perform conn.rollback() so you still revert on
error but retain diagnostic information.
backend/init_db.py
Outdated
| def safe_migrate(sql, success_msg): | ||
| try: | ||
| conn.execute(text(sql)) | ||
| conn.commit() | ||
| if success_msg: | ||
| logger.info(success_msg) | ||
| except Exception: | ||
| conn.rollback() |
There was a problem hiding this comment.
safe_migrate silently swallows all migration failures — at minimum log the exception.
The except Exception block rolls back but discards the error entirely. This is fine for expected "column/index already exists" errors, but genuine failures (permission denied, syntax errors, disk full, connection issues) will be invisible, making production debugging very difficult.
Add at least a logger.debug so failures are traceable when needed:
Proposed fix
def safe_migrate(sql, success_msg):
try:
conn.execute(text(sql))
conn.commit()
if success_msg:
logger.info(success_msg)
- except Exception:
+ except Exception as e:
conn.rollback()
+ logger.debug("Migration skipped or failed: %s — %s", sql.strip()[:60], e)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def safe_migrate(sql, success_msg): | |
| try: | |
| conn.execute(text(sql)) | |
| conn.commit() | |
| if success_msg: | |
| logger.info(success_msg) | |
| except Exception: | |
| conn.rollback() | |
| def safe_migrate(sql, success_msg): | |
| try: | |
| conn.execute(text(sql)) | |
| conn.commit() | |
| if success_msg: | |
| logger.info(success_msg) | |
| except Exception as e: | |
| conn.rollback() | |
| logger.debug("Migration skipped or failed: %s — %s", sql.strip()[:60], e) |
🧰 Tools
🪛 Ruff (0.15.0)
[warning] 40-40: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
In `@backend/init_db.py` around lines 34 - 41, The safe_migrate function is
swallowing all exceptions—capture and log the exception before rolling back so
real failures are visible; inside safe_migrate (which calls
conn.execute(text(sql)) and conn.commit()) update the except Exception block to
log the exception (e.g., logger.debug or logger.error with the exception object
and include context like the SQL or success_msg) and then perform
conn.rollback() so you still revert on error but retain diagnostic information.
- Moved `vite`, `tailwindcss`, `postcss`, and `autoprefixer` from `devDependencies` to `dependencies`. - This ensures these packages are installed and available during the build process on platforms like Render/Netlify, which often set `NODE_ENV=production` and prune dev dependencies before building. - Fixes the "vite: command not found" error during deployment. Co-authored-by: RohanExploit <178623867+RohanExploit@users.noreply.github.com>
🔍 Quality Reminder |
- Moved `vite`, `tailwindcss`, `postcss`, and `autoprefixer` from `devDependencies` to `dependencies`. - This ensures these packages are installed and available during the build process on platforms like Render/Netlify, which often set `NODE_ENV=production` and prune dev dependencies before building. - Fixes the "vite: command not found" error during deployment. - Regenerated `package-lock.json` to reflect changes. Co-authored-by: RohanExploit <178623867+RohanExploit@users.noreply.github.com>
This PR addresses several critical issues and optimizations:
previous_integrity_hashis stored on each record, enabling robust verification. Legacy data is supported via a fallback mechanism.status='duplicate'and linked viaparent_issue_id. This preserves data for analytics while maintaining the existing frontend behavior (returningid=Nonefor duplicates).issue_details_cachetobackend/cache.pyand implemented caching for the newGET /api/issues/{issue_id}endpoint.GET /api/issues/{issue_id}which was missing but expected by the frontend/memory.PR created automatically by Jules for task 8147028805788840111 started by @RohanExploit
Summary by cubic
Strengthens issue blockchain verification, speeds up reads with caching, fixes Postgres migrations, and stabilizes frontend builds.
Data Integrity
Performance, Migrations & Build
Written for commit 0af9212. Summary will update on new commits.
Summary by CodeRabbit
New Features
Refactor