🛡️ Sentinel: [MEDIUM] Fix Insufficient Logging of Database Exceptions#102
Conversation
Co-authored-by: daggerstuff <261005129+daggerstuff@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Reviewer's guide (collapsed on small PRs)Reviewer's GuideAdds structured server-side logging for sqlite3 database exceptions in dataset APIs while preserving existing sanitized HTTP 500 responses, and documents the security learning in the Sentinel log. Sequence diagram for dataset API database error handling with loggingsequenceDiagram
actor Client
participant FastAPI as FastAPIApp
participant DatasetAPI as dataset_api_endpoint
participant SQLite as SQLiteDatabase
participant Logger as Logger
Client->>FastAPI: HTTP request (e.g. GET /datasets)
FastAPI->>DatasetAPI: Call list_datasets / get_dataset_metadata / query_dataset
DatasetAPI->>SQLite: Execute SQL query
SQLite-->>DatasetAPI: sqlite3.Error e
DatasetAPI->>Logger: error(Database error: e)
DatasetAPI->>FastAPI: Raise HTTPException 500 Database error occurred
FastAPI-->>Client: HTTP 500 response with generic error message
Flow diagram for logger usage and exception handling in dataset_apiflowchart TD
A[Start dataset API handler] --> B[Open SQLite connection]
B --> C[Execute SQL query]
C --> D{sqlite3.Error raised?}
D -- No --> E[Build normal response payload]
E --> F[Return successful HTTP response]
D -- Yes --> G["logger.error(Database error: e)"]
G --> H[Raise HTTPException 500 Database error occurred]
H --> I[Return generic 500 to client]
F --> J[Close connection in finally]
I --> J[Close connection in finally]
J --> K[End]
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
📝 WalkthroughWalkthroughAdded server-side logging of SQLite database exception details in three API handlers ( Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Consider logging the full traceback for these database errors (e.g., using
logger.exception(...)orlogger.error(..., exc_info=True)) so investigations have stack context rather than just the error message string. - You now have three nearly identical
sqlite3.Errorhandlers; consider extracting a small helper (e.g.,handle_db_error(e)) or context manager to centralize the logging and HTTPException raising logic.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider logging the full traceback for these database errors (e.g., using `logger.exception(...)` or `logger.error(..., exc_info=True)`) so investigations have stack context rather than just the error message string.
- You now have three nearly identical `sqlite3.Error` handlers; consider extracting a small helper (e.g., `handle_db_error(e)`) or context manager to centralize the logging and HTTPException raising logic.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Pull request overview
This PR improves backend observability for SQLite failures in the Dataset Access API by logging sqlite3.Error exceptions server-side while preserving sanitized 500 responses to clients.
Changes:
- Added module-level logger initialization in
api/dataset_api.py. - Updated three API endpoints to catch
sqlite3.Error as eand log the error before raising a genericHTTPException(500). - Documented the new logging requirement in
.Jules/sentinel.md.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
api/dataset_api.py |
Adds logger + logs sqlite exceptions in dataset endpoints before returning sanitized 500s. |
.Jules/sentinel.md |
Adds a new Sentinel entry describing server-side logging for DB errors. |
Comments suppressed due to low confidence (1)
api/dataset_api.py:189
- In
list_datasets,connis first assigned inside thetry, but thefinallyblock unconditionally references it. Ifget_db_connection()raises beforeconnis set, this will raise anUnboundLocalErrorand mask the original DB exception. Initializeconn = Nonebefore thetry(as done in the other endpoints) so thefinallycheck is safe.
finally:
if conn:
conn.close()
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ) | ||
| except sqlite3.Error: | ||
| except sqlite3.Error as e: | ||
| logger.error(f"Database error: {e}") |
There was a problem hiding this comment.
Logging with logger.error(f"Database error: {e}") drops the stack trace, which makes incident investigation harder. Prefer logger.exception(...) or logger.error(..., exc_info=True) in the except sqlite3.Error block so the traceback is captured alongside the message.
| logger.error(f"Database error: {e}") | |
| logger.exception("Database error occurred while listing datasets") |
| ) | ||
| except sqlite3.Error: | ||
| except sqlite3.Error as e: | ||
| logger.error(f"Database error: {e}") |
There was a problem hiding this comment.
This except sqlite3.Error block logs only the exception string; without exc_info/traceback it can be difficult to diagnose which query/line failed. Use logger.exception(...) (or logger.error(..., exc_info=True)) so the traceback is emitted in logs.
| logger.error(f"Database error: {e}") | |
| logger.exception("Database error while fetching dataset metadata") |
|
|
||
| except sqlite3.Error: | ||
| except sqlite3.Error as e: | ||
| logger.error(f"Database error: {e}") |
There was a problem hiding this comment.
Same as above: this logs only the exception message without the traceback. Switch to logger.exception(...) (or logger.error(..., exc_info=True)) to capture stack traces for auditing/debugging DB failures.
| logger.error(f"Database error: {e}") | |
| logger.exception("Database error") |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
api/dataset_api.py (1)
139-190:⚠️ Potential issue | 🟡 Minor
connmay be unbound infinallyblock if exception occurs early.Unlike
get_dataset_metadata(line 199) andquery_dataset(line 261), this function doesn't initializeconn = Nonebefore thetryblock. Ifget_db_connection()raises a non-sqlite3.Errorexception, thefinallyblock will raiseUnboundLocalErrorwhen evaluatingif conn:.🔧 Suggested fix
async def list_datasets( current_auth_entity: Any = Depends(get_current_active_user_or_api_key), ): """List all available datasets (tables in the database).""" datasets = [] + conn = None try: conn = get_db_connection()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/dataset_api.py` around lines 139 - 190, The variable conn can be uninitialized if get_db_connection() raises before conn is assigned, causing an UnboundLocalError in the finally block; fix by initializing conn = None before the try in the function that builds datasets (where conn, get_db_connection, cursor, and finally are used), ensure the finally checks if conn is not None (or uses a safer context) before calling conn.close(), and keep existing sqlite3.Error handling intact so cleanup never references an unbound variable.
🧹 Nitpick comments (1)
api/dataset_api.py (1)
19-20: Consider includingexc_info=Truefor full stack traces.The PR goal is to audit potential SQL injection probing, but logging only the exception message (
f"Database error: {e}") omits the stack trace. Per the pattern inapi/embedding_agent/app.py(line 120), the project usesexc_info=Trueto capture full tracebacks for debugging.🔧 Suggested improvement across all three logging sites
- logger.error(f"Database error: {e}") + logger.error(f"Database error: {e}", exc_info=True)Apply to lines 185, 243, and 324.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/dataset_api.py` around lines 19 - 20, Search dataset_api.py for the logger.error calls that log exceptions using the message "Database error: {e}" (three occurrences) and change each call to include exc_info=True so the full stack trace is captured (e.g., logger.error(f"Database error: {e}", exc_info=True)); preserve the existing message and behavior otherwise and follow the pattern used in api/embedding_agent/app.py where exc_info=True is passed to logger.error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@api/dataset_api.py`:
- Around line 139-190: The variable conn can be uninitialized if
get_db_connection() raises before conn is assigned, causing an UnboundLocalError
in the finally block; fix by initializing conn = None before the try in the
function that builds datasets (where conn, get_db_connection, cursor, and
finally are used), ensure the finally checks if conn is not None (or uses a
safer context) before calling conn.close(), and keep existing sqlite3.Error
handling intact so cleanup never references an unbound variable.
---
Nitpick comments:
In `@api/dataset_api.py`:
- Around line 19-20: Search dataset_api.py for the logger.error calls that log
exceptions using the message "Database error: {e}" (three occurrences) and
change each call to include exc_info=True so the full stack trace is captured
(e.g., logger.error(f"Database error: {e}", exc_info=True)); preserve the
existing message and behavior otherwise and follow the pattern used in
api/embedding_agent/app.py where exc_info=True is passed to logger.error.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 43cec153-7832-4009-87fc-dad313481485
📒 Files selected for processing (2)
.Jules/sentinel.mdapi/dataset_api.py
🚨 Severity: MEDIUM
💡 Vulnerability: Swallowed database exceptions hid potential underlying malicious activity (e.g., SQL injection probing) behind generic 500 responses without creating any auditable server logs.
🔧 Fix: Imported the
loggingmodule and initialized a standard logger. Caughtsqlite3.Erroraseand loggedlogger.error(f"Database error: {e}")before returning the sanitizedHTTPExceptioninapi/dataset_api.py.✅ Verification: Review
api/dataset_api.pyand invoke an invalid database call to verifylogger.errorappears in stdout while the 500 response remains clean.PR created automatically by Jules for task 6310620232349936904 started by @daggerstuff
Summary by Sourcery
Add server-side logging for database exceptions in dataset APIs to improve security monitoring and auditability.
Bug Fixes:
Documentation:
Summary by cubic
Add server-side logging for database exceptions in the dataset API to improve monitoring without leaking details to clients. Errors are logged and clients still receive a generic 500.
loggingand setlogger = logging.getLogger(__name__)inapi/dataset_api.py.sqlite3.Error as eand loglogger.error(f"Database error: {e}")before raisingHTTPException(500)..Jules/sentinel.md.Written for commit 827b285. Summary will update on new commits.
Summary by CodeRabbit