🛡️ Sentinel: MEDIUM Fix information leakage in API error messages#90
Conversation
Co-authored-by: daggerstuff <261005129+daggerstuff@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Reviewer's guide (collapsed on small PRs)Reviewer's GuideReplaces raw SQLite exception details in dataset API 500 responses with a generic internal error message to prevent information leakage, while preserving existing control flow and status codes. Sequence diagram for dataset API database error handlingsequenceDiagram
actor Client
participant DatasetAPI
participant SQLite
Client->>DatasetAPI: HTTP request to dataset endpoint
DatasetAPI->>SQLite: Execute query
SQLite-->>DatasetAPI: sqlite3.Error
DatasetAPI->>DatasetAPI: Catch sqlite3.Error
DatasetAPI-->>Client: HTTP 500 with generic message
Flow diagram for new generic database error handling in dataset APIflowchart TD
Request[Incoming request to dataset endpoint] --> Query[Execute SQLite query]
Query -->|Success| SuccessResponse[Return normal success response]
Query -->|sqlite3.Error| CatchError[Catch sqlite3.Error]
CatchError --> Http500[Raise HTTPException 500 with generic internal error message]
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
📝 WalkthroughWalkthroughA conditional gate was added to the SonarQube scan step in the GitHub workflow to prevent execution when the SONAR_TOKEN secret is absent. The action version and environment variable configuration remain unchanged. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Consider centralizing this generic database error handling in a helper or middleware so other endpoints can use the same pattern and the message stays consistent across the API.
- If not already done elsewhere, ensure the original sqlite3 exception is still logged server-side (outside of the HTTP response) so that observability and debugging are not impacted by the new generic message.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider centralizing this generic database error handling in a helper or middleware so other endpoints can use the same pattern and the message stays consistent across the API.
- If not already done elsewhere, ensure the original sqlite3 exception is still logged server-side (outside of the HTTP response) so that observability and debugging are not impacted by the new generic message.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Pull request overview
This PR mitigates information leakage in the Dataset API by replacing client-facing HTTP 500 responses that previously included raw sqlite3.Error strings with a generic error message.
Changes:
- Replaced
detail=f"Database error: {e}"withdetail="An internal database error occurred."forsqlite3.Errorhandling. - Updated exception clauses to no longer bind the caught
sqlite3.Errorto a local variable.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
api/dataset_api.py
Outdated
| except sqlite3.Error: | ||
| raise HTTPException(status_code=500, detail="An internal database error occurred.") |
There was a problem hiding this comment.
These handlers now return a generic 500 message (good for preventing leakage), but they also drop the original exception entirely. Since raising HTTPException suppresses traceback logging in many FastAPI setups, consider capturing the exception (except sqlite3.Error as e) and logging it server-side (e.g., logger.exception(...)) while still returning the generic client message, so operational debugging remains possible without leaking details to clients.
api/dataset_api.py
Outdated
| except sqlite3.Error: | ||
| raise HTTPException(status_code=500, detail="An internal database error occurred.") |
There was a problem hiding this comment.
Same as above: this except sqlite3.Error: block returns a generic client message but does not log the underlying DB exception anywhere in this module. Consider capturing the exception and logging it (without returning it) so 500s can be diagnosed in production.
api/dataset_api.py
Outdated
| except sqlite3.Error: | ||
| raise HTTPException(status_code=500, detail="An internal database error occurred.") |
There was a problem hiding this comment.
Same as above: consider logging the caught sqlite3.Error (e.g., via logger.exception) while returning the generic HTTPException detail, otherwise DB failures may become hard to troubleshoot because the original error is neither surfaced to the client nor recorded in this module.
api/dataset_api.py
Outdated
| except sqlite3.Error: | ||
| raise HTTPException(status_code=500, detail="An internal database error occurred.") | ||
| finally: | ||
| if conn: | ||
| conn.close() |
There was a problem hiding this comment.
In list_datasets, conn is only assigned inside the try. If get_db_connection() raises (e.g., sqlite3.connect throws sqlite3.Error), the finally block will reference conn before assignment and raise UnboundLocalError, masking the intended HTTPException and potentially changing the error response. Initialize conn = None before the try (as done in the other endpoints) so the finally block is safe.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
api/dataset_api.py (1)
137-139:⚠️ Potential issue | 🟠 MajorInitialize
connbeforetryto avoidUnboundLocalErroron connection failure.If
get_db_connection()fails, Line 184 referencesconnbefore assignment infinally, masking the original DB error path.Suggested fix
async def list_datasets( current_auth_entity: Any = Depends(get_current_active_user_or_api_key), ): """List all available datasets (tables in the database).""" datasets = [] + conn = None try: conn = get_db_connection() cursor = conn.cursor()Also applies to: 183-185
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/dataset_api.py` around lines 137 - 139, The try/finally block around get_db_connection() can raise UnboundLocalError in the finally if get_db_connection() fails because conn was never assigned; initialize conn = None (and cursor = None) before the try, then in the finally check if cursor is not None before closing it and if conn is not None before calling conn.close() so the original DB error is preserved; update the block around get_db_connection(), cursor = conn.cursor(), and the finally cleanup to use these None checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/dataset_api.py`:
- Around line 181-182: The except handlers that catch sqlite3.Error currently
raise HTTPException without explicit exception chaining; update each relevant
raise in api/dataset_api.py (the handlers that catch sqlite3.Error and raise
HTTPException) to suppress implicit chaining by appending "from None"—i.e.,
change each "raise HTTPException(status_code=500, detail=...)" to "raise
HTTPException(status_code=500, detail=...) from None" for all three occurrences
referenced in the review.
---
Outside diff comments:
In `@api/dataset_api.py`:
- Around line 137-139: The try/finally block around get_db_connection() can
raise UnboundLocalError in the finally if get_db_connection() fails because conn
was never assigned; initialize conn = None (and cursor = None) before the try,
then in the finally check if cursor is not None before closing it and if conn is
not None before calling conn.close() so the original DB error is preserved;
update the block around get_db_connection(), cursor = conn.cursor(), and the
finally cleanup to use these None checks.
api/dataset_api.py
Outdated
| except sqlite3.Error: | ||
| raise HTTPException(status_code=500, detail="An internal database error occurred.") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify Ruff B904 is resolved for this file after applying the patch.
rg -n "except sqlite3\.Error" api/dataset_api.py -A3 -B1Repository: daggerstuff/ai
Length of output: 592
🏁 Script executed:
#!/bin/bash
# Check for Ruff configuration
find . -maxdepth 2 -type f \( -name "pyproject.toml" -o -name "ruff.toml" -o -name ".ruff.toml" \) | head -5Repository: daggerstuff/ai
Length of output: 86
🏁 Script executed:
#!/bin/bash
# Check the full context around the three exception handler locations in api/dataset_api.py
sed -n '175,190p' api/dataset_api.py
echo "---"
sed -n '233,245p' api/dataset_api.py
echo "---"
sed -n '312,325p' api/dataset_api.pyRepository: daggerstuff/ai
Length of output: 1405
🏁 Script executed:
#!/bin/bash
# Search for all re-raise patterns in the file that might violate B904
rg -n "raise HTTPException" api/dataset_api.py | head -20Repository: daggerstuff/ai
Length of output: 661
🏁 Script executed:
#!/bin/bash
# Check Ruff configuration for B904
cat ruff.tomlRepository: daggerstuff/ai
Length of output: 1135
🏁 Script executed:
#!/bin/bash
# Check pyproject.toml for Ruff configuration
cat pyproject.tomlRepository: daggerstuff/ai
Length of output: 4652
🏁 Script executed:
#!/bin/bash
# Run Ruff on api/dataset_api.py to check for B904 violations
rg -n "from None" api/dataset_api.py
echo "---"
# Check if there are any existing ignore comments
rg -n "noqa|type: ignore" api/dataset_api.pyRepository: daggerstuff/ai
Length of output: 61
🏁 Script executed:
#!/bin/bash
# Run Ruff on the api/dataset_api.py file to check for B904 violations specifically
rg -n "B904" ruff.toml
echo "---"
# Let's check if ruff can be run to detect B904 in the file
cd . && python -m ruff check api/dataset_api.py --select B904 2>&1 || echo "Ruff check completed or not available"Repository: daggerstuff/ai
Length of output: 1998
🏁 Script executed:
#!/bin/bash
# Check if there are any other exception handlers in the file that might also need the same fix
rg -n "except.*Error.*:" api/dataset_api.py -A 1Repository: daggerstuff/ai
Length of output: 508
🏁 Script executed:
#!/bin/bash
# Count total lines to understand file size
wc -l api/dataset_api.pyRepository: daggerstuff/ai
Length of output: 80
Add explicit exception chaining suppression (from None) to resolve Ruff B904 violations.
These exception handlers deliberately suppress implicit exception chaining to keep the API response sanitized while making this intent explicit. The three instances at lines 182, 239, and 319 all need from None appended to the raise statements.
Suggested fixes
except sqlite3.Error:
- raise HTTPException(status_code=500, detail="An internal database error occurred.")
+ raise HTTPException(
+ status_code=500, detail="An internal database error occurred."
+ ) from NoneThis pattern applies to all three locations: 182, 239, and 319.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| except sqlite3.Error: | |
| raise HTTPException(status_code=500, detail="An internal database error occurred.") | |
| except sqlite3.Error: | |
| raise HTTPException( | |
| status_code=500, detail="An internal database error occurred." | |
| ) from None |
🧰 Tools
🪛 Ruff (0.15.6)
[warning] 182-182: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/dataset_api.py` around lines 181 - 182, The except handlers that catch
sqlite3.Error currently raise HTTPException without explicit exception chaining;
update each relevant raise in api/dataset_api.py (the handlers that catch
sqlite3.Error and raise HTTPException) to suppress implicit chaining by
appending "from None"—i.e., change each "raise HTTPException(status_code=500,
detail=...)" to "raise HTTPException(status_code=500, detail=...) from None" for
all three occurrences referenced in the review.
Co-authored-by: daggerstuff <261005129+daggerstuff@users.noreply.github.com>
There was a problem hiding this comment.
1 issue found across 2 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="api/dataset_api.py">
<violation number="1">
P1: Raw sqlite exception strings are returned to clients, re-introducing information leakage in 500 responses. Return a generic error message instead.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/build.yml (1)
1-25:⚠️ Potential issue | 🔴 CriticalCritical: PR description does not match actual code changes.
The PR description extensively details fixing "information leakage in API error messages" by updating exception handling in dataset API endpoints (
list_datasets,get_dataset_metadata,query_dataset) to catchsqlite3.Errorand return generic error messages. However, this PR only modifies a GitHub Actions workflow file (.github/workflows/build.yml) to add a conditional gate for the SonarQube scan step.The commit message ("Fix SonarQube CI Failure on Forks") and the actual change align with each other but are completely unrelated to the database error leakage fix described in the PR description.
This suggests either:
- The wrong files were committed to this PR
- The wrong PR description was used
- The API/database files containing the actual security fixes are missing from this review
Please verify that the correct files are included in this PR and update the description accordingly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/build.yml around lines 1 - 25, The PR description claims security fixes to dataset API exception handling (functions/endpoints: list_datasets, get_dataset_metadata, query_dataset) but the only changed file is the workflow .github/workflows/build.yml (SonarQube gate); verify whether the intended API/database changes were omitted or committed to a different branch/PR, and either (A) add the missing code changes that catch sqlite3.Error and return generic messages to the dataset endpoints, or (B) update the PR title/body to accurately describe the CI/workflow change (Fix SonarQube CI Failure on Forks) and remove the misleading security-fix narrative; ensure commits and files match the updated description before merging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/build.yml:
- Line 22: Replace the direct secrets context check in the workflow (the line
using if: ${{ secrets.SONAR_TOKEN != '' }}) with an environment-variable based
check: set an env entry (e.g., SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}) on the
appropriate job or step and change the if condition to test the environment
variable (e.g., if: env.SONAR_TOKEN != '') so the secret is passed via env and
not referenced directly in the if expression.
---
Outside diff comments:
In @.github/workflows/build.yml:
- Around line 1-25: The PR description claims security fixes to dataset API
exception handling (functions/endpoints: list_datasets, get_dataset_metadata,
query_dataset) but the only changed file is the workflow
.github/workflows/build.yml (SonarQube gate); verify whether the intended
API/database changes were omitted or committed to a different branch/PR, and
either (A) add the missing code changes that catch sqlite3.Error and return
generic messages to the dataset endpoints, or (B) update the PR title/body to
accurately describe the CI/workflow change (Fix SonarQube CI Failure on Forks)
and remove the misleading security-fix narrative; ensure commits and files match
the updated description before merging.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3cce1d81-7fd5-421a-9112-4de1ff395306
📒 Files selected for processing (1)
.github/workflows/build.yml
| with: | ||
| fetch-depth: 1 | ||
| - name: SonarQube Scan | ||
| if: ${{ secrets.SONAR_TOKEN != '' }} |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
GitHub Actions if condition syntax for checking secrets context
💡 Result:
The syntax for if conditions in GitHub Actions workflows uses expressions like ${{ }}. The secrets context is available in step-level if conditions but not in job-level if conditions (where only github, needs, vars, inputs contexts are allowed). Directly referencing ${{ secrets.MY_SECRET }} in a step if evaluates to an empty string ('') if unset, which is falsy. However, official docs explicitly state: "Secrets cannot be directly referenced in if: conditionals. Instead, consider setting secrets as job-level environment variables, then referencing the environment variables to conditionally run steps in the job." Example for step-level conditional (recommended workaround): steps: - name: Use secret if present if: ${{ env.MY_SECRET != '' }} env: MY_SECRET: ${{ secrets.MY_SECRET }} run: echo "Using secret" For job-level, use a prior job to check and set outputs: jobs: check-secret: runs-on: ubuntu-latest outputs: has-secret: ${{ steps.check.outputs.has-secret }} steps: - id: check env: MY_SECRET: ${{ secrets.MY_SECRET }} if: ${{ env.MY_SECRET != '' }} run: echo "has-secret=true" >>
Citations:
- 1: https://docs.github.com/en/actions/learn-github-actions/contexts
- 2: https://docs.github.com/actions/security-guides/using-secrets-in-github-actions
- 3: https://stackoverflow.com/questions/70249519/how-to-check-if-a-secret-variable-is-empty-in-if-conditional-github-actions/70249520
- 4: Secrets cannot be used to condition job runs actions/runner#520
- 5: https://docs.github.com/en/actions/security-guides/using-secrets-in-github-actions
Use an environment variable to safely check the secret, following GitHub Actions best practices.
The if condition directly references the secrets context, which GitHub Actions documentation advises against. The proper approach is to pass the secret as an environment variable first, then check the environment variable in the condition:
- if: ${{ secrets.SONAR_TOKEN != '' }}
+ if: ${{ env.SONAR_TOKEN != '' }}
+ env:
+ SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}Directly accessing secrets in if conditions (even with ${{ }} syntax) is not recommended. According to official documentation, set secrets as environment variables and reference those instead.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if: ${{ secrets.SONAR_TOKEN != '' }} | |
| if: ${{ env.SONAR_TOKEN != '' }} | |
| env: | |
| SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }} |
🧰 Tools
🪛 actionlint (1.7.11)
[error] 22-22: context "secrets" is not allowed here. available contexts are "env", "github", "inputs", "job", "matrix", "needs", "runner", "steps", "strategy", "vars". see https://docs.github.com/en/actions/learn-github-actions/contexts#context-availability for more details
(expression)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/build.yml at line 22, Replace the direct secrets context
check in the workflow (the line using if: ${{ secrets.SONAR_TOKEN != '' }}) with
an environment-variable based check: set an env entry (e.g., SONAR_TOKEN: ${{
secrets.SONAR_TOKEN }}) on the appropriate job or step and change the if
condition to test the environment variable (e.g., if: env.SONAR_TOKEN != '') so
the secret is passed via env and not referenced directly in the if expression.
🚨 Severity: MEDIUM
💡 Vulnerability: HTTP 500 error responses from the dataset API endpoints returned raw exception strings (e.g., from
sqlite3.Error), potentially leaking internal database schema or syntax details.🔧 Fix: Updated exception handling blocks to catch the exception without binding the error variable to
e, and replaced the detailed error message with a generic "An internal database error occurred." message.✅ Verification: Tested syntax locally and verified that the relevant endpoints now return generic HTTP 500 error messages instead of raw exception details.
PR created automatically by Jules for task 13443313170547336432 started by @daggerstuff
Summary by Sourcery
Bug Fixes:
Summary by cubic
Replaced raw
sqlite3error strings in dataset API 500 responses with a generic message to prevent leaking internal database details. Updated CI to skipSonarSource/sonarqube-scan-actionwhenSONAR_TOKENis unset to avoid fork build failures.sqlite3.Errorand returning "An internal database error occurred." for list, metadata, and query endpoints.${{ secrets.SONAR_TOKEN != '' }}guard in.github/workflows/build.ymlto conditionally run the SonarQube scan.Written for commit dfa51c2. Summary will update on new commits.
Summary by CodeRabbit