-
Notifications
You must be signed in to change notification settings - Fork 5
Collection: adding an error message column #328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds a nullable error_message column to collections (migration + model), adds an error-extraction helper used to persist failure messages during collection creation, and renames variables / tweaks the mismatch error string in document read logic. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant API as Collections API
participant BG as Background Task
participant DB as Database
Client->>API: POST /collections
API->>DB: Insert collection (status: pending)
API->>BG: Enqueue creation job
BG->>DB: Attempt creation
alt Success
BG->>DB: Update status: completed (error_message: null)
else Failure
BG->>API: Propagate exception
API->>API: extract_error_message(err)
API->>DB: Update status: failed (error_message: parsed message)
end
API-->>Client: 202 Accepted
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🔭 Outside diff range comments (1)
backend/app/api/routes/collections.py (1)
259-268: Use the extracted, sanitized message for both persistence and callbackEnsures consistent, user-friendly messaging and avoids leaking overly verbose payloads.
- message = extract_error_message(err) - collection.error_message = message + message = extract_error_message(err) + collection.error_message = message @@ - callback.fail(str(err)) + # Prefer the sanitized message for callback responses + callback.fail(message if 'message' in locals() and message else str(err))
🧹 Nitpick comments (4)
backend/app/crud/document.py (1)
84-93: Fix grammar and improve clarity of mismatch error messageUse proper spacing and plurality; optional: include missing IDs to aid debugging.
- raise ValueError(f"Requested atleast {n} document retrieved {m}") + raise ValueError(f"Requested {n} documents, retrieved {m}")If useful, consider logging which IDs were missing:
found = {d.id for d in results} missing = [str(i) for i in doc_ids if i not in found] logger.error("... | {'missing_ids': %s}", missing, exc_info=True)backend/app/alembic/versions/5a59c6c29a82_add_error_message_column_in_collections_.py (1)
20-24: Consider text type or app-level truncation for potentially long error messagesError payloads can be verbose. Two options:
- DB: switch to
sa.Text()for unlimited size across dialects.- App: truncate stored value (e.g., 1–4KB). I’ve proposed app-level truncation in
extract_error_message.- sa.Column("error_message", sqlmodel.sql.sqltypes.AutoString(), nullable=True), + sa.Column("error_message", sa.Text(), nullable=True),backend/app/models/collection.py (1)
46-46: Modernize type hint to PEP 604 union (per Ruff UP045)Use
str | Nonefor consistency with project style and Python 3.10+.- error_message: Optional[str] = Field(default=None, nullable=True) + error_message: str | None = Field(default=None, nullable=True)backend/app/api/routes/collections.py (1)
33-39: Add unit tests for error extraction edge casesCover JSON payload, malformed payload (ensuring fallback), OpenAIError path, and length capping.
I can add tests similar to
tests/api/routes/test_threads.py::test_handle_openai_error*for:
- JSON payload with {"error":{"message":"..."}}
- Non-JSON/non-Python payload (fallback to str)
- OpenAIError with .body/.message
- Extremely long message (verify truncation)
Want me to open a follow-up PR with these tests?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backend/app/alembic/versions/5a59c6c29a82_add_error_message_column_in_collections_.py(1 hunks)backend/app/api/routes/collections.py(4 hunks)backend/app/crud/document.py(1 hunks)backend/app/models/collection.py(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
backend/app/alembic/versions/5a59c6c29a82_add_error_message_column_in_collections_.py (3)
backend/app/alembic/versions/3389c67fdcb4_add_alter_columns_in_collections_table.py (1)
upgrade(29-61)backend/app/alembic/versions/f23675767ed2_add_collection_tables.py (1)
upgrade(20-46)backend/app/alembic/versions/4aa1f48c6321_add_inconistency_fixes.py (1)
upgrade(20-49)
backend/app/api/routes/collections.py (2)
backend/app/api/routes/threads.py (1)
handle_openai_error(52-56)backend/app/tests/api/routes/test_threads.py (2)
test_handle_openai_error(338-349)test_handle_openai_error_with_message(352-357)
backend/app/models/collection.py (1)
backend/app/alembic/versions/3389c67fdcb4_add_alter_columns_in_collections_table.py (1)
upgrade(29-61)
🪛 Ruff (0.12.2)
backend/app/api/routes/collections.py
17-17: app.api.routes.responses.handle_openai_error imported but unused
Remove unused import: app.api.routes.responses.handle_openai_error
(F401)
backend/app/models/collection.py
46-46: Use X | None for type annotations
Convert to X | None
(UP045)
🔇 Additional comments (1)
backend/app/alembic/versions/5a59c6c29a82_add_error_message_column_in_collections_.py (1)
14-16: Migration chain integrity verifiedThe referenced down_revision
"e9dd35eff62c"is defined in
backend/app/alembic/versions/e9dd35eff62c_add_openai_conversation_table.py
and has no other children, confirming it was the head before this migration. No changes needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🔭 Outside diff range comments (1)
backend/app/api/routes/collections.py (1)
248-268: Use the sanitized message for both persistence and callbackExtract once, reuse for DB update and user callback. This avoids duplicating logic and ensures external responses don’t leak raw stack traces.
except Exception as err: logger.error( f"[do_create_collection] Collection Creation Failed | {{'collection_id': '{payload.key}', 'error': '{str(err)}'}}", exc_info=True, ) + sanitized_message = extract_error_message(err) if "assistant" in locals(): _backout(assistant_crud, assistant.id) try: collection = collection_crud.read_one(UUID(payload.key)) collection.status = CollectionStatus.failed collection.updated_at = now() - message = extract_error_message(err) - collection.error_message = message + collection.error_message = sanitized_message collection_crud._update(collection) except Exception as suberr: logger.warning( f"[do_create_collection] Failed to update collection status | {{'collection_id': '{payload.key}', 'reason': '{str(suberr)}'}}" ) - callback.fail(str(err)) + callback.fail(sanitized_message)
🧹 Nitpick comments (4)
backend/app/crud/document.py (1)
87-87: Fix grammar in error message for clarityUse “at least” and pluralize “documents”.
- raise ValueError(f"Requested atleast {n} document retrieved {m}") + raise ValueError(f"Requested at least {n} documents, retrieved {m}")backend/app/models/collection.py (1)
46-46: Adopt modern Optional syntax (ruff UP045)Prefer X | None over Optional[X] for Python 3.10+.
- error_message: Optional[str] = Field(default=None, nullable=True) + error_message: str | None = Field(default=None, nullable=True)backend/app/api/routes/collections.py (1)
17-17: Remove unused import (ruff F401)
handle_openai_erroris imported but unused.-from app.api.routes.responses import handle_openai_errorbackend/app/alembic/versions/5a59c6c29a82_add_error_message_column_in_collections_.py (1)
21-24: Column type choice: considersa.Text()for unbounded error messagesIf errors can be long,
sa.Text()is safer thanAutoString()(which may map to VARCHAR). If you prefer a bounded size, set a length (e.g.,sa.String(1024)) and clamp the string in code (as suggested).- sa.Column("error_message", sqlmodel.sql.sqltypes.AutoString(), nullable=True), + sa.Column("error_message", sa.Text(), nullable=True),
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backend/app/alembic/versions/5a59c6c29a82_add_error_message_column_in_collections_.py(1 hunks)backend/app/api/routes/collections.py(4 hunks)backend/app/crud/document.py(1 hunks)backend/app/models/collection.py(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
backend/app/models/collection.py (1)
backend/app/alembic/versions/3389c67fdcb4_add_alter_columns_in_collections_table.py (1)
upgrade(29-61)
backend/app/crud/document.py (1)
backend/app/tests/crud/documents/test_crud_document_read_many.py (3)
test_number_read_is_expected(20-27)test_negative_skip_raises_exception(66-73)test_skip_greater_than_limit_is_difference(103-113)
backend/app/alembic/versions/5a59c6c29a82_add_error_message_column_in_collections_.py (3)
backend/app/alembic/versions/3389c67fdcb4_add_alter_columns_in_collections_table.py (1)
upgrade(29-61)backend/app/alembic/versions/f23675767ed2_add_collection_tables.py (1)
upgrade(20-46)backend/app/alembic/versions/8e7dc5eab0b0_add_fk_constraint_to_user_id_columns.py (1)
upgrade(20-42)
🪛 Ruff (0.12.2)
backend/app/models/collection.py
46-46: Use X | None for type annotations
Convert to X | None
(UP045)
backend/app/api/routes/collections.py
17-17: app.api.routes.responses.handle_openai_error imported but unused
Remove unused import: app.api.routes.responses.handle_openai_error
(F401)
🔇 Additional comments (1)
backend/app/alembic/versions/5a59c6c29a82_add_error_message_column_in_collections_.py (1)
21-24: LGTM: migration aligns with model changeAdds a nullable error_message column; downgrade drops it. Matches model addition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
backend/app/api/routes/collections.py (2)
19-19: Fix linter error: unused import handle_openai_error (Ruff F401)handle_openai_error is imported but not used, which will fail linting. Either remove the import or (preferred) leverage it inside extract_error_message for OpenAIError instances.
To resolve while improving behavior, see the suggested refactor to extract_error_message below (Lines 35-59).
35-59: Harden error extraction, use handle_openai_error for OpenAIError, and avoid potential AttributeError
- Use handle_openai_error for OpenAIError to unify behavior with other routes.
- Guard message types to avoid calling .strip() on non-strings (possible AttributeError).
- Keep JSON-first parsing, fall back to ast.literal_eval, then raw body.
- Minor: accept “Error code 429 - ...” with or without colon.
Apply:
-def extract_error_message(err: Exception) -> str: - err_str = str(err).strip() - - body = re.sub(r"^Error code:\s*\d+\s*-\s*", "", err_str) - message = None - try: - payload = json.loads(body) - if isinstance(payload, dict): - message = payload.get("error", {}).get("message") - except Exception: - pass - - if message is None: - try: - payload = ast.literal_eval(body) - if isinstance(payload, dict): - message = payload.get("error", {}).get("message") - except Exception: - pass - - if not message: - message = body - - return message.strip()[:1000] +def extract_error_message(err: Exception) -> str: + # Prefer existing handler for OpenAI errors + if isinstance(err, OpenAIError): + try: + msg = handle_openai_error(err) + if isinstance(msg, str) and msg.strip(): + return msg.strip()[:1000] + except Exception: + # Fall through to generic handling + pass + + err_str = str(err).strip() + # Strip leading "Error code {n} - " (colon optional) if present + body = re.sub(r"^Error code:?\s*\d+\s*-\s*", "", err_str) + + # Try JSON payloads first + try: + payload = json.loads(body) + if isinstance(payload, dict): + error_obj = payload.get("error") + if isinstance(error_obj, dict): + msg = error_obj.get("message") + if isinstance(msg, str) and msg.strip(): + return msg.strip()[:1000] + except Exception: + pass + + # Fallback: safely evaluate single-quoted dicts or Python-like literals + try: + payload = ast.literal_eval(body) + if isinstance(payload, dict): + error_obj = payload.get("error") + if isinstance(error_obj, dict): + msg = error_obj.get("message") + if isinstance(msg, str) and msg.strip(): + return msg.strip()[:1000] + except Exception: + pass + + # Last resort: return sanitized body + return body[:1000]
🧹 Nitpick comments (1)
backend/app/api/routes/collections.py (1)
279-281: Persist sanitized message; send sanitized message in callbacks & verify DB column capacitySummary: I ran the checks you requested.
- Migration: backend/app/alembic/versions/5a59c6c29a82_add_error_message_column_in_collections_.py — adds error_message as sa.Column("error_message", sqlmodel.sql.sqltypes.AutoString(), nullable=True)
- Model: backend/app/models/collection.py — error_message: Optional[str] = Field(default=None, nullable=True)
- Tests: no tests referencing extract_error_message found under backend/app/tests
Actionable items:
- Send the sanitized message in the callback instead of str(err). Example (apply outside the shown hunk):
message = extract_error_message(err) collection.error_message = message # ... callback.fail(message)
- Confirm the DB column can hold up to 1000 chars. AutoString() is used in the migration; if you need an explicit limit, change the migration to sa.String(1000) or sa.Text() and update the model Field to use sa_column=Column(String(1000)) (or Text) so ORM and DB agree.
- Add unit tests for extract_error_message (raw text; JSON-like "Error code N - {...}"; single-quoted dict fallback; OpenAIError; oversize clamping).
I couldn't conclusively determine whether AutoString() already provides the desired capacity for your target DB — please confirm or update the migration accordingly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these settings in your CodeRabbit configuration.
📒 Files selected for processing (1)
backend/app/api/routes/collections.py(4 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
backend/app/api/routes/collections.py (2)
backend/app/api/routes/responses.py (2)
responses(297-376)handle_openai_error(27-43)backend/app/api/routes/threads.py (1)
handle_openai_error(52-56)
🪛 Ruff (0.12.2)
backend/app/api/routes/collections.py
19-19: app.api.routes.responses.handle_openai_error imported but unused
Remove unused import: app.api.routes.responses.handle_openai_error
(F401)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (1)
backend/app/api/routes/collections.py (1)
4-6: LGTM: Necessary imports for robust parsingjson, ast, and re are appropriate for the parsing logic introduced below. No issues here.
backend/app/alembic/versions/5a59c6c29a82_add_error_message_column_in_collections_.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
backend/app/crud/document.py (4)
84-84: Prefer explicit len() calls over map(...) for readability.map(len, ...) is terse but less clear. Two explicit len() calls are simpler to read and debug.
Apply this diff:
- (retrieved_count, requested_count) = map(len, (results, doc_ids)) + retrieved_count = len(results) + requested_count = len(doc_ids)
87-89: Fix grammar in the exception message.“atleast” → “at least”, and pluralize “documents”. Small polish improves operator experience and searchability in logs/alerts.
Apply this diff:
- raise ValueError( - f"Requested atleast {requested_count} document retrieved {retrieved_count}" - ) + raise ValueError( + f"Requested at least {requested_count} documents, retrieved {retrieved_count}" + )
84-95: Optional: handle duplicate doc_ids to avoid false mismatch and improve diagnostics.If callers accidentally pass duplicates in doc_ids, IN(...) returns distinct rows and you'll flag a mismatch even when all unique IDs exist. Consider comparing unique counts and/or logging which IDs are missing.
Two optional improvements:
- Dedupe for the count comparison:
- retrieved_count = len(results) - requested_count = len(doc_ids) + retrieved_count = len(results) + requested_count = len(set(doc_ids)) # treat duplicates as one requested document
- Include missing IDs in the error log for faster triage (keep counts as-is if semantics matter):
requested_ids_set = set(doc_ids) result_ids_set = {doc.id for doc in results} missing_ids = [str(_id) for _id in (requested_ids_set - result_ids_set)] logger.error( f"[DocumentCrud.read_each] Mismatch in retrieved documents | " f"{{'owner_id': {self.owner_id}, 'requested_count': {requested_count}, " f"'retrieved_count': {retrieved_count}, 'missing_doc_ids': {missing_ids}}}", exc_info=True, )Please confirm whether duplicates are expected upstream before changing the comparison semantics.
90-94: Capture or Drop the Exception Alias inread_each
Ruff flagsF841becauseerris assigned but never used. To satisfy lint and keep logs consistent, either include the exception text in the log payload or remove the alias.– File:
backend/app/crud/document.py, lines 90–94Suggested fixes:
Include the error in the log:
except ValueError as err: logger.error( - f"[DocumentCrud.read_each] Mismatch in retrieved documents | {{'owner_id': {self.owner_id}, 'requested_count': {requested_count}, 'retrieved_count': {retrieved_count}}}", + f"[DocumentCrud.read_each] Mismatch in retrieved documents | {{'owner_id': {self.owner_id}, 'requested_count': {requested_count}, 'retrieved_count': {retrieved_count}, 'error': {err}}}", exc_info=True, )Drop the unused alias:
except ValueError as err:
except ValueError: logger.error( f"[DocumentCrud.read_each] Mismatch in retrieved documents | {{'owner_id': {self.owner_id}, 'requested_count': {requested_count}, 'retrieved_count': {retrieved_count}}}", exc_info=True, )You can verify locally with: ```shell ruff check backend/app/crud/document.py
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
backend/app/crud/document.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
backend/app/crud/document.py
90-90: Local variable err is assigned to but never used
Remove assignment to unused variable err
(F841)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (1)
backend/app/crud/document.py (1)
84-86: Nice variable rename; prior feedback addressed.Switching from m/n to retrieved_count/requested_count improves readability and aligns with the earlier review suggestion. Thanks for cleaning this up.
* altering the table and routes code to add error column
Summary
Target issue is #238
Checklist
Before submitting a pull request, please ensure that you mark these task.
fastapi run --reload app/main.pyordocker compose upin the repository root and test.Summary by CodeRabbit
New Features
Bug Fixes
Chores