-
Notifications
You must be signed in to change notification settings - Fork 5
Surface LLM errors to users with classified, actionable messages #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔒 Security Scan ResultsSecurity Scan SummaryScan ResultsPython SAST (Bandit)Recommendations
|
- Add new error types: RateLimitError, LLMTimeoutError, LLMAuthenticationError - Implement classify_llm_error() to detect and categorize different error types - Update safe_call_llm_with_tools() to use error classification - Update WebSocket error handler to send specific error types and messages - Add comprehensive error classification tests - Ensure all backend errors surface to users with helpful hints Co-authored-by: garland3 <1162675+garland3@users.noreply.github.com>
- Fix test that tried to modify immutable Exception class - Add comprehensive documentation in docs/error_handling_improvements.md - Add demo script to visualize error handling - Add integration tests for error flow - All tests passing (13/13) Co-authored-by: garland3 <1162675+garland3@users.noreply.github.com>
- Fix f-string formatting in logger call (use % formatting) - Fix test logic for API key check (use AND instead of OR) - Improve test for user-friendly messages (check substrings not chars) - All tests still passing (13/13) - CodeQL security scan: 0 alerts ✅ Co-authored-by: garland3 <1162675+garland3@users.noreply.github.com>
- Add comprehensive visual diagram showing error flow - Documents the complete path from error to user message - Shows classification logic and error handling at each layer - 501 total lines changed across 7 files Co-authored-by: garland3 <1162675+garland3@users.noreply.github.com>
|
@ktpedre can you review this? |
🔒 Security Scan ResultsSecurity Scan SummaryScan ResultsPython SAST (Bandit)Recommendations
|
1 similar comment
🔒 Security Scan ResultsSecurity Scan SummaryScan ResultsPython SAST (Bandit)Recommendations
|
ktpedre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good to me visually scanning through. If I want to test live, should I just checkout the copilot/report-rate-throttling-errors branch and give it a try. It should be easy to recreate the throttling events by issuing a few queries.
🔒 Security Scan ResultsSecurity Scan SummaryScan ResultsPython SAST (Bandit)Recommendations
|
…integration tests
🔒 Security Scan ResultsSecurity Scan SummaryScan ResultsPython SAST (Bandit)Recommendations
|
🔒 Security Scan ResultsSecurity Scan SummaryScan ResultsPython SAST (Bandit)Recommendations
|
| detail="Rate limit exceeded. Please try again later." | ||
| ) | ||
|
|
||
| logger.info(f"Chat completion requested for model: {request.model}") |
Check failure
Code scanning / CodeQL
Log Injection High
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 1 day ago
To fix the log injection vulnerability, sanitize user input before logging. Specifically, remove or replace newline characters from user-supplied strings to prevent log injection attacks as recommended in the background. For this case, before logging request.model, process the value to remove \n and \r (and, optionally, mark or quote it to make it clear it's user-supplied). In the code, assign a sanitized version of request.model to a local variable (e.g., model_name) and use this sanitized value in the log entry.
Edits required:
- In mocks/llm-mock/main_rate_limit.py, around line 180, create a sanitized version of
request.modeland log that instead. - No new methods or imports are needed, as Python string methods suffice.
-
Copy modified lines R180-R181
| @@ -177,7 +177,8 @@ | ||
| detail="Rate limit exceeded. Please try again later." | ||
| ) | ||
|
|
||
| logger.info(f"Chat completion requested for model: {request.model}") | ||
| model_name = str(request.model).replace('\r', '').replace('\n', '') | ||
| logger.info(f"Chat completion requested for model: {model_name}") | ||
|
|
||
| # Simulate random errors | ||
| error_type = should_simulate_error() |
| @app.post("/test/scenario/{scenario}") | ||
| async def set_test_scenario(scenario: str, response_data: Dict[str, Any] = None): | ||
| """Set specific test scenario for controlled testing.""" | ||
| logger.info(f"Test scenario set: {scenario}") |
Check failure
Code scanning / CodeQL
Log Injection High
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 1 day ago
The problem arises from directly logging the user-provided scenario string. To mitigate log injection, we should sanitize the input before logging. The common, recommended approach for plain text logs is to remove or replace any newline and carriage return characters (\r, \n) from the user-provided value to prevent misleading or forged log entries.
The best fix here is to sanitize the scenario string immediately before logging it, replacing \r and \n with empty strings. You can achieve this inline in the log call or assign the sanitized value to a new variable before logging. Since we only see the relevant lines, apply the change directly on or immediately before line 266 in mocks/llm-mock/main_rate_limit.py. As this is a trivial Python string operation, no additional methods or imports are needed.
-
Copy modified lines R266-R267
| @@ -263,7 +263,8 @@ | ||
| @app.post("/test/scenario/{scenario}") | ||
| async def set_test_scenario(scenario: str, response_data: Dict[str, Any] = None): | ||
| """Set specific test scenario for controlled testing.""" | ||
| logger.info(f"Test scenario set: {scenario}") | ||
| sanitized_scenario = scenario.replace('\r', '').replace('\n', '') | ||
| logger.info(f"Test scenario set: {sanitized_scenario}") | ||
|
|
||
| # Check rate limit | ||
| if not rate_limiter.is_allowed(): |
🔒 Security Scan ResultsSecurity Scan SummaryScan ResultsPython SAST (Bandit)Recommendations
|
…e error classification
🔒 Security Scan ResultsSecurity Scan SummaryScan ResultsPython SAST (Bandit)Recommendations
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements comprehensive error handling for LLM service failures, addressing the issue where users were left with no feedback when rate limits or other backend errors occurred. The implementation classifies errors into specific domain types (rate limits, timeouts, authentication failures) and surfaces user-friendly messages to the frontend while logging detailed information for debugging.
Key changes:
- Error classification system that transforms technical LLM errors into user-friendly messages
- New domain error types for rate limits, timeouts, and authentication failures
- Enhanced WebSocket error handling with categorized error types sent to frontend
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| backend/domain/errors.py | Added new error types: RateLimitError, LLMTimeoutError, LLMAuthenticationError, LLMServiceError |
| backend/application/chat/utilities/error_utils.py | Implemented classify_llm_error() function to detect and classify errors with user-friendly messages |
| backend/main.py | Enhanced WebSocket handler to catch specific error types and send categorized error responses to frontend |
| backend/application/chat/service.py | Added logic to bubble up DomainError exceptions to transport layer for consistent handling |
| backend/tests/test_error_classification.py | Unit tests for error classification (9 test cases) |
| backend/tests/test_error_flow_integration.py | Integration tests for error flow (4 test cases) |
| docs/developer/error_handling_improvements.md | Documentation explaining error handling improvements and error messages |
| docs/developer/error_flow_diagram.md | Visual diagram showing complete error flow from LLM to UI |
| docs/developer/README.md | Updated to reference new error handling documentation |
| scripts/demo_error_handling.py | Demonstration script showing error classification examples |
| mocks/llm-mock/main_rate_limit.py | Mock LLM server with rate limiting and error simulation for testing |
| config/defaults/llmconfig-buggy.yml | Configuration for mock rate-limited LLM server |
| agent_start.sh | Improved process cleanup to avoid killing all Python processes |
| .env.example | Changed APP_NAME from "Chat UI 13" to "ATLAS" |
| IMPLEMENTATION_SUMMARY.md | Comprehensive summary of implementation and testing results |
Comments suppressed due to low confidence (6)
backend/application/chat/service.py:4
- Import of 'json' is not used.
import json
backend/application/chat/service.py:5
- Import of 'asyncio' is not used.
import asyncio
backend/application/chat/service.py:26
- Import of 'tool_utils' is not used.
Import of 'notification_utils' is not used.
from .utilities import tool_utils, file_utils, notification_utils, error_utils
backend/application/chat/service.py:28
- Import of 'AgentContext' is not used.
Import of 'AgentEvent' is not used.
from .agent.protocols import AgentContext, AgentEvent
backend/application/chat/service.py:29
- Import of 'create_authorization_manager' is not used.
from core.auth_utils import create_authorization_manager
backend/application/chat/utilities/error_utils.py:334
- Illegal class 'NoneType' raised; will result in a TypeError being raised instead.
raise last_error
| error_types = ["server_error", "network_error", None, None, None, None] | ||
| error_type = random.choice(error_types) | ||
|
|
||
| if error_type: |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring states "~10% chance of server or network error" but the implementation has a 2/6 (approximately 33%) chance. The error_types list has 2 error values out of 6 total elements. Update the documentation to reflect the actual probability, or adjust the list to match the documented 10% (e.g., use 1 error type and 9 None values for ~10%).
| error_types = ["server_error", "network_error", None, None, None, None] | |
| error_type = random.choice(error_types) | |
| if error_type: | |
| # 1 in 10 chance (~10%) of simulating an error | |
| error_types = ["error"] + [None] * 9 | |
| error_marker = random.choice(error_types) | |
| if error_marker: | |
| error_type = random.choice(["server_error", "network_error"]) |
| logger.warning("Rate limit exceeded, locking out for 30 seconds") | ||
| return False | ||
|
|
||
| from datetime import timedelta |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timedelta import should be moved to line 14 with the other datetime imports. Import statements should be organized at the top of the file, not scattered throughout the code.
| ```markdown | ||
| # Error Handling Improvements | ||
|
|
||
| ## Problem | ||
| When backend errors occurred (especially rate limiting from services like Cerebras), users were left staring at a non-responsive UI with no indication of what went wrong. Errors were only visible in backend logs. | ||
|
|
||
| ## Solution | ||
| Implemented comprehensive error classification and user-friendly error messaging system. | ||
|
|
||
| ## Changes | ||
|
|
||
| ### 1. New Error Types (`backend/domain/errors.py`) | ||
| - `RateLimitError` - For rate limiting scenarios | ||
| - `LLMTimeoutError` - For timeout scenarios | ||
| - `LLMAuthenticationError` - For authentication failures | ||
| - `LLMServiceError` - For generic LLM service failures | ||
|
|
||
| ### 2. Error Classification (`backend/application/chat/utilities/error_utils.py`) | ||
| Added `classify_llm_error()` function that: | ||
| - Detects error type from exception class name or message content | ||
| - Returns appropriate domain error class | ||
| - Provides user-friendly message (shown in UI) | ||
| - Provides detailed log message (for debugging) | ||
|
|
||
| ### 3. WebSocket Error Handling (`backend/main.py`) | ||
| Enhanced error handling to: | ||
| - Catch specific error types (RateLimitError, LLMTimeoutError, etc.) | ||
| - Send user-friendly messages to frontend | ||
| - Include `error_type` field for frontend categorization | ||
| - Log full error details for debugging | ||
|
|
||
| ### 4. Tests | ||
| - `backend/tests/test_error_classification.py` - Unit tests for error classification | ||
| - `backend/tests/test_error_flow_integration.py` - Integration tests | ||
| - `scripts/demo_error_handling.py` - Visual demonstration | ||
|
|
||
| ## Example: Rate Limiting Error | ||
|
|
||
| ### Before | ||
| ``` | ||
| User sends message → Rate limit hit → UI sits there thinking forever | ||
| Backend logs: "litellm.RateLimitError: CerebrasException - We're experiencing high traffic..." | ||
| User: 🤷 *No idea what's happening* | ||
| ``` | ||
| ### After | ||
| ``` | ||
| User sends message → Rate limit hit → Error displayed in chat | ||
| UI shows: "The AI service is experiencing high traffic. Please try again in a moment." | ||
| Backend logs: "Rate limit error: litellm.RateLimitError: CerebrasException - We're experiencing high traffic..." | ||
| User: ✅ *Knows to wait and try again* | ||
| ``` | ||
| ## Error Messages | ||
| | Error Type | User Message | When It Happens | | ||
| |------------|--------------|-----------------| | ||
| | **RateLimitError** | "The AI service is experiencing high traffic. Please try again in a moment." | API rate limits exceeded | | ||
| | **LLMTimeoutError** | "The AI service request timed out. Please try again." | Request takes too long | | ||
| | **LLMAuthenticationError** | "There was an authentication issue with the AI service. Please contact your administrator." | Invalid API keys, auth failures | | ||
| | **LLMServiceError** | "The AI service encountered an error. Please try again or contact support if the issue persists." | Generic LLM service errors | | ||
| ## Security & Privacy | ||
| - Sensitive details (API keys, etc.) NOT exposed to users | ||
| - Full error details logged for admin debugging | ||
| - User messages are helpful but non-technical | ||
| ## Testing | ||
| Run the demonstration: | ||
| ```bash | ||
| python scripts/demo_error_handling.py | ||
| ``` | ||
|
|
||
| Run tests: | ||
| ```bash | ||
| cd backend | ||
| export PYTHONPATH=/path/to/atlas-ui-3/backend | ||
| python -m pytest tests/test_error_classification.py -v | ||
| python -m pytest tests/test_error_flow_integration.py -v | ||
| ``` | ||
| ``` |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This markdown file is incorrectly wrapped in a code fence. The opening markdown on line 1 and closing on line 81 should be removed. Markdown documentation files should not be wrapped in code fences.
| ```markdown | ||
| # Error Flow Diagram | ||
|
|
||
| ## Complete Error Handling Flow | ||
|
|
||
| ``` | ||
| ┌─────────────────────────────────────────────────────────────────────┐ | ||
| │ USER SENDS MESSAGE │ | ||
| └─────────────────────────────────────────────────────────────────────┘ | ||
| │ | ||
| ▼ | ||
| ┌─────────────────────────────────────────────────────────────────────┐ | ||
| │ WebSocket Handler (main.py) │ | ||
| │ handle_chat() async function │ | ||
| └─────────────────────────────────────────────────────────────────────┘ | ||
| │ | ||
| ▼ | ||
| ┌─────────────────────────────────────────────────────────────────────┐ | ||
| │ ChatService.handle_chat_message() │ | ||
| │ (service.py) │ | ||
| └─────────────────────────────────────────────────────────────────────┘ | ||
| │ | ||
| ▼ | ||
| ┌─────────────────────────────────────────────────────────────────────┐ | ||
| │ ChatOrchestrator.execute() │ | ||
| │ (orchestrator.py) │ | ||
| └─────────────────────────────────────────────────────────────────────┘ | ||
| │ | ||
| ▼ | ||
| ┌─────────────────────────────────────────────────────────────────────┐ | ||
| │ ToolsModeRunner.run() │ | ||
| │ (modes/tools.py) │ | ||
| └─────────────────────────────────────────────────────────────────────┘ | ||
| │ | ||
| ▼ | ||
| ┌─────────────────────────────────────────────────────────────────────┐ | ||
| │ error_utils.safe_call_llm_with_tools() │ | ||
| │ (utilities/error_utils.py) │ | ||
| └─────────────────────────────────────────────────────────────────────┘ | ||
| │ | ||
| ▼ | ||
| ┌─────────────────────────────────────────────────────────────────────┐ | ||
| │ LLMCaller.call_with_tools() │ | ||
| │ (modules/llm/litellm_caller.py) │ | ||
| └─────────────────────────────────────────────────────────────────────┘ | ||
| │ | ||
| ▼ | ||
| ┌─────────────────────────────────────────────────────────────────────┐ | ||
| │ LiteLLM Library │ | ||
| │ (calls Cerebras/OpenAI/etc.) │ | ||
| └─────────────────────────────────────────────────────────────────────┘ | ||
| │ | ||
| ▼ | ||
| ┌─────────────┴─────────────┐ | ||
| │ │ | ||
| ┌──────▼───────┐ ┌───────▼────────┐ | ||
| │ SUCCESS │ │ ERROR │ | ||
| │ (200 OK) │ │ (Rate Limit) │ | ||
| └──────┬───────┘ └───────┬────────┘ | ||
| │ │ | ||
| │ ▼ | ||
| │ ┌──────────────────────────────┐ | ||
| │ │ Exception: RateLimitError │ | ||
| │ │ "We're experiencing high │ | ||
| │ │ traffic right now!" │ | ||
| │ └──────────┬───────────────────┘ | ||
| │ │ | ||
| │ ▼ | ||
| │ ┌──────────────────────────────┐ | ||
| │ │ error_utils.classify_llm_ │ | ||
| │ │ error(exception) │ | ||
| │ │ │ | ||
| │ │ Returns: │ | ||
| │ │ - error_class: RateLimitError│ | ||
| │ │ - user_msg: "The AI service │ | ||
| │ │ is experiencing high │ | ||
| │ │ traffic..." │ | ||
| │ │ - log_msg: Full details │ | ||
| │ └──────────┬───────────────────┘ | ||
| │ │ | ||
| │ ▼ | ||
| │ ┌──────────────────────────────┐ | ||
| │ │ Raise RateLimitError(user_msg)│ | ||
| │ └──────────┬───────────────────┘ | ||
| │ │ | ||
| │ ▼ | ||
| ┌───────────────────┴─────────────────────────┴─────────────────────┐ | ||
| │ Back to WebSocket Handler (main.py) │ | ||
| │ Exception Catching │ | ||
| └────────────────────────────────────────────────────────────────────┘ | ||
| │ | ||
| ┌─────────────┴─────────────┐ | ||
| │ │ | ||
| ┌──────▼────────┐ ┌────────▼────────────┐ | ||
| │ except │ │ except │ | ||
| │ RateLimitError │ │ LLMTimeoutError │ | ||
| │ │ │ LLMAuth...Error │ | ||
| │ Send to user: │ │ ValidationError │ | ||
| │ { │ │ etc. │ | ||
| │ type: "error",│ │ │ | ||
| │ message: user │ │ Send appropriate │ | ||
| │ friendly msg,│ │ message to user │ | ||
| │ error_type: │ │ │ | ||
| │ "rate_limit" │ │ │ | ||
| │ } │ │ │ | ||
| └───────┬────────┘ └────────┬────────────┘ | ||
| │ │ | ||
| └──────────┬───────────────┘ | ||
| │ | ||
| ▼ | ||
| ┌─────────────────────────────────────────────────────────────────────┐ | ||
| │ WebSocket Message Sent │ | ||
| │ { │ | ||
| │ "type": "error", │ | ||
| │ "message": "The AI service is experiencing high traffic...", │ | ||
| │ "error_type": "rate_limit" │ | ||
| │ } │ | ||
| └─────────────────────────────────────────────────────────────────────┘ | ||
| │ | ||
| ▼ | ||
| ┌─────────────────────────────────────────────────────────────────────┐ | ||
| │ Frontend (websocketHandlers.js) │ | ||
| │ │ | ||
| │ case 'error': │ | ||
| │ setIsThinking(false) │ | ||
| │ addMessage({ │ | ||
| │ role: 'system', │ | ||
| │ content: `Error: ${data.message}`, │ | ||
| │ timestamp: new Date().toISOString() │ | ||
| │ }) │ | ||
| └─────────────────────────────────────────────────────────────────────┘ | ||
| │ | ||
| ▼ | ||
| ┌─────────────────────────────────────────────────────────────────────┐ | ||
| │ UI DISPLAYS ERROR │ | ||
| │ │ | ||
| │ System Message: │ | ||
| │ "Error: The AI service is experiencing high traffic. │ | ||
| │ Please try again in a moment." │ | ||
| │ │ | ||
| │ [User can see the error and knows what to do] │ | ||
| └─────────────────────────────────────────────────────────────────────┘ | ||
| ``` | ||
| ## Key Points | ||
| 1. **Error Classification**: The `classify_llm_error()` function examines the exception type and message to determine the appropriate error category. | ||
| 2. **User-Friendly Messages**: Technical errors are translated into helpful, actionable messages for users. | ||
| 3. **Detailed Logging**: Full error details are logged for debugging purposes (not shown to users). | ||
| 4. **Error Type Field**: The `error_type` field allows the frontend to potentially handle different error types differently in the future (e.g., automatic retry for timeouts). | ||
| 5. **No Sensitive Data Exposure**: API keys, stack traces, and other sensitive information are never sent to the frontend. | ||
| ``` |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This markdown file is incorrectly wrapped in a code fence. The opening markdown on line 1 and closing on line 156 should be removed. Markdown documentation files should not be wrapped in code fences.
| # Implementation Complete: Rate Limiting & Backend Error Reporting | ||
|
|
||
| ## ✅ Task Completed Successfully | ||
|
|
||
| All backend errors (including rate limiting) are now properly reported to users with helpful, actionable messages. | ||
|
|
||
| --- | ||
|
|
||
| ## What Was Changed | ||
|
|
||
| ### 1. Error Classification System | ||
| Created a comprehensive error detection and classification system that: | ||
| - Detects rate limit errors (Cerebras, OpenAI, etc.) | ||
| - Detects timeout errors | ||
| - Detects authentication failures | ||
| - Handles generic LLM errors | ||
|
|
||
| ### 2. User-Friendly Error Messages | ||
| Users now see helpful messages instead of silence: | ||
|
|
||
| | Situation | User Sees | | ||
| |-----------|-----------| | ||
| | Rate limit hit | "The AI service is experiencing high traffic. Please try again in a moment." | | ||
| | Request timeout | "The AI service request timed out. Please try again." | | ||
| | Auth failure | "There was an authentication issue with the AI service. Please contact your administrator." | | ||
| | Other errors | "The AI service encountered an error. Please try again or contact support if the issue persists." | | ||
|
|
||
| ### 3. Security & Privacy | ||
| - ✅ No sensitive information (API keys, internal errors) exposed to users | ||
| - ✅ Full error details still logged for debugging | ||
| - ✅ CodeQL security scan: 0 vulnerabilities | ||
|
|
||
| --- | ||
|
|
||
| ## Files Modified (8 files, 501 lines) | ||
|
|
||
| ### Backend Core | ||
| - `backend/domain/errors.py` - New error types | ||
| - `backend/application/chat/utilities/error_utils.py` - Error classification logic | ||
| - `backend/main.py` - Enhanced WebSocket error handling | ||
|
|
||
| ### Tests (All Passing ✅) | ||
| - `backend/tests/test_error_classification.py` - 9 unit tests | ||
| - `backend/tests/test_error_flow_integration.py` - 4 integration tests | ||
|
|
||
| ### Documentation | ||
| - `docs/error_handling_improvements.md` - Complete guide | ||
| - `docs/error_flow_diagram.md` - Visual flow diagram | ||
| - `scripts/demo_error_handling.py` - Interactive demonstration | ||
|
|
||
| --- | ||
|
|
||
| ## How to Test | ||
|
|
||
| ### 1. Run Automated Tests | ||
| ```bash | ||
| cd backend | ||
| export PYTHONPATH=/path/to/atlas-ui-3/backend | ||
| python -m pytest tests/test_error_classification.py tests/test_error_flow_integration.py -v | ||
| ``` | ||
| **Result**: 13/13 tests passing ✅ | ||
|
|
||
| ### 2. View Demonstration | ||
| ```bash | ||
| python scripts/demo_error_handling.py | ||
| ``` | ||
| Shows examples of all error types and their user-friendly messages. | ||
|
|
||
| ### 3. Manual Testing (Optional) | ||
| To see the error handling in action: | ||
| 1. Start the backend server | ||
| 2. Configure an invalid API key or trigger a rate limit | ||
| 3. Send a message through the UI | ||
| 4. Observe the error message displayed to the user | ||
|
|
||
| --- | ||
|
|
||
| ## Before & After Example | ||
|
|
||
| ### Before (The Problem) | ||
| ``` | ||
| User: *Sends a message* | ||
| Backend: *Hits Cerebras rate limit* | ||
| UI: *Sits there thinking... forever* | ||
| Backend Logs: "litellm.RateLimitError: We're experiencing high traffic..." | ||
| User: 🤷 "Is it broken? Should I refresh? Wait?" | ||
| ``` | ||
|
|
||
| ### After (The Solution) | ||
| ``` | ||
| User: *Sends a message* | ||
| Backend: *Hits Cerebras rate limit* | ||
| UI: *Shows error message in chat* | ||
| "The AI service is experiencing high traffic. | ||
| Please try again in a moment." | ||
| Backend Logs: "Rate limit error: litellm.RateLimitError: ..." | ||
| User: ✅ "OK, I'll wait a bit and try again" | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Key Benefits | ||
|
|
||
| 1. **Better User Experience**: Users know what happened and what to do | ||
| 2. **Reduced Support Burden**: Fewer "why isn't it working?" questions | ||
| 3. **Maintained Security**: No sensitive data exposed | ||
| 4. **Better Debugging**: Full error details still logged | ||
| 5. **Extensible**: Easy to add new error types in the future | ||
|
|
||
| --- | ||
|
|
||
| ## What Happens Now | ||
|
|
||
| The error classification system is now active and will: | ||
| - Automatically detect and classify backend errors | ||
| - Send user-friendly messages to the frontend | ||
| - Log detailed error information for debugging | ||
| - Work for any LLM provider (Cerebras, OpenAI, Anthropic, etc.) | ||
|
|
||
| No further action needed - the system is ready to use! | ||
|
|
||
| --- | ||
|
|
||
| ## Documentation | ||
|
|
||
| For more details, see: | ||
| - `docs/error_handling_improvements.md` - Complete technical documentation | ||
| - `docs/error_flow_diagram.md` - Visual diagram of error flow | ||
| - Code comments in modified files | ||
|
|
||
| --- | ||
|
|
||
| ## Security Verification | ||
|
|
||
| ✅ CodeQL Security Scan: **0 alerts** | ||
| ✅ Code Review: **All comments addressed** | ||
| ✅ Tests: **13/13 passing** | ||
| ✅ No sensitive data exposure verified |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove emojis from this documentation file. The codebase convention is "No emojis please" in code or docs. Replace checkmarks and other emojis with text equivalents (e.g., "✅" → "[PASS]" or "DONE").
| print(""" | ||
| ✅ All errors are now properly classified and communicated to users | ||
| Key improvements: | ||
| 1. Rate limit errors → Clear message to wait and try again | ||
| 2. Timeout errors → Clear message about timeout, suggest retry | ||
| 3. Auth errors → User told to contact admin (no key exposure) | ||
| 4. Generic errors → Helpful message with support guidance | ||
| ✅ Detailed error information is still logged for debugging | ||
| ✅ No sensitive information is exposed to users | ||
| ✅ Users are no longer left wondering what happened | ||
| """) |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove emojis from this script. The codebase convention is "No emojis please" in code or docs. Replace checkmarks with text equivalents (e.g., "✅" → "[PASS]" or "DONE").
| User: 🤷 *No idea what's happening* | ||
| ``` | ||
| ### After | ||
| ``` | ||
| User sends message → Rate limit hit → Error displayed in chat | ||
| UI shows: "The AI service is experiencing high traffic. Please try again in a moment." | ||
| Backend logs: "Rate limit error: litellm.RateLimitError: CerebrasException - We're experiencing high traffic..." | ||
| User: ✅ *Knows to wait and try again* |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove emojis from this documentation file. The codebase convention is "No emojis please" in code or docs. Replace emojis with text equivalents.
| ) | ||
| from domain.sessions.models import Session | ||
| from domain.errors import DomainError | ||
| from interfaces.llm import LLMProtocol, LLMResponse |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'LLMResponse' is not used.
| from interfaces.llm import LLMProtocol, LLMResponse | |
| from interfaces.llm import LLMProtocol |
Backend errors (rate limits, timeouts, auth failures) were logged but never surfaced to users, leaving the UI in a perpetual thinking state with no feedback.
Changes
Error Classification (
backend/application/chat/utilities/error_utils.py)classify_llm_error()to detect error types from exception contentDomain Errors (
backend/domain/errors.py)RateLimitError,LLMTimeoutError,LLMAuthenticationErrorWebSocket Error Handling (
backend/main.py)error_typefieldExample:
Error messages are user-friendly, security-conscious (no API key exposure), and extensible.
Tests: 13 new tests covering classification logic and error flow
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.