fix: pop up errors during onboarding, make health banner only appear after onboarding, make openAI not trigger model fetching when not selected#616
Conversation
|
@lucaseduoli I might need to test it since it onbaoridng will let you know any issues. |
There was a problem hiding this comment.
Pull request overview
This pull request improves onboarding state management and API query enablement across the application. The changes ensure that queries for conversations, nudges, and provider health are only enabled after onboarding is complete, preventing premature API calls. The PR also enhances error messaging for provider validation failures by parsing and extracting detailed error information from various API response formats.
Key changes:
- Added onboarding completion tracking with cross-tab synchronization via localStorage and context API
- Updated query hooks to conditionally enable based on onboarding status and provider health
- Improved error message extraction from provider validation failures to show clearer, more actionable messages to users
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
src/api/provider_validation.py |
Added error parsing functions to extract detailed messages from various provider API error formats; replaced generic error messages with specific error details |
frontend/contexts/chat-context.tsx |
Added isOnboardingComplete state with localStorage synchronization and cross-tab sync via storage events |
frontend/components/chat-renderer.tsx |
Updated to mark onboarding as complete in context when onboarding finishes or is skipped |
frontend/app/api/queries/useGetConversationsQuery.ts |
Added check to only enable query after onboarding completion |
frontend/app/api/queries/useGetNudgesQuery.ts |
Added checks to enable query only after onboarding completion and when LLM provider is healthy |
frontend/app/api/queries/useProviderHealthQuery.ts |
Added checks to enable query after onboarding completion and disable during active ingestion; included hasChatError in query key for refetching |
frontend/components/provider-health-banner.tsx |
Improved error message display logic to handle cases where LLM and embedding have same or different errors |
frontend/app/onboarding/_components/onboarding-card.tsx |
Fixed provider configuration check to only apply to currently selected provider |
frontend/app/onboarding/_components/ibm-onboarding.tsx |
Updated formatting and query enablement logic for IBM Watson models |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Include hasChatError in queryKey so React Query refetches when it changes | ||
| // This ensures the health check runs with test_completion=true when chat errors occur | ||
| const testCompletion = params?.test_completion ?? hasChatError; | ||
| const queryKey = ["provider", "health", testCompletion, hasChatError]; |
There was a problem hiding this comment.
Including hasChatError in the query key will cause the query to refetch whenever hasChatError changes, even though the testCompletion value computed from it may not have changed. This could lead to unnecessary refetches. For example, if params?.test_completion is explicitly true, the query will refetch when hasChatError toggles between true/false, even though testCompletion remains true. Consider only including testCompletion in the query key, not hasChatError.
| const queryKey = ["provider", "health", testCompletion, hasChatError]; | |
| const queryKey = ["provider", "health", testCompletion]; |
| // If health data is not available yet, assume healthy (optimistic) | ||
| // Only disable if health data exists and shows LLM error | ||
| const { data: health } = useProviderHealthQuery(); | ||
| const isLLMHealthy = health === undefined || (health?.status === "healthy" && !health?.llm_error); |
There was a problem hiding this comment.
The condition health === undefined will never be true because useProviderHealthQuery() is called without checking if it's enabled. When the query is disabled (e.g., during onboarding), health will be undefined, but the parent query will also be disabled. However, there's a circular dependency here: this query depends on the health query being enabled, but the health query is only enabled after onboarding is complete. During onboarding, both queries will be disabled, so this optimistic assumption won't help.
More importantly, the condition health?.status === "healthy" && !health?.llm_error is checking both status === "healthy" AND !llm_error, but according to the health query interface, if the status is "healthy", there shouldn't be an llm_error. This seems redundant. Consider simplifying to just !health?.llm_error or document why both checks are necessary.
| // If health data is not available yet, assume healthy (optimistic) | |
| // Only disable if health data exists and shows LLM error | |
| const { data: health } = useProviderHealthQuery(); | |
| const isLLMHealthy = health === undefined || (health?.status === "healthy" && !health?.llm_error); | |
| // Only disable if health data shows LLM error | |
| // According to the health query interface, if status is "healthy", llm_error will not be present | |
| const { data: health } = useProviderHealthQuery(); | |
| const isLLMHealthy = !health?.llm_error; |
| (!!debouncedEndpoint && !!debouncedApiKey && !!debouncedProjectId) || | ||
| getFromEnv || | ||
| alreadyConfigured, |
There was a problem hiding this comment.
The enabled condition has been changed to require all three values (endpoint, apiKey, and projectId) to be present. However, when getFromEnv is true, debouncedApiKey will be empty (because apiKey is set to "" on line 122). This means the query will never be enabled when using environment API keys, even though it should be. The condition should be: enabled: (!!debouncedEndpoint && (!!debouncedApiKey || getFromEnv) && !!debouncedProjectId) || alreadyConfigured.
| (!!debouncedEndpoint && !!debouncedApiKey && !!debouncedProjectId) || | |
| getFromEnv || | |
| alreadyConfigured, | |
| ( | |
| !!debouncedEndpoint && | |
| (!!debouncedApiKey || getFromEnv) && | |
| !!debouncedProjectId | |
| ) || alreadyConfigured, |
| if "message" in error_data: | ||
| return error_data["message"] | ||
|
|
||
| # Generic format: {"message": "..."} |
There was a problem hiding this comment.
The comment on line 91 says "Generic format: {"message": "..."}" but the code is checking for "detail" field, not "message". The "message" field is already checked on line 88. This comment appears to be incorrect or the code logic doesn't match the comment.
| # Generic format: {"message": "..."} | |
| # Generic format: {"detail": "..."} |
| def _parse_json_error_message(error_text: str) -> str: | ||
| """Parse JSON error message and extract just the message field.""" | ||
| try: | ||
| # Try to parse as JSON | ||
| error_data = json.loads(error_text) | ||
|
|
||
| if isinstance(error_data, dict): | ||
| # WatsonX format: {"errors": [{"code": "...", "message": "..."}], ...} | ||
| if "errors" in error_data and isinstance(error_data["errors"], list): | ||
| errors = error_data["errors"] | ||
| if len(errors) > 0 and isinstance(errors[0], dict): | ||
| message = errors[0].get("message", "") | ||
| if message: | ||
| return message | ||
| code = errors[0].get("code", "") | ||
| if code: | ||
| return f"Error: {code}" | ||
|
|
||
| # OpenAI format: {"error": {"message": "...", "type": "...", "code": "..."}} | ||
| if "error" in error_data: | ||
| error_obj = error_data["error"] | ||
| if isinstance(error_obj, dict): | ||
| message = error_obj.get("message", "") | ||
| if message: | ||
| return message | ||
|
|
||
| # Direct message field | ||
| if "message" in error_data: | ||
| return error_data["message"] | ||
|
|
||
| # Generic format: {"detail": "..."} | ||
| if "detail" in error_data: | ||
| return error_data["detail"] | ||
| except (json.JSONDecodeError, ValueError, TypeError): | ||
| pass | ||
|
|
||
| # Return original text if not JSON or can't parse | ||
| return error_text |
There was a problem hiding this comment.
The _parse_json_error_message function has significant code duplication with _extract_error_details (lines 51-108). Both functions parse the same error formats (WatsonX, OpenAI, generic) in nearly identical ways. Consider refactoring to eliminate this duplication, perhaps by making _extract_error_details call _parse_json_error_message for the JSON parsing logic, or extracting the common parsing logic into a shared helper function.
| # If the error message contains JSON, parse it to extract just the message | ||
| error_str = str(e) | ||
| if "IBM Watson API error: " in error_str: | ||
| json_part = error_str.split("IBM Watson API error: ", 1)[1] | ||
| parsed_message = _parse_json_error_message(json_part) | ||
| if parsed_message != json_part: | ||
| raise Exception(f"IBM Watson API error: {parsed_message}") |
There was a problem hiding this comment.
The error re-parsing logic (lines 511-517) checks if the error string contains "IBM Watson API error: " and then tries to parse the JSON part. However, this same parsing is already done in _extract_error_details (line 498-502). This creates a double-parsing scenario that could be avoided. Consider handling the error message parsing completely within _extract_error_details to eliminate this redundant logic.
| # If the error message contains JSON, parse it to extract just the message | |
| error_str = str(e) | |
| if "IBM Watson API error: " in error_str: | |
| json_part = error_str.split("IBM Watson API error: ", 1)[1] | |
| parsed_message = _parse_json_error_message(json_part) | |
| if parsed_message != json_part: | |
| raise Exception(f"IBM Watson API error: {parsed_message}") |
| # If the error message contains JSON, parse it to extract just the message | ||
| error_str = str(e) | ||
| if "IBM Watson API error: " in error_str: | ||
| json_part = error_str.split("IBM Watson API error: ", 1)[1] | ||
| parsed_message = _parse_json_error_message(json_part) | ||
| if parsed_message != json_part: | ||
| raise Exception(f"IBM Watson API error: {parsed_message}") |
There was a problem hiding this comment.
This error re-parsing logic is identical to the one in _test_watsonx_completion_with_tools (lines 511-517). This duplication should be refactored into a shared helper function or eliminated by ensuring _extract_error_details handles the parsing completely.
| return localStorage.getItem(ONBOARDING_STEP_KEY) === null; | ||
| }); | ||
|
|
||
| // Sync onboarding completion state with localStorage |
There was a problem hiding this comment.
The comment says "Sync onboarding completion state with localStorage", but the effect doesn't write to localStorage—it only reads from it. This effect is monitoring localStorage changes (including cross-tab changes) and updating component state accordingly. Consider updating the comment to something like "Monitor localStorage for onboarding completion changes".
| // Sync onboarding completion state with localStorage | |
| // Monitor localStorage for onboarding completion changes |
| # Generic format: {"detail": "..."} | ||
| if "detail" in error_data: | ||
| return error_data["detail"] | ||
| except (json.JSONDecodeError, ValueError, TypeError): |
There was a problem hiding this comment.
'except' clause does nothing but pass and there is no explanatory comment.
| except (json.JSONDecodeError, ValueError, TypeError): | |
| except (json.JSONDecodeError, ValueError, TypeError): | |
| # If parsing fails, fall back to returning the original error text. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 12 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } else { | ||
| // No filter to create, just complete | ||
|
|
||
| // Wait a bit before completing | ||
| setTimeout(() => { | ||
| onComplete(); | ||
| }, 1000); | ||
| } |
There was a problem hiding this comment.
The onboarding completion callback is invoked inside both branches (filter creation and no filter), but the timing is different. When a filter is being created, onComplete() is called after the filter mutation completes (line 223). When no filter is needed, it's called immediately with a 1-second delay (line 231).
This creates a race condition where filter creation might not complete before onboarding is marked as complete, potentially causing the filter to not be properly associated. Consider always waiting for any async operations to complete before calling onComplete(), or at least ensure consistent timing.
|
|
||
| // Set error message and jump back one step | ||
| setError(errorMessage); | ||
| setCurrentStep(STEP_LIST.length); |
There was a problem hiding this comment.
The error handling logic has a potential issue. When a file fails during ingestion (lines 105-151), the code:
- Sets the error message (line 140)
- Sets currentStep to STEP_LIST.length (line 141) to show "Done"
- Then after 1 second, resets to null (line 149)
However, setting currentStep to STEP_LIST.length (which equals 4) to indicate completion seems incorrect for an error state. The step list has indices 0-3, so index 4 would be out of bounds. This might cause the AnimatedProviderSteps component to show unexpected behavior.
Consider setting currentStep to a specific error state or to the last valid step index when an error occurs.
| setCurrentStep(STEP_LIST.length); | |
| setCurrentStep(STEP_LIST.length - 1); |
| // Check if there are any active ingestion tasks | ||
| const { data: tasks = [] } = useGetTasksQuery(); | ||
| const hasActiveIngestion = tasks.some( | ||
| (task) => | ||
| task.status === "pending" || | ||
| task.status === "running" || | ||
| task.status === "processing", | ||
| ); |
There was a problem hiding this comment.
[nitpick] The hasActiveIngestion check disables health queries during ingestion, but the useGetTasksQuery() is called unconditionally. This means tasks are always being polled even when not needed. Since useGetTasksQuery likely polls frequently, this could be inefficient.
Consider making the tasks query conditional as well, or accepting that tasks will always be fetched since they're used elsewhere in the application.
|
|
||
| Raises: | ||
| Exception: If validation fails with message "Setup failed, please try again or select a different provider." | ||
| Exception: If validation fails, raises the original exception with the actual error message. |
There was a problem hiding this comment.
[nitpick] The docstring update is misleading. The function still raises an exception on validation failure, but the message has changed from a generic one to "raises the original exception with the actual error message." However, looking at line 175, it just re-raises the caught exception without modification.
The docstring should clarify that the function propagates the original error message from the provider API instead of replacing it with a generic message.
| Exception: If validation fails, raises the original exception with the actual error message. | |
| Exception: If validation fails, propagates the original exception and its error message from the provider API. |
| const [isOnboardingComplete, setIsOnboardingComplete] = useState(() => { | ||
| if (typeof window === "undefined") return false; | ||
| return localStorage.getItem("onboarding-step") === null; |
There was a problem hiding this comment.
There's an inconsistency in how onboarding completion is checked. In chat-context.tsx, the code checks if localStorage.getItem(ONBOARDING_STEP_KEY) === null (line 121), while in chat/page.tsx, it checks localStorage.getItem("onboarding-step") === null (line 644).
The constant ONBOARDING_STEP_KEY is imported in chat-context.tsx and equals "onboarding_current_step" (using underscore), but chat/page.tsx uses the hardcoded string "onboarding-step" (using hyphen).
This inconsistency means the two checks are looking at different localStorage keys and will produce different results. The chat page should import and use ONBOARDING_STEP_KEY constant for consistency.
| scoreThreshold: parsedFilterData?.scoreThreshold ?? 0, | ||
| }, | ||
| { | ||
| enabled: isOnboardingComplete && !isConversationsLoading, // Only fetch nudges after onboarding is complete AND chat history is not loading |
There was a problem hiding this comment.
The nudges query now depends on both isOnboardingComplete and !isConversationsLoading, but isConversationsLoading is extracted from useGetConversationsQuery which itself is only enabled when isOnboardingComplete is true. This creates a circular dependency where:
- Conversations query won't run until onboarding is complete
- Nudges query won't run until conversations are loaded
- But conversations query is disabled during onboarding
This means nudges will always wait for conversations to load first, even though they don't necessarily need to. Consider if this is the intended behavior or if nudges should be enabled as soon as onboarding completes, independent of conversations loading state.
| enabled: isOnboardingComplete && !isConversationsLoading, // Only fetch nudges after onboarding is complete AND chat history is not loading | |
| enabled: isOnboardingComplete, // Only fetch nudges after onboarding is complete |
| def _parse_json_error_message(error_text: str) -> str: | ||
| """Parse JSON error message and extract just the message field.""" | ||
| try: | ||
| # Try to parse as JSON | ||
| error_data = json.loads(error_text) | ||
|
|
||
| if isinstance(error_data, dict): | ||
| # WatsonX format: {"errors": [{"code": "...", "message": "..."}], ...} | ||
| if "errors" in error_data and isinstance(error_data["errors"], list): | ||
| errors = error_data["errors"] | ||
| if len(errors) > 0 and isinstance(errors[0], dict): | ||
| message = errors[0].get("message", "") | ||
| if message: | ||
| return message | ||
| code = errors[0].get("code", "") | ||
| if code: | ||
| return f"Error: {code}" | ||
|
|
||
| # OpenAI format: {"error": {"message": "...", "type": "...", "code": "..."}} | ||
| if "error" in error_data: | ||
| error_obj = error_data["error"] | ||
| if isinstance(error_obj, dict): | ||
| message = error_obj.get("message", "") | ||
| if message: | ||
| return message | ||
|
|
||
| # Direct message field | ||
| if "message" in error_data: | ||
| return error_data["message"] | ||
|
|
||
| # Generic format: {"detail": "..."} | ||
| if "detail" in error_data: | ||
| return error_data["detail"] | ||
| except (json.JSONDecodeError, ValueError, TypeError): | ||
| pass | ||
|
|
||
| # Return original text if not JSON or can't parse | ||
| return error_text | ||
|
|
||
|
|
||
| def _extract_error_details(response: httpx.Response) -> str: | ||
| """Extract detailed error message from API response.""" | ||
| try: | ||
| # Try to parse JSON error response | ||
| error_data = response.json() | ||
|
|
||
| # Common error response formats | ||
| if isinstance(error_data, dict): | ||
| # WatsonX format: {"errors": [{"code": "...", "message": "..."}], ...} | ||
| if "errors" in error_data and isinstance(error_data["errors"], list): | ||
| errors = error_data["errors"] | ||
| if len(errors) > 0 and isinstance(errors[0], dict): | ||
| # Extract just the message from the first error | ||
| message = errors[0].get("message", "") | ||
| if message: | ||
| return message | ||
| # Fallback to code if no message | ||
| code = errors[0].get("code", "") | ||
| if code: | ||
| return f"Error: {code}" | ||
|
|
||
| # OpenAI format: {"error": {"message": "...", "type": "...", "code": "..."}} | ||
| if "error" in error_data: | ||
| error_obj = error_data["error"] | ||
| if isinstance(error_obj, dict): | ||
| message = error_obj.get("message", "") | ||
| error_type = error_obj.get("type", "") | ||
| code = error_obj.get("code", "") | ||
| if message: | ||
| details = message | ||
| if error_type: | ||
| details += f" (type: {error_type})" | ||
| if code: | ||
| details += f" (code: {code})" | ||
| return details | ||
|
|
||
| # Anthropic format: {"error": {"message": "...", "type": "..."}} | ||
| if "message" in error_data: | ||
| return error_data["message"] | ||
|
|
||
| # Generic format: {"message": "..."} | ||
| if "detail" in error_data: | ||
| return error_data["detail"] | ||
|
|
||
| # If JSON parsing worked but no structured error found, try parsing text | ||
| response_text = response.text[:500] | ||
| parsed = _parse_json_error_message(response_text) | ||
| if parsed != response_text: | ||
| return parsed | ||
| return response_text | ||
|
|
||
| except (json.JSONDecodeError, ValueError): | ||
| # If JSON parsing fails, try parsing the text as JSON string | ||
| response_text = response.text[:500] if response.text else f"HTTP {response.status_code}" | ||
| parsed = _parse_json_error_message(response_text) | ||
| if parsed != response_text: | ||
| return parsed | ||
| return response_text |
There was a problem hiding this comment.
[nitpick] There's code duplication in the error message parsing logic. The _parse_json_error_message function (lines 11-48) and _extract_error_details function (lines 51-108) contain very similar error parsing logic for WatsonX, OpenAI, and other formats.
The _extract_error_details function even calls _parse_json_error_message as a fallback (lines 97, 106), suggesting that these two functions could be better unified or one could delegate to the other more consistently to reduce duplication.
| } = useGetIBMModelsQuery( | ||
| { | ||
| endpoint: debouncedEndpoint ? debouncedEndpoint : undefined, | ||
| apiKey: getFromEnv ? "" : debouncedApiKey ? debouncedApiKey : undefined, |
There was a problem hiding this comment.
[nitpick] The ternary operator formatting is inconsistent and could be clearer. The condition getFromEnv ? "" : debouncedApiKey ? debouncedApiKey : undefined can be simplified to:
apiKey: getFromEnv ? "" : (debouncedApiKey || undefined),This is more concise and achieves the same result: empty string when using environment variables, the debounced API key if it exists, or undefined otherwise.
| apiKey: getFromEnv ? "" : debouncedApiKey ? debouncedApiKey : undefined, | |
| apiKey: getFromEnv ? "" : (debouncedApiKey || undefined), |
| <AnimatedProviderSteps | ||
| currentStep={currentStep} | ||
| setCurrentStep={setCurrentStep} | ||
| isCompleted={false} | ||
| steps={STEP_LIST} | ||
| storageKey={ONBOARDING_UPLOAD_STEPS_KEY} | ||
| hasError={!!error} | ||
| /> |
There was a problem hiding this comment.
The hasError prop is passed to AnimatedProviderSteps but it's not clear if this component has been updated to accept and handle this prop. Please ensure that AnimatedProviderSteps has been updated accordingly to use the hasError prop for visual feedback.
| )?.[0] ?? | ||
| undefined) && |
There was a problem hiding this comment.
This use of variable 'undefined' always evaluates to false.
| )?.[0] ?? | |
| undefined) && | |
| )?.[0] !== undefined && |
edwinjosechittilappilly
left a comment
There was a problem hiding this comment.
LGTM, on testing it is working.
This pull request introduces several improvements to onboarding state management and query enablement logic throughout the frontend codebase. The key changes ensure that API queries for conversations, nudges, and provider health are only enabled after onboarding is complete, and in some cases, when the LLM provider is healthy and ingestion is not active. Additionally, onboarding completion is now tracked and synchronized across browser tabs, and the context API has been extended to expose onboarding status. There are also minor UI and logic fixes related to provider configuration and error messaging.
Onboarding State Management
isOnboardingCompleteandsetOnboardingCompletetoChatContext, including logic to sync onboarding state with localStorage and across browser tabs. Onboarding completion is now set in context when onboarding finishes. (frontend/contexts/chat-context.tsx,frontend/components/chat-renderer.tsx) [1] [2] [3] [4] [5] [6] [7]Query Enablement Logic
useGetConversationsQuery,useGetNudgesQuery, anduseProviderHealthQueryso queries are only enabled after onboarding is complete, and (for nudges) when the LLM provider is healthy, and (for provider health) when ingestion is not active. This prevents premature API calls before the app is ready. (frontend/app/api/queries/useGetConversationsQuery.ts,frontend/app/api/queries/useGetNudgesQuery.ts,frontend/app/api/queries/useProviderHealthQuery.ts) [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]Provider Health and Error Messaging
frontend/app/api/queries/useProviderHealthQuery.ts,frontend/components/provider-health-banner.tsx) [1] [2] [3] [4]Onboarding UI Logic
frontend/app/onboarding/_components/onboarding-card.tsx,frontend/app/onboarding/_components/ibm-onboarding.tsx) [1] [2] [3] [4] [5]Code Cleanup
frontend/components/provider-health-banner.tsx,frontend/app/onboarding/_components/ibm-onboarding.tsx) [1] [2]Let me know if you'd like to go through any of these changes in more detail!