Conversation
…resource conflict detection Backend changes: - Add comprehensive schemas for model health, size, server info, and selection analysis - Implement LiteLLM client methods for /v1/model/info and /health/latest endpoints - Create model processor service with intelligent grouping, size parsing, and provider inference - Add new API endpoints: /models/by-server and /models/analyze-selection - Support for duplicate detection across multiple Ollama servers Frontend changes: - Create ServerGroup component to display models grouped by Ollama host - Enhance ModelCard with health status, size badges, and performance metrics - Add SelectionAnalysis component with real-time conflict warnings and recommendations - Update useLLMModels hook to support server grouping and selection analysis - Create utility functions for model selection and formatting - Add comprehensive CSS styling with color-coded health/size indicators Key features: - Group models by Ollama server (10.0.1.14, 10.0.1.150, etc.) - Display health status with response times from LiteLLM health checks - Show model sizes (tiny/small/medium/large) with memory estimates - Detect and warn about resource conflicts (multiple large models on same host) - Identify duplicate models across servers with performance recommendations - Resolve :latest tags to actual versions (e.g., llama3.2:latest → 3b) - Backward compatible with simple flat model list view This helps users avoid memory contention by preventing selection of multiple large models on the same Ollama server, improving council debate performance.
…d of single column
There was a problem hiding this comment.
Pull request overview
This PR implements a comprehensive server-grouped model list with health monitoring and resource conflict detection for LiteLLM models across multiple Ollama servers. The feature helps users avoid memory contention by detecting and warning about resource conflicts when selecting multiple large models on the same host.
Key Changes:
- Added backend services for model processing with intelligent grouping, size parsing, provider inference, and health status integration
- Created new API endpoints for server-grouped models and selection analysis with conflict detection
- Built frontend components for server grouping visualization, health status display, and real-time selection analysis with warnings
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 24 comments.
Show a summary per file
| File | Description |
|---|---|
frontend/src/utils/modelSelection.js |
New utility functions for model lookup, diversity analysis, server statistics, and formatting helpers |
frontend/src/styles/index.css |
Comprehensive styling for server groups, health badges, size tiers, warnings, and selection analysis UI |
frontend/src/hooks/useLLMModels.js |
Enhanced hook with server grouping support, selection tracking, and analysis integration |
frontend/src/components/ServerGroup.jsx |
New component to display models grouped by Ollama server with collapsible sections and resource warnings |
frontend/src/components/SelectionAnalysis.jsx |
Component showing diversity scores, warnings, and recommendations for model selections |
frontend/src/components/ModelCard.jsx |
Enhanced card with health status, size badges, performance metrics, and duplicate detection |
frontend/src/components/QueryInput.jsx |
Updated to support both server-grouped and flat model views with selection analysis integration |
frontend/src/App.jsx |
Integration of new hooks and props for server grouping functionality |
backend/services/model_processor.py |
Core service for parsing model metadata, grouping by server, detecting duplicates, and analyzing selections |
backend/services/litellm_client.py |
Added methods to fetch model info and health status from LiteLLM endpoints |
backend/models/schemas.py |
New schemas for model health, size, server info, warnings, and selection analysis |
backend/api/config.py |
New endpoints for server-grouped models and selection analysis with caching |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| {rec.from_server.split('//')[1]} to{' '} | ||
| {rec.to_server.split('//')[1]} |
There was a problem hiding this comment.
The server URL parsing using split('//')[1] will throw an error if the URL doesn't contain '//'. Add defensive checks or use the URL API for more robust parsing to handle edge cases where the server URL format might be unexpected.
| all_models = [] | ||
| for server_data in server_map.values(): | ||
| all_models.extend(server_data["models"]) | ||
|
|
||
| # Group by base_model:actual_tag to find duplicates | ||
| model_groups = {} | ||
| for model in all_models: | ||
| key = f"{model.base_model}:{model.actual_tag}" | ||
| if key not in model_groups: | ||
| model_groups[key] = [] | ||
| model_groups[key].append(model) | ||
|
|
||
| # Mark duplicates and set better_server | ||
| for key, models in model_groups.items(): | ||
| if len(models) > 1: | ||
| # Sort by TPM (descending) to find best server | ||
| models_sorted = sorted( | ||
| models, | ||
| key=lambda m: m.server_tpm, | ||
| reverse=True | ||
| ) | ||
| best_server = models_sorted[0].api_base | ||
|
|
||
| for model in models: | ||
| model.is_duplicate = True | ||
| model.duplicate_count = len(models) | ||
| if model.api_base != best_server: | ||
| model.better_server = best_server | ||
|
|
There was a problem hiding this comment.
The duplicate detection iterates through all_models multiple times. With many models across servers, this could become inefficient. Consider collecting models into model_groups dictionary in a single pass during the initial model creation loop (lines 159-226) rather than iterating again through all_models.
| global _server_groups_cache | ||
|
|
||
| # Use cache if available (TODO: add expiration logic) | ||
| if _server_groups_cache is not None: | ||
| return _server_groups_cache | ||
|
|
||
| client = LiteLLMClient() | ||
| raw_data = await client.get_model_info() | ||
|
|
||
| server_groups = await process_models_with_health(raw_data) | ||
| _server_groups_cache = server_groups |
There was a problem hiding this comment.
The global _server_groups_cache is accessed and modified without any locking mechanism in an async context. This can lead to race conditions where multiple concurrent requests might all see the cache as None and trigger redundant expensive processing operations. Consider using asyncio.Lock to protect cache access or implement a proper async caching solution.
| """ | ||
| global _server_groups_cache | ||
|
|
||
| # Use cache if available (TODO: add expiration logic) |
There was a problem hiding this comment.
The TODO comment mentions adding expiration logic but provides no details about the intended cache duration or invalidation strategy. Consider documenting the expected cache behavior or creating a proper ticket to track this work. Without expiration, the cache will become stale indefinitely after the first request.
| <div | ||
| className="server-header" | ||
| onClick={() => setExpanded(!expanded)} | ||
| > |
There was a problem hiding this comment.
The server header is clickable to expand/collapse but has no ARIA attributes to indicate its role or state. Add role="button", aria-expanded={expanded}, and aria-controls attributes to improve accessibility for screen reader users who need to understand that this is an expandable section.
| warnings.append(SelectionWarning( | ||
| severity="high", | ||
| server=api_base, | ||
| message=f"⚠️ {len(large_models)} large models selected on {api_base.split('//')[1]}. Expect significant delays during model swapping.", |
There was a problem hiding this comment.
The string formatting split('//')[1] will throw an IndexError if the api_base doesn't contain '//'. This same issue appears multiple times in the codebase. Add defensive checks to handle malformed URLs gracefully, or use urllib.parse to properly parse URLs.
| <button | ||
| className="details-toggle" | ||
| onClick={(e) => { | ||
| e.stopPropagation(); | ||
| setShowDetails(!showDetails); | ||
| }} | ||
| > | ||
| {showDetails ? '▲' : '▼'} | ||
| </button> |
There was a problem hiding this comment.
The details toggle button should have an aria-label or aria-expanded attribute to indicate its purpose and state to screen reader users. Consider adding aria-expanded={showDetails} and aria-label="Toggle model details" for better accessibility.
| selectionAnalysis, | ||
| loading: modelsLoading, | ||
| handleModelSelect, | ||
| clearSelection |
There was a problem hiding this comment.
Unused variable clearSelection.
| clearSelection |
| Model processing service for organizing and analyzing LiteLLM models | ||
| """ | ||
| import re | ||
| from typing import Dict, List, Tuple |
There was a problem hiding this comment.
Import of 'Tuple' is not used.
| from typing import Dict, List, Tuple | |
| from typing import Dict, List |
| """ | ||
| import re | ||
| from typing import Dict, List, Tuple | ||
| from datetime import datetime |
There was a problem hiding this comment.
Import of 'datetime' is not used.
| from datetime import datetime |
…sibility improvements - Fix URL parsing vulnerabilities using urlparse/URL API with fallbacks - Fix cache race condition with asyncio.Lock - Implement configurable cache expiration (default 120s via CACHE_TTL_SECONDS) - Remove unused imports (Tuple, datetime) and variables (clearSelection) - Add accessibility attributes to ServerGroup and ModelCard components - Replace non-semantic div with button for ServerGroup expandable header
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 20 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| export function findModelById(modelId, serverGroups) { | ||
| for (const group of serverGroups) { | ||
| const model = group.models.find(m => m.id === modelId); | ||
| if (model) return model; | ||
| } | ||
| return null; | ||
| } |
There was a problem hiding this comment.
The function returns null when a model isn't found instead of a more descriptive result. While this works, the function name 'findModelById' suggests it should return the model or throw an error. Consider renaming to 'findModelByIdOrNull' or documenting the null return behavior clearly to avoid confusion.
| case 'unhealthy': | ||
| return '✗'; | ||
| default: | ||
| return '?'; |
There was a problem hiding this comment.
The default case returns '?' for unknown health status, but this icon may not be intuitive to users. Consider using a more descriptive icon or text label like 'Unknown' or '⚪' (hollow circle) to better indicate uncertain status.
| return '?'; | |
| return 'Unknown'; |
| getSizeClassName | ||
| } from '../utils/modelSelection'; | ||
|
|
||
| /** | ||
| * Safely extract hostname from URL string | ||
| * @param {string} url - URL to parse (e.g., "http://192.168.1.100:11434") | ||
| * @returns {string} hostname or full URL if parsing fails | ||
| */ | ||
| function extractHostname(url) { | ||
| try { | ||
| const parsed = new URL(url); | ||
| return parsed.hostname; | ||
| } catch { | ||
| // Fallback: try to extract after // and before : | ||
| const match = url.match(/\/\/([^:\/]+)/); | ||
| return match ? match[1] : url; | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
The same extractHostname function is duplicated here as well. This is a third occurrence of identical code across the codebase. Consolidate all instances into a single utility function in modelSelection.js.
| getSizeClassName | |
| } from '../utils/modelSelection'; | |
| /** | |
| * Safely extract hostname from URL string | |
| * @param {string} url - URL to parse (e.g., "http://192.168.1.100:11434") | |
| * @returns {string} hostname or full URL if parsing fails | |
| */ | |
| function extractHostname(url) { | |
| try { | |
| const parsed = new URL(url); | |
| return parsed.hostname; | |
| } catch { | |
| // Fallback: try to extract after // and before : | |
| const match = url.match(/\/\/([^:\/]+)/); | |
| return match ? match[1] : url; | |
| } | |
| } | |
| getSizeClassName, | |
| extractHostname, | |
| } from '../utils/modelSelection'; |
| function getDiversityLevel(score) { | ||
| if (score >= 70) return 'good'; | ||
| if (score >= 40) return 'fair'; | ||
| return 'poor'; | ||
| } |
There was a problem hiding this comment.
The getDiversityLevel function is defined at the bottom of the file and only used internally. Following React conventions, helper functions should typically be defined outside the component (at the top of the file) or explicitly marked as internal. Consider moving it above the component definition for better code organization.
| CACHE_TTL_SECONDS = int(os.getenv("CACHE_TTL_SECONDS", "120")) # Default 2 minutes | ||
|
|
||
|
|
There was a problem hiding this comment.
The CACHE_TTL_SECONDS environment variable is read with a string default but immediately cast to int. If the environment variable contains an invalid value, this will raise a ValueError. Consider adding error handling or validation for the environment variable value.
| CACHE_TTL_SECONDS = int(os.getenv("CACHE_TTL_SECONDS", "120")) # Default 2 minutes | |
| def _get_cache_ttl(default: int = 120) -> int: | |
| """ | |
| Retrieve cache TTL from environment, falling back to a safe default if invalid. | |
| """ | |
| raw_value = os.getenv("CACHE_TTL_SECONDS") | |
| if raw_value is None: | |
| return default | |
| try: | |
| ttl = int(raw_value) | |
| except ValueError: | |
| return default | |
| # Ensure TTL is positive; otherwise, fall back to default. | |
| if ttl <= 0: | |
| return default | |
| return ttl | |
| CACHE_TTL_SECONDS = _get_cache_ttl() # Default 2 minutes |
|
|
||
| return ( | ||
| <div | ||
| className={`model-card ${isSelected ? 'selected' : ''} ${getSizeClassName(model.size.size_tier)}`} |
There was a problem hiding this comment.
The CSS class 'selected' is applied to model cards, but this class is not defined in the stylesheet. The code should use 'model-card-selected' which is defined at line 36, or the missing 'selected' class should be added to the stylesheet.
| className={`model-card ${isSelected ? 'selected' : ''} ${getSizeClassName(model.size.size_tier)}`} | |
| className={`model-card ${isSelected ? 'model-card-selected' : ''} ${getSizeClassName(model.size.size_tier)}`} |
| hasConflict: largeModels.length > 1 || (largeModels.length >= 1 && mediumModels.length >= 1) | ||
| }; |
There was a problem hiding this comment.
The conflict detection logic uses a hardcoded threshold (hasConflict: largeModels.length > 1 || ...). This business logic is duplicated in multiple places (here and in model_processor.py). Consider extracting this to a shared constant or configuration to ensure consistency across the codebase.
| export function formatMemory(gb) { | ||
| if (gb < 1) { | ||
| return `${Math.round(gb * 1024)}MB`; | ||
| } else { | ||
| return `${gb}GB`; | ||
| } | ||
| } |
There was a problem hiding this comment.
The formatMemory function converts values less than 1 GB to MB but doesn't handle the edge case where gb is 0 or negative. While unlikely with the current data, adding validation would make the function more robust.
| checked={isSelected} | ||
| onChange={onSelect} | ||
| onClick={(e) => e.stopPropagation()} | ||
| tabIndex={-1} |
There was a problem hiding this comment.
The checkbox inside the model card has tabIndex={-1} and aria-hidden="true", which removes it from keyboard navigation. However, the parent div handles keyboard events. This creates a confusing experience where the visual checkbox can't be directly focused. Consider either making the checkbox focusable or using a custom styled element that better reflects the actual interaction model.
| tabIndex={-1} |
| @router.post("/models/analyze-selection", response_model=SelectionAnalysis) | ||
| async def analyze_model_selection(request: AnalyzeSelectionRequest): | ||
| """ | ||
| Analyze selected models for resource conflicts and provide recommendations | ||
| """ | ||
| # Get current server groups | ||
| server_groups = await get_models_by_server() | ||
|
|
||
| # Analyze selection | ||
| analysis = await analyze_selection(request.model_ids, server_groups) | ||
|
|
||
| return analysis |
There was a problem hiding this comment.
The analyze_selection endpoint doesn't validate that the model_ids in the request actually exist in the current server groups before processing. While the code handles missing models gracefully by filtering them out, this could hide issues where clients send stale or incorrect model IDs. Consider adding validation or returning information about invalid IDs.
- Consolidate duplicate extractHostname() function into shared modelSelection.js utility - Add error handling for CACHE_TTL_SECONDS parsing with validation and logging - Update CLAUDE.md with Docker build validation requirements - Addresses Copilot review feedback: DRY principle and crash prevention
Backend changes:
Frontend changes:
Key features:
This helps users avoid memory contention by preventing selection of multiple
large models on the same Ollama server, improving council debate performance.