feat: implement server-grouped model list with health monitoring and resource conflict detection by bjeans · Pull Request #9 · bjeans/Multi-AI-Chat

bjeans · 2025-12-18T06:02:18Z

Backend changes:

Add comprehensive schemas for model health, size, server info, and selection analysis
Implement LiteLLM client methods for /v1/model/info and /health/latest endpoints
Create model processor service with intelligent grouping, size parsing, and provider inference
Add new API endpoints: /models/by-server and /models/analyze-selection
Support for duplicate detection across multiple Ollama servers

Frontend changes:

Create ServerGroup component to display models grouped by Ollama host
Enhance ModelCard with health status, size badges, and performance metrics
Add SelectionAnalysis component with real-time conflict warnings and recommendations
Update useLLMModels hook to support server grouping and selection analysis
Create utility functions for model selection and formatting
Add comprehensive CSS styling with color-coded health/size indicators

Key features:

Group models by Ollama server (10.0.1.14, 10.0.1.150, etc.)
Display health status with response times from LiteLLM health checks
Show model sizes (tiny/small/medium/large) with memory estimates
Detect and warn about resource conflicts (multiple large models on same host)
Identify duplicate models across servers with performance recommendations
Resolve :latest tags to actual versions (e.g., llama3.2:latest → 3b)
Backward compatible with simple flat model list view

This helps users avoid memory contention by preventing selection of multiple
large models on the same Ollama server, improving council debate performance.

…resource conflict detection Backend changes: - Add comprehensive schemas for model health, size, server info, and selection analysis - Implement LiteLLM client methods for /v1/model/info and /health/latest endpoints - Create model processor service with intelligent grouping, size parsing, and provider inference - Add new API endpoints: /models/by-server and /models/analyze-selection - Support for duplicate detection across multiple Ollama servers Frontend changes: - Create ServerGroup component to display models grouped by Ollama host - Enhance ModelCard with health status, size badges, and performance metrics - Add SelectionAnalysis component with real-time conflict warnings and recommendations - Update useLLMModels hook to support server grouping and selection analysis - Create utility functions for model selection and formatting - Add comprehensive CSS styling with color-coded health/size indicators Key features: - Group models by Ollama server (10.0.1.14, 10.0.1.150, etc.) - Display health status with response times from LiteLLM health checks - Show model sizes (tiny/small/medium/large) with memory estimates - Detect and warn about resource conflicts (multiple large models on same host) - Identify duplicate models across servers with performance recommendations - Resolve :latest tags to actual versions (e.g., llama3.2:latest → 3b) - Backward compatible with simple flat model list view This helps users avoid memory contention by preventing selection of multiple large models on the same Ollama server, improving council debate performance.

…d of single column

github-actions · 2025-12-18T06:03:41Z

🔍 Vulnerabilities of `bjeans/multi-ai-chat:pr-9`

📦 Image Reference bjeans/multi-ai-chat:pr-9

digest	`sha256:d51bb603b20ae3ad860d2e83868dc4187b5b938df5cc13dd45cd622d35aaa9c6`
vulnerabilities
size	48 MB
packages	74

📦 Base Image python:3-alpine

also known as	`3-alpine3.23` `3.14-alpine` `3.14-alpine3.23` `3.14.2-alpine` `3.14.2-alpine3.23` `897f7d46efcffcccfc367bab2142c1e247247caaaa53b9065240cad2531ee121` `alpine` `alpine3.23`
digest	`sha256:079b889b270e730f38bdbabf505362538db449a216895c2c45664b1538bd34d5`
vulnerabilities

Copilot

Pull request overview

This PR implements a comprehensive server-grouped model list with health monitoring and resource conflict detection for LiteLLM models across multiple Ollama servers. The feature helps users avoid memory contention by detecting and warning about resource conflicts when selecting multiple large models on the same host.

Key Changes:

Added backend services for model processing with intelligent grouping, size parsing, provider inference, and health status integration
Created new API endpoints for server-grouped models and selection analysis with conflict detection
Built frontend components for server grouping visualization, health status display, and real-time selection analysis with warnings

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 24 comments.

Show a summary per file

File	Description
`frontend/src/utils/modelSelection.js`	New utility functions for model lookup, diversity analysis, server statistics, and formatting helpers
`frontend/src/styles/index.css`	Comprehensive styling for server groups, health badges, size tiers, warnings, and selection analysis UI
`frontend/src/hooks/useLLMModels.js`	Enhanced hook with server grouping support, selection tracking, and analysis integration
`frontend/src/components/ServerGroup.jsx`	New component to display models grouped by Ollama server with collapsible sections and resource warnings
`frontend/src/components/SelectionAnalysis.jsx`	Component showing diversity scores, warnings, and recommendations for model selections
`frontend/src/components/ModelCard.jsx`	Enhanced card with health status, size badges, performance metrics, and duplicate detection
`frontend/src/components/QueryInput.jsx`	Updated to support both server-grouped and flat model views with selection analysis integration
`frontend/src/App.jsx`	Integration of new hooks and props for server grouping functionality
`backend/services/model_processor.py`	Core service for parsing model metadata, grouping by server, detecting duplicates, and analyzing selections
`backend/services/litellm_client.py`	Added methods to fetch model info and health status from LiteLLM endpoints
`backend/models/schemas.py`	New schemas for model health, size, server info, warnings, and selection analysis
`backend/api/config.py`	New endpoints for server-grouped models and selection analysis with caching

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-18T06:16:45Z

+                  {rec.from_server.split('//')[1]} to{' '}
+                  {rec.to_server.split('//')[1]}


The server URL parsing using split('//')[1] will throw an error if the URL doesn't contain '//'. Add defensive checks or use the URL API for more robust parsing to handle edge cases where the server URL format might be unexpected.

Copilot · 2025-12-18T06:16:45Z

+    all_models = []
+    for server_data in server_map.values():
+        all_models.extend(server_data["models"])
+
+    # Group by base_model:actual_tag to find duplicates
+    model_groups = {}
+    for model in all_models:
+        key = f"{model.base_model}:{model.actual_tag}"
+        if key not in model_groups:
+            model_groups[key] = []
+        model_groups[key].append(model)
+
+    # Mark duplicates and set better_server
+    for key, models in model_groups.items():
+        if len(models) > 1:
+            # Sort by TPM (descending) to find best server
+            models_sorted = sorted(
+                models,
+                key=lambda m: m.server_tpm,
+                reverse=True
+            )
+            best_server = models_sorted[0].api_base
+
+            for model in models:
+                model.is_duplicate = True
+                model.duplicate_count = len(models)
+                if model.api_base != best_server:
+                    model.better_server = best_server
+


The duplicate detection iterates through all_models multiple times. With many models across servers, this could become inefficient. Consider collecting models into model_groups dictionary in a single pass during the initial model creation loop (lines 159-226) rather than iterating again through all_models.

Copilot · 2025-12-18T06:16:45Z

+    global _server_groups_cache
+
+    # Use cache if available (TODO: add expiration logic)
+    if _server_groups_cache is not None:
+        return _server_groups_cache
+
+    client = LiteLLMClient()
+    raw_data = await client.get_model_info()
+
+    server_groups = await process_models_with_health(raw_data)
+    _server_groups_cache = server_groups


The global _server_groups_cache is accessed and modified without any locking mechanism in an async context. This can lead to race conditions where multiple concurrent requests might all see the cache as None and trigger redundant expensive processing operations. Consider using asyncio.Lock to protect cache access or implement a proper async caching solution.

Copilot · 2025-12-18T06:16:46Z

+    """
+    global _server_groups_cache
+
+    # Use cache if available (TODO: add expiration logic)


The TODO comment mentions adding expiration logic but provides no details about the intended cache duration or invalidation strategy. Consider documenting the expected cache behavior or creating a proper ticket to track this work. Without expiration, the cache will become stale indefinitely after the first request.

Copilot · 2025-12-18T06:16:46Z

+      <div
+        className="server-header"
+        onClick={() => setExpanded(!expanded)}
+      >


The server header is clickable to expand/collapse but has no ARIA attributes to indicate its role or state. Add role="button", aria-expanded={expanded}, and aria-controls attributes to improve accessibility for screen reader users who need to understand that this is an expandable section.

Copilot · 2025-12-18T06:16:49Z

+            warnings.append(SelectionWarning(
+                severity="high",
+                server=api_base,
+                message=f"⚠️ {len(large_models)} large models selected on {api_base.split('//')[1]}. Expect significant delays during model swapping.",


The string formatting split('//')[1] will throw an IndexError if the api_base doesn't contain '//'. This same issue appears multiple times in the codebase. Add defensive checks to handle malformed URLs gracefully, or use urllib.parse to properly parse URLs.

Copilot · 2025-12-18T06:16:50Z

+          <button
+            className="details-toggle"
+            onClick={(e) => {
+              e.stopPropagation();
+              setShowDetails(!showDetails);
+            }}
+          >
+            {showDetails ? '▲' : '▼'}
+          </button>


The details toggle button should have an aria-label or aria-expanded attribute to indicate its purpose and state to screen reader users. Consider adding aria-expanded={showDetails} and aria-label="Toggle model details" for better accessibility.

Copilot · 2025-12-18T06:16:50Z

+    selectionAnalysis,
+    loading: modelsLoading,
+    handleModelSelect,
+    clearSelection


Unused variable clearSelection.

Suggested change

clearSelection

Copilot · 2025-12-18T06:16:50Z

+Model processing service for organizing and analyzing LiteLLM models
+"""
+import re
+from typing import Dict, List, Tuple


Import of 'Tuple' is not used.

Suggested change

from typing import Dict, List, Tuple

from typing import Dict, List

Copilot · 2025-12-18T06:16:50Z

+"""
+import re
+from typing import Dict, List, Tuple
+from datetime import datetime


Import of 'datetime' is not used.

Suggested change

from datetime import datetime

…sibility improvements - Fix URL parsing vulnerabilities using urlparse/URL API with fallbacks - Fix cache race condition with asyncio.Lock - Implement configurable cache expiration (default 120s via CACHE_TTL_SECONDS) - Remove unused imports (Tuple, datetime) and variables (clearSelection) - Add accessibility attributes to ServerGroup and ModelCard components - Replace non-semantic div with button for ServerGroup expandable header

Copilot

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 20 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-20T20:37:52Z

+export function findModelById(modelId, serverGroups) {
+  for (const group of serverGroups) {
+    const model = group.models.find(m => m.id === modelId);
+    if (model) return model;
+  }
+  return null;
+}


The function returns null when a model isn't found instead of a more descriptive result. While this works, the function name 'findModelById' suggests it should return the model or throw an error. Consider renaming to 'findModelByIdOrNull' or documenting the null return behavior clearly to avoid confusion.

Copilot · 2025-12-20T20:37:52Z

+    case 'unhealthy':
+      return '✗';
+    default:
+      return '?';


The default case returns '?' for unknown health status, but this icon may not be intuitive to users. Consider using a more descriptive icon or text label like 'Unknown' or '⚪' (hollow circle) to better indicate uncertain status.

Suggested change

return '?';

return 'Unknown';

Copilot · 2025-12-20T20:37:52Z

+  getSizeClassName
+} from '../utils/modelSelection';
+
+/**
+ * Safely extract hostname from URL string
+ * @param {string} url - URL to parse (e.g., "http://192.168.1.100:11434")
+ * @returns {string} hostname or full URL if parsing fails
+ */
+function extractHostname(url) {
+  try {
+    const parsed = new URL(url);
+    return parsed.hostname;
+  } catch {
+    // Fallback: try to extract after // and before :
+    const match = url.match(/\/\/([^:\/]+)/);
+    return match ? match[1] : url;
+  }
+}
+


The same extractHostname function is duplicated here as well. This is a third occurrence of identical code across the codebase. Consolidate all instances into a single utility function in modelSelection.js.

Suggested change

getSizeClassName

} from '../utils/modelSelection';

/**

* Safely extract hostname from URL string

* @param {string} url - URL to parse (e.g., "http://192.168.1.100:11434")

* @returns {string} hostname or full URL if parsing fails

*/

function extractHostname(url) {

try {

const parsed = new URL(url);

return parsed.hostname;

} catch {

// Fallback: try to extract after // and before :

const match = url.match(/\/\/([^:\/]+)/);

return match ? match[1] : url;

}

}

getSizeClassName,

extractHostname,

} from '../utils/modelSelection';

Copilot · 2025-12-20T20:37:53Z

+function getDiversityLevel(score) {
+  if (score >= 70) return 'good';
+  if (score >= 40) return 'fair';
+  return 'poor';
+}


The getDiversityLevel function is defined at the bottom of the file and only used internally. Following React conventions, helper functions should typically be defined outside the component (at the top of the file) or explicitly marked as internal. Consider moving it above the component definition for better code organization.

Copilot · 2025-12-20T20:37:53Z

+CACHE_TTL_SECONDS = int(os.getenv("CACHE_TTL_SECONDS", "120"))  # Default 2 minutes
+



The CACHE_TTL_SECONDS environment variable is read with a string default but immediately cast to int. If the environment variable contains an invalid value, this will raise a ValueError. Consider adding error handling or validation for the environment variable value.

Suggested change

CACHE_TTL_SECONDS = int(os.getenv("CACHE_TTL_SECONDS", "120")) # Default 2 minutes

def _get_cache_ttl(default: int = 120) -> int:

"""

Retrieve cache TTL from environment, falling back to a safe default if invalid.

"""

raw_value = os.getenv("CACHE_TTL_SECONDS")

if raw_value is None:

return default

try:

ttl = int(raw_value)

except ValueError:

return default

# Ensure TTL is positive; otherwise, fall back to default.

if ttl <= 0:

return default

return ttl

CACHE_TTL_SECONDS = _get_cache_ttl() # Default 2 minutes

Copilot · 2025-12-20T20:37:56Z

+
+  return (
+    <div
+      className={`model-card ${isSelected ? 'selected' : ''} ${getSizeClassName(model.size.size_tier)}`}


The CSS class 'selected' is applied to model cards, but this class is not defined in the stylesheet. The code should use 'model-card-selected' which is defined at line 36, or the missing 'selected' class should be added to the stylesheet.

Suggested change

className={`model-card ${isSelected ? 'selected' : ''} ${getSizeClassName(model.size.size_tier)}`}

className={`model-card ${isSelected ? 'model-card-selected' : ''} ${getSizeClassName(model.size.size_tier)}`}

Copilot · 2025-12-20T20:37:56Z

+    hasConflict: largeModels.length > 1 || (largeModels.length >= 1 && mediumModels.length >= 1)
+  };


The conflict detection logic uses a hardcoded threshold (hasConflict: largeModels.length > 1 || ...). This business logic is duplicated in multiple places (here and in model_processor.py). Consider extracting this to a shared constant or configuration to ensure consistency across the codebase.

Copilot · 2025-12-20T20:37:57Z

+export function formatMemory(gb) {
+  if (gb < 1) {
+    return `${Math.round(gb * 1024)}MB`;
+  } else {
+    return `${gb}GB`;
+  }
+}


The formatMemory function converts values less than 1 GB to MB but doesn't handle the edge case where gb is 0 or negative. While unlikely with the current data, adding validation would make the function more robust.

Copilot · 2025-12-20T20:37:57Z

+          checked={isSelected}
+          onChange={onSelect}
+          onClick={(e) => e.stopPropagation()}
+          tabIndex={-1}


The checkbox inside the model card has tabIndex={-1} and aria-hidden="true", which removes it from keyboard navigation. However, the parent div handles keyboard events. This creates a confusing experience where the visual checkbox can't be directly focused. Consider either making the checkbox focusable or using a custom styled element that better reflects the actual interaction model.

Suggested change

tabIndex={-1}

Copilot · 2025-12-20T20:37:57Z

+@router.post("/models/analyze-selection", response_model=SelectionAnalysis)
+async def analyze_model_selection(request: AnalyzeSelectionRequest):
+    """
+    Analyze selected models for resource conflicts and provide recommendations
+    """
+    # Get current server groups
+    server_groups = await get_models_by_server()
+
+    # Analyze selection
+    analysis = await analyze_selection(request.model_ids, server_groups)
+
+    return analysis


The analyze_selection endpoint doesn't validate that the model_ids in the request actually exist in the current server groups before processing. While the code handles missing models gracefully by filtering them out, this could hide issues where clients send stale or incorrect model IDs. Consider adding validation or returning information about invalid IDs.

- Consolidate duplicate extractHostname() function into shared modelSelection.js utility - Add error handling for CACHE_TTL_SECONDS parsing with validation and logging - Update CLAUDE.md with Docker build validation requirements - Addresses Copilot review feedback: DRY principle and crash prevention

claude added 4 commits December 18, 2025 04:58

fix: make model cards display in responsive grid (1-3 columns) instea…

bd0d15f

…d of single column

fix: improve keyboard accessibility - sync tab focus with hover state

b048a10

fix: remove checkbox from tab order to prevent double-tabbing

1dafad5

bjeans requested a review from Copilot December 18, 2025 06:02

Copilot started reviewing on behalf of bjeans December 18, 2025 06:03 View session

Copilot AI reviewed Dec 18, 2025

View reviewed changes

bjeans requested a review from Copilot December 20, 2025 20:33

Copilot started reviewing on behalf of bjeans December 20, 2025 20:33 View session

Copilot AI reviewed Dec 20, 2025

View reviewed changes

bjeans merged commit 9d114da into main Dec 20, 2025
2 checks passed

bjeans deleted the claude/organize-model-list-ZwkBE branch December 20, 2025 21:14

		{rec.from_server.split('//')[1]} to{' '}
		{rec.to_server.split('//')[1]}

	from typing import Dict, List, Tuple
	from typing import Dict, List

		CACHE_TTL_SECONDS = int(os.getenv("CACHE_TTL_SECONDS", "120")) # Default 2 minutes

-CACHE_TTL_SECONDS = int(os.getenv("CACHE_TTL_SECONDS", "120"))  # Default 2 minutes
+def _get_cache_ttl(default: int = 120) -> int:
+    """
+    Retrieve cache TTL from environment, falling back to a safe default if invalid.
+    """
+    raw_value = os.getenv("CACHE_TTL_SECONDS")
+    if raw_value is None:
+        return default
+    try:
+        ttl = int(raw_value)
+    except ValueError:
+        return default
+    # Ensure TTL is positive; otherwise, fall back to default.
+    if ttl <= 0:
+        return default
+    return ttl
+CACHE_TTL_SECONDS = _get_cache_ttl()  # Default 2 minutes

	className={`model-card ${isSelected ? 'selected' : ''} ${getSizeClassName(model.size.size_tier)}`}
	className={`model-card ${isSelected ? 'model-card-selected' : ''} ${getSizeClassName(model.size.size_tier)}`}

		hasConflict: largeModels.length > 1 \|\| (largeModels.length >= 1 && mediumModels.length >= 1)
		};

Conversation

bjeans commented Dec 18, 2025

Uh oh!

github-actions bot commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 Vulnerabilities of bjeans/multi-ai-chat:pr-9

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

github-actions bot commented Dec 18, 2025 •

edited

Loading

🔍 Vulnerabilities of `bjeans/multi-ai-chat:pr-9`