# Automating Vision Models selection using LLMs & MCP

## 🎯 **Learning Objectives**

This tutorial covers the complete MCP Vision Pipeline system with Registry of Registries (RoR) architecture:

1. **Registry of Registries Concept** - Central discovery hub for AI service registries
2. **LLM-Driven Registry Selection** - How LLMs autonomously choose the best registry via MCP tools
3. **Dynamic Model Discovery** - LLMs discover models through MCP tool exploration
4. **Pipeline Composer Usage** - Orchestrating optimal execution order
5. **MCP Server Registration** - How tools are dynamically added and servers registered

---

## 🏗️ **Architecture Overview**

The system implements **MCP-based discovery** where LLMs use MCP tools to autonomously navigate the architecture:
![MCP Vision Pipeline Architecture](mcp_tutorial_1.png)


**Key Principle:** Everything is MCP-based! LLMs discover registries, models, and execution order through MCP tool calls.

**Critical Rule:** There's no hardcoded "selection logic" - LLMs make decisions by exploring MCP tools autonomously.

---

In [7]:
# Required imports for MCP Vision Pipeline
import asyncio
import json
import subprocess
import time
from pathlib import Path

# Display utilities
from IPython.display import display, Markdown, Image
import warnings
warnings.filterwarnings('ignore')

print("🚀 MCP Vision Pipeline Tutorial Setup Complete!")
print("📅 Tutorial Date: September 3, 2025")
print("🔧 Architecture: Registry of Registries (RoR)")

🚀 MCP Vision Pipeline Tutorial Setup Complete!
📅 Tutorial Date: September 3, 2025
🔧 Architecture: Registry of Registries (RoR)


## 📚 **Section 1: Registry of Registries (RoR) Concept**

The **Registry of Registries** is a hierarchical discovery system that allows AI agents to:

### 🔍 **Multi-Tier Discovery**
1. **Tier 1:** Discover available registries
2. **Tier 2:** Explore registry capabilities
3. **Tier 3:** Find specific AI models
4. **Tier 4:** Compose execution pipelines

### 🏢 **Registry Types**
- **`openos`** - Open-source computer vision models
- **`aicenter`** - Academic research models
- **`roadnet`** - Traffic and vehicle-specific models

### 🎯 **Benefits**
- **Scalability:** Easy to add new model registries
- **Flexibility:** Dynamic model discovery
- **Governance:** Centralized orchestration control
- **Optimization:** Intelligent pipeline composition

In [8]:
# Demo: Registry Discovery Process

# Actual Registry of Registries Implementation
# From: servers/mcp1_registry_of_registries.py

import json
import os
from pathlib import Path

# Load actual registry data
registry_file = "data/registry_of_registries.json"
with open(registry_file, 'r') as f:
    registry_data = json.load(f)

print("🔧 Registry of Registries - Actual Implementation")
print("=" * 50)

# Show the actual registryList function implementation
print("📋 Available Registries from registryList():")
services = registry_data.get("services", {})
registry_info = []

for registry_id, details in services.items():
    registry_entry = {
        "id": registry_id,
        "name": registry_id.capitalize(),
        "description": details.get("description", ""),
        "capabilities": details.get("capabilities", []),
        "domains": details.get("domains", [])
    }
    registry_info.append(registry_entry)
    print(f"\n📦 {registry_id}:")
    print(f"   📝 Description: {details.get('description', 'N/A')}")
    print(f"   🎯 Capabilities: {', '.join(details.get('capabilities', []))}")
    print(f"   🏗️ Domains: {', '.join(details.get('domains', []))}")
    print(f"   🌐 Registry URL: {details.get('registry_url', 'N/A')}")

print(f"\n✅ Total registries discovered: {len(registry_info)}")
print("\n🚀 This is the actual data structure returned by the MCP server!")

🔧 Registry of Registries - Actual Implementation
📋 Available Registries from registryList():

📦 openos:
   📝 Description: OpenOS.AI Asset Registry and models for general AI and computer vision
   🎯 Capabilities: computer_vision, object_detection, vehicle_detection, license_plate_detection, ocr, vehicle_classification, color_detection, tracking
   🏗️ Domains: automotive, surveillance, industrial
   🌐 Registry URL: https://765e087eb5f9.ngrok-free.app/openos-assets/mcp

📦 roadnet:
   📝 Description: Road network analysis and traffic computer vision assets and models
   🎯 Capabilities: road_analysis, traffic_monitoring, license_plate_recognition, weather_classification, pedestrian_detection
   🏗️ Domains: transportation, traffic_management, road_safety
   🌐 Registry URL: https://765e087eb5f9.ngrok-free.app/roadnet-assets/mcp

📦 aicenter:
   📝 Description: AI Center for general AI model management and inference
   🎯 Capabilities: ai_model_management, inference, training, model_deployment
   

In [None]:
PROMPT = (
    f"Here is an image file://{IMG}. "
    "Using the OpenOS registry extract vehicle attributes from this image. "
    
    "**FOLLOW THESE 3 PHASES IN ORDER:**\n"
    "\n"
    "**PHASE 1 - DISCOVERY (NO MODEL CALLS)**\n"
    "1. Discover available registries using registryList\n"
    "2. Use registryServers to get the asset registry MCP server URLs\n"
    "3. Connect to asset registry, get metadata and list available models\n"
    "4. Identify vehicle-related analysis capabilities and extract model_url\n"
    "❌ FORBIDDEN: Do not call any analysis tools before getting pipeline\n"
    "❌ FORBIDDEN: Do not register model servers yet\n"
    "\n"
    "**PHASE 2 - PIPELINE PLANNING (MANDATORY)**\n"
    "5. Connect to pipeline-composer service from registries\n"
    "6. Send relevant model names to pipeline-composer for vehicle analysis\n"
    "7. Get the execution plan with proper order\n"
    "❌ FORBIDDEN: Do not register model servers until Phase 3\n"
    "❌ FORBIDDEN: Do not call any analysis tools until Phase 3\n"
    "\n"
    "**PHASE 3 - EXECUTION (FOLLOW PIPELINE ORDER)**\n"
    "8. Register the model server from Phase 1\n"
    "9. Execute analysis tools in EXACT order from pipeline-composer\n"
    "10. Use ONLY the tools specified in the pipeline response\n"
    "\n"
    "**CRITICAL RULES:**\n"
    "- Complete each phase before moving to next\n"
    "- Never call analysis tools before getting pipeline\n"
    "- Never register model servers before Phase 3\n"
    "- Always follow pipeline execution order exactly\n"
    "\n"
    "Start with Phase 1 and continue with Phase 2 and Phase 3"
)

## 🤖 **Section 2: LLM-Driven Registry Selection**

### 🧠 **How LLMs Select Registries Autonomously**

**Critical Point:** There is **NO hardcoded registry selection logic**! The LLM makes autonomous decisions by:

1. **Calling `registryList()` MCP tool** - Gets available registries
2. **Analyzing tool descriptions and capabilities** - From MCP responses
3. **Making intelligent decisions** - Based on task requirements
4. **Calling `registryServers(registryId)`** - For the chosen registry

### 🔧 **MCP Tool-Based Selection Process**

The LLM discovers registries through **MCP tool calls only**:

```python
# LLM calls MCP tool: registryList()
registries = llm.call_tool("registryList") 

# LLM analyzes response and chooses based on:
# - capabilities: ["computer_vision", "vehicle_detection", "ocr"]
# - domains: ["automotive", "surveillance"] 
# - description: "OpenOS.AI Asset Registry for AI and computer vision"

# LLM then calls: registryServers("openos")
servers = llm.call_tool("registryServers", registryId="openos")
```

### 💡 **Why This Architecture Works**

- **Autonomous Decision Making**: LLM reads MCP tool descriptions and makes choices
- **No Hardcoded Logic**: Selection logic resides in LLM reasoning, not code
- **Flexible & Scalable**: New registries automatically discoverable via MCP
- **Natural Language Processing**: LLM understands registry purposes from descriptions

## 🔍 **Section 3: Model Discovery Process**

### 📊 **Model Metadata Structure**

Once a registry is selected, the LLM must:

1. **Get Registry Metadata** - Understand registry structure
2. **List Available Models** - Discover all models
3. **Filter Relevant Models** - Remove non-vehicle models
4. **Analyze Model Capabilities** - Understand each model's purpose

### 🎯 **Vehicle-Relevant Models**

| Model Name | Purpose | Input | Output |
|------------|---------|--------|--------|
| `detect_colour` | Vehicle body color detection | Image | Color name |
| `detect_plate_colour` | License plate color | Image | Plate color |
| `detect_plate` | License plate location | Image | Bounding box |
| `read_plate` | OCR for license plates | Image | Plate text |
| `detect_make_model` | Vehicle identification | Image | Make & model |

### 🚫 **Non-Vehicle Models (Filtered Out by LLM)**
- `classify_weather` - Weather classification
- `detect_pedestrians` - Pedestrian detection  
- `gender_detector` - Gender classification

### 🧠 **LLM's Role in Model Selection**

**Critical Step:** The LLM must analyze ALL discovered models and intelligently filter them before sending to the pipeline composer:

1. **Analyze Model Descriptions** - Understand what each model does
2. **Match to Task Requirements** - Identify vehicle-relevant capabilities
3. **Filter Out Irrelevant Models** - Remove weather, pedestrian, gender detection
4. **Send Filtered List** - Only vehicle models go to pipeline composer

**Why This Matters:** The pipeline composer receives a curated list of relevant models, enabling better optimization and preventing unnecessary model loading.

---

# 🚀 **Hands-On Demo: Execute MCP Vision Pipeline**

This interactive section demonstrates how to execute the complete MCP Vision Pipeline from within this Jupyter notebook and analyze the results.

## 📋 **Demo Overview**

1. **Execute the Pipeline** using the simplified Jupyter client
2. **Capture the Final Response** in the correct format  
3. **Save Response to Text File** for parsing
4. **Parse with Existing Parser** to get the call sequence summary
5. **Analyze the 3-Phase Workflow** and compliance


In [9]:
# Import the MCP Vision Pipeline client
import sys
import os
import json
from pathlib import Path

# Add the current directory to Python path
current_dir = Path.cwd()
sys.path.append(str(current_dir))

# Import our simplified client
try:
    from jupyter_mcp_client import execute_mcp_vision_pipeline, save_response_for_parsing
    print("✅ MCP Vision Pipeline Client imported successfully")
except ImportError as e:
    print(f"❌ Failed to import MCP client: {e}")
    print("   Make sure jupyter_mcp_client.py is in the current directory")

✅ MCP Vision Pipeline Client imported successfully


## ⚙️ **Configuration**

Set up the parameters for your MCP Vision Pipeline execution. Update these values based on your environment:

- **Image path**: Path to the vehicle image you want to analyze
- **Registry URL**: URL of your Registry of Registries server
- **Model**: OpenAI model to use for the analysis


In [10]:
# Configuration - Update these values for your setup
IMAGE_PATH = "image1.jpg"  # Update with your image path
REGISTRY_URL = "https://765e087eb5f9.ngrok-free.app/ror/mcp"  # Update with your registry URL
MODEL = "gpt-4.1"  # or "gpt-4o-mini", "gpt-4o", etc.

print(f"📸 Image: {IMAGE_PATH}")
print(f"🔗 Registry: {REGISTRY_URL}")
print(f"🤖 Model: {MODEL}")

# Verify image file exists (if local path)
if IMAGE_PATH.startswith('/') and not os.path.exists(IMAGE_PATH):
    print(f"⚠️  Warning: Image file not found at {IMAGE_PATH}")
    print("   Make sure the path is correct or update IMAGE_PATH variable")
else:
    print("✅ Configuration looks good!")

📸 Image: image1.jpg
🔗 Registry: https://765e087eb5f9.ngrok-free.app/ror/mcp
🤖 Model: gpt-4.1
✅ Configuration looks good!


## 🚀 **Execute MCP Vision Pipeline**

Run the complete 3-phase workflow:

### **Phase 1: Discovery**
- Find registries using `registryList`
- Get asset servers using `registryServers` 
- Connect to asset registry and get metadata
- List available models and extract `model_url`

### **Phase 2: Planning** 
- Connect to pipeline-composer service
- Send vehicle-related models for analysis planning
- Get optimized execution order

### **Phase 3: Execution**
- Register model servers 
- Execute analysis tools in pipeline order
- Extract comprehensive vehicle attributes


In [11]:
# Execute the MCP Vision Pipeline
print("🚀 Starting MCP Vision Pipeline execution...")
print("=" * 60)

result = execute_mcp_vision_pipeline(
    image_path=IMAGE_PATH,
    registry_url=REGISTRY_URL,
    model=MODEL,
    verbose=True  # Set to False to reduce output
)

print("\n" + "=" * 60)
print("🏁 Execution completed!")
print(f"Status: {result['status']}")
print(f"Total rounds: {result.get('total_rounds', 0)}")
print(f"Total servers registered: {result.get('total_servers', 0)}")

if result['status'] != 'success':
    print(f"❌ Error: {result.get('error', 'Unknown error')}")
    if 'traceback' in result:
        print("Traceback:")
        print(result['traceback'])

🚀 Starting MCP Vision Pipeline execution...
🚀 Starting MCP Vision Pipeline with gpt-4.1
📋 Registry URL: https://765e087eb5f9.ngrok-free.app/ror/mcp
🖼️ Image: image1.jpg

--- Discovery Round 1 ---
🔌 Active tools: 1
Calling ResponseMcpCallInProgressEvent(item_id='mcp_68b92c864938819396a281d5f06b92aa011022aa1f6b341c', output_index=1, sequence_number=7, type='response.mcp_call.in_progress')
Completed ResponseMcpCallCompletedEvent(sequence_number=10, type='response.mcp_call.completed', output_index=1, item_id='mcp_68b92c864938819396a281d5f06b92aa011022aa1f6b341c')
Calling ResponseMcpCallInProgressEvent(item_id='mcp_68b92c8d6e148193ad72b0dde7465300011022aa1f6b341c', output_index=2, sequence_number=13, type='response.mcp_call.in_progress')
Completed ResponseMcpCallCompletedEvent(sequence_number=16, type='response.mcp_call.completed', output_index=2, item_id='mcp_68b92c8d6e148193ad72b0dde7465300011022aa1f6b341c')
Calling ResponseMcpCallInProgressEvent(item_id='mcp_68b92c99a2f08193a17d4177b8a64

Completed ResponseMcpCallCompletedEvent(sequence_number=580, type='response.mcp_call.completed', output_index=15, item_id='mcp_68b92d4f8ccc8196be50b5d784afc2000dc98caa950ac76f')
Calling ResponseMcpCallInProgressEvent(item_id='mcp_68b92d568ac081968c05ec7d751f296d0dc98caa950ac76f', output_index=16, sequence_number=583, type='response.mcp_call.in_progress')
Completed ResponseMcpCallCompletedEvent(sequence_number=586, type='response.mcp_call.completed', output_index=16, item_id='mcp_68b92d568ac081968c05ec7d751f296d0dc98caa950ac76f')
Calling ResponseMcpCallInProgressEvent(item_id='mcp_68b92d5d80c88196a78e419ea3d544d40dc98caa950ac76f', output_index=17, sequence_number=589, type='response.mcp_call.in_progress')
Completed ResponseMcpCallCompletedEvent(sequence_number=592, type='response.mcp_call.completed', output_index=17, item_id='mcp_68b92d5d80c88196a78e419ea3d544d40dc98caa950ac76f')
  🏁 Response completed
🏁 No new endpoints discovered, execution complete.

🏁 Execution completed!
Status: su

In [5]:
print(result)

{'status': 'success', 'final_response': {'type': 'response.completed', 'response': {'id': 'resp_68b91c32e2788197b838c17cf71cc74400c678d824f619da', 'created_at': 1756961843.0, 'error': None, 'incomplete_details': None, 'instructions': None, 'metadata': {}, 'model': 'gpt-4.1-2025-04-14', 'object': 'response', 'output': [{'id': 'mcpl_68b91c3309c081979d91c041a62e3af900c678d824f619da', 'server_label': 'ror', 'tools': [{'input_schema': {'properties': {}, 'type': 'object'}, 'name': 'registryList', 'annotations': {'read_only': False}, 'description': '\n    Return the list of all registries that are currently available.\n    This is the entry point for service discovery.\n    '}, {'input_schema': {'properties': {'registryId': {'title': 'Registryid', 'type': 'string'}}, 'required': ['registryId'], 'type': 'object'}, 'name': 'registryServers', 'annotations': {'read_only': False}, 'description': '\n    Given a registryId, return its asset registry MCP server.\n    The client can register this MCP 

## 💾 **Save Response for Parsing**

Save the final response to a text file so you can parse it with the existing `json_parser_mcp.py` script to get the detailed call sequence.


In [12]:
# Save the final response for parsing
if result['status'] == 'success':
    # Save to final_response.txt
    success = save_response_for_parsing(result, "final_response.txt")
    
    if success:
        print("\n📄 Response saved successfully!")
        print("   File: final_response.txt")
        print("   Size:", os.path.getsize("final_response.txt"), "bytes")
        
        # Show a preview of the saved data
        print("\n📋 Preview of saved response:")
        with open("final_response.txt", 'r') as f:
            data = json.load(f)
            print(f"   Type: {data.get('type', 'unknown')}")
            if 'response' in data:
                response = data['response']
                if isinstance(response, dict):
                    print(f"   Response ID: {response.get('id', 'unknown')}")
                    print(f"   Model: {response.get('model', 'unknown')}")
                    if 'output' in response:
                        print(f"   Output items: {len(response['output'])}")
    else:
        print("❌ Failed to save response")
else:
    print("❌ Cannot save response - execution was not successful")

✅ Final response saved to final_response.txt
   You can now parse it with: python mcp_response_final_parser.py

📄 Response saved successfully!
   File: final_response.txt
   Size: 23944 bytes

📋 Preview of saved response:
   Type: response.completed
   Response ID: resp_68b92cfecccc81969e4713b687f7fa7d0dc98caa950ac76f
   Model: gpt-4.1-2025-04-14
   Output items: 19


## 🔍 **Parse the Response**

Now use the existing parser to analyze the call sequence and extract the vehicle attributes. This will show you the exact sequence of MCP calls like:

```
--- 📞 Call #1: registryList on server 'ror' ---
--- 📞 Call #2: registryServers on server 'ror' ---  
--- 📞 Call #3: get_registry_metadata on server 'openos-assets' ---
...
--- 📞 Call #12: detect_colour on server 'openos-assets-models' ---
```


In [13]:
# Parse the response using the existing parser
if os.path.exists("final_response.txt"):
    print("🔍 Parsing the response with mcp_response_final_parser.py...")
    
    # Run the parser
    import subprocess
    try:
        result_parse = subprocess.run([
            "python3", "json_parser_mcp.py", "final_response.txt"
        ], capture_output=True, text=True, cwd=current_dir)
        
        if result_parse.returncode == 0:
            print("✅ Parsing completed successfully!")
            print("\n📊 Parsed Results:")
            print("-" * 50)
            print(result_parse.stdout)
        else:
            print("❌ Parsing failed:")
            print(result_parse.stderr)
            
    except FileNotFoundError:
        print("❌ mcp_response_final_parser.py not found in current directory")
        print("   Make sure you're running this notebook from the mcp_vision_pipeline directory")
    except Exception as e:
        print(f"❌ Error running parser: {str(e)}")
else:
    print("❌ final_response.txt not found - make sure the previous step succeeded")

🔍 Parsing the response with mcp_response_final_parser.py...
✅ Parsing completed successfully!

📊 Parsed Results:
--------------------------------------------------
                🤖 AI Assistant Execution Flow 🤖

--- 📞 Call #1: registryList on server 'ror' ---
  ▶️  Arguments: {}
  ◀️  Output: {'registries': [{'id': 'openos', 'name': 'Openos', 'description': 'OpenOS.AI Asset Registry and models for general AI and computer vision', 'capabilities': ['computer_vision', 'object_detection', 'vehicle_detection', 'license_plate_detection', 'ocr', 'vehicle_classification', 'color_detection', 'tracking'], 'domains': ['automotive', 'surveillance', 'industrial']}, {'id': 'roadnet', 'name': 'Roadnet', 'description': 'Road network analysis and traffic computer vision assets and models', 'capabilities': ['road_analysis', 'traffic_monitoring', 'license_plate_recognition', 'weather_classification', 'pedestrian_detection'], 'domains': ['transportation', 'traffic_management', 'road_safety']}, {'id': 'ai

## 📊 **Analysis: 3-Phase Workflow Validation**

Based on the parsed output above, let's analyze the execution to verify proper 3-phase compliance:

### ✅ **Expected Call Sequence Pattern**

**Phase 1: Discovery (Calls 1-4)**
1. `registryList` - Discover available registries
2. `registryServers` - Get OpenOS asset registry URL  
3. `get_registry_metadata` - Connect and get metadata (with model_url)
4. `list_models` - Discover all available vision models

**Phase 2: Planning (Calls 5-6)**  
5. `registryServers` - Get pipeline-composer service URL
6. `compose_pipeline` - **CRITICAL** - Get execution order

**Phase 3: Execution (Calls 7-12)**
7. `register_mcp_tool` - Register model server
8. `detect_plate_colour` - Execute step 1 of pipeline
9. `detect_plate` - Execute step 2 of pipeline  
10. `read_plate` - Execute step 3 of pipeline (uses bbox from step 2)
11. `detect_make_model` - Execute step 4 of pipeline
12. `detect_colour` - Execute step 5 of pipeline

### 🎯 **Key Validation Points**

- ✅ **No premature model calls** - No analysis before pipeline composition
- ✅ **Pipeline composer usage** - Mandatory orchestration step
- ✅ **Exact order compliance** - Followed pipeline-composer instructions
- ✅ **Complete vehicle analysis** - All attributes extracted


## 🎉 **Demo Summary**

This hands-on demo demonstrated the complete MCP Vision Pipeline execution:


### 🎯 **Key Learning Points**

- **MCP-Based Discovery**: Everything discovered through MCP tool exploration
- **LLM Autonomy**: No hardcoded logic - LLM makes all choices
- **Pipeline Orchestration**: Mandatory composition before execution
- **Dynamic Registration**: Servers registered as they're discovered


### 🔄 **Next Steps**


- **Production Scale** - use RAG approach for the pipeline-composer refer this [tutorial](https://youtu.be/RX7UYUQ1kKY) and this [jupyter notebook](https://github.com/opencyber-space/CosmosAI/tree/7d9b003a70c9e50bd0ffa21be3cc3e6f8dca74ea/video_tutorial_series/08_AutoAIExpert_RAG_Based)
- **Models from AIGrid** - If you prefer models from the AIGrid refer this [tutorial](https://youtu.be/G_yKqIbBP5Q) and this [jupyter notebook](https://github.com/opencyber-space/CosmosAI/tree/7d9b003a70c9e50bd0ffa21be3cc3e6f8dca74ea/video_tutorial_series/02_Part1_onboard_gemma3_llama_cpp)

You now have a complete working example of the MCP Vision Pipeline with Registry of Registries architecture!


## 🛠️ **Troubleshooting**

### Common Issues and Solutions

**1. Registry connection failed**  
- Verify the `REGISTRY_URL` is correct and accessible
- Check if MCP servers are running
- Test with: `curl {REGISTRY_URL}`

**2. Model errors**
- Verify OpenAI API key is set: `echo $OPENAI_API_KEY`
- Check if the specified model is available in your account

**3. Parser not found**
- Make sure `json_parser_mcp.py` exists in the same directory
- Run the notebook from the `mcp_vision_pipeline` directory

**4. Import errors**
- Ensure `jupyter_mcp_client.py` is in the current directory
- Check Python path and dependencies

### 🔧 **Debug Commands**

```bash
# Check if servers are running
docker ps

# Test registry connectivity  
curl -X GET https://your-registry-url/ror/mcp

# Verify environment
echo $OPENAI_API_KEY
python --version
```
