# Downloading and Running Hugging Face Models with Ollama

This notebook demonstrates how to download models from Hugging Face and run them with Ollama, including popular models like Gemma, Llama, and others.

## Table of Contents
1. [Prerequisites and Setup](#prerequisites)
2. [Direct Ollama Model Downloads](#direct-downloads)
3. [Hugging Face Model Integration](#hf-integration)
4. [Model Conversion Process](#conversion)
5. [Running and Testing Models](#testing)
6. [Model Management](#management)
7. [Performance Comparison](#comparison)

## 1. Prerequisites and Setup {#prerequisites}

First, let's install the required packages and set up our environment:

In [None]:
# Install required packages
!pip install ollama huggingface_hub transformers torch
!pip install requests tqdm rich

In [None]:
import ollama
import requests
import json
import subprocess
import time
from pathlib import Path
from typing import List, Dict, Optional
from rich.console import Console
from rich.table import Table
from rich.panel import Panel
from rich.progress import Progress, SpinnerColumn, TextColumn, BarColumn, TaskProgressColumn
from rich.markdown import Markdown
from huggingface_hub import HfApi, list_models

console = Console()
client = ollama.Client()

console.print("✅ All packages imported successfully!", style="green")

### Verify Ollama Installation

In [None]:
def check_ollama_status():
    """Check if Ollama is running and accessible"""
    try:
        response = requests.get('http://localhost:11434/api/tags', timeout=5)
        if response.status_code == 200:
            models = response.json()
            console.print("✅ Ollama is running!", style="green")
            console.print(f"📦 Currently installed models: {len(models.get('models', []))}", style="blue")
            return True
        else:
            console.print(f"❌ Ollama responded with status: {response.status_code}", style="red")
            return False
    except requests.exceptions.ConnectionError:
        console.print("❌ Cannot connect to Ollama. Make sure it's running:", style="red")
        console.print("   Run: ollama serve", style="yellow")
        return False
    except Exception as e:
        console.print(f"❌ Error checking Ollama: {e}", style="red")
        return False

ollama_running = check_ollama_status()

## 2. Direct Ollama Model Downloads {#direct-downloads}

Ollama provides direct access to many popular models. Let's explore what's available and download some models:

In [None]:
class OllamaModelManager:
    """Manager for Ollama model operations"""
    
    def __init__(self):
        self.client = ollama.Client()
        self.console = Console()
    
    def list_available_models(self) -> List[str]:
        """List currently installed models"""
        try:
            models = self.client.list()
            return [model['name'] for model in models['models']]
        except Exception as e:
            self.console.print(f"❌ Error listing models: {e}", style="red")
            return []
    
    def pull_model(self, model_name: str) -> bool:
        """Download a model from Ollama library"""
        try:
            self.console.print(f"📥 Downloading {model_name}...", style="cyan")
            
            # Use subprocess to show real-time progress
            process = subprocess.Popen(
                ['ollama', 'pull', model_name],
                stdout=subprocess.PIPE,
                stderr=subprocess.STDOUT,
                universal_newlines=True,
                bufsize=1
            )
            
            # Stream output in real-time
            for line in process.stdout:
                print(line.strip())
            
            process.wait()
            
            if process.returncode == 0:
                self.console.print(f"✅ Successfully downloaded {model_name}!", style="green")
                return True
            else:
                self.console.print(f"❌ Failed to download {model_name}", style="red")
                return False
                
        except Exception as e:
            self.console.print(f"❌ Error downloading {model_name}: {e}", style="red")
            return False
    
    def get_model_info(self, model_name: str) -> Optional[Dict]:
        """Get detailed information about a model"""
        try:
            info = self.client.show(model_name)
            return info
        except Exception as e:
            self.console.print(f"❌ Error getting info for {model_name}: {e}", style="red")
            return None
    
    def remove_model(self, model_name: str) -> bool:
        """Remove a model"""
        try:
            subprocess.run(['ollama', 'rm', model_name], check=True)
            self.console.print(f"🗑️ Removed {model_name}", style="yellow")
            return True
        except subprocess.CalledProcessError:
            self.console.print(f"❌ Failed to remove {model_name}", style="red")
            return False

# Initialize model manager
model_manager = OllamaModelManager()

# Show currently installed models
current_models = model_manager.list_available_models()
console.print(f"📦 Currently installed models: {len(current_models)}", style="blue")
for model in current_models:
    console.print(f"  • {model}", style="dim")

### Popular Models Available in Ollama

Here are some popular models you can download directly:

In [None]:
# Popular models available in Ollama
popular_models = {
    "Gemma Models": [
        "gemma2:2b",
        "gemma2:9b", 
        "gemma2:27b",
        "gemma:2b",
        "gemma:7b"
    ],
    "Llama Models": [
        "llama3.2:1b",
        "llama3.2:3b",
        "llama3.1:8b",
        "llama3.1:70b",
        "llama2:7b",
        "llama2:13b"
    ],
    "Code Models": [
        "codellama:7b",
        "codellama:13b",
        "codegemma:2b",
        "codegemma:7b"
    ],
    "Other Popular Models": [
        "mistral:7b",
        "mixtral:8x7b",
        "phi3:3.8b",
        "qwen2:7b",
        "deepseek-coder:6.7b"
    ]
}

# Display available models in a nice table
table = Table(title="🤖 Popular Models Available in Ollama")
table.add_column("Category", style="cyan", no_wrap=True)
table.add_column("Models", style="green")
table.add_column("Size Info", style="yellow")

size_info = {
    "1b-3b": "~1-2GB (Fast, good for basic tasks)",
    "7b-9b": "~4-5GB (Balanced performance)",
    "13b-27b": "~7-15GB (High quality responses)",
    "70b+": "~40GB+ (Best quality, requires powerful hardware)"
}

for category, models in popular_models.items():
    models_text = "\n".join([f"• {model}" for model in models])
    
    # Determine size category
    if any("1b" in model or "2b" in model or "3b" in model for model in models):
        size_cat = "1b-3b"
    elif any("7b" in model or "9b" in model for model in models):
        size_cat = "7b-9b"
    elif any("13b" in model or "27b" in model for model in models):
        size_cat = "13b-27b"
    else:
        size_cat = "7b-9b"  # default
    
    table.add_row(category, models_text, size_info[size_cat])

console.print(table)

### Download Gemma Models

Let's download some Gemma models as examples:

In [None]:
# Models to download (starting with smaller ones)
models_to_download = [
    "gemma2:2b",  # Small, fast model
    # "gemma2:9b",  # Uncomment if you have enough RAM/VRAM
]

if ollama_running:
    console.print("🚀 Starting model downloads...", style="bold cyan")
    
    for model in models_to_download:
        console.print(f"\n📥 Downloading {model}...", style="cyan")
        
        # Check if model already exists
        if model in current_models:
            console.print("✅ {model} already installed!".format(model=model), style="green")
            continue
        
        # Download the model
        success = model_manager.pull_model(model)
        
        if success:
            console.print(f"✅ {model} downloaded successfully!", style="green")
        else:
            console.print(f"❌ Failed to download {model}", style="red")
        
        time.sleep(1)  # Brief pause between downloads
    
    # Update model list
    current_models = model_manager.list_available_models()
    console.print(f"\n📦 Total models now: {len(current_models)}", style="blue")
else:
    console.print("⚠️ Ollama not running. Please start Ollama first.", style="yellow")

## 3. Hugging Face Model Integration {#hf-integration}

Now let's explore how to work with Hugging Face models and potentially convert them for Ollama:

In [None]:
class HuggingFaceModelExplorer:
    """Explorer for Hugging Face models"""
    
    def __init__(self):
        self.api = HfApi()
        self.console = Console()
    
    def search_models(self, query: str, limit: int = 10) -> List[Dict]:
        """Search for models on Hugging Face"""
        try:
            models = list_models(
                search=query,
                limit=limit,
                sort="downloads",
                direction=-1
            )
            
            model_info = []
            for model in models:
                info = {
                    'id': model.id,
                    'downloads': getattr(model, 'downloads', 0),
                    'likes': getattr(model, 'likes', 0),
                    'tags': getattr(model, 'tags', []),
                    'pipeline_tag': getattr(model, 'pipeline_tag', 'unknown')
                }
                model_info.append(info)
            
            return model_info
            
        except Exception as e:
            self.console.print(f"❌ Error searching models: {e}", style="red")
            return []
    
    def display_models(self, models: List[Dict], title: str = "Models"):
        """Display models in a nice table"""
        if not models:
            self.console.print("No models found.", style="yellow")
            return
        
        table = Table(title=f"🤗 {title}")
        table.add_column("Model ID", style="cyan")
        table.add_column("Downloads", style="green")
        table.add_column("Likes", style="yellow")
        table.add_column("Type", style="magenta")
        
        for model in models:
            downloads = f"{model['downloads']:,}" if model['downloads'] else "N/A"
            likes = str(model['likes']) if model['likes'] else "0"
            pipeline = model['pipeline_tag'] or "text-generation"
            
            table.add_row(
                model['id'],
                downloads,
                likes,
                pipeline
            )
        
        self.console.print(table)

# Initialize HF explorer
hf_explorer = HuggingFaceModelExplorer()

# Search for Gemma models on Hugging Face
console.print("🔍 Searching for Gemma models on Hugging Face...", style="cyan")
gemma_models = hf_explorer.search_models("gemma", limit=8)
hf_explorer.display_models(gemma_models, "Gemma Models on Hugging Face")

In [None]:
# Search for other popular models
console.print("\n🔍 Searching for Llama models...", style="cyan")
llama_models = hf_explorer.search_models("llama", limit=6)
hf_explorer.display_models(llama_models, "Llama Models on Hugging Face")

console.print("\n🔍 Searching for Mistral models...", style="cyan")
mistral_models = hf_explorer.search_models("mistral", limit=6)
hf_explorer.display_models(mistral_models, "Mistral Models on Hugging Face")

## 4. Model Conversion Process {#conversion}

While Ollama provides many models directly, sometimes you might want to convert a Hugging Face model. Here's how to do it:

In [None]:
class ModelConverter:
    """Helper for converting Hugging Face models to Ollama format"""
    
    def __init__(self):
        self.console = Console()
    
    def create_modelfile(self, model_path: str, model_name: str, 
                        system_prompt: str = None, 
                        temperature: float = 0.7) -> str:
        """Create a Modelfile for Ollama"""
        
        modelfile_content = f"FROM {model_path}\n"
        
        if system_prompt:
            modelfile_content += f'SYSTEM "{system_prompt}"\n'
        
        modelfile_content += f"PARAMETER temperature {temperature}\n"
        modelfile_content += "PARAMETER num_ctx 4096\n"
        
        # Save to file
        modelfile_path = f"Modelfile.{model_name}"
        with open(modelfile_path, 'w') as f:
            f.write(modelfile_content)
        
        self.console.print(f"📄 Created Modelfile: {modelfile_path}", style="green")
        self.console.print(f"Content:\n{modelfile_content}", style="dim")
        
        return modelfile_path
    
    def convert_hf_model(self, hf_model_id: str, ollama_model_name: str, 
                        system_prompt: str = None) -> bool:
        """Convert a Hugging Face model to Ollama format"""
        
        try:
            self.console.print(f"🔄 Converting {hf_model_id} to Ollama format...", style="cyan")
            
            # Create Modelfile
            modelfile_path = self.create_modelfile(
                hf_model_id, 
                ollama_model_name,
                system_prompt
            )
            
            # Create the model in Ollama
            self.console.print(f"🏗️ Creating Ollama model: {ollama_model_name}", style="cyan")
            
            process = subprocess.Popen(
                ['ollama', 'create', ollama_model_name, '-f', modelfile_path],
                stdout=subprocess.PIPE,
                stderr=subprocess.STDOUT,
                universal_newlines=True
            )
            
            # Stream output
            for line in process.stdout:
                print(line.strip())
            
            process.wait()
            
            if process.returncode == 0:
                self.console.print(f"✅ Successfully created {ollama_model_name}!", style="green")
                return True
            else:
                self.console.print(f"❌ Failed to create {ollama_model_name}", style="red")
                return False
                
        except Exception as e:
            self.console.print(f"❌ Conversion error: {e}", style="red")
            return False
    
    def download_and_convert(self, hf_model_id: str, ollama_model_name: str):
        """Download from HF and convert to Ollama (example workflow)"""
        
        self.console.print(f"📋 Conversion workflow for {hf_model_id}:", style="bold cyan")
        
        steps = [
            "1. Download model from Hugging Face",
            "2. Convert to GGUF format (if needed)",
            "3. Create Ollama Modelfile",
            "4. Import into Ollama",
            "5. Test the model"
        ]
        
        for step in steps:
            self.console.print(f"  {step}", style="dim")
        
        self.console.print("\n⚠️ Note: This is a simplified example.", style="yellow")
        self.console.print("For complex conversions, you might need additional tools like:", style="yellow")
        self.console.print("  • llama.cpp for GGUF conversion", style="dim")
        self.console.print("  • Specific model conversion scripts", style="dim")

# Initialize converter
converter = ModelConverter()

# Example: Show conversion workflow
converter.download_and_convert("google/gemma-2b", "my-gemma-2b")

### Creating Custom Modelfiles

Let's create some example Modelfiles for different use cases:

In [None]:
# Example Modelfiles for different purposes
modelfile_examples = {
    "coding_assistant": {
        "system_prompt": "You are an expert programming assistant. Provide clear, well-commented code examples and explain complex concepts simply.",
        "temperature": 0.3,
        "description": "Focused on coding tasks with lower temperature for consistency"
    },
    "creative_writer": {
        "system_prompt": "You are a creative writing assistant. Write engaging stories with vivid descriptions and compelling characters.",
        "temperature": 0.9,
        "description": "Creative writing with higher temperature for variety"
    },
    "research_assistant": {
        "system_prompt": "You are a research assistant. Provide accurate, well-sourced information and cite your reasoning clearly.",
        "temperature": 0.5,
        "description": "Balanced temperature for factual accuracy"
    }
}

console.print("📝 Example Modelfile Configurations:", style="bold cyan")

for name, config in modelfile_examples.items():
    console.print(f"\n🎯 {name.replace('_', ' ').title()}:", style="green")
    console.print(f"   Description: {config['description']}", style="dim")
    console.print(f"   Temperature: {config['temperature']}", style="blue")
    console.print(f"   System Prompt: {config['system_prompt'][:80]}...", style="yellow")
    
    # Create example Modelfile content
    modelfile_content = f"""FROM gemma2:2b
SYSTEM "{config['system_prompt']}"
PARAMETER temperature {config['temperature']}
PARAMETER num_ctx 4096
PARAMETER top_p 0.9"""
    
    console.print(Panel(
        modelfile_content,
        title=f"Modelfile.{name}",
        border_style="dim"
    ))

## 5. Running and Testing Models {#testing}

Now let's test the models we've downloaded:

In [None]:
class ModelTester:
    """Test and compare different models"""
    
    def __init__(self):
        self.client = ollama.Client()
        self.console = Console()
    
    def test_model(self, model_name: str, prompt: str, max_tokens: int = 150) -> Dict:
        """Test a single model with a prompt"""
        try:
            start_time = time.time()
            
            response = self.client.chat(
                model=model_name,
                messages=[{'role': 'user', 'content': prompt}],
                options={'num_predict': max_tokens}
            )
            
            end_time = time.time()
            
            return {
                'model': model_name,
                'prompt': prompt,
                'response': response['message']['content'],
                'time_taken': end_time - start_time,
                'success': True
            }
            
        except Exception as e:
            return {
                'model': model_name,
                'prompt': prompt,
                'error': str(e),
                'success': False
            }
    
    def compare_models(self, models: List[str], prompt: str) -> List[Dict]:
        """Compare multiple models with the same prompt"""
        results = []
        
        self.console.print(f"🔬 Testing prompt: '{prompt}'", style="cyan")
        self.console.print(f"📊 Comparing {len(models)} models...\n", style="blue")
        
        for model in models:
            self.console.print(f"Testing {model}...", style="dim")
            result = self.test_model(model, prompt)
            results.append(result)
            
            if result['success']:
                self.console.print(f"✅ {model}: {result['time_taken']:.2f}s", style="green")
            else:
                self.console.print(f"❌ {model}: {result['error']}", style="red")
        
        return results
    
    def display_comparison(self, results: List[Dict]):
        """Display comparison results in a nice format"""
        successful_results = [r for r in results if r['success']]
        
        if not successful_results:
            self.console.print("❌ No successful results to display", style="red")
            return
        
        self.console.print("\n📊 Model Comparison Results:", style="bold cyan")
        
        for i, result in enumerate(successful_results, 1):
            self.console.print(f"\n{i}. 🤖 {result['model']}:", style="bold green")
            self.console.print(f"   ⏱️ Time: {result['time_taken']:.2f} seconds", style="blue")
            self.console.print(f"   📝 Response length: {len(result['response'])} characters", style="yellow")
            
            # Show response in a panel
            self.console.print(Panel(
                result['response'][:300] + ("..." if len(result['response']) > 300 else ""),
                title=f"Response from {result['model']}",
                border_style="green"
            ))

# Initialize tester
tester = ModelTester()

# Get available models for testing
available_models = model_manager.list_available_models()
console.print(f"📦 Available models for testing: {len(available_models)}", style="blue")

if available_models:
    for model in available_models:
        console.print(f"  • {model}", style="dim")
else:
    console.print("⚠️ No models available. Please download some models first.", style="yellow")

In [None]:
# Test models with different types of prompts
test_prompts = [
    "Explain quantum computing in simple terms.",
    "Write a Python function to calculate fibonacci numbers.",
    "What are the benefits of renewable energy?"
]

if available_models and ollama_running:
    # Test with first available model
    test_model = available_models[0]
    
    console.print(f"\n🧪 Testing {test_model} with different prompts:", style="bold cyan")
    
    for i, prompt in enumerate(test_prompts, 1):
        console.print(f"\n{i}. Testing: '{prompt}'", style="cyan")
        
        result = tester.test_model(test_model, prompt, max_tokens=100)
        
        if result['success']:
            console.print(f"✅ Response time: {result['time_taken']:.2f}s", style="green")
            console.print(Panel(
                result['response'],
                title=f"Response from {test_model}",
                border_style="green"
            ))
        else:
            console.print(f"❌ Error: {result['error']}", style="red")
        
        time.sleep(1)  # Brief pause between tests
    
    # If multiple models available, compare them
    if len(available_models) > 1:
        console.print("\n🔬 Comparing multiple models...", style="bold magenta")
        comparison_results = tester.compare_models(
            available_models[:2],  # Compare first two models
            "What is artificial intelligence?"
        )
        tester.display_comparison(comparison_results)
        
else:
    console.print("⚠️ Cannot run tests - no models available or Ollama not running", style="yellow")

## 6. Model Management {#management}

Let's create tools for managing our models effectively:

In [None]:
class AdvancedModelManager:
    """Advanced model management with detailed information"""
    
    def __init__(self):
        self.client = ollama.Client()
        self.console = Console()
    
    def get_detailed_model_info(self) -> List[Dict]:
        """Get detailed information about all models"""
        try:
            models = self.client.list()
            detailed_info = []
            
            for model in models['models']:
                name = model['name']
                size_gb = model.get('size', 0) / (1024**3)
                modified = model.get('modified_at', 'Unknown')
                
                # Try to get additional info
                try:
                    info = self.client.show(name)
                    parameters = info.get('details', {}).get('parameter_size', 'Unknown')
                    family = info.get('details', {}).get('family', 'Unknown')
                except:
                    parameters = 'Unknown'
                    family = 'Unknown'
                
                detailed_info.append({
                    'name': name,
                    'size_gb': size_gb,
                    'modified': modified,
                    'parameters': parameters,
                    'family': family
                })
            
            return detailed_info
            
        except Exception as e:
            self.console.print(f"❌ Error getting model info: {e}", style="red")
            return []
    
    def display_model_table(self, models: List[Dict]):
        """Display models in a detailed table"""
        if not models:
            self.console.print("No models found.", style="yellow")
            return
        
        table = Table(title="🤖 Installed Models - Detailed View")
        table.add_column("Model Name", style="cyan", no_wrap=True)
        table.add_column("Size (GB)", style="green")
        table.add_column("Parameters", style="yellow")
        table.add_column("Family", style="magenta")
        table.add_column("Modified", style="blue")
        
        total_size = 0
        for model in models:
            size_str = f"{model['size_gb']:.1f}"
            total_size += model['size_gb']
            
            # Format modified date
            modified = model['modified']
            if modified != 'Unknown':
                try:
                    from datetime import datetime
                    dt = datetime.fromisoformat(modified.replace('Z', '+00:00'))
                    modified = dt.strftime('%Y-%m-%d')
                except:
                    pass
            
            table.add_row(
                model['name'],
                size_str,
                str(model['parameters']),
                model['family'],
                modified
            )
        
        self.console.print(table)
        self.console.print(f"\n💾 Total storage used: {total_size:.1f} GB", style="bold blue")
    
    def model_usage_stats(self, models: List[Dict]):
        """Show usage statistics"""
        if not models:
            return
        
        # Group by family
        families = {}
        for model in models:
            family = model['family']
            if family not in families:
                families[family] = []
            families[family].append(model)
        
        console.print("\n📊 Model Statistics:", style="bold cyan")
        
        for family, family_models in families.items():
            count = len(family_models)
            total_size = sum(m['size_gb'] for m in family_models)
            console.print(f"  🏷️ {family}: {count} models, {total_size:.1f} GB", style="green")
    
    def cleanup_suggestions(self, models: List[Dict]):
        """Suggest models that could be removed to save space"""
        if len(models) <= 2:
            console.print("\n🧹 No cleanup suggestions - you have few models.", style="green")
            return
        
        # Sort by size (largest first)
        large_models = sorted(models, key=lambda x: x['size_gb'], reverse=True)
        
        console.print("\n🧹 Cleanup Suggestions:", style="bold yellow")
        console.print("Consider removing large models you don't use frequently:", style="dim")
        
        for model in large_models[:3]:  # Show top 3 largest
            if model['size_gb'] > 5:  # Only suggest large models
                console.print(f"  • {model['name']} ({model['size_gb']:.1f} GB)", style="yellow")
                console.print(f"    Command: ollama rm {model['name']}", style="dim")

# Initialize advanced manager
advanced_manager = AdvancedModelManager()

if ollama_running:
    # Get detailed model information
    detailed_models = advanced_manager.get_detailed_model_info()
    
    if detailed_models:
        advanced_manager.display_model_table(detailed_models)
        advanced_manager.model_usage_stats(detailed_models)
        advanced_manager.cleanup_suggestions(detailed_models)
    else:
        console.print("No models found or error retrieving model information.", style="yellow")
else:
    console.print("⚠️ Ollama not running - cannot retrieve model information", style="yellow")

## 7. Performance Comparison {#comparison}

Let's create a comprehensive performance comparison tool:

In [None]:
class PerformanceBenchmark:
    """Benchmark different models for performance comparison"""
    
    def __init__(self):
        self.client = ollama.Client()
        self.console = Console()
    
    def benchmark_model(self, model_name: str, test_prompts: List[str], 
                       num_runs: int = 3) -> Dict:
        """Benchmark a single model with multiple prompts"""
        results = {
            'model': model_name,
            'tests': [],
            'avg_time': 0,
            'total_tokens': 0,
            'success_rate': 0
        }
        
        successful_tests = 0
        total_time = 0
        total_tokens = 0
        
        for prompt in test_prompts:
            prompt_results = []
            
            for run in range(num_runs):
                try:
                    start_time = time.time()
                    
                    response = self.client.chat(
                        model=model_name,
                        messages=[{'role': 'user', 'content': prompt}],
                        options={'num_predict': 100}
                    )
                    
                    end_time = time.time()
                    response_time = end_time - start_time
                    response_text = response['message']['content']
                    token_count = len(response_text.split())
                    
                    prompt_results.append({
                        'time': response_time,
                        'tokens': token_count,
                        'success': True
                    })
                    
                    successful_tests += 1
                    total_time += response_time
                    total_tokens += token_count
                    
                except Exception as e:
                    prompt_results.append({
                        'error': str(e),
                        'success': False
                    })
            
            # Calculate averages for this prompt
            successful_runs = [r for r in prompt_results if r['success']]
            if successful_runs:
                avg_time = sum(r['time'] for r in successful_runs) / len(successful_runs)
                avg_tokens = sum(r['tokens'] for r in successful_runs) / len(successful_runs)
            else:
                avg_time = 0
                avg_tokens = 0
            
            results['tests'].append({
                'prompt': prompt,
                'runs': prompt_results,
                'avg_time': avg_time,
                'avg_tokens': avg_tokens
            })
        
        # Calculate overall statistics
        total_tests = len(test_prompts) * num_runs
        results['success_rate'] = (successful_tests / total_tests) * 100 if total_tests > 0 else 0
        results['avg_time'] = total_time / successful_tests if successful_tests > 0 else 0
        results['avg_tokens_per_second'] = (total_tokens / total_time) if total_time > 0 else 0
        
        return results
    
    def display_benchmark_results(self, results: List[Dict]):
        """Display benchmark results in a comprehensive format"""
        if not results:
            self.console.print("No benchmark results to display.", style="yellow")
            return
        
        # Summary table
        table = Table(title="🏆 Model Performance Benchmark")
        table.add_column("Model", style="cyan")
        table.add_column("Avg Time (s)", style="green")
        table.add_column("Tokens/sec", style="yellow")
        table.add_column("Success Rate", style="blue")
        table.add_column("Rating", style="magenta")
        
        for result in results:
            # Calculate performance rating
            speed_score = min(100, (result['avg_tokens_per_second'] / 10) * 100)
            reliability_score = result['success_rate']
            overall_rating = (speed_score + reliability_score) / 2
            
            if overall_rating >= 80:
                rating = "⭐⭐⭐⭐⭐"
            elif overall_rating >= 60:
                rating = "⭐⭐⭐⭐"
            elif overall_rating >= 40:
                rating = "⭐⭐⭐"
            elif overall_rating >= 20:
                rating = "⭐⭐"
            else:
                rating = "⭐"
            
            table.add_row(
                result['model'],
                f"{result['avg_time']:.2f}",
                f"{result['avg_tokens_per_second']:.1f}",
                f"{result['success_rate']:.1f}%",
                rating
            )
        
        self.console.print(table)
        
        # Detailed breakdown
        self.console.print("\n📊 Detailed Performance Analysis:", style="bold cyan")
        
        for result in results:
            self.console.print(f"\n🤖 {result['model']}:", style="bold green")
            
            for test in result['tests']:
                successful_runs = len([r for r in test['runs'] if r['success']])
                total_runs = len(test['runs'])
                
                self.console.print(f"  📝 '{test['prompt'][:50]}...'", style="dim")
                self.console.print(f"     Success: {successful_runs}/{total_runs}, "
                                 f"Avg Time: {test['avg_time']:.2f}s, "
                                 f"Avg Tokens: {test['avg_tokens']:.0f}", style="blue")

# Benchmark test prompts
benchmark_prompts = [
    "What is machine learning?",
    "Write a Python function to sort a list.",
    "Explain the theory of relativity briefly."
]

if available_models and ollama_running and len(available_models) > 0:
    benchmark = PerformanceBenchmark()
    
    console.print("\n🏁 Starting Performance Benchmark...", style="bold cyan")
    console.print(f"Testing {len(available_models)} model(s) with {len(benchmark_prompts)} prompts", style="blue")
    
    benchmark_results = []
    
    for model in available_models[:2]:  # Limit to first 2 models to save time
        console.print(f"\n🔬 Benchmarking {model}...", style="cyan")
        
        with Progress(
            SpinnerColumn(),
            TextColumn("[progress.description]{task.description}"),
            console=console,
        ) as progress:
            task = progress.add_task(f"Testing {model}...", total=None)
            
            result = benchmark.benchmark_model(model, benchmark_prompts, num_runs=2)
            benchmark_results.append(result)
            
            progress.update(task, completed=True)
    
    # Display results
    benchmark.display_benchmark_results(benchmark_results)
    
else:
    console.print("⚠️ Cannot run benchmark - no models available or Ollama not running", style="yellow")

## Conclusion and Next Steps

This notebook has covered comprehensive model management with Ollama and Hugging Face integration:

In [None]:
# Summary of what we've accomplished
summary = """
# 🎉 Hugging Face + Ollama Integration Complete!

## What We've Covered:

### 📥 **Model Downloads**
- Direct Ollama model downloads (Gemma, Llama, etc.)
- Hugging Face model exploration
- Model availability checking

### 🔄 **Model Conversion**
- Creating custom Modelfiles
- Converting HF models to Ollama format
- Custom system prompts and parameters

### 🧪 **Testing & Benchmarking**
- Model performance testing
- Comparative analysis
- Speed and accuracy metrics

### 🛠️ **Management Tools**
- Advanced model information
- Storage usage tracking
- Cleanup suggestions

## 🚀 Next Steps:

1. **Download More Models**
   ```bash
   ollama pull gemma2:9b
   ollama pull llama3.1:8b
   ollama pull codellama:7b
   ```

2. **Create Custom Models**
   - Experiment with different system prompts
   - Fine-tune parameters for your use case
   - Create specialized assistants

3. **Build Applications**
   - Integrate models into your projects
   - Create web interfaces
   - Build domain-specific chatbots

4. **Optimize Performance**
   - Test different quantization levels
   - Monitor resource usage
   - Implement caching strategies

## 📚 Useful Commands:

```bash
# List available models
ollama list

# Download a model
ollama pull model_name

# Remove a model
ollama rm model_name

# Create custom model
ollama create my_model -f Modelfile

# Show model info
ollama show model_name
```

---

*Happy modeling with Ollama and Hugging Face! 🤗🦙*
"""

console.print(Panel(
    Markdown(summary),
    title="📋 Tutorial Summary",
    border_style="gold"
))

# Final status check
if ollama_running:
    final_models = model_manager.list_available_models()
    console.print(f"\n✅ Tutorial complete! You now have {len(final_models)} model(s) ready to use.", style="bold green")
    
    if final_models:
        console.print("🎯 Try chatting with your models:", style="cyan")
        for model in final_models[:3]:  # Show first 3
            console.print(f"  ollama run {model}", style="dim")
else:
    console.print("\n⚠️ Start Ollama to begin using your models: ollama serve", style="yellow")

console.print("\n🚀 Ready to build amazing AI applications!", style="bold magenta")