# 🦙 Ollama 7B/14B Models on Google Colab

*   List item
*   List item



This notebook will guide you through setting up and running Ollama with either 7B or 14B parameter models on Google Colab.

## 📋 Prerequisites
- Google Colab account
- **Required for 14B models**: GPU runtime + Colab Pro (recommended)
  - Go to **Runtime → Change runtime type → Hardware accelerator → GPU**

## 🚀 What we'll accomplish:
1. Install Ollama on the Colab runtime
2. Choose and download either 7B or 14B models
3. Set up the environment for interactive use
4. Provide examples and utilities for model interaction

## 🎯 Model Options:
- **7B Models**: Perfect for Colab Free, faster responses, 4-6GB RAM
- **14B Models**: Higher quality, requires Colab Pro, 8-12GB RAM

---

**⚠️ Important Notes:**
- The model will run on Colab's servers, not your local machine
- Free Colab has time limits; consider Colab Pro for longer sessions and 14B models
- Models are downloaded fresh each session (not persistent)
- 14B models may timeout on free Colab due to memory constraints

In [None]:
# Check GPU availability and system info
import subprocess
import os
import time
import requests
import json

print("🔍 System Information:")
print("=" * 50)

# Check if GPU is available
try:
    result = subprocess.run(['nvidia-smi'], capture_output=True, text=True)
    if result.returncode == 0:
        print("✅ GPU Available!")
        print(result.stdout)
    else:
        print("❌ No GPU detected")
except FileNotFoundError:
    print("❌ nvidia-smi not found - likely no GPU")

# Check system resources
print("\n💾 System Resources:")
try:
    result = subprocess.run(['free', '-h'], capture_output=True, text=True)
    print(result.stdout)
except Exception as e:
    print(f"Could not get memory info: {e}")

print("\n💿 Disk Space:")
try:
    result = subprocess.run(['df', '-h', '/'], capture_output=True, text=True)
    print(result.stdout)
except Exception as e:
    print(f"Could not get disk info: {e}")


## 📦 Step 1: Install Ollama

We'll download and install Ollama on the Colab runtime.


In [None]:
# Install Ollama
print("🚀 Installing Ollama...")
print("=" * 50)

# Download and install Ollama
!curl -fsSL https://ollama.com/install.sh | sh

print("\n✅ Ollama installation completed!")

# Verify installation
!which ollama
!ollama --version


## 🔄 Step 2: Start Ollama Service

We need to start the Ollama service in the background before we can use it.


In [None]:
# Start Ollama service in background
import subprocess
import threading
import time
import os

print("🔄 Starting Ollama service...")
print("=" * 50)

def start_ollama_server():
    """Start Ollama server in background"""
    try:
        # Start ollama serve in background
        process = subprocess.Popen(['ollama', 'serve'],
                                 stdout=subprocess.PIPE,
                                 stderr=subprocess.PIPE,
                                 text=True)
        return process
    except Exception as e:
        print(f"Error starting Ollama server: {e}")
        return None

def start_server_background():
    """Start server in background thread"""
    global server_process
    server_process = start_ollama_server()

# Start the server in a separate thread
server_process = None
server_thread = threading.Thread(target=start_server_background, daemon=True)
server_thread.start()

# Wait a bit for the server to start
print("⏳ Waiting for Ollama server to start...")
time.sleep(10)  # Increased wait time

# Check if server is running
def check_server():
    try:
        response = subprocess.run(['curl', '-s', 'http://localhost:11434/api/tags'],
                                capture_output=True, text=True, timeout=10)
        return response.returncode == 0
    except:
        return False

if check_server():
    print("✅ Ollama server is running!")
else:
    print("⚠️ Server might not be ready yet. Trying alternative method...")
    # Alternative method - start directly
    os.system('nohup ollama serve > /dev/null 2>&1 &')
    time.sleep(5)
    if check_server():
        print("✅ Ollama server is now running!")
    else:
        print("❌ Server startup may need more time. Continue to next cell.")

print("\n🎯 Server should be running on http://localhost:11434")


## 📥 Step 3: Download and Load 7B Model or 14B

Now we'll download a 7B parameter model. We'll use Llama 2 7B, but you can choose from other available models.


In [None]:
# Available models you can choose from:
print("🎯 Available Models (7B and 14B):")
print("=" * 60)

# 7B Models - Faster, less memory usage
models_7b = {
    "llama2:7b": "Meta's Llama 2 7B - Good general purpose model",
    "mistral:7b": "Mistral 7B - Efficient and fast model",
    "codellama:7b": "Code Llama 7B - Specialized for code generation",
    "vicuna:7b": "Vicuna 7B - Fine-tuned for conversations",
    "orca-mini:7b": "Orca Mini 7B - Small but capable model"
}

# 14B Models - Higher quality, more memory usage
models_14b = {
    "llama2:13b": "Meta's Llama 2 13B - Higher quality general purpose",
    "codellama:13b": "Code Llama 13B - Advanced code generation",
    "vicuna:13b": "Vicuna 13B - Superior conversational abilities",
    "wizard-vicuna-uncensored:13b": "Wizard Vicuna 13B - Uncensored responses"
}

print("🚀 7B Models (Recommended for Colab Free):")
for model, description in models_7b.items():
    print(f"  • {model}: {description}")

print(f"\n🔥 13B-14B Models (Requires more memory - Colab Pro recommended):")
for model, description in models_14b.items():
    print(f"  • {model}: {description}")

print("\n" + "=" * 60)
print("📊 Memory Requirements:")
print("  • 7B models: ~4-6GB RAM")
print("  • 13B models: ~8-12GB RAM")
print("  • GPU acceleration helps significantly with larger models")
print("=" * 60)

# Model selection with size consideration
print("\n🎯 Model Selection:")
use_14b = input("Do you want to use a 14B model? (y/n, default=n): ").lower().strip()

if use_14b in ['y', 'yes']:
    print("\n⚠️  WARNING: 14B models require more memory!")
    print("   • Make sure you have GPU runtime enabled")
    print("   • Consider Colab Pro for better performance")

    model_choice = input(f"\nChoose 14B model (1-{len(models_14b)}, default=1): ").strip()
    model_list = list(models_14b.keys())

    try:
        model_idx = int(model_choice) - 1 if model_choice else 0
        model_name = model_list[model_idx]
    except (ValueError, IndexError):
        model_name = model_list[0]  # Default to first 14B model

    print(f"📥 Selected: {model_name}")
    download_time = "10-20 minutes"
else:
    print("\n✅ Using 7B model (recommended for Colab)")
    model_choice = input(f"Choose 7B model (1-{len(models_7b)}, default=1): ").strip()
    model_list = list(models_7b.keys())

    try:
        model_idx = int(model_choice) - 1 if model_choice else 0
        model_name = model_list[model_idx]
    except (ValueError, IndexError):
        model_name = "llama2:7b"  # Default fallback

    print(f"📥 Selected: {model_name}")
    download_time = "5-10 minutes"

print(f"\n🚀 Pulling model: {model_name}")
print(f"⚠️  This may take {download_time} depending on your connection...")

# Pull the model
# subprocess should already be imported from previous cells

print(f"⬬ Downloading {model_name}...")
result = subprocess.run(['ollama', 'pull', model_name],
                       capture_output=True, text=True)

if result.returncode == 0:
    print(f"\n✅ Model {model_name} downloaded successfully!")
    print(f"💾 Model size: {model_name.split(':')[1].upper()}")
else:
    print(f"\n❌ Error downloading model:")
    print(f"Error: {result.stderr}")
    print("💡 Try restarting the Ollama service and try again")
    print("💡 For 14B models, ensure you have sufficient memory/GPU")


In [None]:
## 🧪 Step 4: Test the Model

Let's run a simple test to make sure everything is working!


In [None]:
# Test the model with a simple prompt
print("🧪 Testing the model...")
print("=" * 50)

test_prompt = "hey hi tell me a joke"

print(f"📝 Prompt: {test_prompt}")
print("\n🤖 Response:")
print("-" * 30)

# Run the model
!ollama run {model_name} "{test_prompt}"

print("\n✅ Test completed! Your model is working!")
