# Backdoor AI - Ollama on Google Colab (Llama 3.3 70B)

This notebook helps you run Ollama on Google Colab to use with your Backdoor AI application. You can install and run Llama 3.3 70B directly in Colab, then connect your Backdoor AI app to it.

## How it works

1. This notebook will first optimize your Colab environment for large models
2. We'll install Ollama on this Colab instance
3. We'll download the Llama 3.3 70B model
4. We'll set up Cloudflared to create a secure tunnel to your Ollama instance
5. You'll get a URL to use in your Backdoor AI settings

Let's get started!

## 1. Memory Optimization for Large Models

First, let's clear up disk space and optimize memory to ensure we have enough resources for large models.

In [None]:
# Memory optimization functions
import os
import shutil
import subprocess
import gc
import time
from IPython.display import display, HTML, clear_output

# Install required packages first
!pip install -q psutil
import psutil

def clear_disk_space():
    """Clean up disk space by removing unnecessary files."""
    print("🧹 Cleaning up disk space...")
    
    # Clean apt cache
    subprocess.run("apt-get clean", shell=True)
    
    # Remove unnecessary packages
    subprocess.run("apt-get -y autoremove", shell=True)
    
    # Clean pip cache
    subprocess.run("rm -rf ~/.cache/pip", shell=True)
    
    # Remove temporary files
    temp_dirs = ['/tmp', '/var/tmp']
    for temp_dir in temp_dirs:
        if os.path.exists(temp_dir):
            try:
                for item in os.listdir(temp_dir):
                    item_path = os.path.join(temp_dir, item)
                    # Skip our ollama directories
                    if item.startswith('ollama') or item.startswith('backdoor'):
                        continue
                    
                    try:
                        if os.path.isdir(item_path):
                            shutil.rmtree(item_path)
                        else:
                            os.remove(item_path)
                    except Exception as e:
                        pass  # Skip files that can't be removed
            except Exception as e:
                print(f"Warning: Could not clean {temp_dir}: {e}")
    
    # Remove unused Docker images/containers if Docker is installed
    try:
        subprocess.run("docker system prune -af", shell=True, stderr=subprocess.DEVNULL)
    except:
        pass
    
    print("✅ Disk cleanup complete!")
    
    # Show disk space
    show_disk_usage()

def show_disk_usage():
    """Show current disk usage."""
    try:
        df_output = subprocess.check_output("df -h /", shell=True, text=True)
        print("\n📊 Disk Space Available:")
        for line in df_output.split('\n'):
            print(line)
    except:
        print("Could not retrieve disk usage information")

def show_memory_usage():
    """Show current memory usage."""
    try:
        memory = psutil.virtual_memory()
        total_gb = memory.total / (1024 ** 3)
        available_gb = memory.available / (1024 ** 3)
        used_gb = memory.used / (1024 ** 3)
        percent = memory.percent
        
        print(f"\n📊 Memory Usage:")
        print(f"Total Memory: {total_gb:.2f} GB")
        print(f"Available: {available_gb:.2f} GB")
        print(f"Used: {used_gb:.2f} GB ({percent}%)")
    except:
        print("Could not retrieve memory usage information")

def clear_memory():
    """Clear Python memory."""
    gc.collect()
    torch_available = False
    
    try:
        import torch
        torch_available = True
    except ImportError:
        pass
    
    if torch_available:
        try:
            import torch
            torch.cuda.empty_cache()
            print("✅ PyTorch CUDA cache cleared")
        except:
            pass
    
    print("✅ Python memory cleared")

def clean_model_files(keep_models=None):
    """Clean up model files to free space, optionally keeping specified models."""
    if keep_models is None:
        keep_models = []
    
    print(f"🧹 Cleaning model files (keeping: {', '.join(keep_models) if keep_models else 'none'})...")
    
    # Clean Ollama model files (except the ones specified to keep)
    ollama_dirs = ['/root/.ollama', '/tmp/ollama']
    
    for ollama_dir in ollama_dirs:
        if os.path.exists(ollama_dir):
            models_path = os.path.join(ollama_dir, 'models')
            if os.path.exists(models_path):
                for model_file in os.listdir(models_path):
                    should_keep = False
                    for keep_model in keep_models:
                        if keep_model in model_file:
                            should_keep = True
                            break
                    
                    if not should_keep:
                        try:
                            model_path = os.path.join(models_path, model_file)
                            if os.path.isdir(model_path):
                                shutil.rmtree(model_path)
                            else:
                                os.remove(model_path)
                            print(f"  - Removed: {model_file}")
                        except Exception as e:
                            print(f"  - Could not remove {model_file}: {e}")
    
    print("✅ Model cleanup complete!")
    show_disk_usage()

def monitor_download_progress(model_name):
    """Monitor the download progress of a model."""
    last_size = 0
    download_dir = '/root/.ollama/models'
    
    print(f"🔄 Monitoring download progress for {model_name}")
    
    try:
        while True:
            if not os.path.exists(download_dir):
                time.sleep(1)
                continue
                
            total_size = 0
            for root, dirs, files in os.walk(download_dir):
                for file in files:
                    if model_name.lower() in file.lower():
                        try:
                            file_path = os.path.join(root, file)
                            total_size += os.path.getsize(file_path)
                        except:
                            pass
            
            if total_size > last_size:
                clear_output(wait=True)
                print(f"Downloading {model_name}...")
                print(f"Downloaded: {total_size / (1024**3):.2f} GB")
                last_size = total_size
            
            time.sleep(2)
    except KeyboardInterrupt:
        print("Download monitoring stopped")

# Run optimization process
print("🚀 Optimizing environment for large language models...")
clear_disk_space()
clear_memory()

# Set environment variables for improved performance
os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8"
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:512"

# Show current resource usage
show_memory_usage()
show_disk_usage()

print("\n✅ Optimization complete! Ready to continue.")

## 2. Set up environment

Now let's install the required packages. We'll need Ollama and Cloudflared for tunneling.

In [None]:
# Install Ollama
!curl -fsSL https://ollama.com/install.sh | sh

# Install cloudflared for tunneling
!wget -q https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64.deb
!dpkg -i cloudflared-linux-amd64.deb

# Install other dependencies
!pip install -q requests pyngrok httpx

# Set up directories
!mkdir -p /tmp/ollama/models

## 3. Start Ollama server

Now we'll start the Ollama server in the background.

In [None]:
import subprocess
import time
import requests
import json
from IPython.display import clear_output

# Start Ollama server in background
ollama_process = subprocess.Popen(
    ["ollama", "serve"],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True
)

# Wait for Ollama to start
print("Starting Ollama server...")
time.sleep(5)

# Check if Ollama is running
try:
    response = requests.get("http://localhost:11434/api/version")
    if response.status_code == 200:
        print(f"✅ Ollama is running! Version: {response.json().get('version')}")
    else:
        print(f"❌ Ollama returned unexpected status: {response.status_code}")
except Exception as e:
    print(f"❌ Failed to connect to Ollama: {e}")
    print("Trying to start again...")
    # Kill the previous process if it exists
    if ollama_process:
        ollama_process.terminate()
        time.sleep(2)
    # Try starting again
    !ollama serve &
    time.sleep(5)
    try:
        response = requests.get("http://localhost:11434/api/version")
        if response.status_code == 200:
            print(f"✅ Second attempt succeeded! Ollama is running. Version: {response.json().get('version')}")
    except:
        print("❌ Failed to start Ollama after multiple attempts.")

## 4. Download Llama 3.3 70B model

Now, let's download the Llama 3.3 70B model. This is a large model (approximately 70GB) so it will take some time to download.

In [None]:
# Download Llama 3.3 70B
print("🚀 Downloading Llama 3.3 70B model...")
print("This is a large model (approximately 70GB) and will take some time to download.")
print("You'll see progress below. Please don't interrupt the process.")

# Run the download command
!ollama pull llama3.3:70b

# Verify the model is available
print("\n📋 Available models:")
!ollama list

## 5. Test the Llama 3.3 70B model

Let's make sure the model works by asking it a question relevant to its use in Backdoor AI.

In [None]:
import requests
import json
from IPython.display import display, HTML

# Define a relevant prompt for Backdoor AI usage
test_prompt = """
I'm using you with the Backdoor AI Flask application. Can you:
1. Help me understand how to use Llama 3.3 70B effectively
2. Explain what makes Llama 3.3 70B different from other models
3. Suggest some good use cases for this model

Please provide a brief response.
"""

# Set up the API call
url = "http://localhost:11434/api/chat"
payload = {
    "model": "llama3.3:70b",
    "messages": [
        {"role": "user", "content": test_prompt}
    ],
    "stream": False
}

# Make the API call
try:
    print("Testing Llama 3.3 70B with a Backdoor AI related question...\n")
    response = requests.post(url, json=payload)
    if response.status_code == 200:
        result = response.json()
        content = result.get("message", {}).get("content", "No content returned")
        print("✅ Model response:\n")
        print(content)
    else:
        print(f"❌ Error: Server returned status {response.status_code}")
        print(response.text)
except Exception as e:
    print(f"❌ Error testing model: {str(e)}")

## 6. Set up a tunnel to access your Ollama instance

Now we'll set up a Cloudflare tunnel so your Backdoor AI application can access this Ollama instance.

In [None]:
import subprocess
import threading
import time
import re
from IPython.display import display, HTML

# Function to run cloudflared tunnel in a separate thread
def run_tunnel():
    global tunnel_process, tunnel_url
    tunnel_process = subprocess.Popen(
        ["cloudflared", "tunnel", "--url", "http://localhost:11434"],
        stdout=subprocess.PIPE,
        stderr=subprocess.STDOUT,
        text=True
    )
    
    # Extract tunnel URL
    tunnel_url = None
    url_pattern = re.compile(r'https://[\w.-]+\.trycloudflare\.com')
    
    while True:
        line = tunnel_process.stdout.readline()
        if not line and tunnel_process.poll() is not None:
            break
        
        match = url_pattern.search(line)
        if match and not tunnel_url:
            tunnel_url = match.group(0)
            print(f"✅ Tunnel established at: {tunnel_url}")
            # Update tunnel info
            tunnel_info.value = f"<div style='padding: 10px; background-color: #e6ffe6; border-radius: 5px;'><b>✅ Your Ollama API is accessible at:</b><br><code>{tunnel_url}</code><br><br>Use this URL in your Backdoor AI settings as the Ollama API Base URL.<br>Keep this notebook running while you're using Ollama with your app!</div>"

# Create widgets for tunnel info
tunnel_button = widgets.Button(description='Start Tunnel', button_style='info')
tunnel_info = widgets.HTML("")

# Function to start tunnel
def on_tunnel_button_click(b):
    b.description = "Starting..."
    b.disabled = True
    tunnel_info.value = "<div style='padding: 10px; background-color: #fff3e6; border-radius: 5px;'>⏳ Creating secure tunnel to Ollama... (this may take a moment)</div>"
    
    # Start tunnel in separate thread
    thread = threading.Thread(target=run_tunnel)
    thread.daemon = True
    thread.start()
    
    # Check for tunnel URL
    attempts = 0
    while attempts < 10 and not tunnel_url:
        time.sleep(1)
        attempts += 1
    
    if not tunnel_url:
        tunnel_info.value = "<div style='padding: 10px; background-color: #ffe6e6; border-radius: 5px;'>❌ Failed to create tunnel. Check the output below for details.</div>"
        b.description = "Try Again"
        b.disabled = False

# Connect events
tunnel_button.on_click(on_tunnel_button_click)

# Initialize global variables
tunnel_process = None
tunnel_url = None

# Display widgets
display(widgets.HTML("<h3>Create a secure tunnel to your Ollama instance:</h3>"))
display(widgets.HTML("<p>This will create a public URL that you can use to connect your Backdoor AI application to this Ollama instance.</p>"))
display(tunnel_button)
display(tunnel_info)

## 7. Connect Backdoor AI to your Ollama instance

Once you have your tunnel URL, follow these steps to connect Backdoor AI to your Ollama instance:

1. Go to your Backdoor AI settings page
2. Select "Ollama (Remote)" as your LLM provider
3. In the "Ollama API URL" field, enter the tunnel URL from above
4. In the "Ollama Model" dropdown, select "llama3.3:70b"
5. Click "Save Settings"

**Important Notes:**
- Keep this notebook running as long as you're using Ollama with your app
- The tunnel URL will change each time you restart this notebook
- Google Colab sessions have limited runtime (a few hours for free tier)
- Your model downloads will be lost when the Colab session ends

If you want a more permanent solution, consider setting up Ollama on your own machine or a cloud server.

## Additional Information

### Troubleshooting

If you encounter issues:

1. **Ollama not starting**: Try restarting the runtime (Runtime > Restart runtime) and run all cells again
2. **Model download failing**: Check your internet connection and try a smaller model
3. **Tunnel not working**: Make sure Ollama is running properly, then try starting the tunnel again

### Keeping Colab Active

Google Colab sessions will disconnect after periods of inactivity. To keep your session active:
- Keep the browser tab open
- Consider using browser extensions that simulate activity

### Shutting Down

When you're done, remember to:
1. Switch your Backdoor AI back to another provider if needed
2. Close this notebook

### For Production Use

For a more reliable solution, consider:
- Running Ollama on your own hardware or a cloud VM
- Setting up proper authentication and security measures
- Using persistent storage for model files