<div align="center">
<img src="https://poorit.in/image.png" alt="Poorit" width="40" style="vertical-align: middle;"> <b>AI SYSTEMS ENGINEERING 1</b>

## Unit 1: Comparing Model Providers - OpenAI, Gemini, and Ollama

**CV Raman Global University, Bhubaneswar**  
*AI Center of Excellence*

---

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Poorit-Technologies/cvraman-ai-notebooks/blob/main/ai-systems-engineering-1/unit-1/02-ai-systems-engineering-1-unit1-comparing-models.ipynb)

</div>

---

### What You'll Learn

In this notebook, you will:

1. **Understand the Chat Completions API** and how HTTP endpoints work
2. **Compare Python client libraries** vs raw HTTP requests
3. **Use OpenAI-compatible endpoints** to connect to Google Gemini
4. **Run local models with Ollama** for free, private inference

**Duration:** ~1.5 hours

---

## 1. Environment Setup

In [None]:
# Install required packages
!pip install -q openai requests

In [None]:
import os
import requests
from getpass import getpass
from openai import OpenAI

In [None]:
# Configure OpenAI API Key
openai_api_key = getpass("Enter your OpenAI API Key: ")
os.environ['OPENAI_API_KEY'] = openai_api_key

---

## 2. Understanding HTTP Endpoints

The **Chat Completions API** is simply an HTTP endpoint that:
- Receives a POST request with messages
- Returns a completion (the model's response)

Let's make a raw HTTP call to understand what's happening behind the scenes.

In [None]:
# Raw HTTP call to OpenAI endpoint

headers = {
    "Authorization": f"Bearer {openai_api_key}",
    "Content-Type": "application/json"
}

payload = {
    "model": "gpt-4o-mini",
    "messages": [
        {"role": "user", "content": "Tell me a fun fact about India"}
    ]
}

response = requests.post(
    "https://api.openai.com/v1/chat/completions",
    headers=headers,
    json=payload
)

print(response.json()["choices"][0]["message"]["content"])

---

## 3. Python Client Libraries

The **openai** package is a Python client library - a wrapper around the HTTP endpoint.

It provides:
- Cleaner Python syntax
- Type hints and autocomplete
- Automatic error handling

**Important**: The client library is open-source and lightweight. It doesn't contain any model code!

In [None]:
# Using the OpenAI client library (much cleaner!)

client = OpenAI(api_key=openai_api_key)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Tell me a fun fact about India"}]
)

print(response.choices[0].message.content)

---

## 4. OpenAI-Compatible Endpoints

OpenAI's Chat Completions API became so popular that other providers created **compatible endpoints**.

This means you can use the same Python client library to call:
- OpenAI (GPT models)
- Google Gemini
- Ollama (local models)
- Many others!

### Google Gemini

To use Gemini, get your API key from: https://aistudio.google.com/apikey

In [None]:
# Configure Gemini (optional)
google_api_key = getpass("Enter your Google API Key (or press Enter to skip): ")

if google_api_key:
    GEMINI_BASE_URL = "https://generativelanguage.googleapis.com/v1beta/openai/"
    
    gemini_client = OpenAI(
        base_url=GEMINI_BASE_URL,
        api_key=google_api_key
    )
    
    response = gemini_client.chat.completions.create(
        model="gemini-1.5-flash",
        messages=[{"role": "user", "content": "Tell me a fun fact about India"}]
    )
    
    print("Gemini response:")
    print(response.choices[0].message.content)
else:
    print("Skipping Gemini - no API key provided")

---

## 5. Running Local Models with Ollama

**Ollama** lets you run open-source models — and in Google Colab, we can install and run it directly inside the VM!

**Benefits:**
- Free (no API charges)
- Private (data stays on the Colab VM)
- Uses Colab's free T4 GPU (16GB VRAM)

**Limitations:**
- Less powerful than frontier models
- Free tier GPU memory limits you to smaller models (1B–3B parameters)

We'll install Ollama, start the server, and pull a small model — all within this notebook.

In [None]:
# Install zstd (required by Ollama installer) and then install Ollama
!sudo apt-get update -qq && sudo apt-get install -y -qq zstd > /dev/null
!curl -fsSL https://ollama.com/install.sh | sh

# Start Ollama server as a background daemon
import subprocess
import time

subprocess.Popen(["ollama", "serve"], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
time.sleep(5)  # Wait for the server to start

# Verify it's running
try:
    status = requests.get("http://localhost:11434", timeout=5)
    print("Ollama is running!")
except:
    print("Ollama failed to start. Try re-running this cell.")

In [None]:
# Pull a small model (~600MB, works well on Colab free tier)
# For a more capable model, try: !ollama pull llama3.2:3b (requires GPU runtime)
!ollama pull llama3.2:1b

In [None]:
# Connect to Ollama using OpenAI-compatible endpoint

OLLAMA_BASE_URL = "http://localhost:11434/v1"

ollama_client = OpenAI(
    base_url=OLLAMA_BASE_URL,
    api_key="ollama"  # Ollama doesn't need a real key
)

response = ollama_client.chat.completions.create(
    model="llama3.2:1b",
    messages=[{"role": "user", "content": "Tell me a fun fact about India"}]
)

print("Llama 3.2 response:")
print(response.choices[0].message.content)

In [None]:
# Try DeepSeek R1 (reasoning model)
!ollama pull deepseek-r1:1.5b

In [None]:
response = ollama_client.chat.completions.create(
    model="deepseek-r1:1.5b",
    messages=[{"role": "user", "content": "What is 15 * 23?"}]
)

print("DeepSeek R1 response:")
print(response.choices[0].message.content)

---

## 6. Comparing Providers

Let's create a utility to compare responses from different providers.

In [None]:
def compare_models(prompt, clients):
    """Compare responses from multiple model providers."""
    for name, (client, model) in clients.items():
        try:
            response = client.chat.completions.create(
                model=model,
                messages=[{"role": "user", "content": prompt}]
            )
            print(f"\n--- {name} ({model}) ---")
            print(response.choices[0].message.content[:300] + "...")
        except Exception as e:
            print(f"\n--- {name} ---")
            print(f"Error: {e}")

In [None]:
# Set up clients dictionary
clients = {
    "OpenAI": (client, "gpt-4o-mini"),
    "Ollama": (ollama_client, "llama3.2:1b")
}

# Add Gemini if available
if google_api_key:
    clients["Gemini"] = (gemini_client, "gemini-1.5-flash")

# Compare!
compare_models("Explain quantum computing in simple terms", clients)

---

## 7. Exercise: Build a Multi-Provider Summarizer

Modify the website summarizer from notebook 01 to use Ollama instead of OpenAI.

In [None]:
# Exercise: Create a summarizer using Ollama

def summarize_with_ollama(text):
    """Summarize text using a local Ollama model."""
    # Your implementation here
    pass

# Test with some sample text

---

## Key Takeaways

1. **Chat Completions API** is an HTTP endpoint - you can call it with raw requests

2. **Python client libraries** are convenient wrappers, not model implementations

3. **OpenAI-compatible endpoints** let you use the same code for multiple providers

4. **Ollama** provides free, private inference with open-source models — and runs inside Google Colab!

### Provider Comparison

| Provider | Cost | Privacy | Quality | Speed |
|----------|------|---------|---------|-------|
| OpenAI | Paid | Cloud | Highest | Fast |
| Gemini | Free tier | Cloud | High | Fast |
| Ollama | Free | Colab VM | Varies | Depends on hardware |

### What's Next?

In the next notebook, we'll explore:
- Tokenization and how text is converted to tokens
- Understanding context windows and API costs
- Conversation memory

---

## Additional Resources

- [OpenAI API Documentation](https://platform.openai.com/docs/)
- [Google AI Studio](https://aistudio.google.com/)
- [Ollama Documentation](https://ollama.com/)

---

**Course Information:**
- **Institution:** CV Raman Global University, Bhubaneswar
- **Program:** AI Center of Excellence
- **Course:** AI Systems Engineering 1
- **Developed by:** [Poorit Technologies](https://poorit.in) - *Transform Graduates into Industry-Ready Professionals*

---