## Getting Started

### Prerequisites
1. Create an Ollama account at [ollama.com](https://ollama.com)
2. Generate an API key from [ollama.com/settings/keys](https://ollama.com/settings/keys)
3. Set the API key in the `.env` file in this folder

### Installation
```bash
pip install ollama python-dotenv requests
```

In [1]:
# Setup: Install required packages and import libraries
import sys
import subprocess

packages = ['ollama', 'python-dotenv', 'requests']

for package in packages:
    try:
        __import__(package.replace('-', '_'))
    except ImportError:
        print(f"Installing {package}...")
        subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", package])

print("✓ All packages installed successfully!")

Installing python-dotenv...
✓ All packages installed successfully!


In [2]:
# Import all required libraries
import os
from pathlib import Path
from dotenv import load_dotenv
import ollama
import base64

# Load environment variables from .env file
# Use Path.cwd() since __file__ is not defined in Jupyter notebooks
env_path = Path.cwd() / '.env'
if env_path.exists():
    load_dotenv(env_path)
else:
    # Try to find .env in the ollama-cloud directory
    load_dotenv(dotenv_path='d:\\ollama-n8n\\ollama-cloud\\.env')

# Get API key and configuration
OLLAMA_API_KEY = os.getenv('OLLAMA_API_KEY')
OLLAMA_ENDPOINT = os.getenv('OLLAMA_API_ENDPOINT', 'https://ollama.com')
DEFAULT_MODEL = os.getenv('DEFAULT_MODEL', 'gpt-oss:120b')

# Verify API key is set
if not OLLAMA_API_KEY or OLLAMA_API_KEY == 'your_api_key_here':
    print(" WARNING: OLLAMA_API_KEY not set in .env file!")
    print("Please set your API key in the .env file: https://ollama.com/settings/keys")
else:
    print(f"✓ Ollama API configured")
    print(f"  Endpoint: {OLLAMA_ENDPOINT}")
    print(f"  Default Model: {DEFAULT_MODEL}")

# Initialize Ollama client for cloud API
client = ollama.Client(
    host=OLLAMA_ENDPOINT,
    headers={'Authorization': f'Bearer {OLLAMA_API_KEY}'}
)

✓ Ollama API configured
  Endpoint: https://ollama.com
  Default Model: gpt-oss:120b


## Vision Capability - Analyze Images

Some Ollama models support vision/multimodal capabilities. They can analyze images and answer questions about them.

**Note**: This example uses base64 encoding. You can also use image URLs directly in many cases.

In [3]:
def test_vision_with_url(model: str = 'gemma3:4b', image_url: str = 'https://raw.githubusercontent.com/opencv/opencv/master/samples/data/lena.jpg') -> str:
    """
    Test vision capability using an image URL.
    Downloads the image and converts to base64 for Ollama.
    
    Args:
        model: Vision-capable model (llama4:scout, qwen2.5vl, etc.)
        image_url: URL of the image to analyze
    
    Returns:
        Model's analysis of the image
    """
    print(f"   Testing vision capability with {model}...")
    print(f"   Image URL: {image_url}")
    print(f"   Downloading image...")
    
    try:
        # Download image from URL with proper headers to avoid 403 errors
        import requests
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
        }
        response_img = requests.get(image_url, headers=headers, timeout=10, allow_redirects=True)
        response_img.raise_for_status()
        
        # Verify we got image data, not HTML
        content_type = response_img.headers.get('content-type', '')
        if 'text/html' in content_type or not content_type.startswith('image'):
            print(f"   Warning: Received {content_type} instead of image")
            print(f"   Response preview: {response_img.content[:100]}")
            raise ValueError(f"URL returned {content_type} instead of an image")
        
        # Convert to base64 - Ollama expects just the base64 string, not data URI
        image_data = base64.b64encode(response_img.content).decode('utf-8')
        
        print(f"   ✓ Image downloaded ({len(response_img.content)} bytes, {content_type})\n")
        
        response = client.chat(
            model=model,
            messages=[
                {
                    'role': 'user',
                    'content': 'Describe what you see in this image in detail.',
                    'images': [image_data]  # Pass base64 string directly, not data URI
                }
            ],
            stream=False
        )
        
        answer = response.message.content
        print(f"Vision Analysis:\n{answer}\n")
        return answer
    
    except Exception as e:
        print(f"Error: {str(e)}")
        print("Note: Make sure the URL is accessible and the model supports vision.")
        raise

def test_vision_with_base64(model: str = 'gemma3:4b', image_path: str = None) -> str:
    """
    Test vision capability with local image (base64 encoded).
    
    Args:
        model: Vision-capable model
        image_path: Path to local image file
    
    Returns:
        Model's analysis
    """
    if image_path is None:
        print("   Note: For this example to work, provide a path to a local image file.")
        print("   Example: test_vision_with_base64(image_path='./path/to/image.jpg')\n")
        return "No image provided"
    
    # Read and encode image
    with open(image_path, 'rb') as f:
        image_data = base64.b64encode(f.read()).decode('utf-8')
    
    # Determine image format
    file_ext = Path(image_path).suffix.lower()
    media_type_map = {
        '.jpg': 'image/jpeg',
        '.jpeg': 'image/jpeg',
        '.png': 'image/png',
        '.gif': 'image/gif',
        '.webp': 'image/webp'
    }
    media_type = media_type_map.get(file_ext, 'image/jpeg')
    
    print(f"   Testing vision with local image...")
    print(f"   Model: {model}")
    print(f"   Image: {image_path}\n")
    
    response = client.chat(
        model=model,
        messages=[
            {
                'role': 'user',
                'content': 'Analyze this image and describe what you see.',
                'images': [image_data]  # Pass base64 string directly, not data URI
            }
        ],
        stream=False
    )
    
    answer = response.message.content
    print(f"Vision Analysis:\n{answer}\n")
    return answer

# Test vision with URL
print("Testing vision with online image:\n")
vision_result = test_vision_with_url()

Testing vision with online image:

   Testing vision capability with gemma3:4b...
   Image URL: https://raw.githubusercontent.com/opencv/opencv/master/samples/data/lena.jpg
   Downloading image...
   ✓ Image downloaded (91814 bytes, image/jpeg)

Vision Analysis:
Okay, here’s a detailed description of what I see in the image:

**Overall Impression:**

The image is a portrait, likely taken in the late 1960s or early 1970s, judging by the fashion and the color palette. It has a slightly vintage, possibly slightly faded, feel due to the color processing. The lighting is warm and creates a soft, flattering effect.

**The Subject:**

*   **Woman:** A young woman is the central focus. She has a relaxed, contemplative expression with her eyes looking slightly to the right.
*   **Hair:** Her hair is long, wavy, and a rich, dark brown color. A strand of hair is draped dramatically over the side of her hat, adding a touch of bohemian style.
*   **Skin Tone:** She appears to have fair skin with a 