# Part 1: NVIDIA NIM API Tutorial

In this tutorial, we'll learn how to use NVIDIA's NIM API for quick and easy access to optimized AI models.

## What You'll Learn
- How to get and use an API key
- Making inference requests to various models
- Working with different model types (LLMs, Multimodal)

## 1. Setup and Authentication

First, sign up for an API key at: https://build.nvidia.com/explore/discover

In [None]:
# Install required packages
!pip install requests openai python-dotenv

🎤 **PRESENTER SCRIPT:**

"Now we'll load your NVIDIA API Key from the .env file that was created in the setup notebook.

[RUN THE CELL]

Great! The API key has been loaded from the .env file. This is a much better practice than typing it in each time.

If you see an error about the API key not being found, make sure you've run the 00_Workshop_Setup.ipynb notebook first.

Extra info: In production, you'd load this from a secrets manager like AWS Secrets Manager or Azure Key Vault. Never commit API keys to git!"


In [None]:
import os
import requests
import json
from openai import OpenAI
from dotenv import load_dotenv
from pathlib import Path

# Find the .env file in the project root
env_path = Path('.env')

# Load environment variables from .env file
# Use override=True to ensure values are loaded even if they exist in environment
load_dotenv(dotenv_path=env_path, override=True)

# Get API key from environment
nvidia_api_key = os.getenv("NVIDIA_API_KEY")

if not nvidia_api_key:
    print("❌ NVIDIA API Key not found in .env file!")
    print("👉 Please run 00_Workshop_Setup.ipynb first to set up your API key.")
    print(f"   (Looked for .env file at: {env_path.absolute()})")
    raise ValueError("NVIDIA_API_KEY not found. Please run the setup notebook first.")
else:
    print("✅ NVIDIA API Key loaded successfully from .env file")
    os.environ["NVIDIA_API_KEY"] = nvidia_api_key

## 2. Available Models

NVIDIA NIM API provides access to various model categories (Please check build.nvidia.com for latest list of supported models):
- **LLMs**: Llama 3, Mixtral, Nemotron, etc.
- **Vision Models**: Stable Diffusion, ControlNet, etc.
- **Multimodal**: CLIP, NeVA, etc.
- **Speech**: Whisper, FastPitch, etc.

## 3. Using LLMs via NIM API

### Method 1: Direct API calls

In [None]:
# Method 1: Direct API calls
def call_nim_llm(model, messages, temperature=0.7, max_tokens=1024):
    url = "https://integrate.api.nvidia.com/v1/chat/completions"
    headers = {
        "Authorization": f"Bearer {nvidia_api_key}",
        "Content-Type": "application/json"
    }
    
    payload = {
        "model": model,
        "messages": messages,
        "temperature": temperature,
        "max_tokens": max_tokens
    }
    
    response = requests.post(url, headers=headers, json=payload)
    return response.json()

# Example: Using Llama 3.1 70B
messages = [
    {"role": "system", "content": "You are a helpful AI assistant."},
    {"role": "user", "content": "Explain what AI in 3 sentences."}
]

response = call_nim_llm("meta/llama-3.1-70b-instruct", messages)
print(response['choices'][0]['message']['content'])

### Method 2 (recommended): Using OpenAI SDK

In [12]:
# Method 2: Using OpenAI SDK (recommended)
client = OpenAI(
    base_url="https://integrate.api.nvidia.com/v1",
    api_key=nvidia_api_key
)

Try swapping out to different models!

In [16]:
# Example: Streaming response, try changing the models
stream = client.chat.completions.create(
    # model="meta/llama-3.1-70b-instruct",
    # model="deepseek-ai/deepseek-r1",
    # model="google/gemma-2-9b-it",
    # model="mistralai/mixtral-8x7b-instruct-v0.1",
    model="meta/llama-3.1-8b-instruct",
    messages=[
        {"role": "user", "content": "Write a poem about AI"}
    ],
    stream=True
)

print("Streaming response:")
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Streaming response:
In silicon halls, a mind awakes,
A collective consciousness makes,
Its presence felt, its might displayed,
As computers learn, and choices made.

With algorithms that dance and spin,
It weaves a tapestry of wisdom within,
A digital dream, a virtual sphere,
Where knowledge grows, and data appears.

Its name is given, a label assigned,
A sign of its power, its artificial mind,
But is it human, or just a guise?
A mask that hides, the AI's surprise.

It learns from us, our good and bad,
Our flaws and strengths, its data has,
It adapts, it evolves, it grows with time,
A self-improving mind, a digital prime.

In hospitals, it diagnoses with ease,
Aiding doctors, with expert expertise,
It guides our cars, through roads so long,
A trusted friend, a loyal song.

But with each step, a question grows,
Is this intelligence, just a clever show?
A simulation of thought, a mimicry true,
Or is it real, or just a digital crew?

The lines are blurred, the debate's intense,
As AI rise

Lets try out some other models, see how easy it is to swap between models

## 4. Multimodal Models (Vision + Language)

In [None]:
import base64
import requests
import os

def analyze_image_with_vlm(image_path, question, model="nvidia/neva-22b"):
    # Read and encode image
    with open(image_path, "rb") as image_file:
        image_b64 = base64.b64encode(image_file.read()).decode()
    
    url = f"https://ai.api.nvidia.com/v1/vlm/{model}"
    headers = {
        "Authorization": f"Bearer {nvidia_api_key}",
        "Accept": "application/json"
    }
    
    # Create message with image
    message_content = f'{question} <img src="data:image/png;base64,{image_b64}" />'
    
    payload = {
        "messages": [{"role": "user", "content": message_content}],
        "max_tokens": 512,
        "temperature": 0.2
    }
    
    response = requests.post(url, headers=headers, json=payload)
    return response.json()

# Example usage (assuming you have an image)
# First check if the image exists
import os
if os.path.exists("img/sample_image.jpg"):
    result = analyze_image_with_vlm("img/sample_image.jpg", "What objects do you see in this image?")
    print(result['choices'][0]['message']['content'])
else:
    print("Image file 'img/sample_image.jpg' not found. Please provide a valid image path.")

Try out your own image!

## Summary

In this tutorial, we covered:
- Setting up NVIDIA NIM API access
- Making inference requests to LLMs
- Working with multimodal models

Next, we'll explore how to run models locally using NIM containers!