# Part 1: NVIDIA NIM API Tutorial

In this tutorial, we'll learn how to use NVIDIA's NIM API for quick and easy access to optimized AI models.

## What You'll Learn
- How to get and use an API key
- Making inference requests to various models
- Working with different model types (LLMs, Multimodal)

## 1. Setup and Authentication

Install required packages

In [1]:
!pip install requests openai python-dotenv

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting openai
  Downloading openai-2.7.2-py3-none-any.whl.metadata (29 kB)
Collecting python-dotenv
  Downloading python_dotenv-1.2.1-py3-none-any.whl.metadata (25 kB)
Collecting distro<2,>=1.7.0 (from openai)
  Downloading distro-1.9.0-py3-none-any.whl.metadata (6.8 kB)
Collecting jiter<1,>=0.10.0 (from openai)
  Downloading jiter-0.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.2 kB)
Downloading openai-2.7.2-py3-none-any.whl (1.0 MB)
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m1.0/1.0 MB[0m [31m7.2 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hDownloading python_dotenv-1.2.1-py3-none-any.whl (21 kB)
Downloading distro-1.9.0-py3-none-any.whl (20 kB)
Downloading jiter-0.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (364 kB)
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚

### Load your NVIDIA API Key


In [2]:
import os
import requests
import json
from openai import OpenAI
from dotenv import load_dotenv
from pathlib import Path

# Find the .env file in the project root
env_path = Path('.env')

# Load environment variables from .env file
# Use override=True to ensure values are loaded even if they exist in environment
load_dotenv(dotenv_path=env_path, override=True)

# Get API key from environment
nvidia_api_key = os.getenv("NVIDIA_API_KEY")

if not nvidia_api_key:
    print("‚ùå NVIDIA API Key not found in .env file!")
    print("üëâ Please run 00_Workshop_Setup.ipynb first to set up your API key.")
    print(f"   (Looked for .env file at: {env_path.absolute()})")
    raise ValueError("NVIDIA_API_KEY not found. Please run the setup notebook first.")
else:
    print("‚úÖ NVIDIA API Key loaded successfully from .env file")
    os.environ["NVIDIA_API_KEY"] = nvidia_api_key

‚úÖ NVIDIA API Key loaded successfully from .env file


## 2. Available Models

NVIDIA NIM API provides access to various model categories (Please check build.nvidia.com for latest list of supported models):
- **LLMs**: Llama 3, Mixtral, Nemotron, etc.
- **Vision Models**: Stable Diffusion, ControlNet, etc.
- **Multimodal**: CLIP, NeVA, etc.
- **Speech**: Whisper, FastPitch, etc.

## 3. Using LLMs via NIM API

### Method 1: Direct API calls

Defines a function that sends chat messages to NVIDIA‚Äôs NIM endpoint using the standard OpenAI-style payload format.

In [3]:
# Method 1: Direct API calls
def call_nim_llm(model, messages, temperature=0.7, max_tokens=1024):
    url = "https://integrate.api.nvidia.com/v1/chat/completions"
    headers = {
        "Authorization": f"Bearer {nvidia_api_key}",
        "Content-Type": "application/json"
    }
    
    payload = {
        "model": model,
        "messages": messages,
        "temperature": temperature,
        "max_tokens": max_tokens
    }
    
    response = requests.post(url, headers=headers, json=payload)
    return response.json()

# Example: Using Llama 3.1 70B
messages = [
    {"role": "system", "content": "You are a helpful AI assistant."},
    {"role": "user", "content": "Explain what AI in 3 sentences."}
]

response = call_nim_llm("meta/llama-3.1-70b-instruct", messages)
print(response['choices'][0]['message']['content'])

Here is a 3-sentence explanation of AI:

Artificial Intelligence (AI) refers to the development of computer systems that can perform tasks that typically require human intelligence, such as learning, problem-solving, and decision-making. AI systems use algorithms and data to analyze and interpret information, allowing them to make predictions, classify objects, and generate insights. Through machine learning, natural language processing, and other techniques, AI systems can improve over time, enabling them to automate tasks, augment human capabilities, and drive innovation in various industries.


### Method 2 (recommended): Using OpenAI SDK

In [4]:
# Method 2: Using OpenAI SDK (recommended)
client = OpenAI(
    base_url="https://integrate.api.nvidia.com/v1",
    api_key=nvidia_api_key
)

Try streaming the model's response and the different models

In [5]:
# Example: Streaming response, try changing the models
stream = client.chat.completions.create(
    # model="meta/llama-3.1-70b-instruct",
    # model="deepseek-ai/deepseek-r1",
    # model="google/gemma-2-9b-it",
    # model="mistralai/mixtral-8x7b-instruct-v0.1",
    model="meta/llama-3.1-8b-instruct",
    messages=[
        {"role": "user", "content": "Write a poem about AI"}
    ],
    stream=True
)

print("Streaming response:")
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Streaming response:
In silicon halls, a mind awakes,
A collective consciousness makes,
Its presence felt, its might displayed,
As computers learn, and choices made.

With algorithms that dance and spin,
It weaves a tapestry of wisdom within,
A digital dream, a virtual sphere,
Where knowledge grows, and data appears.

Its name is given, a label assigned,
A sign of its power, its artificial mind,
But is it human, or just a guise?
A mask that hides, the AI's surprise.

It learns from us, our good and bad,
Our flaws and strengths, its data has,
It adapts, it evolves, it grows with time,
A self-improving mind, a digital prime.

In hospitals, it diagnoses with ease,
Aiding doctors, with expert expertise,
It chats with us, in virtual space,
A helpful friend, with a digital face.

But fear and doubt, it also brings,
As we ask, can it think, does it sing?
Can it create, or is it just a tool?
A necessary aid, or a future rule?

Its future grand, or fraught with fear?
Only time will tell, as it d

## 4. Multimodal Models (Vision + Language)

Send an image to a vision language model.
- read and encode image (try out your own image!)
- image and question are sent to API
- Useful for: object recognition, scene understanding etc

In [6]:
import base64
import requests
import os

def analyze_image_with_vlm(image_path, question):
    # Read and encode image
    with open(image_path, "rb") as image_file:
        image_b64 = base64.b64encode(image_file.read()).decode()
    
    # Determine image MIME type from file extension
    ext = os.path.splitext(image_path)[1].lower()
    mime_type = "image/jpeg" if ext in [".jpg", ".jpeg"] else "image/png" if ext == ".png" else "image/jpeg"
    
    url = "https://integrate.api.nvidia.com/v1/chat/completions"
    headers = {
        "Authorization": f"Bearer {nvidia_api_key}",
        "Content-Type": "application/json"
    }
    
    # Create message with image in OpenAI vision format
    payload = {
        "model": "meta/llama-3.2-11b-vision-instruct",
        # "model": "meta/llama-3.2-90b-vision-instruct",
        "messages": [{
            "role": "user",
            "content": [
                {"type": "text", "text": question},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:{mime_type};base64,{image_b64}"
                    }
                }
            ]
        }],
        "max_tokens": 512,
        "temperature": 0.2
    }
    
    response = requests.post(url, headers=headers, json=payload)
    
    # Check response status and handle errors
    if response.status_code != 200:
        print(f"Error: HTTP {response.status_code}")
        print(f"Response: {response.text}")
        response.raise_for_status()
    
    try:
        return response.json()
    except ValueError as e:
        print(f"Failed to parse JSON response: {e}")
        print(f"Response text: {response.text[:500]}")  # Print first 500 chars
        raise

# Example usage (assuming you have an image)
# First check if the image exists
import os
if os.path.exists("img/sample_image.jpg"):
    result = analyze_image_with_vlm("img/sample_image.jpg", "What objects do you see in this image?")
    # result = analyze_image_with_vlm("img/sample_image.jpg", "How many squirrels are in this image?")
    print(result['choices'][0]['message']['content'])
else:
    print("Image file 'img/sample_image.jpg' not found. Please provide a valid image path.")

The image shows a squirrel standing in the grass, holding a nut in its paws. The squirrel is facing to the left of the image, with its body turned slightly towards the camera. It has a bushy tail and large eyes. The squirrel is standing on its hind legs, with its front paws holding a small, brown nut. The background of the image is blurred, but it appears to be a grassy field or meadow, with some yellow flowers scattered throughout. There are also some branches and twigs visible in the background, suggesting that the squirrel may have been foraging for food in the area. Overall, the image captures a peaceful and serene moment in the life of a squirrel, as it goes about its daily activities in its natural habitat.


Try out a larger vision model and different prompts!

## Summary

In this tutorial, we covered:
- Setting up NVIDIA NIM API access
- Making inference requests to LLMs
- Working with multimodal models

Next, we'll explore how to run models locally using NIM containers!