# 1. Prerequisites
Python 3.x installed

requests library (pip install requests)

(Optional) an NVIDIA NIM API key (get one at build.nvidia.com by clicking "Get API Key")

# 2. Find and Get Your API Key
Go to build.nvidia.com.

Pick a model (e.g., Mixtral-8x7b, Llama).

Click "Get API Key" and save the value (it starts with nvapi-...).

# 3. Make Your First API Request

In [None]:
# import requests
import json

API_URL = "https://integrate.api.nvidia.com/v1/chat/completions"
API_KEY = "nvapi-..."  # Replace with your key

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_KEY}"
}

data = {
    "model": "mistralai/mixtral-8x7b-instruct-v0.1",
    "messages": [{"role": "user", "content": "Say hello in French!"}],
    "temperature": 0.5,
    "max_tokens": 100,
}

response = requests.post(API_URL, headers=headers, json=data)
print(response.json())

# 4. See All Available Models

In [None]:
models_url = "https://integrate.api.nvidia.com/v1/models"
response = requests.get(models_url, headers=headers)
models = response.json().get("data", [])
for model in models:
    print(model.get("id"))


# 5. Streaming Responses
If you want to display streaming tokens as they arrive (like a chatbot):

In [None]:
headers["Accept"] = "text/event-stream"
response = requests.post(API_URL, headers=headers, json=data, stream=True)

def stream_response(resp):
    for line in resp.iter_lines():
        if line and line.startswith(b"data: "):
            chunk = json.loads(line[6:])
            print(chunk["choices"][0]["delta"].get("content", ""), end="")

stream_response(response)


# 6. Using the OpenAI Python Client (Optional)
If you’re familiar with the OpenAI API, you can use their Python client with NVIDIA endpoints:

In [None]:
from openai import OpenAI

client = OpenAI(
    base_url="https://integrate.api.nvidia.com/v1",
    api_key=API_KEY
)

response = client.chat.completions.create(
    model="mistralai/mixtral-8x7b-instruct-v0.1",
    messages=[{"role": "user", "content": "Hello World!"}],
    temperature=1,
    max_tokens=100
)
print(response.choices[0].message.content)


# 7. Summary
You can try out many high-end AI models by getting an API key and sending a couple of lines of Python code.

Use /v1/models to see what’s available.

Send to /v1/chat/completions with your prompt to get a response.

Use streaming for real-time output.

Optional: Connect via the OpenAI Python client for even easier use.