# Getting Started with NVIDIA

This notebook demonstrates using inference calls against a model hosted remotely on [NVIDIA](https://build.nvidia.com/explore/discover).

In [None]:
!pip install git+https://github.com/ibm-granite-community/utils langchain langchain_nvidia_ai_endpoints

## Accessing NVIDIA services

### Establish NVIDIA Account

Sign up for an account at [NVIDIA](https://build.nvidia.com/explore/discover)


During the signup process, generate a `NVIDIA_API_KEY`.

There are three ways to provide this value to the cells below.  In order of precedence:

1. As an environment variable
2. As a Google colab secret
3. Supplied by the user using `getpass()`

In [None]:
from ibm_granite_community.notebook_utils import get_env_var

NVIDIA_API_KEY = get_env_var('NVIDIA_API_KEY')

## Querying the model with LangChain

### Choose a model

Models available on NVIDIA:

- [`ibm-granite/granite-3.0-8b-instruct`](https://build.nvidia.com/ibm/granite-3_0-8b-instruct)
- [`ibm-granite/granite-guardian-3.0-8B`](https://build.nvidia.com/ibm/granite-guardian-3_0-8b)
- [`ibm-granite/granite-3.0-3b-a800m-instruct`](https://build.nvidia.com/ibm/granite-3_0-3b-a800m-instruct)

### Instantiate the Model Client

In [None]:
from langchain_nvidia_ai_endpoints import ChatNVIDIA

client = ChatNVIDIA(
  model="ibm/granite-3.0-8b-instruct",
  api_key=NVIDIA_API_KEY, 
  temperature=0.2,
  top_p=0.7,
  max_tokens=1024,
)


### Perform Inference

In [None]:
prompt = "Write a ballad about IBM"

for chunk in client.stream([{"role":"user", "content": prompt}]): 
  print(chunk.content, end="")
