# This notebook is a simple example of utilizing HuggingFace's transformers and get a start in using large language models

In [8]:
# import requirements
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
# Create device agnostic code:
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cpu'

### For our purposes, we want to use smaller models (so multiple people can test the demo at the same time, and so we don't use all of Jet's resources)

In [9]:
# OPT is an open-source language model with a variety of different sizes
# Parameters influence the amount of VRAM that the model utilizes as well as the complexity/accuracy of the answers the model can generate.
model_name = "facebook/opt-350m"

There are a few different ways to utilize language models with transformers.

The first (and simplest) is to create a pipeline for generation.

In [12]:
generator = pipeline('text-generation', model=model_name, max_new_tokens=50)
response = generator("Hello, I am conscious and ")

response

[{'generated_text': 'Hello, I am conscious and  I am a very good person. I am a very good person. I am a very good person. I am a very good person. I am a very good person. I am a very good person. I am a very good person. I'}]

The next way, which allows for more complex use of the model, is loading it with `AutoTokenizer` and `AutoModeForCausalLM`

In [13]:
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

input = "Hello, I am conscious and "

input_ids = tokenizer(input, return_tensors="pt").input_ids.to(device)

outputs = model.generate(input_ids, max_new_tokens=50)
output = tokenizer.decode(outputs[0])

output

'</s>Hello, I am conscious and  I am a very good person. I am a very good person. I am a very good person. I am a very good person. I am a very good person. I am a very good person. I am a very good person. I'

In [7]:
!nvidia-smi

Fri Aug  4 02:39:03 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 536.67                 Driver Version: 536.67       CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  NVIDIA GeForce RTX 3080 Ti   WDDM  | 00000000:0B:00.0  On |                  N/A |
|  0%   44C    P5              65W / 350W |   4380MiB / 12288MiB |      2%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    