# WK03: Models

## Setup

This code imports the functions we need to run our inference pipelines

In [None]:
from PIL import Image
from transformers import pipeline

### Text Completion

Let's use the GPT2 model to create some text completions:

In [None]:
TEXT_GEN_MODEL = "openai-community/gpt2"

Here we define some phrases that we'll use as sentence starters:

In [None]:
PHRASE_EXAMPLES = [
  "How much wood would a woodchuck chuck if ",
  "I once knew a man from Natucket, who ",
  "To be or not to be, "
]

We'll use a Transformers pipeline object to run inference:

In [None]:
generator = pipeline("text-generation", model=TEXT_GEN_MODEL)

Here we run the generator on a starter phrase:

In [None]:
generator("To be or not to be, ")

Here we ask for longer responses:

In [None]:
generator("To be or not to be, ", max_length=100, pad_token_id=0)

#### Changing Model

is as easy as:

In [None]:
TEXT_GEN_MODEL = "Xenova/llama2.c-stories110M"
generator = pipeline("text-generation", model=TEXT_GEN_MODEL)

Rerun with new model:

In [None]:
generator("To be or not to be, ", max_length=32, pad_token_id=0)

One last model:

In [None]:
TEXT_GEN_MODEL = "facebook/opt-125m"
generator = pipeline("text-generation", model=TEXT_GEN_MODEL)

generator("To be or not to be, ", max_length=32, pad_token_id=0)

### Text Sentiment Analysis

Define model and create some example phrases:

In [None]:
TEXT_SENT_MODEL = "joeddav/distilbert-base-uncased-go-emotions-student"

EXAMPLE_TEXTS = [
  "What a wonderful day",
  "OMG my head hurts",
  "What am I doing here?"
]

Create inference pipeline object:

In [None]:
analyzer = pipeline("sentiment-analysis", model=TEXT_SENT_MODEL)

Run on example phrases

In [None]:
for t in EXAMPLE_TEXTS:
  result = analyzer(t)
  print(t, "->", result[0]["label"])

We can also define our pipeline like this if we want to get scores for all possible sentiments:

In [None]:
full_analyzer = pipeline("sentiment-analysis", model=TEXT_SENT_MODEL, return_all_scores=True)

In [None]:
for t in EXAMPLE_TEXTS:
  result = full_analyzer(t)
  sorted_result = sorted(result[0], key=lambda A: A["score"], reverse=True)
  top_3_labels = [s["label"] for s in sorted_result[:3]]
  print(t, "->", top_3_labels)

### Image Description

New model definition/location:

In [None]:
IMAGE_CAP_MODEL = "Salesforce/blip-image-captioning-base"

A test image:

In [None]:
test_image = Image.open("./imgs/GDTM.jpg").convert("RGB")
display(test_image)

The inference object:

In [None]:
img_captioner = pipeline(task="image-to-text", model=IMAGE_CAP_MODEL)

Run inference:

In [None]:
result = img_captioner(test_image)
print(result[0]["generated_text"])

Other image description models:
- [`LLAVA`](https://huggingface.co/llava-hf/llava-interleave-qwen-0.5b-hf)
- [`VIT`](https://huggingface.co/nlpconnect/vit-gpt2-image-captioning)

### Next-Word Model

or, a _fill-mask_ model, can be used to get the probabilities/scores of different possible words to complete a sentence:

In [None]:
MASK_MODEL = "FacebookAI/xlm-roberta-large"
unmasker = pipeline("fill-mask", model=MASK_MODEL)

In [None]:
unmasker("To be or not to be that is the <mask>")