# Text prediction/text generation

Now that was have established an understanding of how text embeddings work, we will use those embeddings to try to guess the next word in a sequence. It's normal for the cells below to take a bit of time to run.

In [1]:
from transformers import GPT2Tokenizer, AutoModelForCausalLM
import numpy as np

# Here we grab gpt2 tokenizer and model from the hub. You can also use your own model.
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = AutoModelForCausalLM.from_pretrained("openai-community/gpt2")
tokenizer.pad_token_id = tokenizer.eos_token_id


# Make the sequence we are trying to get the next token for.
current_sequence = "The Boston Red"
inputs = tokenizer(current_sequence, return_tensors="pt")

# Here we ask the model what the next token should be. We get the top 10 items and compare the probabilities.
outputs = model.generate(**inputs, max_new_tokens=10, return_dict_in_generate=True, output_scores=True)
transition_scores = model.compute_transition_scores(
    outputs.sequences, outputs.scores, normalize_logits=True
)

# Here we pass to the model the current sequence and ask it what the next token should be and output for viewing
input_length = inputs.input_ids.shape[1]
generated_tokens = outputs.sequences[:, input_length:]
for tok, score in zip(generated_tokens[0], transition_scores[0]):
    # The output you'll see is:
    # | token | token string | log probability | probability
    print(f"| {tok:5d} | {tokenizer.decode(tok):8s} | {score.numpy():.3f} | {np.exp(score.numpy()):.2%}")


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


| 17689 |  Sox     | -0.028 | 97.25%
|   389 |  are     | -2.219 | 10.87%
|   287 |  in      | -3.029 | 4.84%
|   262 |  the     | -1.113 | 32.87%
| 15925 |  midst   | -0.623 | 53.62%
|   286 |  of      | -0.001 | 99.86%
|   257 |  a       | -0.972 | 37.85%
| 25448 |  rebuilding | -2.890 | 5.56%
|  1429 |  process | -0.422 | 65.56%
|   326 |  that    | -1.493 | 22.46%


# Sequence Generation

Sequence generation is simply running the above procedure repeatedly to create more than 1 word

In [3]:
import torch
from transformers import pipeline


def continue_sequence(input, model, temperature=0.7, top_k=50, top_p=0.95, max_new_tokens=50):
    """
    Generate new text based on the input sequence.
    """
    # We set up the pipeline to use the model and generate text
    # The torch dtype is set to bfloat16 for a smaller memory footprint
    text_continuation = pipeline("text-generation", model=model, do_sample=True, temperature=temperature, top_k=top_k, top_p=top_p, torch_dtype=torch.bfloat16)

    # Generate text with a maximum length of 40 tokens
    outputs = text_continuation(input, max_new_tokens = max_new_tokens)

    return outputs

# Here we are trying out a different model. 
# There are many models at https://huggingface.co/models
# This is a smaller model that is more efficient to run on a CPU
model = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"

# Generate new text based on this sequence
stub = "The Boston Red Sox are"

# Generate new text based on the input sequence
model_output = continue_sequence(stub, model)

print(f"Model Output: {model_output[0]['generated_text']}")

Device set to use mps:0


Model Output: The Boston Red Sox are a professional baseball team based in Boston, Massachusetts. They are currently members of the American League East division. The Red Sox have won three World Series championships, in 1904, 1912, and 20


### Multiple candidate sequences

Since the models have some randomness in the next word they select, we can see that they generate different options based on the same output

In [4]:
# Reusing the model from above
model = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"

# Feel free to change this text! 
stub = "The Boston Red Sox are"

# Generate new text based on the input sequence
num_sequences_to_generate = 5
for i in range(num_sequences_to_generate):
    model_output = continue_sequence(stub, model)
    print(f"Model Output {i}: {model_output[0]['generated_text']}")

Device set to use mps:0


Model Output 0: The Boston Red Sox are taking on the Toronto Blue Jays in the American League Championship Series, and here's what you need to know about the teams and their matchups:

1. Boston Red Sox: Boston has a 3-1 lead in the


Device set to use mps:0


Model Output 1: The Boston Red Sox are 4-1 against the Los Angeles Angels in the regular season, and the teams split their two postseason matchups. The Red Sox won the first two games of the series, but the Angels won the next three. The Ang


Device set to use mps:0


Model Output 2: The Boston Red Sox are seeking to end their nine-year playoff drought and return to the World Series.
They've done it before, but it hasn't happened yet.
The team, which has won 28 National League East division titles,


Device set to use mps:0


Model Output 3: The Boston Red Sox are a professional baseball team based in Boston, Massachusetts. They play in the Eastern Division of Major League Baseball's American League. The team was founded in 1901 and has won five World Series championships, most recently in 20


Device set to use mps:0


Model Output 4: The Boston Red Sox are 3-1 on their current 10-game homestand and are still in first place in the American League East. The Red Sox are just three games back of the Yankees for first place in the division.
The


### Removing Randomness

We can also just generate the most likely items and remove this randomness. This is not generally advised but is instructuve

In [None]:
# Reusing the model from above
model = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"

# Feel free to change this text! 
stub = "The Boston Red Sox are"

# Generate new text based on the input sequence
num_sequences_to_generate = 2
for i in range(num_sequences_to_generate):
    # Here we are setting the top_k to 1 to get the most likely next token which removes the randomness and makes it deterministic
    model_output = continue_sequence(stub, model, top_k=1)
    print(f"Model Output {i}: {model_output[0]['generated_text']}")

Device set to use mps:0


Model Output 0: The Boston Red Sox are a professional baseball team based in Boston, Massachusetts. They are a member club of the American League (AL) East division of Major League Baseball (MLB). The Red Sox have won 27 World Series championships, more than any other


Device set to use mps:0


Model Output 1: The Boston Red Sox are a professional baseball team based in Boston, Massachusetts. They are a member club of the American League (AL) East division of Major League Baseball (MLB). The Red Sox have won 27 World Series championships, more than any other


# Chatting with models

The sequence generation is interesting but let's set up an actual Q&A model like we see in popular product such as chat GPT

In [None]:
import torch
from transformers import pipeline

model = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"

# Download the model and specify the task
# The torch dtype is set to bfloat16 for a smaller memory footprint
pipe = pipeline("text-generation", model=model, torch_dtype=torch.bfloat16)

# Set the system prompt and user message
system_prompt = "You are a friendly chatbot who always responds with extremely relevant information and provides concise answers."
user_message = "How do I best use AI for education?"

messages = [
    {
        "role": "system",
        "content": system_prompt,
    },
    {
        "role": "user", 
        "content": user_message,
    },
]

prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=400, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

Device set to use mps:0


<|system|>
You are a friendly chatbot who always responds with extremely relevant information and provides concise answers.</s>
<|user|>
How do I best use AI for education?</s>
<|assistant|>
AI has become an essential tool for education, providing various benefits and enhancing the quality and relevance of learning experiences for students. Here are a few ways in which AI can be used for education:

1. Personalized Learning: AI-based tools can analyze student data and provide personalized learning experiences. For instance, students can access content based on their learning style, progress, and interests.

2. Automated Assessments: AI-based tools can automate assessments, freeing up teachers' time for more meaningful tasks. Students can receive feedback on their performance immediately after the assessment.

3. Teacher Training: AI-based tools can provide teachers with data-driven insights on their students' learning. This data can be used to improve teaching techniques, such as class

### Augmenting based on user data

One neat trick that is frequently used to get more specific content out of a model is to add relevant information is to attach is to the system prompt. Sometimes there is a retreival step that is used for this process and that is called retrieval augmented generation and is the basis of a lot of tools. 

For our example we will take the description from this session and add it to the prompt.

In [8]:
import torch
from transformers import pipeline

model = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"

# Download the model and specify the task
# The torch dtype is set to bfloat16 for a smaller memory footprint
pipe = pipeline("text-generation", model=model, torch_dtype=torch.bfloat16)

# This is the session description that we will add to the system prompt
session_description = "This session is for any data analyst who wants to build a foundation for machine learning and AI using what they already know with traditional analytic methods. It provides a broad overview and motivating examples across different tasks including analyzing textual data, predicting which of two or more groups individuals will belong to in the future, and forecasting future metrics plus next word prediction just like fancy generative AI models. Importantly, the information will be presented in a way to build a bridge between traditional analytic methods and artificial intelligence: spelling out what is common and what is unique, “translating” AI jargon, and pointing out where AI methods are similar to other tried-and-true methods familiar to data analysts. For the areas covered, the session will show how the methods have evolved to reach their current state, what AI is doing behind the scenes with the data, and how these methods can be useful when working with educational data. The session will end with worked examples featuring simple code in Python, but no prior experience in Python is needed. Although the session covers a lot of ground, attendees can work at their own pace thanks to instructions, code examples, and other resources that they can return to after convening. Facilitators are data scientists with a combined 15 years of industry experience working with real world data—primarily K-12 educational data—who want to prove that if you have written code in any language to analyze quantitative data before, you are well situated to apply methods used in machine learning and AI.  This is the session for data analysts who have wanted to try their hand at using AI in their work but would like guidance in determining how, where, or when to start."

# Set the system prompt and user message
system_prompt = "You are a friendly chatbot who always responds with extremely relevant information and provides concise answers. Use the following session description to inform your responses: " + session_description
user_message = "What are the learning outcomes of this session?"

messages = [
    {
        "role": "system",
        "content": system_prompt,
    },
    {
        "role": "user", 
        "content": user_message,
    },
]

prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=400, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])


Device set to use mps:0


<|system|>
You are a friendly chatbot who always responds with extremely relevant information and provides concise answers. Use the following session description to inform your responses: This session is for any data analyst who wants to build a foundation for machine learning and AI using what they already know with traditional analytic methods. It provides a broad overview and motivating examples across different tasks including analyzing textual data, predicting which of two or more groups individuals will belong to in the future, and forecasting future metrics plus next word prediction just like fancy generative AI models. Importantly, the information will be presented in a way to build a bridge between traditional analytic methods and artificial intelligence: spelling out what is common and what is unique, “translating” AI jargon, and pointing out where AI methods are similar to other tried-and-true methods familiar to data analysts. For the areas covered, the session will show ho