# <center>Critical AI</center>
<center>ENGL 54.41</center>
<center>Dartmouth College</center>
<center>Winter 2026</center>
<pre>Created: 02/03/2026; Updated 02/05/2026</pre>

In [None]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import torch.nn.functional as F

from transformers import AutoTokenizer, AutoModelForCausalLM
import random

In [None]:
# fix-up display for exponential notation 
pd.set_option('display.float_format', '{:.4f}'.format)

In [None]:
# This cell of code will determine if we have an accelerator for running
# our neural networks.
# mps == Apple Silicon device (MX series of Macbooks)
# cuda == Compute Unified Device Architecture is a toolkit from Nvidia and means we have a GPU
# cpu == Just using the general-purpose CPU for our calculations

if hasattr(torch.backends, "mps") and torch.backends.mps.is_available():
    device = torch.device('mps')
elif torch.cuda.is_available():
    device = torch.device('cuda')
else:
    device = torch.device('cpu')
print('Using device: {0}'.format(device))

In [None]:
# The model names are constructed from model provider + model name.
# We're going to load 1B model from Allen Institute. This is the base
# foundational model from the OLMo2 series with 1 billion parameters.
model_name = "allenai/OLMo-2-0425-1B"

# the tokenizer is tied to the model itself
tokenizer = AutoTokenizer.from_pretrained(model_name)

# load the model and put on the correct device
model = AutoModelForCausalLM.from_pretrained(model_name,
    dtype=torch.float16,
    device_map = "auto")

In [None]:
# show us the basic configuration of the model--how many layers, attention heads, 
# vocabulary size, embedding width, etc:
model.config

In [None]:
# put the model into evaluation state and display architecture
model.eval()

## Generation as Prediction of Next Tokens

We'll now do some very basic next-token-prediction to generate text. These outputs will not be very interesting or creative because we are going to always take the most highly predicted next token. This is basic text completion. It is also known as __greedy__ search when used in generation. What is the next most likely word in this sequence?

In [None]:
# first, to return to tokenization: the model/network takes as input a tensor of tokens, not language/text. 
# we need to convert our text fragments into a tensor of token ids:
mission = """Dartmouth educates the most promising students and prepares them for a lifetime of learning and of responsible leadership through a faculty dedicated to teaching and the creation of knowledge."""
print(tokenizer(mission)['input_ids'])

In [None]:
prompt = "Dartmouth College is located in Hanover, New"
inputs = tokenizer(prompt,
                   return_tensors="pt").to(model.device)
outputs = model(input_ids = inputs['input_ids'])

In [None]:
# the model outputs data for the inputs + predictions for the next token
outputs[0].shape

In [None]:
predicted_tokens = outputs.logits[:,-1,:]

# sort the predicted tokens and take the most probable next token:
pids = torch.argsort(predicted_tokens, descending = True)[0]

In [None]:
# here we'll see the predicted tokens by their token ID:
pids

In [None]:
# this will give us the token with a space:
tokenizer.decode(pids[0])

In [None]:
# Out of curiosity...the second most probable next token?
tokenizer.decode(pids[1])

In [None]:
# Out of curiosity...the third most probable next token?
tokenizer.decode(pids[2])

In [None]:
# It seems likely that this model has some confidence in that answer...

In [None]:
def next_tokens(prompt, n=10):
    """
    returns: decoded string of just generated tokens.
    This function is deterministic. It will return the same outputs for the same prompt
    because we are always going to take the most probable next token.
    """
    inputs = tokenizer(prompt,
                        padding=True,
                        return_tensors="pt").to(next(model.parameters()).device)
    input_ids = inputs["input_ids"]
 
    # this iterates n times and demonstrates autoregression.
    # On our first iteration, we have just the original inputs as supplied as 'prompt'. 
    # On each iteration through the loop we'll append the generated token to our inputs.
    # Note that we do not need to tokenize the output because we generating tokens; we'll
    # decode (translate back to language) the entire tensor of tokens once the loop as completed.
    
    for i in range(n):
        logits = model(input_ids).logits[:, -1, :]
        pid = torch.argsort(logits, descending=True)[:, :1]
        input_ids =  torch.cat((input_ids, pid),dim=1)
    return tokenizer.decode(input_ids[0])

In [None]:
# take just the next most likely token (n = 1)
next_tokens("1 + 2 = ", n = 1)

In [None]:
next_tokens("10 + 2 = ", n = 1)

In [None]:
next_tokens("Hello, I am a", n = 2)

In [None]:
next_tokens("Hello, I am an", n = 2)

In [None]:
next_tokens("The quick brown fox jumps over the lazy")

In [None]:
# Now try some more on your own. 

## Stochastic Generation 

Taking the highest probability token results in rather dull outputs. To make these more interesting and to introduce the possibility of unexpected, creative, or even readable outputs, we need to vary our selection from among these tokens. Autoregression helps increase the likelihood that these outputs will make sense by generating the next token from the entire input. If we were just generating from the previous token or even a smaller number, we would not be able to generate coherent sentences. The larger the model, the better the output. With fine-tuning the models will generate token sequences that conform to human/reader preferences and that more closely resemble everyday prose.

The method used below is a highly simplified method of selecting from among the distribution of probability values. This method will introduce some randomness (by selecting from the entire vocabulary) into our selection of tokens. We'll soon see other methods to control this selection process.

In [None]:
def next_tokens_mn(prompt, n=10):
    inp_tok = tokenizer(prompt,
                        padding=True,
                        return_tensors="pt").to(next(model.parameters()).device)
    input_ids = inp_tok["input_ids"]

    for i in range(n):
        logits = model(input_ids).logits[:, -1, :]
        
        # we'll take the softmax of predictions to normalize the values
        probs = F.softmax(logits, dim=-1)
        
        # and from these we'll use multinomial sampling to select a token
        pid = torch.multinomial(probs, num_samples=1)
        
        # add new token to our inputs and continue loop
        input_ids =  torch.cat((input_ids, pid),dim=1)
        
    # return decoded tokens
    return tokenizer.decode(input_ids[0])

In [None]:
next_tokens_mn("The quick brown fox")

In [None]:
next_tokens_mn("To bake a cake: ",n = 25)

## Perplexity 

We can use our large language models as language models to understand how they are modeling input sequences. Perplexity is a measure of the predictive capabilities of our model. The lower the perplexity, the less "perplexed," we might say, the model is by the input. Tokens with higher scores were not as likely to be predicted by the model.

You might use this to probe the model's predictive power. Are sentences with lower perplexity found in training data? Not necessarily. Do sentences with higher perplexity values give us original language framents? Perhaps. Can we use these values to predict whether a sentence was likely to be generated by a language model? Potentially, but not as the only features.

In [None]:
def get_perplexity(sentence):
    """
    returns a dataframe of perplexity values for input sequence
    """

    # we extract tokens in this way in case we have subword tokens
    inputs = tokenizer(sentence, return_tensors="pt").to(model.device)
    tokens = tokenizer.convert_ids_to_tokens(inputs['input_ids'][0])
    
    # strip special character for space
    tokens = [t.replace('Ä ',' ') for t in tokens]

    # inference with labels for perplexity scores
    outputs = model(**inputs, 
                labels=inputs['input_ids'])

    # obtain logprobs
    log_probs = F.log_softmax(outputs.logits, dim=-1)

    # pad perplexity values for first token--we won't have predictions for it.
    perplexities = [0]

    # extract per-token perplexity scores from log_probs
    for i in range(1, inputs['input_ids'].size(1)):
        target_id = inputs['input_ids'][0, i]
        target_log_prob = log_probs[0, i -1 , target_id].item()  
        p = torch.exp(-torch.tensor(target_log_prob)).item()
        perplexities.append(p)
    out = pd.DataFrame({"tokens":tokens,"perplexities":perplexities})
    return out

In [None]:
df = get_perplexity("The quick brown fox jumps over the lazy dog")
df.plot(y = "perplexities",
        x = "tokens",
        title = "Perplexity",
        figsize=(10,3))
plt.xticks(range(len(df)), df['tokens'], rotation=45)
plt.tight_layout()
plt.show()