<a href="https://colab.research.google.com/github/noakishere/cart498-genai/blob/main/A2/CART498_GENAI_A02.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

The following algorithm uses **GPT2** to generate text predictability.

This predictability is used to create newer poems based on word replacement with X highest ranked predictability.

**Original Poem:**
The Snow Man
by Wallace Stevens (1879-1955)

One must have a mind of winter

To regard the frost and the boughs

Of the pine-trees crusted with snow;

And have been cold a long time

To behold the junipers shagged with ice,

The spruces rough in the distant glitter

Of the January sun; and not to think

Of any misery in the sound of the wind,

In the sound of a few leaves,

Which is the sound of the land

Full of the same wind

That is blowing in the same bare place

For the listener, who listens in the snow,

And, nothing himself, beholds

Nothing that is not there and the nothing that is.

In [8]:
# Install the transformers library if not already installed
!pip install transformers

from transformers import AutoTokenizer, AutoModelForCausalLM, LogitsProcessorList, TopKLogitsWarper
import torch

# Initialize the GPT-2 tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("openai-community/gpt2")
model = AutoModelForCausalLM.from_pretrained("openai-community/gpt2")

top_k = 20
chosenKIndex = 4 # p + x

# Define the prompt
lines = ["One must have a mind of",
         "To regard the frost and the",
         "Of the pine-trees crusted with",
         "And have been cold a long time",
         "To behold the junipers shagged with",
         "The spruces rough in the distant",
         "Of the January sun; and not to",
         "Of any misery in the sound of the",
         "In the sound of a few",
         "Which is the sound of the",
         "Full of the same",
         "That is blowing in the same bare",
         "For the listener, who listens in the",
         "And, nothing himself,",
         "Nothing that is not there and the nothing that"]

for line in lines:
  prompt = line

  # Tokenize the prompt and convert it to input IDs
  input_ids = tokenizer(prompt, return_tensors="pt").input_ids

  # Generate the model's output logits for the next token
  with torch.no_grad():
      outputs = model(input_ids)
      logits = outputs.logits[:, -1, :]  # Get logits for the last token

  # Apply a Top-K logits processor to focus on the most probable continuations
  logits_processor = LogitsProcessorList([TopKLogitsWarper(top_k=50)])
  processed_logits = logits_processor(input_ids, logits)

  # Convert logits to probabilities using softmax
  probabilities = torch.softmax(processed_logits, dim=-1).squeeze()

  top_probs, top_indices = torch.topk(probabilities, top_k)

  # Decode the tokens and pair them with their probabilities
  continuations = [token.replace("Ä ", "") for token in tokenizer.convert_ids_to_tokens(top_indices.tolist())]
  results = list(zip(continuations, top_probs.tolist()))

  # Display the results
  for i, (word, prob) in enumerate(results, 1):
      if(i == chosenKIndex):
        print(f"{line} {word}")


One must have a mind of your
To regard the frost and the ice
Of the pine-trees crusted with salt
And have been cold a long time now
To behold the junipers shagged with blood
The spruces rough in the distant ,
Of the January sun; and not to say
Of any misery in the sound of the world
In the sound of a few gunshots
Which is the sound of the engine
Full of the same name
That is blowing in the same bare chest
For the listener, who listens in the first
And, nothing himself, no
Nothing that is not there and the nothing that was
