# Recipe Generation with GPT-2


This notebook explores how to **fine-tune a transformer** for a custom text completion task -- specifically, we'll teach GPT-2 to generate full recipe instructions from just a list of ingredients.

Before the world met ChatGPT, GPT-2 was already showing us what generative AI could do -- if we were paying attention. To learn more about this large language model, which was **trained on 8 million web pages**, check out the official model card on [Hugging Face](https://huggingface.co/gpt2).

Before we get cooking, here's a roadmap of what is to come:
- Set up the GPU runtime
- Download and inspect the dataset
- Create a DataFrame and split it into training and test sets
- Test the base model on a single sentence
- Fine-tune the model using the recipe dataset
- Evaluate the fine-tuned model's recipe-writing skills

## GPU time

In [None]:
# Check which GPU is available for model training
!nvidia-smi -L

GPU 0: Tesla T4 (UUID: GPU-3ede2645-63ec-de5f-3b35-f15a1a4b3d32)


In [None]:
# Install, if needed, and import the Hugging Face Transformers library
!pip install transformers
import transformers



## Download the dataset

In [None]:
import pandas as pd

In [None]:
# Install gdown, a Python tool that helps you download files from Google Drive directly into your Colab
!pip install --upgrade gdown



This dataset includes 120K recipes! All the recipes are formatted in the same way, and, as we will see, formatting is critical for solving text completion tasks like this. Part of the magic of transformers is that they are able to recognize and learn patterns.

In [None]:
import gdown
# Download the recipe dataset (120K examples) from Google Drive
gdrivelink='https://drive.google.com/uc?id=10KF1LqW9k2MgTb1GwSPlfvX1INFPDxi5'
gdown.download(gdrivelink, quiet=False)

Downloading...
From (original): https://drive.google.com/uc?id=10KF1LqW9k2MgTb1GwSPlfvX1INFPDxi5
From (redirected): https://drive.google.com/uc?id=10KF1LqW9k2MgTb1GwSPlfvX1INFPDxi5&confirm=t&uuid=6dbb02e2-5318-40bf-a7e0-46eb23376b2b
To: /content/recipes.csv
100%|██████████| 343M/343M [00:04<00:00, 85.2MB/s]


'recipes.csv'

Recipe dataset at  https://eightportions.com/datasets/Recipes/

## Create the DataFrame

In [None]:
# Create the DataFrame that we just downloaded
df = pd.read_csv('recipes.csv')

In [None]:
# Check the first five rows
df.head()

Unnamed: 0.1,Unnamed: 0,title,ingredients,instructions,source,ingredients_length,instructions_length,combined
0,0,Slow Cooker Chicken and Dumplings,"['4 skinless, boneless chicken breast halves '...","place the chicken, butter, soup, and onion in ...",ar,5,54,"\n Ingredients: \n 4 skinless, boneless chick..."
1,1,Awesome Slow Cooker Pot Roast,['2 (10.75 ounce) cans condensed cream of mush...,"in a slow cooker, mix cream of mushroom soup, ...",ar,4,45,\n Ingredients: \n 2 (10.75 ounce) cans conde...
2,2,Brown Sugar Meatloaf,"['1/2 cup packed brown sugar ', '1/2 cup ketch...",preheat oven to 350 degrees f (175 degrees c)....,ar,10,68,\n Ingredients: \n 1/2 cup packed brown sugar...
3,3,Best Chocolate Chip Cookies,"['1 cup butter, softened ', '1 cup white sugar...",preheat oven to 350 degrees f (175 degrees c)....,ar,11,75,"\n Ingredients: \n 1 cup butter, softened \n..."
4,4,Homemade Mac and Cheese Casserole,"['8 ounces whole wheat rotini pasta ', '3 cups...",preheat oven to 350 degrees f. line a 2-quart ...,ar,13,176,\n Ingredients: \n 8 ounces whole wheat rotin...


In [None]:
# Check the lenght
len(df)

121456

In [None]:
# Now check the shape
df.shape

(121456, 8)

In [None]:
# Check for null values
df.isna().sum()

Unnamed: 0,0
Unnamed: 0,0
title,0
ingredients,0
instructions,0
source,0
ingredients_length,0
instructions_length,0
combined,0


Time to explore the `Combined` column -- the only column that matters in this text generation task that we are going to have GPT-2 complete.

In [None]:
# 'Combined' is the only column we care about in this exercise
# It's the combined ingredients and instructions and it's essential that we format the training set vert carefully
df.combined

Unnamed: 0,combined
0,"\n Ingredients: \n 4 skinless, boneless chick..."
1,\n Ingredients: \n 2 (10.75 ounce) cans conde...
2,\n Ingredients: \n 1/2 cup packed brown sugar...
3,"\n Ingredients: \n 1 cup butter, softened \n..."
4,\n Ingredients: \n 8 ounces whole wheat rotin...
...,...
121451,\n Ingredients: \n 4 ears fresh corn \n 2 hea...
121452,\n Ingredients: \n 4 large plum tomatoes \n s...
121453,\n Ingredients: \n 3 tablespoons olive oil \n...
121454,\n Ingredients: \n 8 ounces butter \n 8 ounce...


In [None]:
# Let's improve the formatting by printing the column and selecting .iloc[0], or just the first row
# Important note: So, this is the format for every element in this column
print(df.combined.iloc[0])

 
 Ingredients: 
 4 skinless, boneless chicken breast halves  
 2 tablespoons butter  
 2 (10.75 ounce) cans condensed cream of chicken soup  
 1 onion, finely diced  
 2 (10 ounce) packages refrigerated biscuit dough, torn into pieces  
 Instructions: 
 place the chicken, butter, soup, and onion in a slow cooker, and fill with enough water to cover. cover, and cook for 5 to 6 hours on high. about 30 minutes before serving, place the torn biscuit dough in the slow cooker. cook until the dough is no longer raw in the center.  <|endoftext|>


Notice: Every set of Instructions ends with the following -- <|endoftext|>. This is crucial because the model needs to know where in the training set a sentence ends.

Also, notice that ingredients in the text often appear in the same order in the Ingredients section *and* the Instructions section. Humans can notice this pattern, of course, and transformers can, too.

So, here's **the gameplan**: We will show GPT-2 120K examples where the Ingredients section is followed by the Instructions section. Then we will provide it just the Ingredients section and it will be prompted to generate text, or Instructions.

## Make the training and test set

So, how do you measure the performance of generating text?

It's a little murky.

Even if there is not a clear metric we can use, we will still split the dataset into two -- training and test.

In [None]:
# Create the training set, just the 'combined' column values and grab the first 120,000
dataset_train = df.combined.values[:120000]

In [None]:
# Check the length
len(dataset_train)

120000

In [None]:
# For the test set, we will take the rest
dataset_test = df.combined.values[120000:]

In [None]:
# Check the length
len(dataset_test)

1456

As we see above, we have a huge training set. But we need to turn the training and text sets into **text files**. And this is why those <|endoftext|> tokens are so important.

In [None]:
# Here's how to create the text file for 'dataset_train'
with open('dataset_train.txt', 'w') as f:
  f.write('\n'.join(dataset_train))

In [None]:
# And now for the test set
with open('dataset_test.txt', 'w') as f:
  f.write('\n'.join(dataset_test))

In [None]:
# These are massive files, so, let's just display a small piece of 'dataset_train.txt'
!head -20 dataset_train.txt

 
 Ingredients: 
 4 skinless, boneless chicken breast halves  
 2 tablespoons butter  
 2 (10.75 ounce) cans condensed cream of chicken soup  
 1 onion, finely diced  
 2 (10 ounce) packages refrigerated biscuit dough, torn into pieces  
 Instructions: 
 place the chicken, butter, soup, and onion in a slow cooker, and fill with enough water to cover. cover, and cook for 5 to 6 hours on high. about 30 minutes before serving, place the torn biscuit dough in the slow cooker. cook until the dough is no longer raw in the center.  <|endoftext|>
 
 Ingredients: 
 2 (10.75 ounce) cans condensed cream of mushroom soup  
 1 (1 ounce) package dry onion soup mix  
 1 1/4 cups water  
 5 1/2 pounds pot roast  
 Instructions: 
 in a slow cooker, mix cream of mushroom soup, dry onion soup mix and water. place pot roast in slow cooker and coat with soup mixture. cook on high setting for 3 to 4 hours, or on low setting for 8 to 9 hours.  <|endoftext|>
 
 Ingredients: 
 1/2 cup packed brown sugar  


## Load the pretrained GPT-2 model

We're now ready to load the core ingredients: the **pretrained GPT-2 model** and its **tokenizer**. We'll use Hugging Face's `transformers` library to grab both.

Specifically, we’ll be using:
1. `AutoModelForCausalLM` -- the GPT-2 language model for text generation
2. `AutoTokenizer` -- the tokenizer that breaks text into tokens the model understands

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM

In [None]:
# Here you can either use your own model or download one from Hugging Face
# We, of course, will be using GPT-2 but note: The distiled version, which is small, fast
# and still almost as accurate as the famous original
tokenizer = AutoTokenizer.from_pretrained('distilgpt2')
model = AutoModelForCausalLM.from_pretrained('distilgpt2')

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/762 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/353M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

## Alternatively, load a fine-tuned model we previously saved

In [None]:
import gdown
gdrivelink='https://drive.google.com/drive/folders/1qHEQ6zpOeGDiBlZ3q4zkpMVSVa86d69q?usp=sharing'
gdown.download_folder(gdrivelink, quiet=True)

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained('recipe_generation_model')
model = AutoModelForCausalLM.from_pretrained('recipe_generation_model')

## Test the model on one sentence


Let's find out how the pretrained model does on the task of completing a sentence.

We’ll first encode a sentence using the tokenizer and see what the model sees.

**Side Note**: When we use `return_tensors='pt'`, we're actually pulling in **PyTorch** under the hood. Even though we didn’t explicitly import it, Hugging Face’s `transformers` library is built on top of PyTorch by default.
>
So, the output is a PyTorch tensor  -- a powerful data structure that we’ll work with more as we go deeper into model building.

In [None]:
# First, define the input text
# And then encode it with 'tokenizer.encode()'
input_text = 'I hit the slopes early to snowboard, caught the first lift, and by noon I had already'
enc_input = tokenizer.encode(input_text, return_tensors='pt', add_special_tokens=False)
enc_input

tensor([[   40,  2277,   262, 35082,  1903,   284,  6729,  3526,    11,  4978,
           262,   717, 10303,    11,   290,   416, 19613,   314,   550,  1541]])

Now that we've tokenized our input sentence, we can feed it into the model and generate predictions to complete the sentence.

Note: This block of code is a **beast** so make sure to read through those comment lines.

In [None]:
output_sequences = model.generate(
    input_ids = enc_input, # The encoded input from earlier
    max_length= 70,  # The max length of the generated sentence
    temperature = 0.9, # Controls randomness, closer to 1 = more creative, closer to 0 = more predictable
    top_k = 20, # Considers only the top 20 most likely next words
    top_p = 0.9, # Allows dynamic cutoff of word options based on cumulative probabilioty
    repetition_penalty = 1, # Penalty for repeating a word in the input
    do_sample = True, # If true, this allows for randomness instead of always picking the highest-probability word
    num_return_sequences = 5 # Number of output sentences
)
for i in range(len(output_sequences)):
  print(f'{i}: {tokenizer.decode(output_sequences[i])}\n')

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


0: I hit the slopes early to snowboard, caught the first lift, and by noon I had already begun my trek.


It was not the first time I had encountered a mountain in Canada. On my way to work at the Canadian Tire Centre, I spotted the first big thing I noticed: the snow was coming off the ice. I

1: I hit the slopes early to snowboard, caught the first lift, and by noon I had already been riding the mountain for a day and a half. My legs were sore but I was able to get back up to my feet to get back up to the base of the lift. I took off my shoes and I was able to climb the ridge

2: I hit the slopes early to snowboard, caught the first lift, and by noon I had already hit a couple of hundred feet.
The next morning, we had to wait at the airport for an airport flight to Paris. There was a short time between flights from the airport to Paris and I was already halfway there. We had just had a

3: I hit the slopes early to snowboard, caught the first lift, and by noon I had already done what

Obviously, these outputs are -- mostly, if not enitrely -- nonsense. But what about a recipe? Can are model accomplish that?

In [None]:
# Display the first element from 'dataset_train'
dataset_train[0]

' \n Ingredients: \n 4 skinless, boneless chicken breast halves  \n 2 tablespoons butter  \n 2 (10.75 ounce) cans condensed cream of chicken soup  \n 1 onion, finely diced  \n 2 (10 ounce) packages refrigerated biscuit dough, torn into pieces  \n Instructions: \n place the chicken, butter, soup, and onion in a slow cooker, and fill with enough water to cover. cover, and cook for 5 to 6 hours on high. about 30 minutes before serving, place the torn biscuit dough in the slow cooker. cook until the dough is no longer raw in the center.  <|endoftext|>'

In [None]:
# Need to split the above output and add that 'Instructions' column as a prompt
dataset_train[0].split('Instructions:')[0]+'Instructions:'

' \n Ingredients: \n 4 skinless, boneless chicken breast halves  \n 2 tablespoons butter  \n 2 (10.75 ounce) cans condensed cream of chicken soup  \n 1 onion, finely diced  \n 2 (10 ounce) packages refrigerated biscuit dough, torn into pieces  \n Instructions:'

In [None]:
# Now, let's try it as our new input text
# Note: Need to increase the length of the generated sentence
new_input_text = dataset_train[0].split('Instructions:')[0]+'Instructions:'
enc_input = tokenizer.encode(new_input_text, return_tensors='pt', add_special_tokens=False)
enc_input
output_sequences = model.generate(
    input_ids = enc_input, # The encoded input from earlier
    max_length= 150,  # The max length of the generated sentence
    temperature = 0.9, # Controls randomness, closer to 1 = more creative, closer to 0 = more predictable
    top_k = 20, # Considers only the top 20 most likely next words
    top_p = 0.9, # Allows dynamic cutoff of word options based on cumulative probabilioty
    repetition_penalty = 1, # Penalty for repeating a word in the input
    do_sample = True, # If true, this allows for randomness instead of always picking the highest-probability word
    num_return_sequences = 5 # Number of output sentences
)
for i in range(len(output_sequences)):
  print(f'{i}: {tokenizer.decode(output_sequences[i])}\n')

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


0:  
 Ingredients: 
 4 skinless, boneless chicken breast halves  
 2 tablespoons butter  
 2 (10.75 ounce) cans condensed cream of chicken soup  
 1 onion, finely diced  
 2 (10 ounce) packages refrigerated biscuit dough, torn into pieces  
 Instructions:



















































































1:  
 Ingredients: 
 4 skinless, boneless chicken breast halves  
 2 tablespoons butter  
 2 (10.75 ounce) cans condensed cream of chicken soup  
 1 onion, finely diced  
 2 (10 ounce) packages refrigerated biscuit dough, torn into pieces  
 Instructions: Preheat oven to 350F. Cut chicken breast halves into two pieces. Place chicken breast halves on the baking sheet. Cover the baking sheet and bake for an additional 20 minutes or until a lightly golden brown surface forms. Cover the dough in a large bowl and let cool slightly. Transfer the chicken breast halves to a lightly lined baking sheet or sheet. Cover the dough in a large bowl and bake for an add

The results aren't a total mess -- well, at least the ones that aren't left blank.

GPT-2 is speaking gramatically correct English but still isn't accomplishing the task we set out to do. So, up next? Fine-tuning the model.

# Fine-tune the model

To begin fine-tuning, we’ll download a lightweight training script provided by Michele Samorani, Associate Professor, Information Systems & Analytics at the Leavey School of Business at Santa Clara Univeristy.

The script is adapted from Hugging Face’s official examples. it handles the model training loop and saves outputs to a local `experiments/` folder.

The original Hugging Face version can be found [here](https://github.com/huggingface/transformers/blob/main/examples/pytorch/language-modeling/run_clm.py).



In [None]:
# Download the custom fine-tuning script from the course server
!curl https://webpages.scu.edu/ftp/msamorani/NLP/run_lm_finetuning.py > run_lm_finetuning.py

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 31098  100 31098    0     0   216k      0 --:--:-- --:--:-- --:--:--  216k


In [None]:
# Create a folder to store training outputs and model checkpoints
!mkdir experiments

In [None]:
# Set a couple of parameters
# First, epochs set to three so the model will go through the dataset three times
epochs = 3

In [None]:
# Then find the file that has our training set
file_with_training_set = 'dataset_train.txt'

In [None]:
# Create the bash scipt to fine-tune the language model
# This loops through the number of epochs and build a command that:
# 1. Calls the training script
# 2. Points to the input file
# 3. Saves model checkpoints to a directory named by epoch
# 4. Uses distilGPT2 and basic Hugging Face flags
text = f"for epoch in {epochs} \n"+\
"do \n"+\
"python run_lm_finetuning.py "+\
f"--output_dir=experiments/epoch_{epochs} "+\
"--model_type=gpt2 "+\
"--model_name_or_path=distilgpt2 "+\
f"--train_data_file={file_with_training_set} "+\
"--do_train "+\
"--overwrite_output_dir "+\
"--save_steps=500 " +\
f"--num_train_epochs={epochs} \n" +\
"done"

In [None]:
# Write the fine-tuning bash script to a file called 'run_experiments.sh'
# This will be executed in the next cell to launch the training loops
f = open('run_experiments.sh',mode='w')
f.write(text)
f.close()

This is the core training step where the GPT-2 model learns from our dataset using the training script we generated earlier. This loop runs for 3 epochs, saving checkpoints at defined intervals.

**Note:** Due to time and GPU constraints, I manually interrupted training after about 25 minutes, or 17% of total steps. Despite the early stop, several useful checkpoints (500, 1000, 1500) were created and will be used in the next step.

In [None]:
# Launch the fine-tuning process
# Note: This was manually stopped after about 25 minutes at iteration 1700+
!bash run_experiments.sh

2025-05-17 00:41:39.341068: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1747442499.360835   14938 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1747442499.367082   14938 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
Epoch:   0% 0/3 [00:00<?, ?it/s]
Iteration:   0% 0/10104 [00:00<?, ?it/s][A`loss_type=None` was set in the config but it is unrecognised.Using the default loss: `ForCausalLMLoss`.

Iteration:   0% 1/10104 [00:01<3:51:17,  1.37s/it][A
Iteration:   0% 2/10104 [00:02<2:44:02,  1.03it/s][A
Iteration:   0% 3/10104 [00:02<2:25:08,  1.16it/s][A
Iteration:   0% 4/10104 [00:03<2:17:02,  1.23it/s][A
Iteration:   0% 5/10104 [00:04<2:11:15,

In [None]:
# Check the available checkpoints
!ls experiments/epoch_3

checkpoint-1000  checkpoint-1500  checkpoint-500


## Save the fine-tuned model


In this step, we save our fine-tuned language model using Hugging Face’s `AutoModelForCausalLM` and `AutoTokenizer` classes.

- The model and tokenizer are loaded from the last available checkpoint, or `checkpoint-1500`
- Then we save them to a new folder: `recipe_generation_model`
- Finally, we mount Google Drive and copy the model over for long-term storage and later use

In [None]:
# Load the model + tokenizer from the last checkpoint
tokenizer = AutoTokenizer.from_pretrained('experiments/epoch_3/checkpoint-1500')
model = AutoModelForCausalLM.from_pretrained('experiments/epoch_3/checkpoint-1500')

# Define a folder name for the saved model
model_path = 'recipe_generation_model'

# And save the model and tokenizer locally
model.save_pretrained(model_path)
tokenizer.save_pretrained(model_path)

('recipe_generation_model/tokenizer_config.json',
 'recipe_generation_model/special_tokens_map.json',
 'recipe_generation_model/vocab.json',
 'recipe_generation_model/merges.txt',
 'recipe_generation_model/added_tokens.json',
 'recipe_generation_model/tokenizer.json')

In [None]:
# Mount Google Drive so we can export our fine-tuned model
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

# And copy the saved model to your Google Drive
import shutil
shutil.copytree(model_path, '/content/drive/MyDrive/' + model_path)

Mounted at /content/drive


'/content/drive/MyDrive/recipe_generation_model'

## Test the fine-tuned model's recipe writing skills

In [None]:
# Begin by grabbing the test set
print(dataset_test[0])

 
 Ingredients: 
 2 large red bell peppers 
 1 pound boneless, skinless chicken thighs 
 salt and freshly ground black pepper 
 1 tablespoon dried italian seasoning or herbes de provence 
 1/4 cup extra-virgin olive oil, plus more liberal drizzling 
 1 pound large shrimp, peeled and deveined, tails in tact 
 1 tablespoon seafood seasoning, (recommended: old bay) 
 2 tablespoons white balsamic or red wine vinegar 
 2 red chilies, seeded and chopped 
 2 cloves garlic, grated or pasted 
 Instructions: 
 put the peppers under the broiler to blacken evenly all over; leave the oven door slightly ajar to allow steam to escape. (alternatively char the peppers over an open flame on the stovetop.) put the peppers in a bowl and cover with plastic wrap. cool the peppers in order to handle, peel and seed. preheat a grill pan or griddle pan over medium-high heat. meanwhile, season the chicken with salt and pepper and dried herbs, and coat with a little extra-virgin olive oil. season the shrimp with 

In [None]:
# Need to split at 'Instructions' which will be our prompt
print(dataset_test[0].split('Instructions:')[0] + '\n Instructions')

 
 Ingredients: 
 2 large red bell peppers 
 1 pound boneless, skinless chicken thighs 
 salt and freshly ground black pepper 
 1 tablespoon dried italian seasoning or herbes de provence 
 1/4 cup extra-virgin olive oil, plus more liberal drizzling 
 1 pound large shrimp, peeled and deveined, tails in tact 
 1 tablespoon seafood seasoning, (recommended: old bay) 
 2 tablespoons white balsamic or red wine vinegar 
 2 red chilies, seeded and chopped 
 2 cloves garlic, grated or pasted 
 
 Instructions


In [None]:
# Now store it as input_text_test
input_text_test = dataset_test[0].split('Instructions:')[0] + '\n Instructions'

In [None]:
input_text_test

' \n Ingredients: \n 2 large red bell peppers \n 1 pound boneless, skinless chicken thighs \n salt and freshly ground black pepper \n 1 tablespoon dried italian seasoning or herbes de provence \n 1/4 cup extra-virgin olive oil, plus more liberal drizzling \n 1 pound large shrimp, peeled and deveined, tails in tact \n 1 tablespoon seafood seasoning, (recommended: old bay) \n 2 tablespoons white balsamic or red wine vinegar \n 2 red chilies, seeded and chopped \n 2 cloves garlic, grated or pasted \n \n Instructions'

Take 2 with the recipe generation...

In [None]:
# And now to test out the model
input_text_test = dataset_train[0].split('Instructions:')[0]+'Instructions:'
enc_input = tokenizer.encode(input_text_test, return_tensors='pt', add_special_tokens=False)
enc_input
output_sequences = model.generate(
    input_ids = enc_input, # The encoded input from earlier
    max_length= 300,  # The max length of the generated sentence
    temperature = 0.9, # Controls randomness, closer to 1 = more creative, closer to 0 = more predictable
    top_k = 20, # Considers only the top 20 most likely next words
    top_p = 0.9, # Allows dynamic cutoff of word options based on cumulative probabilioty
    repetition_penalty = 1, # Penalty for repeating a word in the input
    do_sample = True, # If true, this allows for randomness instead of always picking the highest-probability word
    num_return_sequences = 5 # Number of output sentences
)
for i in range(len(output_sequences)):
  print(f'{i}: {tokenizer.decode(output_sequences[i])}\n')

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


0:  
 Ingredients: 
 4 skinless, boneless chicken breast halves  
 2 tablespoons butter  
 2 (10.75 ounce) cans condensed cream of chicken soup  
 1 onion, finely diced  
 2 (10 ounce) packages refrigerated biscuit dough, torn into pieces  
 Instructions: 
 preheat oven to 350 degrees f (175 degrees c). grease and flour a 9 inch baking dish. in a bowl, combine chicken, butter, onion, and onion. add chicken to prepared baking dish, and stir until chicken is lightly browned, about 5 minutes. using your hands, lightly brush chicken with melted butter. bake for 20 minutes, or until chicken is tender. transfer chicken to an airtight container and cool on the baking sheet.  <|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext

And look at that! These instructions actually make sense. You could cook these meals.

Just for fun, let's see if the model can handle that **snowboarding prompt** from before.

In [None]:
# Let's quickly update the model
input_text_snowboarding = 'I hit the slopes early to snowboard, caught the first lift, and by noon I had already'
enc_input = tokenizer.encode(input_text_snowboarding, return_tensors='pt', add_special_tokens=False)
enc_input
output_sequences = model.generate(
    input_ids = enc_input, # The encoded input from earlier
    max_length= 70,  # The max length of the generated sentence
    temperature = 0.9, # Controls randomness, closer to 1 = more creative, closer to 0 = more predictable
    top_k = 20, # Considers only the top 20 most likely next words
    top_p = 0.9, # Allows dynamic cutoff of word options based on cumulative probabilioty
    repetition_penalty = 1, # Penalty for repeating a word in the input
    do_sample = True, # If true, this allows for randomness instead of always picking the highest-probability word
    num_return_sequences = 5 # Number of output sentences
    )
for i in range(len(output_sequences)):
  print(f'{i}: {tokenizer.decode(output_sequences[i])}\n')

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


0: I hit the slopes early to snowboard, caught the first lift, and by noon I had already set it. the last lift, the first lift, was the last. it was the last lift. the last lift, the first lift, was the last.
 <|endoftext|><|endoftext|>

1: I hit the slopes early to snowboard, caught the first lift, and by noon I had already finished running the steeps. the last lift was about halfway done, and the final was about halfway done. the lift had been done, and the lift had been done. <|endoftext|>

2: I hit the slopes early to snowboard, caught the first lift, and by noon I had already been able to lift the legs off the rocks. <|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|>

3: I hit the slopes early to s

**Final check: model analysis after fine-tuning**

The output above confirms that fine-tuning worked as intended: the model can now generate coherent recipe instructions, but fails to respond meaningfully to off-domain prompts -- like snowboarding. This is a sign of successful domain specialization. Our GPT-2 model has learned to focus on recipe generation, not general language tasks.
