<a href="https://colab.research.google.com/github/Zardian18/Recipe-maker-LSTM/blob/master/LSTM_recipes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!wget https://raw.githubusercontent.com/Zardian18/helper-functions-colab/master/helper.py

--2024-01-11 21:48:19--  https://raw.githubusercontent.com/Zardian18/helper-functions-colab/master/helper.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.111.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 17171 (17K) [text/plain]
Saving to: ‘helper.py’


2024-01-11 21:48:19 (39.0 MB/s) - ‘helper.py’ saved [17171/17171]



In [2]:
!pip install kaggle



In [3]:
from google.colab import drive
drive.mount("/content/drive")

Mounted at /content/drive


In [4]:
! mkdir ~/.kaggle

In [5]:
!cp /content/drive/MyDrive/kaggle/kaggle.json ~/.kaggle/kaggle.json

In [6]:
!chmod 600 ~/.kaggle/kaggle.json

In [7]:
! kaggle datasets download -d hugodarwood/epirecipes

Downloading epirecipes.zip to /content
  0% 0.00/11.3M [00:00<?, ?B/s]
100% 11.3M/11.3M [00:00<00:00, 131MB/s]


In [8]:
! unzip epirecipes.zip

Archive:  epirecipes.zip
  inflating: epi_r.csv               
  inflating: full_format_recipes.json  
  inflating: recipe.py               
  inflating: utils.py                


In [9]:
import numpy as np
import json
import re
import string
import tensorflow as tf
from tensorflow.keras import layers, models, callbacks, losses

## Loading the data

In [10]:
with open("/content/full_format_recipes.json") as json_data:
  recipe_data = json.load(json_data)

In [11]:
len(recipe_data)

20130

In [12]:
recipe_data[:3]

[{'directions': ['1. Place the stock, lentils, celery, carrot, thyme, and salt in a medium saucepan and bring to a boil. Reduce heat to low and simmer until the lentils are tender, about 30 minutes, depending on the lentils. (If they begin to dry out, add water as needed.) Remove and discard the thyme. Drain and transfer the mixture to a bowl; let cool.',
   '2. Fold in the tomato, apple, lemon juice, and olive oil. Season with the pepper.',
   '3. To assemble a wrap, place 1 lavash sheet on a clean work surface. Spread some of the lentil mixture on the end nearest you, leaving a 1-inch border. Top with several slices of turkey, then some of the lettuce. Roll up the lavash, slice crosswise, and serve. If using tortillas, spread the lentils in the center, top with the turkey and lettuce, and fold up the bottom, left side, and right side before rolling away from you.'],
  'fat': 7.0,
  'date': '2006-09-01T04:00:00.000Z',
  'categories': ['Sandwich',
   'Bean',
   'Fruit',
   'Tomato',
  

In [13]:
filtered_data =[
    "Recipe for "+ x["title"]+ " | "+ " ".join(x["directions"])
    for x in recipe_data
    if 'title' in x
    and x['title'] is not None
    and 'directions' in x
    and x['directions'] is not None
    ]

In [14]:
filtered_data[:3]

['Recipe for Lentil, Apple, and Turkey Wrap  | 1. Place the stock, lentils, celery, carrot, thyme, and salt in a medium saucepan and bring to a boil. Reduce heat to low and simmer until the lentils are tender, about 30 minutes, depending on the lentils. (If they begin to dry out, add water as needed.) Remove and discard the thyme. Drain and transfer the mixture to a bowl; let cool. 2. Fold in the tomato, apple, lemon juice, and olive oil. Season with the pepper. 3. To assemble a wrap, place 1 lavash sheet on a clean work surface. Spread some of the lentil mixture on the end nearest you, leaving a 1-inch border. Top with several slices of turkey, then some of the lettuce. Roll up the lavash, slice crosswise, and serve. If using tortillas, spread the lentils in the center, top with the turkey and lettuce, and fold up the bottom, left side, and right side before rolling away from you.',
 'Recipe for Boudin Blanc Terrine with Red Onion Confit  | Combine first 9 ingredients in heavy medium 

## Tokenizing data

In [15]:
def pad_punctuation(s):
  s = re.sub(f"([{string.punctuation}])", r' \1', s)
  s = re.sub(" +", " ", s)
  return s

In [16]:
text_data = [pad_punctuation(x) for x in filtered_data]

In [17]:
text_data[:3]

['Recipe for Lentil , Apple , and Turkey Wrap | 1 . Place the stock , lentils , celery , carrot , thyme , and salt in a medium saucepan and bring to a boil . Reduce heat to low and simmer until the lentils are tender , about 30 minutes , depending on the lentils . (If they begin to dry out , add water as needed . ) Remove and discard the thyme . Drain and transfer the mixture to a bowl ; let cool . 2 . Fold in the tomato , apple , lemon juice , and olive oil . Season with the pepper . 3 . To assemble a wrap , place 1 lavash sheet on a clean work surface . Spread some of the lentil mixture on the end nearest you , leaving a 1 -inch border . Top with several slices of turkey , then some of the lettuce . Roll up the lavash , slice crosswise , and serve . If using tortillas , spread the lentils in the center , top with the turkey and lettuce , and fold up the bottom , left side , and right side before rolling away from you .',
 'Recipe for Boudin Blanc Terrine with Red Onion Confit | Combi

In [18]:
text_ds = tf.data.Dataset.from_tensor_slices(text_data).batch(32).shuffle(1000)

In [19]:
text_ds

<_ShuffleDataset element_spec=TensorSpec(shape=(None,), dtype=tf.string, name=None)>

In [20]:
len(text_data)

20111

In [21]:
sum =0
for i in range(len(text_data)):
  words = text_data[i].split()
  sum += len(words)

print(f"Average words per paragraph: {sum/len(text_data)}")

Average words per paragraph: 185.03664661130725


In [22]:
vectorize_layer = layers.TextVectorization(
    standardize= "lower", # not stripping punctuations
    max_tokens = 10000,
    output_mode = "int",
    output_sequence_length = 200+1,
)

In [23]:
vectorize_layer.adapt(text_ds)
vocab = vectorize_layer.get_vocabulary()

In [24]:
vocab[:10]

['', '[UNK]', '.', ',', 'and', 'to', 'in', 'the', 'with', 'a']

In [25]:
vocab[-10:]

['(rectangles',
 '(recrystallize',
 '(ravioli',
 '(raspberries',
 '(raita',
 '(quinoa',
 '(quince',
 '(pâté',
 '(purée',
 '(prosciutto']

In [26]:
vocab[4:14]

['and', 'to', 'in', 'the', 'with', 'a', 'until', '1', 'minutes', 'of']

In [27]:
vectorize_layer((text_data[3]))

<tf.Tensor: shape=(201,), dtype=int64, numpy=
array([  24,   14, 3697, 4229,    6,  267,  252,   51,   25,   15,   32,
          6,   74,   27,   48,   18,   28,  139,   15,    2,   16,  113,
         20,  127,   10,  865,    4,  471,    5,   89,    3,   17,   63,
         12,    2,   16,  249,    4, 1462,  345,    2,   67,   10,  287,
          5,   36,   72,   52,    3,   17,   36,   12,    2,   16,  180,
          8,  100,   20,   80,    5,   67,    2,   85,  208,    8,   22,
          4,   31,    2,   16,  208,    5,   48,  474,  267,   29,    2,
        150,   15,    5,  181,    3,   45,    3,    4,   68,   10,  208,
         33,  183,   98,    3,   17,  340,   12,    2,   84,  305,  446,
        383,    3,   35,  208,    5,  216,    4,  948,    8,  165,    5,
        191,  149,    2,  111,  526,    3,   21,  408,  648,    3,    4,
        254,  174,   23,   51,    6,   48,    2,  544,   15,    5,  153,
          4,   67,   10,   51,   33,  287,    4,  428,    3,   17,  114,
     

In [28]:
text_data[3]

'Recipe for Mahi -Mahi in Tomato Olive Sauce | Heat oil in heavy large skillet over medium -high heat . Add onion ; sauté until translucent and beginning to brown , about 4 minutes . Add wine and anchovy paste . Boil until reduced to 3 /4 cup , about 3 minutes . Add tomatoes with juice ; bring to boil . Sprinkle fish with salt and pepper . Add fish to skillet atop tomato mixture . Reduce heat to low , cover , and simmer until fish is cooked through , about 9 minutes . Using slotted metal spatula , transfer fish to plate and tent with foil to keep warm . Mix olives , 2 teaspoons oregano , and orange peel into sauce in skillet . Increase heat to high and boil until sauce is reduced and thickened , about 6 minutes . Season to taste with salt and pepper . Place 1 fish fillet on each of 4 plates . Pour sauce over and around fish , sprinkle with remaining 1 teaspoon oregano , and serve with warm toasted bread .'

## Creating Training Set

In [29]:
def prepare_inputs(text):
  text = tf.expand_dims(text, axis=-1)
  tokenized_sentences = vectorize_layer(text)
  x = tokenized_sentences[:,:-1]
  y = tokenized_sentences[:,1:]

  return x,y

In [30]:
train_ds = text_ds.map(prepare_inputs)

In [31]:
inputs = layers.Input(shape=(None, ), dtype="int32")
x = layers.Embedding(10000, 100)(inputs)
x = layers.LSTM(128, return_sequences=True)(x)
outputs = layers.Dense(10000, activation="softmax")(x)

lstm = models.Model(inputs, outputs)
lstm.summary()

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, None)]            0         
                                                                 
 embedding (Embedding)       (None, None, 100)         1000000   
                                                                 
 lstm (LSTM)                 (None, None, 128)         117248    
                                                                 
 dense (Dense)               (None, None, 10000)       1290000   
                                                                 
Total params: 2407248 (9.18 MB)
Trainable params: 2407248 (9.18 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [32]:
loss_fn = losses.SparseCategoricalCrossentropy()
lstm.compile("adam", loss_fn)

In [33]:
class TextGenerator(callbacks.Callback):
    def __init__(self, index_to_word, top_k=10):
        self.index_to_word = index_to_word
        self.word_to_index = {
            word: index for index, word in enumerate(index_to_word)
        }  # <1>

    def sample_from(self, probs, temperature):  # <2>
        probs = probs ** (1 / temperature)
        probs = probs / np.sum(probs)
        return np.random.choice(len(probs), p=probs), probs

    def generate(self, start_prompt, max_tokens, temperature):
        start_tokens = [
            self.word_to_index.get(x, 1) for x in start_prompt.split()
        ]  # <3>
        sample_token = None
        info = []
        while len(start_tokens) < max_tokens and sample_token != 0:  # <4>
            x = np.array([start_tokens])
            y = self.model.predict(x, verbose=0)  # <5>
            sample_token, probs = self.sample_from(y[0][-1], temperature)  # <6>
            info.append({"prompt": start_prompt, "word_probs": probs})
            start_tokens.append(sample_token)  # <7>
            start_prompt = start_prompt + " " + self.index_to_word[sample_token]
        print(f"\ngenerated text:\n{start_prompt}\n")
        return info

    def on_epoch_end(self, epoch, logs=None):
        self.generate("recipe for", max_tokens=100, temperature=1.0)

In [34]:
model_checkpoint_callback = callbacks.ModelCheckpoint(
    filepath="./checkpoint/checkpoint.ckpt",
    save_weights_only=True,
    save_freq="epoch",
    verbose=0,
)

tensorboard_callback = callbacks.TensorBoard(log_dir="./logs")

text_generator = TextGenerator(vocab)

In [35]:
lstm.fit(
    train_ds,
    epochs=25,
    callbacks=[model_checkpoint_callback, tensorboard_callback, text_generator],
)

Epoch 1/25
generated text:
recipe for stir peanut parchment goose for sherry , salt milk , stirring oven with eggs over sauce together 1 a oven until use melting 

Epoch 2/25
generated text:
recipe for nectarine cherries with grilled pearl toasts | whisk oil together 2 tbsp , lightly , , wine , 2 sauce , and sugar in the third of water . cut cake in them vinaigrette , cut ! the egg [UNK] with emulsify tops ) for and refrigerate . preheat the gelato . tossing , about 3 minutes . press the canned , slice bag with the sheet and cornstarch soup , until strainer cook in a bowl . in strainer over whisk in small bowls . preheat oven to lamb . prepare drippings apples with sliced tuna panna

Epoch 3/25
generated text:
recipe for white tea chips | cut sorbet with a the jars five -ounce have coarse the oil forms , turning occasionally , until vegetables are soft , stir to detach , and cover and purée vigorously over high heat , about 5 minutes . cover ; add juice , avocado , and 1 /4 cup water i

<keras.src.callbacks.History at 0x7d44f8a17760>

## Generating Text

In [36]:
def print_probs(info, vocab, top_k=5):
    for i in info:
        print(f"\nPROMPT: {i['prompt']}")
        word_probs = i["word_probs"]
        p_sorted = np.sort(word_probs)[::-1][:top_k]
        i_sorted = np.argsort(word_probs)[::-1][:top_k]
        for p, i in zip(p_sorted, i_sorted):
            print(f"{vocab[i]}:   \t{np.round(100*p,2)}%")
        print("--------\n")

In [37]:
info = text_generator.generate(
    "recipe for roasted vegetables | chop 1 /", max_tokens=10, temperature=1.0
)


generated text:
recipe for roasted vegetables | chop 1 / 2 inches



In [38]:
print_probs(info, vocab)


PROMPT: recipe for roasted vegetables | chop 1 /
3:   	21.97%
2:   	16.14%
1:   	11.42%
5:   	5.89%
half:   	4.8%
--------


PROMPT: recipe for roasted vegetables | chop 1 / 2
tablespoons:   	8.99%
1:   	6.74%
cups:   	5.26%
garlic:   	4.47%
at:   	4.4%
--------



In [39]:
info = text_generator.generate(
    "recipe for roasted vegetables | chop 1 /", max_tokens=10, temperature=0.2
)


generated text:
recipe for roasted vegetables | chop 1 / 3 /4



In [40]:
print_probs(info, vocab)


PROMPT: recipe for roasted vegetables | chop 1 /
3:   	79.74%
2:   	17.08%
1:   	3.03%
5:   	0.11%
half:   	0.04%
--------


PROMPT: recipe for roasted vegetables | chop 1 / 3
/4:   	100.0%
tablespoons:   	0.0%
-:   	0.0%
pounds:   	0.0%
cups:   	0.0%
--------



In [41]:
info = text_generator.generate(
    "recipe for chocolate ice cream |", max_tokens=7, temperature=1.0
)
print_probs(info, vocab)


generated text:
recipe for chocolate ice cream | peel


PROMPT: recipe for chocolate ice cream |
in:   	30.03%
combine:   	8.0%
bring:   	6.36%
melt:   	5.09%
whisk:   	4.83%
--------



In [42]:
info = text_generator.generate(
    "recipe for chocolate ice cream |", max_tokens=7, temperature=0.2
)
print_probs(info, vocab)


generated text:
recipe for chocolate ice cream | in


PROMPT: recipe for chocolate ice cream |
in:   	99.79%
combine:   	0.13%
bring:   	0.04%
melt:   	0.01%
whisk:   	0.01%
--------



## Custom Generation

In [53]:
import numpy as np

def generate_text_from_model(model, index_to_word, start_prompt, max_tokens=100, temperature=1.0):
    word_to_index = {word: index for index, word in enumerate(index_to_word)}

    def sample_from(probs, temperature):
        probs = probs ** (1 / temperature)
        probs = probs / np.sum(probs)
        return np.random.choice(len(probs), p=probs), probs

    start_tokens = [word_to_index.get(x, 1) for x in start_prompt.split()]
    sample_token = None
    generated_text = ""

    while len(start_tokens) < max_tokens and sample_token != 0:
        x = np.array([start_tokens])
        y = model.predict(x, verbose=0)
        sample_token, _ = sample_from(y[0][-1], temperature)
        start_tokens.append(sample_token)
        generated_text += " " + index_to_word[sample_token]

    return generated_text.strip()

In [46]:
model = lstm
index_to_word = vocab

In [58]:
start_prompt = "recipe for "

generated_text = generate_text_from_model(model, index_to_word, start_prompt, temperature=0.3)
print(f"Generated text:\n{start_prompt + generated_text}")

Generated text:
recipe for grilled pork chops with salsa verde , and dill | preheat oven to 400°f . place 1 tablespoon oil in large bowl . add to dressing and toss to coat . season with salt and pepper . grill until cooked through , about 3 minutes per side . transfer to platter . sprinkle with remaining 1 /2 teaspoon sesame seeds .


In [63]:
start_prompt = "recipe for "

generated_text = generate_text_from_model(model, index_to_word, start_prompt, temperature=0.2)
print("Recipe for:")
generated_text

Recipe for:


'grilled chicken with lemon , fennel , and mint | preheat oven to 400°f . toss together all ingredients in a large bowl . season with salt and pepper . heat oil in a large nonstick skillet over medium -high heat until hot but not smoking . working in batches , cook chicken in batches , turning once , until golden brown , about 5 minutes . turn off heat and cook until golden brown , about 5 minutes total . transfer to a platter and keep warm .'

In [65]:
start_prompt = input("Enter what would you want to generate recipe for: ")

generated_text = generate_text_from_model(model, index_to_word, start_prompt, temperature=0.2)
print("Recipe :")
generated_text

Enter what would you want to generate recipe for: recipe for chocolate brownie | 
Recipe :


'preheat oven to 350°f . butter a 9 -inch -diameter cake pan with 1 1 /2 -inch -high sides with nonstick spray . line bottom of pan with parchment paper . sift flour , baking powder , and salt into medium bowl . using electric mixer , beat butter and butter in large bowl until fluffy . add eggs 1 at a time , beating well after each addition . beat in eggs 1 at a time , beating well after each addition . beat in eggs 1 at a time . add eggs 1'

Therefore random recipes are being generated everytime and the temperature decides the probablity of what token to expect

In [68]:
start_prompt = input("Enter what would you want to generate recipe for: ")

generated_text = generate_text_from_model(model, index_to_word, start_prompt, temperature=0.2)
print("Recipe :")
generated_text

Enter what would you want to generate recipe for: cold coffee | 
Recipe :


'storage -speed /off foamy simmer ) blend brown sugar , brown sugar , and salt in a small saucepan over medium heat , stirring constantly , until mixture is reduced to about 2 1 /2 cups , about 3 minutes . stir in vanilla and stir until smooth . add 1 /2 cup sugar and 1 /4 cup water and stir until melted and smooth . pour into a large bowl and chill , covered , until cold , about 1 hour .'

In [72]:
start_prompt = input("Enter what would you want to generate recipe for: ")

generated_text = generate_text_from_model(model, index_to_word, start_prompt, temperature=0.5)
print("Recipe :")
generated_text

Enter what would you want to generate recipe for: cold coffee | 
Recipe :


'generally angle tons touch soda white chocolate and almond frosting | in a food processor blend the flour , the sugar , the salt , the cinnamon , the salt , the cinnamon , and the sugar until it resembles coarse meal with some of the butter . make a 1 /2 -inch thick , drop in the water , cover , and refrigerate for 1 hour . in a small bowl , combine the apricots , sugar , and 1 /2 cup of the syrup . stir in the remaining 3 /4 cup sugar . set'

Different temperatures generate different recepies

In [74]:
start_prompt = input("Enter what would you want to generate recipe for: ")

generated_text = generate_text_from_model(model, index_to_word, start_prompt, temperature=0.2)
print("Recipe :")
generated_text

Enter what would you want to generate recipe for: pizza | 
Recipe :


'help wall least harden low low low low low low dissolve the flour a food processor ) and pulse the dough in a food processor until it is very smooth . add the flour and pulse the dough until the dough is smooth . knead the dough in a food processor until smooth . transfer the dough to a floured surface and knead it until the dough is smooth . knead the dough until smooth and elastic , adding more water by 1 inch of the dough for 1 1 /2 hours , until the dough is very'

In [76]:
start_prompt = input("Enter what would you want to generate recipe for: ")

generated_text = generate_text_from_model(model, index_to_word, start_prompt, temperature=1)
print("Recipe :")
generated_text

Enter what would you want to generate recipe for: pizza | 
Recipe :


'2 lukewarm milk for 3 /4 cup of frozen liquid . 2 . lightly with water , drop the brittle quarters into the small bowl and cover and freeze for 30 minutes , then cover and refrigerate . 2 . preheat the oven to 350°f . 325°f with each of the cheddar , and cook for 2 minutes , about 7 to 10 minutes , or until the eggs are firm . while time , break the cobs into salt and ground to combine . before serving , you should be stiff and about completely before you allow'

In [77]:
start_prompt = input("Enter what would you want to generate recipe for: ")

generated_text = generate_text_from_model(model, index_to_word, start_prompt, temperature=1)
print("Recipe :")
generated_text

Enter what would you want to generate recipe for: recipe for 
Recipe :


'braised turkey and peppers with tomatoes , olives , and vinegar | combine anchovies and garlic in blender . whisk until beginning to smoke , about 3 tablespoons soy sauce , without running water until vegetables are soft and red wine is almost at color . transfer to small bowl ; mash peas with some salt and pepper . add chicken to skillet and pork chops . pour marinade into plastic and secure with cheese . sprinkle fish with fish and marinade can be prepared 3 hours ahead . cover tightly with plastic . refrigerate until turkey can'

In [81]:
start_prompt = "recipe for "

generated_text = generate_text_from_model(model, index_to_word, start_prompt, temperature=0.75)
print("Recipe for:")
generated_text

Recipe for:


'coconut lime pie | preheat oven to 400°f . finely grind yolks and flour in processor . add sugar and ground cardamom . blend until just combined . process in bowl with coarse puree . using on /off turns , grind spice mixture in processor until smooth . transfer to small bowl . cool slightly . mix in orange peel . mix in parsley , and next 5 ingredients . process until smooth . transfer mixture to large bowl . whisk in cream mixture . spread glaze over cake and bake until golden brown and center is golden'