# 🥙 LSTM on Recipe Data

In this notebook, we'll walk through the steps required to train your own LSTM on the recipes dataset

In [1]:
%load_ext autoreload
%autoreload 2

import numpy as np
import json
import re
import string

import tensorflow as tf
from tensorflow.keras import layers, models, callbacks, losses

2023-10-29 17:21:10.022560: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-10-29 17:21:13.008197: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-10-29 17:21:13.016032: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2023-10-29 17:21:13.261593: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-10-29 17:21:17.303205: W tensorflow/stream_executor/platform/de

## 0. Parameters <a name="parameters"></a>

In [24]:
VOCAB_SIZE = 10000
MAX_LEN = 200
EMBEDDING_DIM = 100
N_UNITS = 128
VALIDATION_SPLIT = 0.2
SEED = 42
LOAD_MODEL = False
BATCH_SIZE = 32
EPOCHS = 25

## 1. Load the data <a name="load"></a>

In [2]:
# Load the full dataset
with open("/app/data/epirecipes/full_format_recipes.json") as json_data:
    recipe_data = json.load(json_data)

In [3]:
# Filter the dataset
filtered_data = [
    "Recipe for " + x["title"] + " | " + " ".join(x["directions"])
    for x in recipe_data
    if "title" in x
    and x["title"] is not None
    and "directions" in x
    and x["directions"] is not None
]

In [18]:
# Count the recipes
n_recipes = len(filtered_data)
print(f"{n_recipes} recipes loaded")

20111 recipes loaded


In [20]:
example = filtered_data[1]
print(example)

Recipe for Boudin Blanc Terrine with Red Onion Confit  | Combine first 9 ingredients in heavy medium saucepan. Add 3 shallots. Bring to simmer. Remove from heat, cover and let stand 30 minutes. Chill overnight. Preheat oven to 325°F. Line 7-cup pâté or bread pan with plastic wrap. Melt butter in heavy small skillet over low heat. Add remaining 5 shallots. Cover and cook until very soft, stirring occasionally, about 15 minutes. Transfer to processor. Add pork, eggs, flour and Port and puree. Strain cream mixture, pressing on solids to extract as much liquid as possible. With processor running, add cream through feed tube and process just until combined with pork. Transfer to large bowl. Mix in currants. Spoon mixture into prepared pan. Cover with foil. Place pan in large pan. Add boiling water to larger pan to within 1/2 inch of top of terrine. Bake until terrine begins to shrink from sides of pan and knife inserted into center comes out clean, about 1 1/2 hours. Uncover and cool on rac

## 2. Tokenise the data

In [21]:
# Pad the punctuation, to treat them as separate 'words'
def pad_punctuation(s):
    s = re.sub(f"([{string.punctuation}])", r" \1 ", s)
    s = re.sub(" +", " ", s)
    return s


text_data = [pad_punctuation(x) for x in filtered_data]

In [22]:
# Display an example of a recipe
example_data = text_data[9]
example_data

'Recipe for Ham Persillade with Mustard Potato Salad and Mashed Peas | Chop enough parsley leaves to measure 1 tablespoon ; reserve . Chop remaining leaves and stems and simmer with broth and garlic in a small saucepan , covered , 5 minutes . Meanwhile , sprinkle gelatin over water in a medium bowl and let soften 1 minute . Strain broth through a fine - mesh sieve into bowl with gelatin and stir to dissolve . Season with salt and pepper . Set bowl in an ice bath and cool to room temperature , stirring . Toss ham with reserved parsley and divide among jars . Pour gelatin on top and chill until set , at least 1 hour . Whisk together mayonnaise , mustard , vinegar , 1 / 4 teaspoon salt , and 1 / 4 teaspoon pepper in a large bowl . Stir in celery , cornichons , and potatoes . Pulse peas with marjoram , oil , 1 / 2 teaspoon pepper , and 1 / 4 teaspoon salt in a food processor to a coarse mash . Layer peas , then potato salad , over ham . '

In [25]:
# Convert to a Tensorflow Dataset
text_ds = (
    tf.data.Dataset.from_tensor_slices(text_data)
    .batch(BATCH_SIZE)
    .shuffle(1000)
)

In [26]:
# Create a vectorisation layer
vectorize_layer = layers.TextVectorization(
    standardize="lower",
    max_tokens=VOCAB_SIZE,
    output_mode="int",
    output_sequence_length=MAX_LEN + 1,
)

In [31]:
# Adapt the layer to the training set
vectorize_layer.adapt(text_ds)
vocab = vectorize_layer.get_vocabulary()

In [34]:
# Display some token:word mappings
for i, word in enumerate(vocab[2:50]):
    print(f"{i}: {word}")

0: .
1: ,
2: and
3: to
4: in
5: the
6: with
7: a
8: until
9: 1
10: minutes
11: -
12: of
13: 2
14: for
15: heat
16: add
17: about
18: over
19: bowl
20: ;
21: /
22: salt
23: into
24: recipe
25: |
26: on
27: medium
28: large
29: mixture
30: 4
31: pepper
32: (
33: )
34: 3
35: oil
36: is
37: water
38: transfer
39: or
40: stir
41: cook
42: pan
43: remaining
44: then
45: oven
46: stirring
47: cover


In [35]:
# Display the same example converted to ints
example_tokenised = vectorize_layer(example_data)
print(example_tokenised.numpy())

[  26   16  557    1    8  298  335  189    4 1054  494   27  332  228
  235  262    5  594   11  133   22  311    2  332   45  262    4  671
    4   70    8  171    4   81    6    9   65   80    3  121    3   59
   12    2  299    3   88  650   20   39    6    9   29   21    4   67
  529   11  164    2  320  171  102    9  374   13  643  306   25   21
    8  650    4   42    5  931    2   63    8   24    4   33    2  114
   21    6  178  181 1245    4   60    5  140  112    3   48    2  117
  557    8  285  235    4  200  292  980    2  107  650   28   72    4
  108   10  114    3   57  204   11  172    2   73  110  482    3  298
    3  190    3   11   23   32  142   24    3    4   11   23   32  142
   33    6    9   30   21    2   42    6  353    3 3224    3    4  150
    2  437  494    8 1281    3   37    3   11   23   15  142   33    3
    4   11   23   32  142   24    6    9  291  188    5    9  412  572
    2  230  494    3   46  335  189    3   20  557    2    0    0    0
    0 

## 3. Create the Training Set

In [37]:
# Create the training set of recipes and the same text shifted by one word
def prepare_inputs(text):
    text = tf.expand_dims(text, -1)
    tokenized_sentences = vectorize_layer(text)
    x = tokenized_sentences[:, :-1]
    y = tokenized_sentences[:, 1:]
    return x, y


train_ds = text_ds.map(prepare_inputs)

## 4. Build the LSTM <a name="build"></a>

In [54]:
inputs = layers.Input(shape=(None,), dtype="int32")
x = layers.Embedding(VOCAB_SIZE, EMBEDDING_DIM)(inputs)
x = layers.LSTM(N_UNITS, return_sequences=True)(x)
outputs = layers.Dense(VOCAB_SIZE, activation="softmax")(x)
lstm = models.Model(inputs, outputs)
lstm.summary()

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, None)]            0         
                                                                 
 embedding (Embedding)       (None, None, 100)         1000000   
                                                                 
 lstm (LSTM)                 (None, None, 128)         117248    
                                                                 
 dense (Dense)               (None, None, 10000)       1290000   
                                                                 
Total params: 2,407,248
Trainable params: 2,407,248
Non-trainable params: 0
_________________________________________________________________


In [55]:
if LOAD_MODEL:
    # model.load_weights('./models/model')
    lstm = models.load_model("./models/lstm", compile=False)

## 5. Train the LSTM <a name="train"></a>

In [56]:
loss_fn = losses.SparseCategoricalCrossentropy()
lstm.compile("adam", loss_fn)

In [57]:
# Create a TextGenerator checkpoint
class TextGenerator(callbacks.Callback):
    def __init__(self, index_to_word, top_k=10):
        self.index_to_word = index_to_word
        self.word_to_index = {
            word: index for index, word in enumerate(index_to_word)
        }  # <1>

    def sample_from(self, probs, temperature):  # <2>
        probs = probs ** (1 / temperature)
        probs = probs / np.sum(probs)
        return np.random.choice(len(probs), p=probs), probs

    def generate(self, start_prompt, max_tokens, temperature):
        start_tokens = [
            self.word_to_index.get(x, 1) for x in start_prompt.split()
        ]  # <3>
        sample_token = None
        info = []
        while len(start_tokens) < max_tokens and sample_token != 0:  # <4>
            x = np.array([start_tokens])
            y = self.model.predict(x, verbose=0)  # <5>
            sample_token, probs = self.sample_from(y[0][-1], temperature)  # <6>
            info.append({"prompt": start_prompt, "word_probs": probs})
            start_tokens.append(sample_token)  # <7>
            start_prompt = start_prompt + " " + self.index_to_word[sample_token]
        print(f"\ngenerated text:\n{start_prompt}\n")
        return info

    def on_epoch_end(self, epoch, logs=None):
        self.generate("recipe for", max_tokens=100, temperature=1.0)

In [58]:
# Create a model save checkpoint
model_checkpoint_callback = callbacks.ModelCheckpoint(
    filepath="./checkpoint/checkpoint.ckpt",
    save_weights_only=True,
    save_freq="epoch",
    verbose=0,
)

tensorboard_callback = callbacks.TensorBoard(log_dir="./logs")

# Tokenize starting prompt
text_generator = TextGenerator(vocab)

In [59]:
lstm.fit(
    train_ds,
    epochs=EPOCHS,
    callbacks=[model_checkpoint_callback, tensorboard_callback, text_generator],
)

Epoch 1/25


2023-10-29 17:53:03.470085: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 256000000 exceeds 10% of free system memory.


  1/629 [..............................] - ETA: 1:29:12 - loss: 9.2105

2023-10-29 17:53:09.608897: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 256000000 exceeds 10% of free system memory.


  2/629 [..............................] - ETA: 17:54 - loss: 9.2084  

2023-10-29 17:53:11.234393: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 256000000 exceeds 10% of free system memory.


  3/629 [..............................] - ETA: 20:02 - loss: 9.2062

2023-10-29 17:53:13.517120: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 256000000 exceeds 10% of free system memory.


  4/629 [..............................] - ETA: 28:14 - loss: 9.2038

2023-10-29 17:53:17.656296: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 256000000 exceeds 10% of free system memory.


generated text:
recipe for roasted in port uncovered to 350°f , with ingredients in 1 1 to onion . cover , in large - center . high in help mixer cooking and 2 liquid with 1 just the remains to salt . large heated . pepper 

Epoch 2/25
generated text:
recipe for melon - green rice with raisins with blackberry slaw | preheat oven to 300°f . 

Epoch 3/25
generated text:
recipe for poached vegetables | preheat oven to 400°f . mix off potatoes , sautéed garlic , and pepper in a same bowl with 1 / 2 tsp . ( can be mixer shell turn over high heat and turn in batches , then chill flavors , about 16 minutes . 

Epoch 4/25
generated text:
recipe for plantain tart with yogurt | roast pork , cinnamon , vegetable , orange juice , white garlic and rosemary sprigs with shortening . beat in bowl gently dressing to grape . grill until tender , about 3 / 2 hours . ) transfer pork to chunks . dip tortilla between evenly jars for prepared pan . spread onto tomatoes . 

Epoch 5/25
generated text:
recipe f

<keras.callbacks.History at 0x7f597c12f3d0>

In [60]:
# Save the final model
lstm.save("./models/lstm")



INFO:tensorflow:Assets written to: ./models/lstm/assets


INFO:tensorflow:Assets written to: ./models/lstm/assets


## 6. Generate text using the LSTM

In [61]:
def print_probs(info, vocab, top_k=5):
    for i in info:
        print(f"\nPROMPT: {i['prompt']}")
        word_probs = i["word_probs"]
        p_sorted = np.sort(word_probs)[::-1][:top_k]
        i_sorted = np.argsort(word_probs)[::-1][:top_k]
        for p, i in zip(p_sorted, i_sorted):
            print(f"{vocab[i]}:   \t{np.round(100*p,2)}%")
        print("--------\n")

In [62]:
info = text_generator.generate(
    "recipe for roasted vegetables | chop 1 /", max_tokens=10, temperature=1.0
)


generated text:
recipe for roasted vegetables | chop 1 / 8 inch



In [63]:
print_probs(info, vocab)


PROMPT: recipe for roasted vegetables | chop 1 /
4:   	42.67%
2:   	39.88%
3:   	9.52%
8:   	6.4%
16:   	0.36%
--------


PROMPT: recipe for roasted vegetables | chop 1 / 8
-:   	39.74%
inch:   	32.56%
cup:   	8.99%
of:   	3.21%
teaspoon:   	3.1%
--------



In [64]:
info = text_generator.generate(
    "recipe for roasted vegetables | chop 1 /", max_tokens=10, temperature=0.2
)


generated text:
recipe for roasted vegetables | chop 1 / 4 cup



In [65]:
print_probs(info, vocab)


PROMPT: recipe for roasted vegetables | chop 1 /
4:   	58.36%
2:   	41.6%
3:   	0.03%
8:   	0.0%
16:   	0.0%
--------


PROMPT: recipe for roasted vegetables | chop 1 / 4
-:   	56.54%
cup:   	42.11%
inch:   	1.35%
of:   	0.0%
to:   	0.0%
--------



In [66]:
info = text_generator.generate(
    "recipe for chocolate ice cream |", max_tokens=7, temperature=1.0
)
print_probs(info, vocab)


generated text:
recipe for chocolate ice cream | preheat


PROMPT: recipe for chocolate ice cream |
preheat:   	19.03%
bring:   	9.78%
in:   	9.7%
1:   	5.74%
whisk:   	5.23%
--------



In [67]:
info = text_generator.generate(
    "recipe for chocolate ice cream |", max_tokens=7, temperature=0.2
)
print_probs(info, vocab)


generated text:
recipe for chocolate ice cream | preheat


PROMPT: recipe for chocolate ice cream |
preheat:   	92.97%
bring:   	3.34%
in:   	3.19%
1:   	0.23%
whisk:   	0.15%
--------

