# Generative AI

Generative AI encompasses artificial intelligence techniques aimed at producing new content, such as text, images, audio, or video, that resembles or is inspired by existing data. These models learn patterns from training data and create new data samples that emulate these patterns..

### Downloading the IMDB Dataset

The dataset consists of movie reviews from IMDB, commonly used for sentiment analysis tasks. It includes labeled data where reviews are classified as either positive or negative sentiment.

In [1]:
!wget https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz

--2024-08-07 01:43:23--  https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
Resolving ai.stanford.edu (ai.stanford.edu)... 171.64.68.10
Connecting to ai.stanford.edu (ai.stanford.edu)|171.64.68.10|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 84125825 (80M) [application/x-gzip]
Saving to: ‘aclImdb_v1.tar.gz’


2024-08-07 01:43:25 (41.1 MB/s) - ‘aclImdb_v1.tar.gz’ saved [84125825/84125825]



The command `!tar -xf aclImdb_v1.tar.gz` extracts the contents of the `aclImdb_v1.tar.gz` archive file in the current directory.

In [2]:
!tar -xf aclImdb_v1.tar.gz

#Text Generation in Keras

In [3]:
from tensorflow import keras
import tensorflow as tf
from tensorflow .keras import layers
dataset = keras.utils.text_dataset_from_directory(
   'aclImdb',batch_size = 256, label_mode = None)
dataset = dataset.map(lambda x: tf.strings.regex_replace(x, '<br />', ' '))

Found 100006 files.


In [4]:
dataset

<_MapDataset element_spec=TensorSpec(shape=(None,), dtype=tf.string, name=None)>

#Text Vectorization

In [5]:
from tensorflow.keras.layers import TextVectorization

sequence_length = 100
vocab_size = 15000
text_vectorization =  TextVectorization (
    max_tokens = vocab_size,
    output_mode = 'int',
    output_sequence_length = sequence_length,)
text_vectorization.adapt(dataset)

#Using a TextVectorization layer to create a language modeling dataset

In [6]:
def prepare_lm_dataset(text_batch):
    vectorized_sequences = text_vectorization(text_batch)
    # Prepare input data (x) by excluding the last element of each sequence
    x = vectorized_sequences[:, :-1]
    # Prepare target data (y) by excluding the first element of each sequence
    y = vectorized_sequences[:, 1:]
    # Return the prepared input and target data
    return x, y

# Apply the prepare_lm_dataset function to each element in the dataset
lm_dataset = dataset.map(prepare_lm_dataset, num_parallel_calls=4)


#Transformer-Based Sequence to Sequence Model

In [7]:
import tensorflow as tf
from tensorflow.keras import layers

class PositionalEmbedding(layers.Layer):
    def __init__(self, sequence_length, input_dim, output_dim, **kwargs):
        super().__init__(**kwargs)
        self.token_embeddings = layers.Embedding(
            input_dim=input_dim, output_dim=output_dim
        )
        self.position_embeddings = layers.Embedding(
            input_dim=sequence_length, output_dim=output_dim
        )
        self.sequence_length = sequence_length
        self.input_dim = input_dim
        self.output_dim = output_dim

    def call(self, inputs):
        length = tf.shape(inputs)[-1]
        positions = tf.range(start=0, limit=length, delta=1)
        embedded_tokens = self.token_embeddings(inputs)
        embedded_positions = self.position_embeddings(positions)
        return embedded_tokens + embedded_positions

    def compute_mask(self, inputs, mask=None):
        return tf.math.not_equal(inputs, 0)

    def get_config(self):
        config = super().get_config()
        config.update({
            "output_dim": self.output_dim,
            "sequence_length": self.sequence_length,
            "input_dim": self.input_dim
        })
        return config


The `TransformerEncoder` class implements a transformer encoder layer with multi-head attention, feed-forward network, and residual connections with layer normalization.

In [8]:
class TransformerEncoder(layers.Layer):
    def __init__(self, embed_dim, dense_dim, num_heads, **kwargs):
        super().__init__(**kwargs)
        self.embed_dim = embed_dim
        self.dense_dim = dense_dim
        self.num_heads = num_heads

        # Multi-head attention layer
        self.attention = layers.MultiHeadAttention(
            num_heads=num_heads, key_dim=embed_dim
        )

        # Feed-forward network
        self.dense_proj = tf.keras.Sequential([
            layers.Dense(dense_dim, activation="relu"),
            layers.Dense(embed_dim),
        ])

        # Layer normalization layers
        self.layernorm_1 = layers.LayerNormalization()
        self.layernorm_2 = layers.LayerNormalization()

    def call(self, inputs, mask=None):
        if mask is not None:
            mask = mask[:, tf.newaxis, :]  # Adjust mask shape for attention

        # Apply multi-head attention
        attention_output = self.attention(
            query=inputs, value=inputs, key=inputs, attention_mask=mask
        )

        # Apply residual connection and layer normalization
        proj_input = self.layernorm_1(inputs + attention_output)

        # Apply feed-forward network
        proj_output = self.dense_proj(proj_input)

        # Apply another residual connection and layer normalization
        return self.layernorm_2(proj_input + proj_output)

    def get_config(self):
        config = super().get_config()
        config.update({
            "embed_dim": self.embed_dim,
            "num_heads": self.num_heads,
            "dense_dim": self.dense_dim
        })
        return config

The code defines a text generation callback for a Keras model that generates text based on a given prompt at the end of each epoch, using specified temperatures to control the randomness of predictions.

In [9]:
import numpy as np
from tensorflow import keras

tokens_index = dict(enumerate(text_vectorization.get_vocabulary()))

def sample_next(predictions, temperature=1.0):
    predictions = np.asarray(predictions).astype("float64")
    predictions = np.log(predictions) / temperature
    exp_preds = np.exp(predictions)
    predictions = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, predictions, 1)
    return np.argmax(probas)

class TextGenerator(keras.callbacks.Callback):
    def __init__(self, prompt, generate_length, model_input_length, temperatures=(1.,), print_freq=1):
        super().__init__()
        self.prompt = prompt
        self.generate_length = generate_length
        self.model_input_length = model_input_length
        self.temperatures = temperatures
        self.print_freq = print_freq

    def on_epoch_end(self, epoch, logs=None):
        if (epoch + 1) % self.print_freq != 0:
            return

        for temperature in self.temperatures:
            print(f"== Generating with temperature {temperature}")
            sentence = self.prompt
            for i in range(self.generate_length):
                tokenized_sentence = text_vectorization([sentence])
                predictions = self.model(tokenized_sentence)
                next_token = sample_next(predictions[0, i, :], temperature)
                sampled_token = tokens_index[next_token]
                sentence += " " + sampled_token
            print(sentence)

The code defines a Keras model with an input layer, a positional embedding layer, a transformer encoder layer, and a dense output layer with softmax activation for text generation or sequence prediction.

In [10]:
from tensorflow import keras
from tensorflow.keras import layers

class PositionalEmbedding(layers.Layer):
    def __init__(self, sequence_length, vocab_size, embed_dim):
        super().__init__()
        self.token_embeddings = layers.Embedding(input_dim=vocab_size, output_dim=embed_dim)
        self.position_embeddings = layers.Embedding(input_dim=sequence_length, output_dim=embed_dim)
        self.sequence_length = sequence_length

    def call(self, inputs):
        length = tf.shape(inputs)[-1]
        positions = tf.range(start=0, limit=length, delta=1)
        embedded_tokens = self.token_embeddings(inputs)
        embedded_positions = self.position_embeddings(positions)
        return embedded_tokens + embedded_positions

    def compute_mask(self, inputs, mask=None):
        # Correct way: Use a Lambda layer to wrap the TensorFlow operation
        return keras.layers.Lambda(lambda x: tf.math.not_equal(x, 0))(inputs)

# Example usage in a model
sequence_length = 100
vocab_size = 15000
embed_dim = 256
dense_dim = 2048
num_heads = 2

inputs = keras.Input(shape=(None,), dtype="int64")
x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(inputs)
x = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)
outputs = layers.Dense(vocab_size, activation="softmax")(x)
model = keras.Model(inputs, outputs)



The code compiles the Keras model using the RMSprop optimizer and sparse categorical cross-entropy loss function for training.

In [11]:
model.compile(
 loss="sparse_categorical_crossentropy",
 optimizer="rmsprop",
)

In [12]:
prompt = "This movie"
text_gen_callback = TextGenerator(
 prompt,
 generate_length=50,
 model_input_length=sequence_length,
 temperatures=(0.2, 0.5, 1.0, 1.5))

In [13]:
model.fit(lm_dataset, epochs=10, callbacks=[text_gen_callback])

Epoch 1/10




[1m391/391[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 344ms/step - loss: 6.1708== Generating with temperature 0.2
This movie movie this this movie movie this this movie movie this this movie movie this this movie movie this this movie movie this this movie movie this this movie movie this this movie movie this this movie movie this this movie movie this this movie movie this this movie movie this
== Generating with temperature 0.5
This movie movie this this movie movie this this movie movie this this movie i this dont movie this this movie this it movie this that movie this this movie movie this it movie this this movie movie this this movie movie this this movie movie it this it movie this this
== Generating with temperature 1.0
This movie it as make common this parts movie this  its neighbours this movie shark episode an entertainment it out of a bad hollywood andor this movie you would laugh                     
== Generating with temperature 1.5
This movie are hopelessl

<keras.src.callbacks.history.History at 0x78aa6e8175e0>

### Conclusion

Due to limited resources, I could only use the GPU for a short time, allowing me to run the model for just 10 epochs instead of the desired 200 epochs. Consequently, the results were not as expected. The limited number of epochs is the primary reason for the suboptimal model performance.