Question 1: What is Generative AI and what are its primary use cases across
industries?


=> Generative AI is transforming industries by enabling machines to create content and solutions that were once exclusively human tasks. It enhances efficiency, creativity, and personalization across sectors while raising important ethical and governance considerations.

Question 2: Explain the role of probabilistic modeling in generative models. How do
these models differ from discriminative models?


=> Probabilistic modeling enables generative models to learn the underlying data distribution and generate new samples by modeling P(X, Y) or P(X).

In contrast, discriminative models focus only on predicting outputs by modeling P(Y | X) and do not generate new data.

Question 3: What is the difference between Autoencoders and Variational
Autoencoders (VAEs) in the context of text generation?



=>  Autoencoders compress and reconstruct text using deterministic latent vectors, making them unsuitable for true generative tasks.

Variational Autoencoders introduce probabilistic modeling in the latent space, enabling meaningful text generation through sampling, making them more suitable for generative NLP applications.

Question 4: Describe the working of attention mechanisms in Neural Machine
Translation (NMT). Why are they critical?

=> Attention mechanisms allow NMT models to dynamically focus on relevant parts of the input during translation. They eliminate the fixed-length bottleneck, improve long-sentence translation, enable word alignment, and form the foundation of modern Transformer-based systems.

Question 5: What ethical considerations must be addressed when using generative AI
for creative content such as poetry or storytelling?

=> While generative AI enables innovative creative expression, it raises serious ethical concerns around authorship, plagiarism, bias, transparency, economic impact, and misuse. Responsible development and deployment require clear policies, legal frameworks, and ethical guidelines to ensure AI supports—not undermines—human creativity.

Question 6: Use the following small text dataset to train a simple Variational
Autoencoder (VAE) for text reconstruction:
["The sky is blue", "The sun is bright", "The grass is green",
"The night is dark", "The stars are shining"]
1. Preprocess the data (tokenize and pad the sequences).
2. Build a basic VAE model for text reconstruction.
3. Train the model and show how it reconstructs or generates similar sentences.
Include your code, explanation, and sample outputs.
(Include your Python code and output in the code box below.)

In [12]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Input, Embedding, LSTM, Dense
from tensorflow.keras.models import Model
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

# -----------------------------
# 1. Dataset
# -----------------------------
sentences = [
    "The sky is blue",
    "The sun is bright",
    "The grass is green",
    "The night is dark",
    "The stars are shining"
]

# -----------------------------
# 2. Preprocessing
# -----------------------------
tokenizer = Tokenizer()
tokenizer.fit_on_texts(sentences)
sequences = tokenizer.texts_to_sequences(sentences)

max_len = max(len(seq) for seq in sequences)
vocab_size = len(tokenizer.word_index) + 1

padded_sequences = pad_sequences(sequences, maxlen=max_len, padding='post')
padded_sequences = np.array(padded_sequences)

print("Vocabulary Size:", vocab_size)
print("Max Sequence Length:", max_len)
print("Padded Sequences:\n", padded_sequences)

# -----------------------------
# 3. Encoder
# -----------------------------
embedding_dim = 16
latent_dim = 8

encoder_inputs = Input(shape=(max_len,))
x = Embedding(vocab_size, embedding_dim)(encoder_inputs)
x = LSTM(32)(x)

z_mean = Dense(latent_dim)(x)
z_log_var = Dense(latent_dim)(x)

# Reparameterization Trick
def sampling(args):
    z_mean, z_log_var = args
    epsilon = tf.random.normal(shape=tf.shape(z_mean))
    return z_mean + tf.exp(0.5 * z_log_var) * epsilon

z = tf.keras.layers.Lambda(sampling)([z_mean, z_log_var])

encoder = Model(encoder_inputs, [z_mean, z_log_var, z])

# -----------------------------
# 4. Decoder
# -----------------------------
latent_inputs = Input(shape=(latent_dim,))
x = tf.keras.layers.RepeatVector(max_len)(latent_inputs)
x = LSTM(32, return_sequences=True)(x)
decoder_outputs = Dense(vocab_size, activation="softmax")(x)

decoder = Model(latent_inputs, decoder_outputs)

# -----------------------------
# 5. Custom VAE Model
# -----------------------------
class VAE(Model):
    def __init__(self, encoder, decoder):
        super(VAE, self).__init__()
        self.encoder = encoder
        self.decoder = decoder

    def train_step(self, data):
        with tf.GradientTape() as tape:
            z_mean, z_log_var, z = self.encoder(data)
            reconstruction = self.decoder(z)

            reconstruction_loss = tf.reduce_mean(
                tf.keras.losses.sparse_categorical_crossentropy(
                    data, reconstruction
                )
            )

            kl_loss = -0.5 * tf.reduce_mean(
                1 + z_log_var - tf.square(z_mean) - tf.exp(z_log_var)
            )

            total_loss = reconstruction_loss + kl_loss

        grads = tape.gradient(total_loss, self.trainable_weights)
        self.optimizer.apply_gradients(zip(grads, self.trainable_weights))

        return {"loss": total_loss}

vae = VAE(encoder, decoder)
vae.compile(optimizer="adam")

# -----------------------------
# 6. Train
# -----------------------------
vae.fit(padded_sequences, epochs=200, verbose=0)

# -----------------------------
# 7. Reconstruction
# -----------------------------
z_mean, z_log_var, z = encoder.predict(padded_sequences)
predictions = decoder.predict(z)

predicted_words = np.argmax(predictions, axis=-1)

index_word = {v: k for k, v in tokenizer.word_index.items()}

print("\nReconstructed Sentences:")
for seq in predicted_words:
    sentence = []
    for idx in seq:
        if idx != 0 and idx in index_word:
            sentence.append(index_word[idx])
    print(" ".join(sentence))


Vocabulary Size: 14
Max Sequence Length: 4
Padded Sequences:
 [[ 1  3  2  4]
 [ 1  5  2  6]
 [ 1  7  2  8]
 [ 1  9  2 10]
 [ 1 11 12 13]]
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 174ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 169ms/step

Reconstructed Sentences:
is is is is
is is is is
is is is is
is is is is
is is is is


Question 7: Use a pre-trained GPT model (like GPT-2 or GPT-3) to translate a short
English paragraph into French and German. Provide the original and translated text.

=> Pre-trained GPT models can perform high-quality translations through prompt-based generation without being explicitly trained as translation-only systems. Their contextual understanding allows them to generate fluent and grammatically correct outputs in multiple languages.

Question 8: Implement a simple attention-based encoder-decoder model for
English-to-Spanish translation using Tensorflow or PyTorch.



In [13]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Embedding, LSTM, Dense
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

# -----------------------------
# 1. Dataset
# -----------------------------
english_sentences = [
    "hello",
    "how are you",
    "i am fine",
    "good morning",
    "thank you"
]

spanish_sentences = [
    "<start> hola <end>",
    "<start> como estas <end>",
    "<start> estoy bien <end>",
    "<start> buenos dias <end>",
    "<start> gracias <end>"
]

# -----------------------------
# 2. Tokenization
# -----------------------------
eng_tokenizer = Tokenizer(filters='')
spa_tokenizer = Tokenizer(filters='')

eng_tokenizer.fit_on_texts(english_sentences)
spa_tokenizer.fit_on_texts(spanish_sentences)

eng_seq = eng_tokenizer.texts_to_sequences(english_sentences)
spa_seq = spa_tokenizer.texts_to_sequences(spanish_sentences)

max_eng_len = max(len(s) for s in eng_seq)
max_spa_len = max(len(s) for s in spa_seq)

eng_seq = pad_sequences(eng_seq, maxlen=max_eng_len, padding='post')
spa_seq = pad_sequences(spa_seq, maxlen=max_spa_len, padding='post')

eng_vocab_size = len(eng_tokenizer.word_index) + 1
spa_vocab_size = len(spa_tokenizer.word_index) + 1

# -----------------------------
# 3. Model Parameters
# -----------------------------
embedding_dim = 64
units = 128
batch_size = len(english_sentences)

# -----------------------------
# 4. Encoder
# -----------------------------
class Encoder(tf.keras.Model):
    def __init__(self, vocab_size, embedding_dim, units):
        super().__init__()
        self.embedding = Embedding(vocab_size, embedding_dim)
        self.lstm = LSTM(units, return_sequences=True, return_state=True)

    def call(self, x):
        x = self.embedding(x)
        output, state_h, state_c = self.lstm(x)
        return output, state_h, state_c

# -----------------------------
# 5. Attention (Bahdanau)
# -----------------------------
class Attention(tf.keras.layers.Layer):
    def __init__(self, units):
        super().__init__()
        self.W1 = Dense(units)
        self.W2 = Dense(units)
        self.V = Dense(1)

    def call(self, encoder_outputs, hidden):
        hidden_with_time_axis = tf.expand_dims(hidden, 1)
        score = self.V(tf.nn.tanh(
            self.W1(encoder_outputs) + self.W2(hidden_with_time_axis)
        ))
        attention_weights = tf.nn.softmax(score, axis=1)
        context_vector = attention_weights * encoder_outputs
        context_vector = tf.reduce_sum(context_vector, axis=1)
        return context_vector, attention_weights

# -----------------------------
# 6. Decoder
# -----------------------------
class Decoder(tf.keras.Model):
    def __init__(self, vocab_size, embedding_dim, units):
        super().__init__()
        self.embedding = Embedding(vocab_size, embedding_dim)
        self.lstm = LSTM(units, return_state=True, return_sequences=True)
        self.fc = Dense(vocab_size)
        self.attention = Attention(units)

    def call(self, x, encoder_outputs, state_h, state_c):
        context_vector, attention_weights = self.attention(
            encoder_outputs, state_h
        )
        x = self.embedding(x)
        x = tf.concat([tf.expand_dims(context_vector, 1), x], axis=-1)
        output, state_h, state_c = self.lstm(x)
        output = self.fc(output)
        return output, state_h, state_c

# -----------------------------
# 7. Training Setup
# -----------------------------
encoder = Encoder(eng_vocab_size, embedding_dim, units)
decoder = Decoder(spa_vocab_size, embedding_dim, units)

optimizer = tf.keras.optimizers.Adam()
loss_object = tf.keras.losses.SparseCategoricalCrossentropy(
    from_logits=True, reduction='none'
)

def loss_function(real, pred):
    loss = loss_object(real, pred)
    return tf.reduce_mean(loss)

# -----------------------------
# 8. Training Loop (Simple)
# -----------------------------
EPOCHS = 500

for epoch in range(EPOCHS):
    with tf.GradientTape() as tape:
        enc_output, enc_h, enc_c = encoder(eng_seq)

        dec_input = tf.expand_dims(spa_seq[:, 0], 1)
        total_loss = 0

        for t in range(1, max_spa_len):
            predictions, enc_h, enc_c = decoder(
                dec_input, enc_output, enc_h, enc_c
            )
            total_loss += loss_function(spa_seq[:, t], predictions[:, 0, :])
            dec_input = tf.expand_dims(spa_seq[:, t], 1)

    variables = encoder.trainable_variables + decoder.trainable_variables
    gradients = tape.gradient(total_loss, variables)
    optimizer.apply_gradients(zip(gradients, variables))

    if epoch % 100 == 0:
        print(f"Epoch {epoch}, Loss {total_loss.numpy():.4f}")

# -----------------------------
# 9. Test Translation
# -----------------------------
def translate(sentence):
    sequence = eng_tokenizer.texts_to_sequences([sentence])
    sequence = pad_sequences(sequence, maxlen=max_eng_len, padding='post')
    enc_output, enc_h, enc_c = encoder(sequence)

    dec_input = tf.expand_dims(
        [spa_tokenizer.word_index['<start>']], 0
    )

    result = []

    for _ in range(max_spa_len):
        predictions, enc_h, enc_c = decoder(
            dec_input, enc_output, enc_h, enc_c
        )
        predicted_id = tf.argmax(predictions[0][0]).numpy()

        word = spa_tokenizer.index_word.get(predicted_id, '')
        if word == '<end>':
            break
        result.append(word)

        dec_input = tf.expand_dims([predicted_id], 0)

    return " ".join(result)

print("\nTranslations:")
for s in english_sentences:
    print(s, "→", translate(s))


Epoch 0, Loss 7.1916
Epoch 100, Loss 0.0587
Epoch 200, Loss 0.0118
Epoch 300, Loss 0.0057
Epoch 400, Loss 0.0034

Translations:
hello → hola
how are you → como estas
i am fine → estoy bien
good morning → buenos dias
thank you → gracias


Question 9: Use the following short poetry dataset to simulate poem generation with a
pre-trained GPT model:
["Roses are red, violets are blue,",
"Sugar is sweet, and so are you.",
"The moon glows bright in silent skies,",
"A bird sings where the soft wind sighs."]
Using this dataset as a reference for poetic structure and language, generate a new 2-4
line poem using a pre-trained GPT model (such as GPT-2). You may simulate
fine-tuning by prompting the model with similar poetic patterns.
Include your code, the prompt used, and the generated poem in your answer.

In [15]:
poetry_dataset = [
    "Roses are red, violets are blue,",
    "Sugar is sweet, and so are you.",
    "The moon glows bright in silent skies,",
    "A bird sings where the soft wind sighs."
]

from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch

# Load pre-trained GPT-2
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

# Prompt simulating fine-tuning
prompt = """Roses are red, violets are blue,
Sugar is sweet, and so are you.
The moon glows bright in silent skies,
A bird sings where the soft wind sighs.

Write a new short poem in a similar style:
"""

# Encode input
inputs = tokenizer.encode(prompt, return_tensors="pt")

# Generate poem
output = model.generate(
    inputs,
    max_length=120,
    num_return_sequences=1,
    temperature=0.9,
    top_k=50,
    top_p=0.95,
    do_sample=True
)

# Decode output
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

print(generated_text)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]



vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

GPT2LMHeadModel LOAD REPORT from: gpt2
Key                  | Status     |  | 
---------------------+------------+--+-
h.{0...11}.attn.bias | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.


generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


Roses are red, violets are blue,
Sugar is sweet, and so are you.
The moon glows bright in silent skies,
A bird sings where the soft wind sighs.

Write a new short poem in a similar style:

You may be my teacher,

But the moon glows bright in silent skies.

The wind sighs and tears and sings;

The moon glows white and beautiful,

But the wind sighs and tears and sings.


Question 10: Imagine you are building a creative writing assistant for a publishing
company. The assistant should generate story plots and character descriptions using
Generative AI. Describe how you would design the system, including model selection,
training data, bias mitigation, and evaluation methods. Explain the real-world challenges
you might face.


In [21]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model
tokenizer = AutoTokenizer.from_pretrained("gpt2")
model = AutoModelForCausalLM.from_pretrained("gpt2")

# Define inputs
genre = "Fantasy"
audience = "Young Adult"
tone = "Adventurous"

prompt = f"""
Genre: {genre}
Audience: {audience}
Tone: {tone}

Generate:
1. Story plot (3 paragraphs)
2. Main characters
3. Central theme
"""

input_ids = tokenizer.encode(prompt, return_tensors="pt")

output = model.generate(
    input_ids,
    max_length=300,
    temperature=0.8,
    top_p=0.95,
    do_sample=True
)

generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

print(generated_text)


Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

GPT2LMHeadModel LOAD REPORT from: gpt2
Key                  | Status     |  | 
---------------------+------------+--+-
h.{0...11}.attn.bias | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



Genre: Fantasy
Audience: Young Adult
Tone: Adventurous

Generate:
1. Story plot (3 paragraphs)
2. Main characters
3. Central theme
4. Characters are different (no voice acting)
5. Plot has to be completed
6. Plot is based on the main characters
7. Characters are in a different world (they don't have the same name)
8. Main story was planned from the beginning (they don't like the same name)
9. Story doesn't have any interesting characters (or any dialogue)
10. Main main character (with the same name) is a normal person (it's not bad)
11. Main main character isn't a bad person (it's good)
12. Main main character is strong (it's not bad)
13. Main main character is a strong person (it's good)
14. Main main character is stupid (it's not bad)
15. Main main character is good (it's not bad)
16. Main main character is stupid (it's not bad)
17. Main main character is great (it's not bad)
18. Main main character is likeable (it's not bad)
19. Main main character is stupid (it's not bad)
20. Main