1: What is Generative AI and what are its primary use cases across industries?

Answer:

Generative AI is a branch of Artificial Intelligence that focuses on creating new content—such as text, images, audio, video, code, or synthetic data—by learning patterns from existing data.
Unlike traditional AI (which mainly analyzes or predicts), Generative AI produces original outputs that resemble human-created content.



---



2: Explain the role of probabilistic modeling in generative models. How do these models differ from discriminative models?

Answer:

Role of Probabilistic Modeling in Generative Models

Probabilistic modeling is the foundation of generative models. It allows these models to learn the underlying probability distribution of data so they can generate new, realistic samples.

Captures uncertainty
Real-world data is noisy. Probabilistic models represent uncertainty instead of giving fixed outputs.

Enables data generation
Once the distribution is learned, the model can sample new data points.

Handles missing data well
Since the full distribution is known, missing values can be inferred.

Supports Bayesian reasoning
Models can update beliefs as new data arrives.



---



 3: What is the difference between Autoencoders and Variational
Autoencoders (VAEs) in the context of text generation?

Answers:


Difference Between Autoencoders (AEs) and Variational Autoencoders (VAEs) in Text Generation

Autoencoders and Variational Autoencoders are both encoder–decoder architectures, but they differ fundamentally in how they represent latent space and generate text.



---



 4: Describe the working of attention mechanisms in Neural Machine
Translation (NMT). Why are they critical?

Answer:


Working of Attention Mechanisms in Neural Machine Translation (NMT) & Why They Are Critical

Attention mechanisms were introduced to overcome key limitations of early encoder–decoder NMT models and are now central to modern translation systems (including Transformers).



---



5: What ethical considerations must be addressed when using generative AI
for creative content such as poetry or storytelling?

Answer:

Ethical Considerations in Using Generative AI for Creative Content (Poetry & Storytelling)

When Generative AI is used for creative domains like poetry or storytelling, it raises important ethical, legal, and social concerns beyond technical performance.

Authorship & Intellectual Property (IP)

Who owns AI-generated poems or stories—the user, the developer, or no one?

Risk of unintentional plagiarism if training data includes copyrighted works

AI may reproduce styles too closely to specific authors

AI can imitate famous writers’ styles (e.g., Shakespeare, Tagore)

Blurred line between creative inspiration and copying

Training data may contain:

Cultural bias

Gender stereotypes

Ethnic misrepresentation

Creative narratives can reinforce harmful tropes



---



6: Use the following small text dataset to train a simple Variational
Autoencoder (VAE) for text reconstruction:
["The sky is blue", "The sun is bright", "The grass is green",
"The night is dark", "The stars are shining"]
1. Preprocess the data (tokenize and pad the sequences).
2. Build a basic VAE model for text reconstruction.
3. Train the model and show how it reconstructs or generates similar sentences.
Include your code, explanation, and sample outputs.
(Include your Python code and output in the code box below.)


In [None]:
# =========================
# 1. Imports
# =========================
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras import layers, Model, backend as K

# =========================
# 2. Dataset
# =========================
sentences = [
    "the sky is blue",
    "the sun is bright",
    "the grass is green",
    "the night is dark",
    "the stars are shining"
]

# =========================
# 3. Tokenization & Padding
# =========================
tokenizer = Tokenizer()
tokenizer.fit_on_texts(sentences)

sequences = tokenizer.texts_to_sequences(sentences)
max_len = max(len(seq) for seq in sequences)
vocab_size = len(tokenizer.word_index) + 1

X = pad_sequences(sequences, maxlen=max_len, padding='post')

# =========================
# 4. VAE Parameters
# =========================
embedding_dim = 16
latent_dim = 8

# =========================
# 5. Encoder
# =========================
inputs = layers.Input(shape=(max_len,))
x = layers.Embedding(vocab_size, embedding_dim)(inputs)
x = layers.LSTM(32)(x)

z_mean = layers.Dense(latent_dim)(x)
z_log_var = layers.Dense(latent_dim)(x)

def sampling(args):
    z_mean, z_log_var = args
    epsilon = K.random_normal(shape=(K.shape(z_mean)[0], latent_dim))
    return z_mean + K.exp(0.5 * z_log_var) * epsilon

z = layers.Lambda(sampling)([z_mean, z_log_var])

encoder = Model(inputs, [z_mean, z_log_var, z])

# =========================
# 6. Decoder
# =========================
latent_inputs = layers.Input(shape=(latent_dim,))
x = layers.RepeatVector(max_len)(latent_inputs)
x = layers.LSTM(32, return_sequences=True)(x)
outputs = layers.TimeDistributed(
    layers.Dense(vocab_size, activation='softmax')
)(x)

decoder = Model(latent_inputs, outputs)

# =========================
# 7. VAE Model
# =========================
vae_outputs = decoder(encoder(inputs)[2])
vae = Model(inputs, vae_outputs)

# =========================
# 8. Loss Function
# =========================
reconstruction_loss = tf.keras.losses.sparse_categorical_crossentropy(
    X, vae_outputs
)
reconstruction_loss = K.mean(reconstruction_loss)

kl_loss = -0.5 * K.mean(
    1 + z_log_var - K.square(z_mean) - K.exp(z_log_var)
)

vae.add_loss(reconstruction_loss + kl_loss)
vae.compile(optimizer='adam')

# =========================
# 9. Training
# =========================
vae.fit(X, X, epochs=300, verbose=0)

# =========================
# 10. Reconstruction
# =========================
preds = vae.predict(X)

def decode_sentence(pred):
    words = []
    for timestep in pred:
        idx = np.argmax(timestep)
        word = tokenizer.index_word.get(idx, "")
        words.append(word)
    return " ".join(words)

print("\nOriginal vs Reconstructed:\n")
for i in range(len(sentences)):
    print("Original     :", sentences[i])
    print("Reconstructed:", decode_sentence(preds[i]))
    print("-" * 40)




---



7: Use a pre-trained GPT model (like GPT-2 or GPT-3) to translate a short English paragraph into French and German. Provide the original and translated text. (Include your Python code and output in the code box below.)

In [None]:
# =========================
# 1. Install & Import
# =========================
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch

# =========================
# 2. Load Pre-trained GPT-2
# =========================
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")
model.eval()

# =========================
# 3. Translation Prompts
# =========================
english_text = (
    "Artificial intelligence is transforming the world by improving "
    "efficiency and enabling new innovations."
)

prompt_french = f"Translate English to French:\nEnglish: {english_text}\nFrench:"
prompt_german = f"Translate English to German:\nEnglish: {english_text}\nGerman:"

# =========================
# 4. Generate Translation
# =========================
def generate_translation(prompt):
    inputs = tokenizer.encode(prompt, return_tensors="pt")
    outputs = model.generate(
        inputs,
        max_length=100,
        do_sample=True,
        temperature=0.7,
        top_k=50
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

french_translation = generate_translation(prompt_french)
german_translation = generate_translation(prompt_german)

print("Original English:\n", english_text)
print("\nFrench Translation:\n", french_translation)
print("\nGerman Translation:\n", german_translation)




---



8: Implement a simple attention-based encoder-decoder model for
English-to-Spanish translation using Tensorflow or PyTorch.
(Include your Python code and output in the code box below.)


In [None]:
import tensorflow as tf
import numpy as np
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.layers import Embedding, LSTM, Dense
from tensorflow.keras.models import Model

# Dataset
eng_sentences = [
    "hello",
    "how are you",
    "i am fine",
    "thank you",
    "good night"
]

spa_sentences = [
    "<start> hola <end>",
    "<start> como estas <end>",
    "<start> estoy bien <end>",
    "<start> gracias <end>",
    "<start> buenas noches <end>"
]
# Tokenizers
eng_tokenizer = Tokenizer()
spa_tokenizer = Tokenizer()

eng_tokenizer.fit_on_texts(eng_sentences)
spa_tokenizer.fit_on_texts(spa_sentences)

eng_seq = eng_tokenizer.texts_to_sequences(eng_sentences)
spa_seq = spa_tokenizer.texts_to_sequences(spa_sentences)

max_eng_len = max(len(s) for s in eng_seq)
max_spa_len = max(len(s) for s in spa_seq)

eng_seq = pad_sequences(eng_seq, maxlen=max_eng_len, padding="post")
spa_seq = pad_sequences(spa_seq, maxlen=max_spa_len, padding="post")

eng_vocab = len(eng_tokenizer.word_index) + 1
spa_vocab = len(spa_tokenizer.word_index) + 1
embedding_dim = 64
latent_dim = 128

encoder_inputs = tf.keras.Input(shape=(max_eng_len,))
enc_emb = Embedding(eng_vocab, embedding_dim)(encoder_inputs)

encoder_lstm = LSTM(latent_dim, return_sequences=True, return_state=True)
encoder_outputs, state_h, state_c = encoder_lstm(enc_emb)
class BahdanauAttention(tf.keras.layers.Layer):
    def __init__(self, units):
        super().__init__()
        self.W1 = Dense(units)
        self.W2 = Dense(units)
        self.V = Dense(1)

    def call(self, query, values):
        query = tf.expand_dims(query, 1)
        score = self.V(tf.nn.tanh(self.W1(values) + self.W2(query)))
        attention_weights = tf.nn.softmax(score, axis=1)
        context_vector = attention_weights * values
        context_vector = tf.reduce_sum(context_vector, axis=1)
        return context_vector
decoder_inputs = tf.keras.Input(shape=(max_spa_len - 1,))
dec_emb = Embedding(spa_vocab, embedding_dim)(decoder_inputs)

decoder_lstm = LSTM(latent_dim, return_sequences=True, return_state=True)
attention = BahdanauAttention(latent_dim)

outputs = []
state_h_dec, state_c_dec = state_h, state_c

for t in range(max_spa_len - 1):
    context = attention(state_h_dec, encoder_outputs)
    x = tf.concat([context[:, None, :], dec_emb[:, t:t+1, :]], axis=-1)
    out, state_h_dec, state_c_dec = decoder_lstm(
        x, initial_state=[state_h_dec, state_c_dec]
    )
    outputs.append(out)

decoder_outputs = tf.concat(outputs, axis=1)
decoder_dense = Dense(spa_vocab, activation="softmax")
decoder_outputs = decoder_dense(decoder_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

model.compile(
    optimizer="adam",
    loss="sparse_categorical_crossentropy",
    metrics=["accuracy"]
)

model.fit(
    [eng_seq, spa_seq[:, :-1]],
    spa_seq[:, 1:],
    epochs=300,
    verbose=0
)
def translate(sentence):
    seq = eng_tokenizer.texts_to_sequences([sentence])
    seq = pad_sequences(seq, maxlen=max_eng_len, padding="post")

    enc_out, h, c = encoder_lstm(
        Embedding(eng_vocab, embedding_dim)(seq)
    )

    target = spa_tokenizer.word_index["<start>"]
    result = []

    for _ in range(max_spa_len):
        context = attention(h, enc_out)
        x = tf.concat([context[:, None, :],
                       Embedding(spa_vocab, embedding_dim)(
                           tf.constant([[target]])
                       )], axis=-1)

        out, h, c = decoder_lstm(x, initial_state=[h, c])
        pred = tf.argmax(decoder_dense(out)[0, 0]).numpy()

        word = spa_tokenizer.index_word.get(pred, "")
        if word == "<end>":
            break
        result.append(word)
        target = pred

    return " ".join(result)




---



9: Use the following short poetry dataset to simulate poem generation with a
pre-trained GPT model:
["Roses are red, violets are blue,",
"Sugar is sweet, and so are you.",
"The moon glows bright in silent skies,",
"A bird sings where the soft wind sighs."]
Using this dataset as a reference for poetic structure and language, generate a new 2-4
line poem using a pre-trained GPT model (such as GPT-2). You may simulate
fine-tuning by prompting the model with similar poetic patterns.
Include your code, the prompt used, and the generated poem in your answer.
(Include your Python code and output in the code box below.)

In [None]:
# =========================
# 1. Imports
# =========================
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch

# =========================
# 2. Load Pre-trained GPT-2
# =========================
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")
model.eval()

# =========================
# 3. Prompt (Simulated Fine-Tuning)
# =========================
prompt = (
    "Roses are red, violets are blue,\n"
    "Sugar is sweet, and so are you.\n"
    "The moon glows bright in silent skies,\n"
    "A bird sings where the soft wind sighs.\n\n"
    "Write a new short poem with 2 to 4 lines in a similar poetic style:\n"
)

# =========================
# 4. Generate Poem
# =========================
inputs = tokenizer.encode(prompt, return_tensors="pt")

outputs = model.generate(
    inputs,
    max_length=120,
    temperature=0.8,
    top_k=50,
    top_p=0.95,
    do_sample=True,
    num_return_sequences=1
)

generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

print("PROMPT USED:\n")
print(prompt)
print("\nGENERATED POEM:\n")
print(generated_text)




---



 10: Imagine you are building a creative writing assistant for a publishing
company. The assistant should generate story plots and character descriptions using
Generative AI. Describe how you would design the system, including model selection,
training data, bias mitigation, and evaluation methods. Explain the real-world challenges
you might face.
(Include your Python code and output in the code box below.)

In [None]:
# =========================
# Creative Writing Assistant Demo
# =========================
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch

# Load model
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")
model.eval()

# Prompt for story plot
plot_prompt = (
    "Generate a short story plot for a fantasy novel:\n"
    "Genre: Fantasy\n"
    "Tone: Mysterious\n"
    "Focus: Lost kingdom and hidden power\n\n"
    "Story Plot:\n"
)

inputs = tokenizer.encode(plot_prompt, return_tensors="pt")

outputs = model.generate(
    inputs,
    max_length=150,
    temperature=0.8,
    top_p=0.9,
    do_sample=True
)

generated_plot = tokenizer.decode(outputs[0], skip_special_tokens=True)

print("=== GENERATED STORY PLOT ===\n")
print(generated_plot)

# Prompt for character description
character_prompt = (
    "Create a character description for a novel:\n"
    "Role: Protagonist\n"
    "Personality: Curious, brave, flawed\n"
    "Setting: Ancient fantasy world\n\n"
    "Character Description:\n"
)

inputs = tokenizer.encode(character_prompt, return_tensors="pt")

outputs = model.generate(
    inputs,
    max_length=140,
    temperature=0.85,
    top_p=0.9,
    do_sample=True
)

generated_character = tokenizer.decode(outputs[0], skip_special_tokens=True)

print("\n=== GENERATED CHARACTER DESCRIPTION ===\n")
print(generated_character)
