### Question 1: What is Generative AI and what are its primary use cases across industries?  
**Answer:**  
Generative AI refers to artificial intelligence systems capable of creating new content such as text, images, audio, or video by learning patterns from existing data.  
- **Primary Use Cases:**  
  - **Healthcare:** Drug discovery, medical imaging synthesis.  
  - **Entertainment:** Script writing, music composition, game design.  
  - **Marketing:** Personalized ad copy, product descriptions.  
  - **Education:** Automated tutoring, content generation.  
  - **Software Development:** Code generation, documentation.  
  - **Design:** Fashion, architecture, and art prototyping.  

---

### Question 2: Explain the role of probabilistic modeling in generative models. How do these models differ from discriminative models?  
**Answer:**  
- **Probabilistic Modeling in Generative Models:**  
  - Generative models estimate the **joint probability distribution** \(P(x, y)\).  
  - They capture how data is generated, allowing simulation of new samples.  
  - Examples: Naive Bayes, Hidden Markov Models, Variational Autoencoders.  

- **Difference from Discriminative Models:**  
  - **Generative Models:** Learn distribution of inputs and outputs; can generate new data.  
  - **Discriminative Models:** Focus on conditional probability \(P(y|x)\); used for classification.  
  - Generative = “how data is formed,” Discriminative = “decision boundary.”  

---

### Question 3: What is the difference between Autoencoders and Variational Autoencoders (VAEs) in the context of text generation?  
**Answer:**  
- **Autoencoders (AE):**  
  - Compress input into latent space and reconstruct it.  
  - Deterministic encoding → limited diversity in generation.  

- **Variational Autoencoders (VAE):**  
  - Introduce probabilistic latent variables.  
  - Learn distribution over latent space → enables sampling and diverse text generation.  
  - Better suited for creative tasks like paraphrasing or poetry generation.  

---

### Question 4: Describe the working of attention mechanisms in Neural Machine Translation (NMT). Why are they critical?  
**Answer:**  
- **Working of Attention:**  
  - At each decoding step, the model assigns weights to different input tokens.  
  - Creates a **context vector** highlighting relevant words for translation.  
  - Allows the decoder to “focus” on specific parts of the source sentence.  

- **Critical Importance:**  
  - Handles long sentences without losing context.  
  - Improves translation accuracy and fluency.  
  - Enables alignment between source and target languages.  
  - Foundation for Transformer models (e.g., GPT, BERT).  

---

### Question 5: What ethical considerations must be addressed when using generative AI for creative content such as poetry or storytelling?  
**Answer:**  
- **Plagiarism & Originality:** Ensure generated content is not copied.  
- **Bias & Representation:** Avoid reinforcing stereotypes in narratives.  
- **Ownership & Attribution:** Clarify authorship of AI-generated works.  
- **Misinformation Risks:** Prevent creation of deceptive or harmful stories.  
- **Emotional Impact:** Consider psychological effects of generated content.  
- **Transparency:** Clearly disclose when content is AI-generated.  

---

### Question 10: Imagine you are building a creative writing assistant for a publishing company. The assistant should generate story plots and character descriptions using Generative AI. Describe how you would design the system, including model selection, training data, bias mitigation, and evaluation methods. Explain the real-world challenges you might face.  
**Answer:**  
- **System Design:**  
  - **Model Selection:** Use GPT-based models (e.g., GPT-3.5) for coherent text generation.  
  - **Training Data:** Curated datasets of novels, scripts, and character profiles.  
  - **Bias Mitigation:**  
    - Diverse datasets across cultures and genres.  
    - Post-processing filters to remove harmful stereotypes.  
  - **Evaluation Methods:**  
    - Human-in-the-loop review for creativity and coherence.  
    - Metrics: BLEU, ROUGE, and qualitative storytelling quality.  

- **Real-World Challenges:**  
  - **Bias & Fairness:** Risk of cultural or gender bias in generated plots.  
  - **Creativity vs. Repetition:** AI may produce formulaic or repetitive stories.  
  - **Copyright Issues:** Generated text may resemble existing works.  
  - **User Trust:** Ensuring transparency about AI involvement.  
  - **Scalability:** Balancing computational cost with creative depth.  

---


In [5]:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
import re

# Dataset
texts = [
    "The sky is blue",
    "The sun is bright",
    "The grass is green",
    "The night is dark",
    "The stars are shining"
]

# 1. Preprocess the data
def preprocess_text(texts):
    # Lowercase and clean
    texts = [t.lower() for t in texts]

    # Tokenize
    tokenizer = Tokenizer(char_level=False, oov_token='<OOV>')
    tokenizer.fit_on_texts(texts)

    # Convert to sequences
    sequences = tokenizer.texts_to_sequences(texts)

    # Pad sequences
    max_len = max(len(seq) for seq in sequences)
    padded = pad_sequences(sequences, maxlen=max_len, padding='post')

    return tokenizer, padded, max_len

tokenizer, X_train, max_len = preprocess_text(texts)
vocab_size = len(tokenizer.word_index) + 1

print("Vocabulary:", tokenizer.word_index)
print("Vocab size:", vocab_size)
print("Max length:", max_len)
print("Padded sequences:\n", X_train)

# One-hot encode for reconstruction target
X_train_onehot = tf.keras.utils.to_categorical(X_train, num_classes=vocab_size)
print("\nOne-hot shape:", X_train_onehot.shape)

Vocabulary: {'<OOV>': 1, 'the': 2, 'is': 3, 'sky': 4, 'blue': 5, 'sun': 6, 'bright': 7, 'grass': 8, 'green': 9, 'night': 10, 'dark': 11, 'stars': 12, 'are': 13, 'shining': 14}
Vocab size: 15
Max length: 4
Padded sequences:
 [[ 2  4  3  5]
 [ 2  6  3  7]
 [ 2  8  3  9]
 [ 2 10  3 11]
 [ 2 12 13 14]]

One-hot shape: (5, 4, 15)


In [8]:
from transformers import GPT2LMHeadModel, GPT2Tokenizer, AutoTokenizer, AutoModelForSeq2SeqLM

# Load GPT-2 model and tokenizer
model_name = "gpt2"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)

# Set pad token
tokenizer.pad_token = tokenizer.eos_token

def translate_with_gpt2(text, target_language, max_length=100):
    """
    Use GPT-2 for translation via few-shot prompting
    """
    # Create few-shot prompt
    if target_language == "French":
        prompt = f"""Translate English to French:
English: Hello, how are you?
French: Bonjour, comment allez-vous?
English: I love programming.
French: J'adore la programmation.
English: {text}
French:"""
    elif target_language == "German":
        prompt = f"""Translate English to German:
English: Hello, how are you?
German: Hallo, wie geht es dir?
English: I love programming.
German: Ich liebe Programmieren.
English: {text}
German:"""

    # Encode and generate
    inputs = tokenizer.encode(prompt, return_tensors="pt", truncation=True)

    outputs = model.generate(
        inputs,
        max_length=inputs.shape[1] + 50,
        num_return_sequences=1,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id,
        eos_token_id=tokenizer.encode("\n")[0]
    )

    # Decode and extract translation
    generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    translation = generated_text.split(f"{target_language}:")[-1].strip().split("\n")[0]
    return translation

# Original text
original_text = "The weather is beautiful today and I want to go for a walk in the park."

print("=" * 60)
print("ORIGINAL TEXT (English):")
print(f'"{original_text}"')
print("=" * 60)

# Translate to French
print("\nFRENCH TRANSLATION (GPT-2):")
french_translation = translate_with_gpt2(original_text, "French")
print(f'"{french_translation}"')

# Translate to German
print("\nGERMAN TRANSLATION (GPT-2):")
german_translation = translate_with_gpt2(original_text, "German")
print(f'"{german_translation}"')

# Alternative: Using Helsinki-NLP Translation Models (Direct loading to avoid pipeline KeyError)
print("\n" + "=" * 60)
print("ALTERNATIVE: Using Helsinki-NLP Translation Models (Direct Load)")
print("=" * 60)

def translate_manual(text, tokenizer, model):
    inputs = tokenizer(text, return_tensors="pt")
    outputs = model.generate(**inputs, max_new_tokens=100)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# English to French
model_name_fr = "Helsinki-NLP/opus-mt-en-fr"
tokenizer_fr = AutoTokenizer.from_pretrained(model_name_fr)
model_fr = AutoModelForSeq2SeqLM.from_pretrained(model_name_fr)
fr_result_manual = translate_manual(original_text, tokenizer_fr, model_fr)
print(f"\nFrench (Manual): {fr_result_manual}")

# English to German
model_name_de = "Helsinki-NLP/opus-mt-en-de"
tokenizer_de = AutoTokenizer.from_pretrained(model_name_de)
model_de = AutoModelForSeq2SeqLM.from_pretrained(model_name_de)
de_result_manual = translate_manual(original_text, tokenizer_de, model_de)
print(f"German (Manual): {de_result_manual}")

Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

GPT2LMHeadModel LOAD REPORT from: gpt2
Key                  | Status     |  | 
---------------------+------------+--+-
h.{0...11}.attn.bias | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.


ORIGINAL TEXT (English):
"The weather is beautiful today and I want to go for a walk in the park."

FRENCH TRANSLATION (GPT-2):
"Nous avons sommes de l'intervention."

GERMAN TRANSLATION (GPT-2):
"I'm a bit late so I'll wait in the car for you."

ALTERNATIVE: Using Helsinki-NLP Translation Models (Direct Load)


tokenizer_config.json:   0%|          | 0.00/42.0 [00:00<?, ?B/s]

source.spm:   0%|          | 0.00/778k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/802k [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]



pytorch_model.bin:   0%|          | 0.00/301M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/301M [00:00<?, ?B/s]

Loading weights:   0%|          | 0/258 [00:00<?, ?it/s]



generation_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]


French (Manual): Le temps est beau aujourd'hui et je veux faire une promenade dans le parc.


config.json: 0.00B [00:00, ?B/s]

tokenizer_config.json:   0%|          | 0.00/42.0 [00:00<?, ?B/s]

source.spm:   0%|          | 0.00/768k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/797k [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

pytorch_model.bin:   0%|          | 0.00/298M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/298M [00:00<?, ?B/s]

Loading weights:   0%|          | 0/258 [00:00<?, ?it/s]



generation_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

German (Manual): Das Wetter ist heute schön und ich möchte einen Spaziergang im Park machen.


In [9]:
import tensorflow as tf
import numpy as np
from tensorflow.keras import layers, Model
import re

# Sample parallel corpus (English -> Spanish)
data = [
    ("hello", "hola"),
    ("how are you", "cómo estás"),
    ("good morning", "buenos días"),
    ("thank you", "gracias"),
    ("good night", "buenas noches"),
    ("i love you", "te quiero"),
    ("what is your name", "cómo te llamas"),
    ("nice to meet you", "mucho gusto"),
    ("where are you from", "de dónde eres"),
    ("i am a student", "soy estudiante")
]

# Preprocessing functions
def preprocess(text):
    text = text.lower().strip()
    text = re.sub(r"[^a-záéíóúüñ\s]", "", text)
    return "<start> " + text + " <end>"

# Prepare data
english_texts = [preprocess(pair[0]) for pair in data]
spanish_texts = [preprocess(pair[1]) for pair in data]

# Tokenization
def tokenize(texts):
    tokenizer = tf.keras.preprocessing.text.Tokenizer(filters='')
    tokenizer.fit_on_texts(texts)
    sequences = tokenizer.texts_to_sequences(texts)
    max_len = max(len(seq) for seq in sequences)
    padded = tf.keras.preprocessing.sequence.pad_sequences(sequences, maxlen=max_len, padding='post')
    return tokenizer, padded, max_len

eng_tokenizer, eng_data, eng_max_len = tokenize(english_texts)
spa_tokenizer, spa_data, spa_max_len = tokenize(spanish_texts)

vocab_size_eng = len(eng_tokenizer.word_index) + 1
vocab_size_spa = len(spa_tokenizer.word_index) + 1

print(f"English vocab size: {vocab_size_eng}")
print(f"Spanish vocab size: {vocab_size_spa}")
print(f"English max length: {eng_max_len}")
print(f"Spanish max length: {spa_max_len}")

# Attention Layer
class BahdanauAttention(layers.Layer):
    def __init__(self, units):
        super(BahdanauAttention, self).__init__()
        self.W1 = layers.Dense(units)
        self.W2 = layers.Dense(units)
        self.V = layers.Dense(1)

    def call(self, query, values):
        # query: (batch_size, hidden_size)
        # values: (batch_size, max_length, hidden_size)
        query_with_time_axis = tf.expand_dims(query, 1)

        score = self.V(tf.nn.tanh(
            self.W1(query_with_time_axis) + self.W2(values)
        ))

        attention_weights = tf.nn.softmax(score, axis=1)
        context_vector = attention_weights * values
        context_vector = tf.reduce_sum(context_vector, axis=1)

        return context_vector, attention_weights

# Hyperparameters
embedding_dim = 64
units = 128
batch_size = 2

# Encoder
class Encoder(Model):
    def __init__(self, vocab_size, embedding_dim, enc_units, batch_sz):
        super(Encoder, self).__init__()
        self.batch_sz = batch_sz
        self.enc_units = enc_units
        self.embedding = layers.Embedding(vocab_size, embedding_dim)
        self.gru = layers.GRU(enc_units,
                             return_sequences=True,
                             return_state=True,
                             recurrent_initializer='glorot_uniform')

    def call(self, x, hidden):
        x = self.embedding(x)
        output, state = self.gru(x, initial_state=hidden)
        return output, state

    def initialize_hidden_state(self):
        return tf.zeros((self.batch_sz, self.enc_units))

# Decoder
class Decoder(Model):
    def __init__(self, vocab_size, embedding_dim, dec_units, batch_sz):
        super(Decoder, self).__init__()
        self.batch_sz = batch_sz
        self.dec_units = dec_units
        self.embedding = layers.Embedding(vocab_size, embedding_dim)
        self.gru = layers.GRU(dec_units,
                             return_sequences=True,
                             return_state=True,
                             recurrent_initializer='glorot_uniform')
        self.fc = layers.Dense(vocab_size)
        self.attention = BahdanauAttention(dec_units)

    def call(self, x, hidden, enc_output):
        context_vector, attention_weights = self.attention(hidden, enc_output)
        x = self.embedding(x)
        x = tf.concat([tf.expand_dims(context_vector, 1), x], axis=-1)
        output, state = self.gru(x)
        output = tf.reshape(output, (-1, output.shape[2]))
        x = self.fc(output)
        return x, state, attention_weights

# Initialize models
encoder = Encoder(vocab_size_eng, embedding_dim, units, batch_size)
decoder = Decoder(vocab_size_spa, embedding_dim, units, batch_size)

# Optimizer and loss
optimizer = tf.keras.optimizers.Adam()
loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True, reduction='none')

def loss_function(real, pred):
    mask = tf.math.logical_not(tf.math.equal(real, 0))
    loss_ = loss_object(real, pred)
    mask = tf.cast(mask, dtype=loss_.dtype)
    loss_ *= mask
    return tf.reduce_mean(loss_)

# Training step
@tf.function
def train_step(inp, targ, enc_hidden):
    loss = 0
    with tf.GradientTape() as tape:
        enc_output, enc_hidden = encoder(inp, enc_hidden)
        dec_hidden = enc_hidden
        dec_input = tf.expand_dims([spa_tokenizer.word_index['<start>']] * batch_size, 1)

        for t in range(1, targ.shape[1]):
            predictions, dec_hidden, _ = decoder(dec_input, dec_hidden, enc_output)
            loss += loss_function(targ[:, t], predictions)
            dec_input = tf.expand_dims(targ[:, t], 1)

    batch_loss = (loss / int(targ.shape[1]))
    variables = encoder.trainable_variables + decoder.trainable_variables
    gradients = tape.gradient(loss, variables)
    optimizer.apply_gradients(zip(gradients, variables))
    return batch_loss

# Training loop
EPOCHS = 300
dataset = tf.data.Dataset.from_tensor_slices((eng_data, spa_data)).batch(batch_size)

print("\nTraining...")
for epoch in range(EPOCHS):
    enc_hidden = encoder.initialize_hidden_state()
    total_loss = 0

    for (batch, (inp, targ)) in enumerate(dataset):
        batch_loss = train_step(inp, targ, enc_hidden)
        total_loss += batch_loss

    if (epoch + 1) % 50 == 0:
        print(f'Epoch {epoch+1}, Loss: {total_loss.numpy():.4f}')

# Translation function
def translate(sentence):
    sentence = preprocess(sentence)
    sequence = eng_tokenizer.texts_to_sequences([sentence])
    inputs = tf.keras.preprocessing.sequence.pad_sequences(sequence, maxlen=eng_max_len, padding='post')
    inputs = tf.convert_to_tensor(inputs)

    result = ''
    hidden = [tf.zeros((1, units))]
    enc_out, enc_hidden = encoder(inputs, hidden)
    dec_hidden = enc_hidden
    dec_input = tf.expand_dims([spa_tokenizer.word_index['<start>']], 0)

    for t in range(spa_max_len):
        predictions, dec_hidden, attention_weights = decoder(dec_input, dec_hidden, enc_out)
        predicted_id = tf.argmax(predictions[0]).numpy()

        if spa_tokenizer.index_word[predicted_id] == '<end>':
            return result.strip()

        result += spa_tokenizer.index_word[predicted_id] + ' '
        dec_input = tf.expand_dims([predicted_id], 0)

    return result.strip()

# Test translations
print("\n" + "=" * 50)
print("TRANSLATION RESULTS")
print("=" * 50)
test_sentences = ["hello", "how are you", "good morning", "thank you", "i love you"]
for sent in test_sentences:
    translation = translate(sent)
    print(f"English: '{sent}' -> Spanish: '{translation}'")

English vocab size: 25
Spanish vocab size: 21
English max length: 6
Spanish max length: 5

Training...
Epoch 50, Loss: 0.1993
Epoch 100, Loss: 0.0280
Epoch 150, Loss: 0.0121
Epoch 200, Loss: 0.0069
Epoch 250, Loss: 0.0044
Epoch 300, Loss: 0.0031

TRANSLATION RESULTS
English: 'hello' -> Spanish: 'hola'
English: 'how are you' -> Spanish: 'cómo estás'
English: 'good morning' -> Spanish: 'buenos días'
English: 'thank you' -> Spanish: 'gracias'
English: 'i love you' -> Spanish: 'te quiero'


In [10]:
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch

# Load pre-trained GPT-2
model_name = "gpt2"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)

# Set pad token
tokenizer.pad_token = tokenizer.eos_token

# Reference poetry dataset for structure
poetry_examples = [
    "Roses are red, violets are blue,",
    "Sugar is sweet, and so are you.",
    "The moon glows bright in silent skies,",
    "A bird sings where the soft wind sighs."
]

def generate_poem(prompt, max_length=100, temperature=0.8, num_return_sequences=1):
    """
    Generate poem using GPT-2 with poetic prompt engineering
    """
    inputs = tokenizer.encode(prompt, return_tensors="pt")

    outputs = model.generate(
        inputs,
        max_length=max_length,
        num_return_sequences=num_return_sequences,
        temperature=temperature,
        top_k=50,
        top_p=0.95,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id,
        no_repeat_ngram_size=2
    )

    generated_texts = []
    for output in outputs:
        text = tokenizer.decode(output, skip_special_tokens=True)
        generated_texts.append(text)

    return generated_texts

# Create poetic prompt using few-shot learning
poetic_prompt = """Write a romantic poem following AABB rhyme scheme:

Roses are red, violets are blue,
Sugar is sweet, and so are you.
The moon glows bright in silent skies,
A bird sings where the soft wind sighs.

New poem:"""

print("=" * 60)
print("POEM GENERATION WITH GPT-2")
print("=" * 60)

print("\n--- Prompt Used ---")
print(poetic_prompt)
print("\n" + "-" * 60)

# Generate poems
print("\n--- Generated Poems ---\n")

for i in range(3):
    print(f"Generation {i+1}:")
    generated = generate_poem(poetic_prompt, max_length=80, temperature=0.9)
    # Extract only the new generated part (after "New poem:")
    poem = generated[0].split("New poem:")[-1].strip()
    print(poem)
    print("-" * 40)

# Alternative: More structured generation
print("\n" + "=" * 60)
print("STRUCTURED GENERATION (2-4 lines)")
print("=" * 60)

structured_prompt = """Write a short 2-line romantic poem:

Roses are red, violets are blue,
Sugar is sweet, and so are you.

New 2-line poem:"""

generated_structured = generate_poem(structured_prompt, max_length=60, temperature=0.7)
new_poem = generated_structured[0].split("New 2-line poem:")[-1].strip()
print(f"\nGenerated Poem:\n{new_poem}")

# Fine-tuning simulation with style transfer
print("\n" + "=" * 60)
print("STYLE-CONDITIONED GENERATION")
print("=" * 60)

style_prompt = """Write a nature poem in the style of romantic poetry, using imagery of stars and night:

The moon glows bright in silent skies,
A bird sings where the soft wind sighs.

New nature poem about stars:"""

generated_style = generate_poem(style_prompt, max_length=70, temperature=0.85)
style_poem = generated_style[0].split("New nature poem about stars:")[-1].strip()
print(f"\nGenerated Nature Poem:\n{style_poem}")

Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

GPT2LMHeadModel LOAD REPORT from: gpt2
Key                  | Status     |  | 
---------------------+------------+--+-
h.{0...11}.attn.bias | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.


POEM GENERATION WITH GPT-2

--- Prompt Used ---
Write a romantic poem following AABB rhyme scheme:

Roses are red, violets are blue,
Sugar is sweet, and so are you.
The moon glows bright in silent skies,
A bird sings where the soft wind sighs.

New poem:

------------------------------------------------------------

--- Generated Poems ---

Generation 1:
a letter A is given to you, to me: I am a prince, a man
----------------------------------------
Generation 2:
Cursing tears from the lips of a woman
 (sounds like one):
----------------------------------------
Generation 3:
"Famous Poems of A ABB"
 (English translation from Wikipedia):
----------------------------------------

STRUCTURED GENERATION (2-4 lines)

Generated Poem:
I'll be up, I'll come to your house, my love will be sweet

STYLE-CONDITIONED GENERATION

Generated Nature Poem:
"Mama, make a moon of love to make me a man" (Song of the Seven
