# Assignment


Q.1.  What is Generative AI and what are its primary use cases across
industries?

Answer ->>

Generative AI refers to AI systems that create new content, such as text, images, audio, video, or code, based on patterns learned from vast datasets. Unlike traditional AI, which analyzes or classifies data, generative models produce original outputs mimicking human creativity.

**Core Mechanism :**

Generative AI primarily relies on models like GANs (Generative Adversarial Networks), VAEs (Variational Autoencoders), and transformers (e.g., GPT architectures). These systems generate content by predicting probable sequences or structures from training data, enabling applications from writing assistants to image synthesis.

**Key Use Cases :**

1. **Healthcare :**
It accelerates drug discovery by proposing novel molecular structures and enhances medical imaging through synthetic data generation. Personalized treatment plans and predictive disease modeling are also common.

2. **Manufacturing :**
Engineers use it for generative design, optimizing parts for weight, strength, and material efficiency—e.g., General Motors designs lighter vehicle components. Predictive maintenance and quality control reduce downtime and waste.

3. **Marketing and Retail :**
Tools create personalized content, product recommendations, and marketing copy. Chatbots deliver tailored customer responses, while demand forecasting improves inventory management.

4. **Finance and Insurance :**
It aids fraud detection, generates financial reports, and simulates scenarios for risk assessment. Personalized strategies enhance customer engagement.

5. **Entertainment and Media :**
Generative AI produces art, music, scripts, and videos, speeding up creative workflows. It also supports personalized recommendations.

6. **Software Development :**
Code generation tools like assistants write, debug, and optimize software, reducing development time for less experienced programmers.

Q.2. Explain the role of probabilistic modeling in generative models. How do
these models differ from discriminative models?

Answer ->>

Probabilistic modeling forms the foundation of generative models by capturing the underlying probability distributions of data, enabling them to generate new samples that resemble the training data. These models learn the joint probability P(X) or P(X,Y) over input features X (and labels Y, if supervised), often factorized as P(X)=∏
i=1
n
 P(x
i
 ∣x
1
 ,…,x
i−1
 )  allowing sequential sampling to produce novel outputs like text or images.

**Role in Generative Models :**

Generative models use probabilistic approaches to estimate data distributions explicitly. For instance, Variational Autoencoders (VAEs) impose probabilistic priors (e.g., Gaussian) on latent spaces and optimize losses like reconstruction error plus KL-divergence to match learned distributions to priors. This enables sampling from the latent space to create diverse, realistic data points, powering applications like image synthesis.
​

**Differences from Discriminative Models :**

- Generative models learn the full data distribution to generate new instances, while discriminative models focus on boundaries between classes by estimating
P
(
Y
∣
X
).
- Generative approaches support tasks like data augmentation and anomaly detection; discriminative ones excel in classification but cannot generate data.

Q.3. What is the difference between Autoencoders and Variational
Autoencoders (VAEs) in the context of text generation?

Answer ->>

Autoencoders (AEs) and Variational Autoencoders (VAEs) both compress and reconstruct data via encoder-decoder architectures, but VAEs introduce probabilistic elements that make them suitable for generative tasks like text generation.

**The difference between Autoencoders and Variational
Autoencoders :**

Autoencoders :      
- In this, latent output is fixed vector.
- In this, the loss function is only Reconstructino.
- In this, Generative ability is poor (no smooth smapling).
- Example : Denoising noisy sentences

Variational Autoencoders :     
- In this, latent output is distribution parameters.
- In this, the loss function is Reconstruction and KL divergence.
- In this, Generative ability is strong (probabilistic smapling).
- Example : Generating variations of input text.

Q.4.  Describe the working of attention mechanisms in Neural Machine
Translation (NMT). Why are they critical?

Answer ->>

Attention mechanisms in Neural Machine Translation (NMT) enable the decoder to dynamically focus on relevant parts of the source sequence during target generation, overcoming limitations of fixed context vectors.

**How Attention Works :**

In NMT's encoder-decoder framework (often RNN- or LSTM-based), the encoder processes the source sentence into hidden states h1,h2,…,hn, each representing contextual info at position i. For each decoder timestep t(generating target word yt), attention computes:

1. Alignment scores: A similarity function (e.g., dot product) between decoder state st and each encoder state hi:eti =
score(st,hi).

2. Attention weights: Softmax-normalized scores αti = exp(eti) / ∑j exp(etj), indicating focus on source positions.

3. Context vector: Weighted sum ct = ∑i αti hi, fed to the decoder alongside st to predict yt.

This repeats per target word, with visualizations showing peak weights aligning source-target pairs (e.g., high focus on "cat" when translating a pronoun referring to it).

**Why Critical ?**

Without attention, decoders rely on a single fixed-length context vector from the final encoder state, causing "bottleneck" information loss for long sentences—degrading BLEU scores by up to 5+ points. Attention handles variable-length inputs, captures long-range dependencies, and improves fluency/accuracy, as proven in foundational works yielding major gains on WMT tasks.

Q.5. What ethical considerations must be addressed when using generative AI
for creative content such as poetry or storytelling?

Answer ->>

Generative AI for creative content like poetry or storytelling raises ethical challenges around authenticity, ownership, and societal impact. Key considerations include protecting intellectual property, ensuring transparency, and mitigating biases to avoid harm.

**Intellectual Property Rights :**
AI models trained on vast datasets often reproduce styles or phrases from existing works, risking unintentional plagiarism or copyright infringement. Creators must verify originality, attribute influences when possible, and clarify AI involvement to prevent misrepresenting machine output as human creation.

**Transparency and Disclosure :**
Failing to disclose AI use deceives audiences, eroding trust in creative fields where human authorship holds value. Ethical practice requires labeling AI-generated poetry or stories, allowing consumers to contextualize the work and distinguish it from human efforts.

**Bias and Fairness :**
Training data embeds cultural, gender, or racial biases, leading to stereotypical narratives (e.g., clichéd portrayals in storytelling). Developers and users should audit outputs for inclusivity, diversify datasets, and implement fairness checks to promote equitable representation.

**Misinformation and Authenticity :**
AI can generate convincingly false historical details or emotional manipulations in stories, blurring reality-fiction lines. Responsible use demands fact-checking, especially for poetry evoking real events, and prioritizing human oversight to maintain artistic integrity.

Q.6. Use the following small text dataset to train a simple Variational
Autoencoder (VAE) for text reconstruction:

["The sky is blue", "The sun is bright", "The grass is green",
"The night is dark", "The stars are shining"]

1. Preprocess the data (tokenize and pad the sequences).
2. Build a basic VAE model for text reconstruction.
3. Train the model and show how it reconstructs or generates similar sentences.

Include your code, explanation, and sample outputs.


In [1]:
# Answer ->>

import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.utils import to_categorical

# Dataset
sentences = [
    "The sky is blue",
    "The sun is bright",
    "The grass is green",
    "The night is dark",
    "The stars are shining"
]

# Tokenization
tokenizer = Tokenizer()
tokenizer.fit_on_texts(sentences)

sequences = tokenizer.texts_to_sequences(sentences)
word_index = tokenizer.word_index
vocab_size = len(word_index) + 1

# Padding
max_length = max(len(seq) for seq in sequences)
padded_sequences = pad_sequences(sequences, maxlen=max_length, padding='post')

# One-hot encoding for reconstruction
one_hot_targets = to_categorical(padded_sequences, num_classes=vocab_size)

print("Vocabulary:", word_index)
print("Padded sequences:\n", padded_sequences)

Vocabulary: {'the': 1, 'is': 2, 'sky': 3, 'blue': 4, 'sun': 5, 'bright': 6, 'grass': 7, 'green': 8, 'night': 9, 'dark': 10, 'stars': 11, 'are': 12, 'shining': 13}
Padded sequences:
 [[ 1  3  2  4]
 [ 1  5  2  6]
 [ 1  7  2  8]
 [ 1  9  2 10]
 [ 1 11 12 13]]


Q.7. Use a pre-trained GPT model (like GPT-2 or GPT-3) to translate a short
English paragraph into French and German. Provide the original and translated text.

In [5]:
# Answer ->>

from transformers import pipeline

# Load GPT-2 model
generator = pipeline("text-generation", model="gpt2")

text = """
Translate the following paragraph into French:

Artificial Intelligence is transforming the world rapidly.
It is used in healthcare, finance, education, and transportation
to improve efficiency and decision-making.
"""

output = generator(text, max_length=200, temperature=0.3)

print(output[0]['generated_text'])

Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

GPT2LMHeadModel LOAD REPORT from: gpt2
Key                  | Status     |  | 
---------------------+------------+--+-
h.{0...11}.attn.bias | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=200) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)



Translate the following paragraph into French:

Artificial Intelligence is transforming the world rapidly. 
It is used in healthcare, finance, education, and transportation 
to improve efficiency and decision-making.

It is used in the production of new technologies, including robotics, artificial intelligence, and robotics.

It is used in the production of new technologies, including robotics, artificial intelligence, and robotics.

It is used in the production of new technologies, including robotics, artificial intelligence, and robotics.

It is used in the production of new technologies, including robotics, artificial intelligence, and robotics.

It is used in the production of new technologies, including robotics, artificial intelligence, and robotics.

It is used in the production of new technologies, including robotics, artificial intelligence, and robotics.

It is used in the production of new technologies, including robotics, artificial intelligence, and robotics.

It is used 

Q.8. Implement a simple attention-based encoder-decoder model for
English-to-Spanish translation using Tensorflow or PyTorch.


In [None]:
# Answer  ->>

import tensorflow as tf
import numpy as np
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

# =========================
# 1. Dataset
# =========================
pairs = [
    ("hello", "hola"),
    ("how are you", "como estas"),
    ("i am fine", "estoy bien"),
    ("thank you", "gracias"),
    ("good night", "buenas noches")
]

eng_sentences = [p[0] for p in pairs]
spa_sentences = ["<start> " + p[1] + " <end>" for p in pairs]

# =========================
# 2. Tokenization
# =========================
eng_tokenizer = Tokenizer(filters='')
spa_tokenizer = Tokenizer(filters='')

eng_tokenizer.fit_on_texts(eng_sentences)
spa_tokenizer.fit_on_texts(spa_sentences)

eng_seq = eng_tokenizer.texts_to_sequences(eng_sentences)
spa_seq = spa_tokenizer.texts_to_sequences(spa_sentences)

max_eng_len = max(len(seq) for seq in eng_seq)
max_spa_len = max(len(seq) for seq in spa_seq)

eng_seq = pad_sequences(eng_seq, maxlen=max_eng_len, padding='post')
spa_seq = pad_sequences(spa_seq, maxlen=max_spa_len, padding='post')

eng_vocab_size = len(eng_tokenizer.word_index) + 1
spa_vocab_size = len(spa_tokenizer.word_index) + 1

# =========================
# 3. Attention Layer
# =========================
class BahdanauAttention(tf.keras.layers.Layer):
    def __init__(self, units):
        super().__init__()
        self.W1 = tf.keras.layers.Dense(units)
        self.W2 = tf.keras.layers.Dense(units)
        self.V = tf.keras.layers.Dense(1)

    def call(self, hidden, encoder_outputs):
        hidden_with_time_axis = tf.expand_dims(hidden, 1)
        score = self.V(tf.nn.tanh(
            self.W1(encoder_outputs) + self.W2(hidden_with_time_axis)
        ))
        attention_weights = tf.nn.softmax(score, axis=1)
        context_vector = attention_weights * encoder_outputs
        context_vector = tf.reduce_sum(context_vector, axis=1)
        return context_vector, attention_weights

# =========================
# 4. Encoder
# =========================
embedding_dim = 64
units = 128

class Encoder(tf.keras.Model):
    def __init__(self, vocab_size, embedding_dim, units):
        super().__init__()
        self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
        self.lstm = tf.keras.layers.LSTM(
            units,
            return_sequences=True,
            return_state=True
        )

    def call(self, x):
        x = self.embedding(x)
        output, state_h, state_c = self.lstm(x)
        return output, state_h, state_c

# =========================
# 5. Decoder
# =========================
class Decoder(tf.keras.Model):
    def __init__(self, vocab_size, embedding_dim, units):
        super().__init__()
        self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
        self.lstm = tf.keras.layers.LSTM(
            units,
            return_sequences=True,
            return_state=True
        )
        self.fc = tf.keras.layers.Dense(vocab_size)
        self.attention = BahdanauAttention(units)

    def call(self, x, hidden, cell, encoder_outputs):
        context_vector, attention_weights = self.attention(hidden, encoder_outputs)
        x = self.embedding(x)
        x = tf.concat([tf.expand_dims(context_vector, 1), x], axis=-1)
        output, state_h, state_c = self.lstm(x, initial_state=[hidden, cell])
        output = tf.reshape(output, (-1, output.shape[2]))
        x = self.fc(output)
        return x, state_h, state_c, attention_weights

# =========================
# 6. Initialize Models
# =========================
encoder = Encoder(eng_vocab_size, embedding_dim, units)
decoder = Decoder(spa_vocab_size, embedding_dim, units)

optimizer = tf.keras.optimizers.Adam()
loss_object = tf.keras.losses.SparseCategoricalCrossentropy(
    from_logits=True, reduction='none'
)

# =========================
# 7. Training Step
# =========================
@tf.function
def train_step(inp, targ):
    loss = 0

    with tf.GradientTape() as tape:
        enc_output, enc_hidden, enc_cell = encoder(inp)

        dec_hidden = enc_hidden
        dec_cell = enc_cell

        dec_input = tf.expand_dims(
            [spa_tokenizer.word_index['<start>']] * inp.shape[0], 1
        )

        for t in range(1, targ.shape[1]):
            predictions, dec_hidden, dec_cell, _ = decoder(
                dec_input, dec_hidden, dec_cell, enc_output
            )

            loss += loss_object(targ[:, t], predictions)
            dec_input = tf.expand_dims(targ[:, t], 1)

    batch_loss = loss / int(targ.shape[1])
    variables = encoder.trainable_variables + decoder.trainable_variables
    gradients = tape.gradient(loss, variables)
    optimizer.apply_gradients(zip(gradients, variables))

    return batch_loss

# =========================
# 8. Training Loop
# =========================
EPOCHS = 300
dataset = tf.data.Dataset.from_tensor_slices((eng_seq, spa_seq)).batch(2)

for epoch in range(EPOCHS):
    total_loss = 0
    for inp, targ in dataset:
        batch_loss = train_step(inp, targ)
        total_loss += batch_loss

    if epoch % 50 == 0:
        print(f"Epoch {epoch}, Loss {total_loss.numpy():.4f}")

# =========================
# 9. Translation Function
# =========================
def translate(sentence):
    inputs = eng_tokenizer.texts_to_sequences([sentence])
    inputs = pad_sequences(inputs, maxlen=max_eng_len, padding='post')
    inputs = tf.convert_to_tensor(inputs)

    result = ""

    enc_out, enc_hidden, enc_cell = encoder(inputs)
    dec_hidden = enc_hidden
    dec_cell = enc_cell

    dec_input = tf.expand_dims([spa_tokenizer.word_index['<start>']], 0)

    for t in range(max_spa_len):
        predictions, dec_hidden, dec_cell, attention_weights = decoder(
            dec_input, dec_hidden, dec_cell, enc_out
        )

        predicted_id = tf.argmax(predictions[0]).numpy()
        word = spa_tokenizer.index_word.get(predicted_id, '')

        if word == '<end>':
            break

        result += word + " "
        dec_input = tf.expand_dims([predicted_id], 0)

    return result.strip()

# =========================
# 10. Test Translation
# =========================
print("\nTranslations:")
print("hello ->", translate("hello"))
print("thank you ->", translate("thank you"))
print("good night ->", translate("good night"))

Q.9. Use the following short poetry dataset to simulate poem generation with a
pre-trained GPT model:

["Roses are red, violets are blue,",
"Sugar is sweet, and so are you.",
"The moon glows bright in silent skies,",
"A bird sings where the soft wind sighs."]

Using this dataset as a reference for poetic structure and language, generate a new 2-4
line poem using a pre-trained GPT model (such as GPT-2). You may simulate
fine-tuning by prompting the model with similar poetic patterns.

Include your code, the prompt used, and the generated poem in your answer.

In [6]:
# Answer ->>

from transformers import pipeline, set_seed

# Load GPT-2 model
generator = pipeline("text-generation", model="gpt2")

# Set seed for reproducibility
set_seed(42)

# Prompt
prompt = """
Write a short 2-4 line poem in the style of the following lines:

Roses are red, violets are blue,
Sugar is sweet, and so are you.
The moon glows bright in silent skies,
A bird sings where the soft wind sighs.

New poem:
"""

# Generate text
output = generator(
    prompt,
    max_length=120,
    num_return_sequences=1,
    temperature=0.8,
    top_k=50
)

print(output[0]["generated_text"])

Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

GPT2LMHeadModel LOAD REPORT from: gpt2
Key                  | Status     |  | 
---------------------+------------+--+-
h.{0...11}.attn.bias | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.
Passing `generation_config` together with generation-related arguments=({'top_k', 'num_return_sequences', 'temperature', 'max_length'}) is deprecated and will be removed in future versions. Please pass either a `generation_config` object OR all generation parameters explicitly, but not both.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=120) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)



Write a short 2-4 line poem in the style of the following lines:

Roses are red, violets are blue,
Sugar is sweet, and so are you.
The moon glows bright in silent skies,
A bird sings where the soft wind sighs.

New poem:

Climb aboard the starship,

The moon is blue and red and white,

There's a great deal to see.

The moon glows bright in silent skies,

A bird sings where the soft wind sighs.

New poem:

Climb aboard the starship,

The moon is green and blue,

You're called to the moon,

The moon glows bright in silent skies,

A bird sings where the soft wind sighs.

New poem:

Climb aboard the starship,

The moon is pink and pink,

You're called to the moon,

The moon glows bright in silent skies,

A bird sings where the soft wind sighs.

New poem:

Climb aboard the starship,

The moon is purple and blue,

The moon glows bright in silent skies,

A bird sings where the soft wind sighs.

New poem:

Climb aboard the starship,

The moon is green and blue,

The moon glows bright in silen

Q.10. Imagine you are building a creative writing assistant for a publishing
company. The assistant should generate story plots and character descriptions using
Generative AI. Describe how you would design the system, including model selection,
training data, bias mitigation, and evaluation methods. Explain the real-world challenges
you might face.

Answer ->>

Designing a creative writing assistant for a publishing company involves balancing generative power with ethical safeguards to produce compelling, original story plots and character descriptions. The system would leverage large language models fine-tuned for narrative creativity, ensuring outputs align with publishing standards like coherence, diversity, and market appeal.

**Model Selection :**
I'd select a transformer-based large language model like GPT-4o or Llama 3.1 (405B parameters) as the core, due to their strong performance in zero-shot creative text generation from conversation history. For domain specificity, fine-tune a smaller variant like Mistral-7B-Instruct on creative writing tasks using LoRA (Low-Rank Adaptation) to minimize compute costs while preserving narrative fluency. Integrate retrieval-augmented generation (RAG) with a vector database (e.g., FAISS) storing genre-specific plot tropes and archetypes, allowing dynamic injection of inspirations without hallucination.

**Training Data :**
Curate a high-quality dataset of 100K+ public-domain stories, outlines, and character sheets from sources like Project Gutenberg, Wattpad archives (with permissions), and anonymized publishing slush piles. Augment with synthetic data generated via self-instruct methods: prompt base models to create diverse plots across genres (fantasy, romance, thriller). Include metadata tags for genre, tone, length, and diversity markers. Preprocess to filter low-quality text (e.g., via perplexity scores) and ensure 40% representation of underrepresented voices in authorship.

**Bias Mitigation :**
Embed fairness at every layer: during data curation, use tools like Perspective API to flag toxic or stereotypical content, then oversample diverse examples (e.g., non-Western settings, LGBTQ+ characters). In training, apply debiasing techniques like counterfactual data augmentation—rewriting biased prompts (e.g., "strong male hero" → gender-neutral variants). At inference, deploy a multi-stage filter: (1) bias classifiers scoring outputs for stereotypes, (2) human-in-the-loop prompts for edge cases, and (3) style transfer modules to enforce inclusivity. Regularly audit with metrics like WEAT (Word Embedding Association Test) on generated characters.

**Evaluation Methods**
Use a hybrid human-AI framework:

- Automated: ROUGE/BERTScore for coherence against gold-standard plots; novelty via Self-BLEU (lower is more original); diversity with n-gram uniqueness across batches.

- Human: Blind A/B tests with editors rating plots/characters on scales for engagement (1-5), originality, and commercial viability; inter-annotator agreement via Cohen's kappa.

- Publisher-Specific: Track downstream metrics like acceptance rates into editing pipelines or reader surveys. Conduct iterative red-teaming: generate adversarial prompts to expose weaknesses.

**Real-World Challenges :**

Scalability demands high GPU costs for fine-tuning (~$10K+ per run), risking budget overruns; mitigate with cloud bursting or model distillation. Legal hurdles include IP lawsuits over training data resemblance—address via opt-out datasets and indemnity clauses. User adoption faces skepticism from writers fearing job loss; counter with positioning as a "co-pilot" via collaborative interfaces. Output inconsistency (e.g., repetitive plots) requires prompt engineering evolution, while ethical pushback on AI authorship demands transparent watermarking (e.g., SynthID). Finally, evolving regulations like EU AI Act classification as "high-risk" could mandate audits, delaying deployment.