# Generative AI - Text Generation and Machine Translation | Assignment

**Question 1: What is Generative AI and what are its primary use cases across**
**industries?**


Generative AI is a type of artificial intelligence that can create new content such as text, images, audio, video, or code by learning patterns from large amounts of data. Unlike traditional AI that focuses on prediction or classification, Generative AI produces original outputs. Its primary use cases include content creation in marketing and media, code generation in software development, medical report generation in healthcare, fraud detection support in finance, personalized tutoring in education, and intelligent chatbots in customer service.

**Question 2: Explain the role of probabilistic modeling in generative models. How do these models differ from discriminative models?**

Probabilistic modeling plays a central role in generative models because these models learn the probability distribution of the data itself. In other words, a generative model tries to estimate the joint probability distribution P(X, Y) or sometimes just P(X), which allows it to generate new data samples that resemble the original dataset. By modeling how data is distributed, generative models can create new text, images, or other content, and can also perform tasks like classification by applying Bayes’ theorem.

In contrast, discriminative models focus only on learning the conditional probability P(Y | X), which directly maps inputs to outputs for prediction tasks. They do not attempt to understand or model how the data was generated; instead, they learn decision boundaries between classes. For example, logistic regression and neural classifiers are discriminative, while models like Naive Bayes, Hidden Markov Models, GANs, and VAEs are generative. The key difference is that generative models model the data distribution and can generate new samples, whereas discriminative models only predict labels.

**Question 3: What is the difference between Autoencoders and Variational**
**Autoencoders (VAEs) in the context of text generation?**

A standard Autoencoder consists of an encoder that compresses input text into a fixed latent vector and a decoder that reconstructs the original text. It learns a deterministic mapping, meaning each input corresponds to a single point in the latent space. While useful for feature extraction and dimensionality reduction, it is not well-suited for generating new text because its latent space is not structured probabilistically.

A Variational Autoencoder (VAE), on the other hand, is a probabilistic generative model. Instead of mapping inputs to a single vector, it maps them to a distribution (usually defined by a mean and variance). A latent vector is sampled from this distribution and passed to the decoder, creating a smooth and continuous latent space. This allows VAEs to generate new and diverse text samples, making them more suitable for text generation tasks.

**Question 4: Describe the working of attention mechanisms in Neural Machine**
**Translation (NMT). Why are they critical?**

In Neural Machine Translation (NMT), the attention mechanism allows the decoder to focus on relevant parts of the source sentence while generating each target word. Instead of relying on a single fixed context vector from the encoder, the model calculates attention weights over all encoder hidden states at every decoding step. These weights determine which source words are most important for predicting the next target word, and a context vector is formed accordingly.

Attention is critical because it solves the problem of information loss in long sentences and improves alignment between source and target words. It significantly enhances translation quality, especially for complex and lengthy inputs, and forms the foundation of modern architectures like the Transformer.

**Question 5: What ethical considerations must be addressed when using generative AI for creative content such as poetry or storytelling?**

When using generative AI for creative content like poetry or storytelling, several ethical considerations must be addressed. One major concern is authorship and intellectual property, as AI models are trained on large datasets that may include copyrighted material. There is a risk of unintentionally generating content that closely resembles existing works, raising issues of plagiarism and ownership. Transparency is also important—audiences should know whether content is AI-generated or human-written.

Another key concern is bias and harmful content. Generative models may reflect societal biases present in their training data, leading to stereotypical, offensive, or misleading outputs. Additionally, misuse for misinformation, deepfake narratives, or manipulation is a serious risk. Therefore, responsible use requires content moderation, bias mitigation, clear attribution, and adherence to legal and ethical guidelines.


In [5]:
# Question 6: Use the following small text dataset to train a simple Variational
# Autoencoder (VAE) for text reconstruction:
# ["The sky is blue", "The sun is bright", "The grass is green",
# "The night is dark", "The stars are shining"]
# 1. Preprocess the data (tokenize and pad the sequences).
# 2. Build a basic VAE model for text reconstruction.
# 3. Train the model and show how it reconstructs or generates similar sentences.


import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

# -----------------------------
# 1. Dataset
# -----------------------------
sentences = [
    "The sky is blue",
    "The sun is bright",
    "The grass is green",
    "The night is dark",
    "The stars are shining"
]

# -----------------------------
# 2. Preprocessing
# -----------------------------
tokenizer = Tokenizer()
tokenizer.fit_on_texts(sentences)
sequences = tokenizer.texts_to_sequences(sentences)

vocab_size = len(tokenizer.word_index) + 1
max_len = max(len(seq) for seq in sequences)

X = pad_sequences(sequences, maxlen=max_len, padding='post')

# -----------------------------
# 3. VAE Parameters
# -----------------------------
embedding_dim = 16
latent_dim = 8

# -----------------------------
# 4. Encoder
# -----------------------------
encoder_inputs = keras.Input(shape=(max_len,))
x = layers.Embedding(vocab_size, embedding_dim)(encoder_inputs)
x = layers.LSTM(32)(x)

z_mean = layers.Dense(latent_dim)(x)
z_log_var = layers.Dense(latent_dim)(x)

# Reparameterization
def sampling(args):
    z_mean, z_log_var = args
    epsilon = tf.random.normal(shape=(tf.shape(z_mean)[0], latent_dim))
    return z_mean + tf.exp(0.5 * z_log_var) * epsilon

z = layers.Lambda(sampling)([z_mean, z_log_var])

encoder = keras.Model(encoder_inputs, [z_mean, z_log_var, z])

# -----------------------------
# 5. Decoder
# -----------------------------
latent_inputs = keras.Input(shape=(latent_dim,))
x = layers.RepeatVector(max_len)(latent_inputs)
x = layers.LSTM(32, return_sequences=True)(x)
decoder_outputs = layers.TimeDistributed(
    layers.Dense(vocab_size, activation="softmax")
)(x)

decoder = keras.Model(latent_inputs, decoder_outputs)

# -----------------------------
# 6. VAE Model (Subclassed)
# -----------------------------
class VAE(keras.Model):
    def __init__(self, encoder, decoder):
        super(VAE, self).__init__()
        self.encoder = encoder
        self.decoder = decoder
        self.loss_tracker = keras.metrics.Mean(name="loss")

    def train_step(self, data):
        with tf.GradientTape() as tape:
            z_mean, z_log_var, z = self.encoder(data)
            reconstruction = self.decoder(z)

            # Reconstruction loss
            recon_loss = keras.losses.sparse_categorical_crossentropy(
                data, reconstruction
            )
            recon_loss = tf.reduce_mean(recon_loss)

            # KL loss
            kl_loss = -0.5 * tf.reduce_mean(
                1 + z_log_var - tf.square(z_mean) - tf.exp(z_log_var)
            )

            total_loss = recon_loss + kl_loss

        grads = tape.gradient(total_loss, self.trainable_weights)
        self.optimizer.apply_gradients(zip(grads, self.trainable_weights))

        self.loss_tracker.update_state(total_loss)
        return {"loss": self.loss_tracker.result()}

vae = VAE(encoder, decoder)
vae.compile(optimizer="adam")

# -----------------------------
# 7. Training
# -----------------------------
vae.fit(X, epochs=200, batch_size=2, verbose=0)

# -----------------------------
# 8. Reconstruction
# -----------------------------
z_mean, z_log_var, z = encoder.predict(X)
reconstructed = decoder.predict(z)
predicted_tokens = np.argmax(reconstructed, axis=-1)

print("\nOriginal vs Reconstructed:\n")
for i, seq in enumerate(predicted_tokens):
    words = [tokenizer.index_word.get(idx, "") for idx in seq if idx != 0]
    print("Original:", sentences[i])
    print("Reconstructed:", " ".join(words))
    print()


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 132ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 151ms/step

Original vs Reconstructed:

Original: The sky is blue
Reconstructed: the the is is

Original: The sun is bright
Reconstructed: the is is is

Original: The grass is green
Reconstructed: the is is is

Original: The night is dark
Reconstructed: the is is dark

Original: The stars are shining
Reconstructed: stars are shining shining



In [8]:
# Question 7: Use a pre-trained GPT model (like GPT-2 or GPT-3) to translate a short
# English paragraph into French and German. Provide the original and translated text.

import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer

# Load GPT-2 model and tokenizer
model_name = "gpt2"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)

model.eval()

# English paragraph
english_text = """Artificial Intelligence is transforming the world by enabling machines to learn from data and make intelligent decisions. It is widely used in healthcare, finance, education, and many other industries."""

# Translation function
def translate(text, target_language):
    prompt = f"Translate the following English text into {target_language}:\n{text}\nTranslation:"

    inputs = tokenizer.encode(prompt, return_tensors="pt")

    with torch.no_grad():
        outputs = model.generate(
            inputs,
            max_length=250,
            temperature=0.7,
            num_return_sequences=1,
            pad_token_id=tokenizer.eos_token_id
        )

    decoded_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

    # Extract only translation part
    if "Translation:" in decoded_text:
        translated_part = decoded_text.split("Translation:")[-1].strip()
    else:
        translated_part = decoded_text.strip()

    return translated_part

# Generate translations
french_translation = translate(english_text, "French")
german_translation = translate(english_text, "German")

# Print results
print("Original English:\n")
print(english_text)

print("\nFrench Translation:\n")
print(french_translation)

print("\nGerman Translation:\n")
print(german_translation)


Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

GPT2LMHeadModel LOAD REPORT from: gpt2
Key                  | Status     |  | 
---------------------+------------+--+-
h.{0...11}.attn.bias | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.


Original English:

Artificial Intelligence is transforming the world by enabling machines to learn from data and make intelligent decisions. It is widely used in healthcare, finance, education, and many other industries.

French Translation:

Artificial Intelligence is transforming the world by enabling machines to learn from data and make intelligent decisions. It is widely used in healthcare, finance, education, and many other industries.
The following is a translation of the following text into French:
Artificial Intelligence is transforming the world by enabling machines to learn from data and make intelligent decisions. It is widely used in healthcare, finance, education, and many other industries.
The following is a translation of the following text into German:
Artificial Intelligence is transforming the world by enabling machines to learn from data and make intelligent decisions. It is widely used in healthcare, finance, education, and many other industries.
The following is a 

In [10]:
# Question 8: Implement a simple attention-based encoder-decoder model for
# English-to-Spanish translation using Tensorflow or PyTorch.

# Simple Attention-based Encoder-Decoder (English → Spanish)

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Input, LSTM, Embedding, Dense, Attention, Concatenate
from tensorflow.keras.models import Model
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

# -----------------------------
# 1. Sample Data
# -----------------------------
english_sentences = [
    "hello",
    "how are you",
    "i am fine",
    "thank you"
]

spanish_sentences = [
    "hola",
    "como estas",
    "estoy bien",
    "gracias"
]

# Add start and end tokens
spanish_sentences = ["<start> " + s + " <end>" for s in spanish_sentences]

# -----------------------------
# 2. Tokenization
# -----------------------------
eng_tokenizer = Tokenizer(filters='')
eng_tokenizer.fit_on_texts(english_sentences)
eng_sequences = eng_tokenizer.texts_to_sequences(english_sentences)

spa_tokenizer = Tokenizer(filters='')
spa_tokenizer.fit_on_texts(spanish_sentences)
spa_sequences = spa_tokenizer.texts_to_sequences(spanish_sentences)

eng_vocab_size = len(eng_tokenizer.word_index) + 1
spa_vocab_size = len(spa_tokenizer.word_index) + 1

# -----------------------------
# 3. Padding (VERY IMPORTANT)
# -----------------------------
max_len_eng = max(len(seq) for seq in eng_sequences)
max_len_spa = max(len(seq) for seq in spa_sequences)

encoder_input_data = pad_sequences(eng_sequences, maxlen=max_len_eng, padding='post')
decoder_input_data = pad_sequences(
    [seq[:-1] for seq in spa_sequences],
    maxlen=max_len_spa - 1,
    padding='post'
)
decoder_output_data = pad_sequences(
    [seq[1:] for seq in spa_sequences],
    maxlen=max_len_spa - 1,
    padding='post'
)

# -----------------------------
# 4. Model Architecture
# -----------------------------
embedding_dim = 64
units = 128

# Encoder
encoder_inputs = Input(shape=(max_len_eng,))
encoder_embedding = Embedding(eng_vocab_size, embedding_dim)(encoder_inputs)
encoder_outputs, state_h, state_c = LSTM(units, return_sequences=True, return_state=True)(encoder_embedding)

# Decoder
decoder_inputs = Input(shape=(max_len_spa - 1,))
decoder_embedding = Embedding(spa_vocab_size, embedding_dim)(decoder_inputs)
decoder_lstm_outputs, _, _ = LSTM(units, return_sequences=True, return_state=True)(
    decoder_embedding, initial_state=[state_h, state_c]
)

# Attention Layer
attention = Attention()
attention_output = attention([decoder_lstm_outputs, encoder_outputs])

# Concatenate attention output and decoder LSTM output
concat = Concatenate(axis=-1)([decoder_lstm_outputs, attention_output])

# Final Dense layer
decoder_dense = Dense(spa_vocab_size, activation='softmax')
decoder_outputs = decoder_dense(concat)

# Build Model
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
model.summary()

# -----------------------------
# 5. Train Model
# -----------------------------
decoder_output_data = np.expand_dims(decoder_output_data, -1)

model.fit(
    [encoder_input_data, decoder_input_data],
    decoder_output_data,
    batch_size=2,
    epochs=200,
    verbose=1
)

# -----------------------------
# 6. Test Translation
# -----------------------------
def translate(sentence):
    seq = eng_tokenizer.texts_to_sequences([sentence])
    seq = pad_sequences(seq, maxlen=max_len_eng, padding='post')

    target_seq = np.zeros((1, 1))
    target_seq[0, 0] = spa_tokenizer.word_index['<start>']

    decoded_sentence = ""

    for _ in range(max_len_spa):
        output_tokens = model.predict([seq, target_seq], verbose=0)
        sampled_token_index = np.argmax(output_tokens[0, -1, :])
        sampled_word = spa_tokenizer.index_word.get(sampled_token_index, '')

        if sampled_word == "<end>" or sampled_word == '':
            break

        decoded_sentence += " " + sampled_word

        target_seq = np.append(target_seq, [[sampled_token_index]], axis=1)

    return decoded_sentence.strip()

print("\nTranslation Results:")
print("hello →", translate("hello"))
print("how are you →", translate("how are you"))

Epoch 1/200
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 30ms/step - loss: 2.1948
Epoch 2/200
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 27ms/step - loss: 2.1783
Epoch 3/200
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 34ms/step - loss: 2.1615
Epoch 4/200
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 27ms/step - loss: 2.1543
Epoch 5/200
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 30ms/step - loss: 2.1379
Epoch 6/200
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 26ms/step - loss: 2.1192
Epoch 7/200
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 25ms/step - loss: 2.0771
Epoch 8/200
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 25ms/step - loss: 2.0540
Epoch 9/200
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 25ms/step - loss: 2.0248
Epoch 10/200
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 26ms/step - loss: 1.9713
Epoch 11/



hello → hola
how are you → como estas


In [11]:
# Question 9: Use the following short poetry dataset to simulate poem generation with a
# pre-trained GPT model:
# ["Roses are red, violets are blue,",
# "Sugar is sweet, and so are you.",
# "The moon glows bright in silent skies,",
# "A bird sings where the soft wind sighs."]
# Using this dataset as a reference for poetic structure and language, generate a new 2-4
# line poem using a pre-trained GPT model (such as GPT-2). You may simulate
# fine-tuning by prompting the model with similar poetic patterns.


# Poem Generation using Pre-trained GPT-2

from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch

# Load pre-trained GPT-2 model and tokenizer
model_name = "gpt2"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)

model.eval()

# Given poetry dataset (used as structural reference in prompt)
poetry_dataset = [
    "Roses are red, violets are blue,",
    "Sugar is sweet, and so are you.",
    "The moon glows bright in silent skies,",
    "A bird sings where the soft wind sighs."
]

# Create prompt to simulate fine-tuning via pattern prompting
prompt = "\n".join(poetry_dataset) + "\n\nWrite a new 4-line poem in similar rhyming and lyrical style:\n"

# Encode input
input_ids = tokenizer.encode(prompt, return_tensors="pt")

# Generate poem
with torch.no_grad():
    output = model.generate(
        input_ids,
        max_length=150,
        num_return_sequences=1,
        temperature=0.9,
        top_k=50,
        top_p=0.95,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

# Decode and print result
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

print("----- Original Dataset -----")
for line in poetry_dataset:
    print(line)

print("\n----- Generated Poem -----")
print(generated_text[len(prompt):])

Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

GPT2LMHeadModel LOAD REPORT from: gpt2
Key                  | Status     |  | 
---------------------+------------+--+-
h.{0...11}.attn.bias | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.


----- Original Dataset -----
Roses are red, violets are blue,
Sugar is sweet, and so are you.
The moon glows bright in silent skies,
A bird sings where the soft wind sighs.

----- Generated Poem -----

Song of The Night The sun glows brightly, but what is it?

It's an empty place, without a place.

Song of The Night The sun glows bright, but what is it?

It's an empty place, without a place.

Song of The Night The sun glows bright, but what is it?

It's an empty place, without a place.




**Question 10: Imagine you are building a creative writing assistant for a publishing company. The assistant should generate story plots and character descriptions using Generative AI. Describe how you would design the system, including model selection, training data, bias mitigation, and evaluation methods. Explain the real-world challenges you might face.**

To build a creative writing assistant, I would use a powerful pre-trained language model such as GPT-4 or LLaMA as the core engine. The system would generate story plots and character descriptions using structured prompts and controlled text generation settings. Fine-tuning on licensed books, public-domain literature, and genre-specific datasets would help the model learn narrative styles and storytelling patterns. A retrieval system could also provide plot frameworks or genre rules to improve coherence.

To reduce bias, the training data would be carefully filtered and regularly audited. Content moderation tools and human review would ensure outputs are safe, diverse, and respectful. Evaluation would combine automated metrics (like coherence and originality checks) with human feedback from editors. Real-world challenges include avoiding copyright issues, preventing biased or repetitive content, maintaining long-story consistency, managing high computation costs, and ensuring publishers trust the AI-generated material.



