## Text Generation and Machine Translation

1) What is Generative AI and what are its primary use cases across industries?

->

Generative AI refers to a class of artificial intelligence models that are capable of creating new content such as text, images, audio, video, or code, rather than just analyzing or classifying existing data. These models learn the underlying patterns and structure of data and use that knowledge to generate realistic and meaningful outputs.

 Primary Use Cases Across Industries
- Healthcare
  - Generating synthetic medical data
  - Assisting in medical report summarization
  - Drug discovery and molecular design
- Finance
  - Automated report generation
  - Fraud scenario simulation
  - Personalized financial advice
- Entertainment & Media
  - Storytelling, poetry, and script writing
  - Music and image generation
  - Game design and character creation
- Education
  - Automated content creation
  - Personalized learning material
  - Intelligent tutoring systems
- Software Development
  - Code generation and debugging
  - Documentation writing
  - Test case generation

Overall, Generative AI enhances creativity, productivity, and automation across domains.


 2) Explain the role of probabilistic modeling in generative models. How do these models differ from discriminative models?

->

Probabilistic modeling plays a central role in generative models by learning the joint probability distribution of data and labels.

 Role of Probabilistic Modeling
- Models uncertainty in data
- Enables sampling of new data points
- Allows generation of diverse and realistic outputs
- Helps capture hidden (latent) structures in data

Generative models learn:
\[
P(X, Y) \quad \text{or} \quad P(X)
\]
where \(X\) represents the data.


 Difference Between Generative and Discriminative Models

| Aspect | Generative Models | Discriminative Models |
|------|------------------|-----------------------|
| What they learn | Joint distribution \(P(X, Y)\) | Conditional probability \(P(Y|X)\) |
| Goal | Generate new data | Predict labels |
| Output | New samples | Class predictions |
| Examples | GANs, VAEs, HMMs | Logistic Regression, SVM, CNNs |

In simple terms:  
Generative models learn *how data is created*, while discriminative models learn *how to distinguish between classes*.


 3) What is the difference between Autoencoders and Variational Autoencoders (VAEs) in the context of text generation?

->

 Autoencoders
- Consist of an encoder and a decoder
- Learn a deterministic compressed representation of input data
- Focus on reconstruction accuracy
- Not ideal for generating diverse new text

 Variational Autoencoders (VAEs)
- Learn a probabilistic latent space
- Encoder outputs a distribution (mean and variance)
- Decoder samples from this distribution
- Better suited for text generation and creativity

 Key Differences

| Aspect | Autoencoder | Variational Autoencoder |
|------|-------------|-------------------------|
| Latent space | Deterministic | Probabilistic |
| Output diversity | Low | High |
| Sampling | Not supported | Supported |
| Text generation | Limited | Effective |

VAEs enable smooth interpolation and generation of novel text samples, making them more suitable for generative tasks.


 4) Describe the working of attention mechanisms in Neural Machine Translation (NMT). Why are they critical?

->

In Neural Machine Translation, attention mechanisms allow the model to focus on relevant parts of the input sentence when generating each word in the output sentence.

 How Attention Works
1. The encoder processes the entire source sentence.
2. At each decoding step, attention:
   - Assigns weights to all input tokens
   - Determines which source words are most relevant
3. The decoder uses this weighted information to generate the next word.

This avoids compressing the entire sentence into a single fixed-length vector.

 Why Attention Is Critical
- Handles long sentences effectively
- Improves translation accuracy
- Aligns source and target words dynamically
- Enables context-aware translations

Attention mechanisms are the foundation of Transformer models, which outperform traditional RNN-based NMT systems.


 5) What ethical considerations must be addressed when using generative AI for creative content such as poetry or storytelling?

->

Using Generative AI for creative content raises important ethical concerns.

 Key Ethical Considerations
- Originality and Plagiarism
  - Risk of generating content too similar to existing works
- Copyright and Ownership
  - Unclear ownership of AI-generated content
- Bias and Representation
  - Models may reflect societal biases present in training data
- Misinformation
  - Generated stories may spread false or misleading narratives
- Human Creativity
  - Over-reliance on AI may devalue human creative effort
- Transparency
  - Users should know whether content is AI-generated

 Responsible Use
- Clear disclosure of AI involvement
- Human oversight in creative workflows
- Ethical dataset curation
- Respect for cultural and artistic values

In [1]:
!pip install transformers torch --quiet
!pip install tensorflow --quiet

In [2]:
"""
6) Use the following small text dataset to train a simple Variational
Autoencoder (VAE) for text reconstruction:

["The sky is blue", "The sun is bright", "The grass is green",
"The night is dark", "The stars are shining"]

1. Preprocess the data (tokenize and pad the sequences).
2. Build a basic VAE model for text reconstruction.
3. Train the model and show how it reconstructs or generates similar sentences.

Include your code, explanation, and sample outputs.

->

"""

import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.layers import (
    Input, Embedding, LSTM, Dense,
    RepeatVector, Layer
)

from tensorflow.keras.models import Model


texts = [
    "The sky is blue",
    "The sun is bright",
    "The grass is green",
    "The night is dark",
    "The stars are shining"
]


tokenizer = Tokenizer()
tokenizer.fit_on_texts(texts)

sequences = tokenizer.texts_to_sequences(texts)
word_index = tokenizer.word_index
vocab_size = len(word_index) + 1

max_len = max(len(seq) for seq in sequences)
padded_sequences = pad_sequences(sequences, maxlen=max_len, padding="post")

print("Word Index:", word_index)
print("Padded Sequences:\n", padded_sequences)


class Sampling(Layer):
    def call(self, inputs):
        z_mean, z_log_var = inputs
        epsilon = tf.random.normal(shape=tf.shape(z_mean))
        return z_mean + tf.exp(0.5 * z_log_var) * epsilon


class KLDivergenceLayer(Layer):
    def call(self, inputs):
        z_mean, z_log_var = inputs
        kl_loss = -0.5 * tf.reduce_mean(
            1 + z_log_var - tf.square(z_mean) - tf.exp(z_log_var)
        )
        self.add_loss(kl_loss)
        return inputs


embedding_dim = 16
latent_dim = 8

# Encoder
encoder_inputs = Input(shape=(max_len,))
x = Embedding(vocab_size, embedding_dim, mask_zero=True)(encoder_inputs)
x = LSTM(32)(x)

z_mean = Dense(latent_dim)(x)
z_log_var = Dense(latent_dim)(x)

# KL loss (safe)
z_mean, z_log_var = KLDivergenceLayer()([z_mean, z_log_var])

# Sampling
z = Sampling()([z_mean, z_log_var])

# Decoder
decoder_inputs = RepeatVector(max_len)(z)
decoder_lstm = LSTM(32, return_sequences=True)(decoder_inputs)
decoder_outputs = Dense(vocab_size, activation="softmax")(decoder_lstm)

vae = Model(encoder_inputs, decoder_outputs)


vae.compile(
    optimizer="adam",
    loss="sparse_categorical_crossentropy"
)

vae.summary()

vae.fit(
    padded_sequences,
    padded_sequences,
    epochs=200,
    verbose=0
)

print("✅ VAE training completed")


predictions = vae.predict(padded_sequences)

reverse_word_index = {i: w for w, i in word_index.items()}

def decode_sentence(pred):
    words = []
    for token in pred:
        word = reverse_word_index.get(np.argmax(token), "")
        if word:
            words.append(word)
    return " ".join(words)

print("\nOriginal vs Reconstructed Sentences:\n")
for i, pred in enumerate(predictions):
    print("Original     :", texts[i])
    print("Reconstructed:", decode_sentence(pred))
    print()

Word Index: {'the': 1, 'is': 2, 'sky': 3, 'blue': 4, 'sun': 5, 'bright': 6, 'grass': 7, 'green': 8, 'night': 9, 'dark': 10, 'stars': 11, 'are': 12, 'shining': 13}
Padded Sequences:
 [[ 1  3  2  4]
 [ 1  5  2  6]
 [ 1  7  2  8]
 [ 1  9  2 10]
 [ 1 11 12 13]]


✅ VAE training completed
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 381ms/step

Original vs Reconstructed Sentences:

Original     : The sky is blue
Reconstructed: is is is is

Original     : The sun is bright
Reconstructed: is is is is

Original     : The grass is green
Reconstructed: is is is is

Original     : The night is dark
Reconstructed: the is is is

Original     : The stars are shining
Reconstructed: are shining shining shining



In [3]:
"""
7)  Use a pre-trained GPT model (like GPT-2 or GPT-3) to translate a short
English paragraph into French and German. Provide the original and translated text.
->

"""

from transformers import GPT2LMHeadModel, GPT2Tokenizer

# Load tokenizer and model
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")


english_text = (
    "Artificial intelligence is transforming industries by enabling machines "
    "to learn from data and make intelligent decisions."
)


# Translate English to French

prompt_french = f"Translate the following text from English to French:\nEnglish: {english_text}\nFrench:"

inputs_fr = tokenizer.encode(prompt_french, return_tensors="pt")
outputs_fr = model.generate(
    inputs_fr,
    max_length=120,
    num_return_sequences=1,
    no_repeat_ngram_size=2,
    pad_token_id=tokenizer.eos_token_id
)

french_translation = tokenizer.decode(outputs_fr[0], skip_special_tokens=True)
print("French Translation:\n", french_translation)


# Translate English to German

prompt_german = f"Translate the following text from English to German:\nEnglish: {english_text}\nGerman:"

inputs_de = tokenizer.encode(prompt_german, return_tensors="pt")
outputs_de = model.generate(
    inputs_de,
    max_length=120,
    num_return_sequences=1,
    no_repeat_ngram_size=2,
    pad_token_id=tokenizer.eos_token_id
)

german_translation = tokenizer.decode(outputs_de[0], skip_special_tokens=True)
print("German Translation:\n", german_translation)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]



merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

GPT2LMHeadModel LOAD REPORT from: gpt2
Key                  | Status     |  | 
---------------------+------------+--+-
h.{0...11}.attn.bias | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.


generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


French Translation:
 Translate the following text from English to French:
English: Artificial intelligence is transforming industries by enabling machines to learn from data and make intelligent decisions.
French: The future of artificial intelligence will be a world of machines. Artificial Intelligence is the future. The world is changing. We are changing the world. It is time to change the way we think. This is a new era of technology. A new age of innovation. And it is not just about the machines, it's about us. Our future is about machines and machines are the only way to make it happen.
German Translation:
 Translate the following text from English to German:
English: Artificial intelligence is transforming industries by enabling machines to learn from data and make intelligent decisions.
German: The future of artificial intelligence will be a world of machines. Artificial Intelligence is the future. The world is changing. It is a new world. We are changing the world, and we are t

In [4]:
"""
8) Implement a simple attention-based encoder-decoder model for
English-to-Spanish translation using Tensorflow or PyTorch.

->

"""

import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.layers import Embedding, LSTM, Dense

# English sentences
eng_sentences = [
    "i love you",
    "how are you",
    "i am happy",
    "good morning",
    "thank you"
]

# Spanish translations
spa_sentences = [
    "te amo",
    "como estas",
    "estoy feliz",
    "buenos dias",
    "gracias"
]

# Tokenizers
eng_tokenizer = Tokenizer()
spa_tokenizer = Tokenizer()

eng_tokenizer.fit_on_texts(eng_sentences)
spa_tokenizer.fit_on_texts(spa_sentences)

eng_seq = eng_tokenizer.texts_to_sequences(eng_sentences)
spa_seq = spa_tokenizer.texts_to_sequences(spa_sentences)

# Padding
max_eng_len = max(len(seq) for seq in eng_seq)
max_spa_len = max(len(seq) for seq in spa_seq)

eng_pad = pad_sequences(eng_seq, maxlen=max_eng_len, padding='post')
spa_pad = pad_sequences(spa_seq, maxlen=max_spa_len, padding='post')

eng_vocab = len(eng_tokenizer.word_index) + 1
spa_vocab = len(spa_tokenizer.word_index) + 1

embedding_dim = 64
units = 64

encoder_inputs = tf.keras.Input(shape=(max_eng_len,))
encoder_embedding = Embedding(eng_vocab, embedding_dim)(encoder_inputs)
encoder_outputs, state_h, state_c = LSTM(
    units, return_sequences=True, return_state=True
)(encoder_embedding)

encoder_states = [state_h, state_c]

attention = tf.keras.layers.AdditiveAttention()


decoder_inputs = tf.keras.Input(shape=(max_spa_len,))
decoder_embedding = Embedding(spa_vocab, embedding_dim)(decoder_inputs)

decoder_lstm = LSTM(units, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(
    decoder_embedding, initial_state=encoder_states
)

# Apply attention
context_vector = attention([decoder_outputs, encoder_outputs])

# Concatenate attention context and decoder output
decoder_concat = tf.keras.layers.Concatenate()(
    [decoder_outputs, context_vector]
)

decoder_dense = Dense(spa_vocab, activation='softmax')
decoder_outputs = decoder_dense(decoder_concat)


model = tf.keras.Model(
    [encoder_inputs, decoder_inputs],
    decoder_outputs
)

model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

model.summary()

spa_pad_input = spa_pad[:, :-1]
spa_pad_output = spa_pad[:, 1:]

model.fit(
    [eng_pad, spa_pad_input],
    spa_pad_output,
    epochs=300,
    verbose=0
)

print("Training completed")

def translate(sentence):
    seq = eng_tokenizer.texts_to_sequences([sentence])
    seq = pad_sequences(seq, maxlen=max_eng_len, padding='post')

    start_token = spa_tokenizer.word_index.get("te", 1)
    decoder_input = np.zeros((1, max_spa_len))
    decoder_input[0, 0] = start_token

    prediction = model.predict([seq, decoder_input], verbose=0)
    tokens = np.argmax(prediction[0], axis=1)

    reverse_spa_index = {i: w for w, i in spa_tokenizer.word_index.items()}
    return " ".join([reverse_spa_index.get(t, "") for t in tokens])

print("English :", "i love you")
print("Spanish :", translate("i love you"))

Training completed
English : i love you
Spanish : amo amo


In [5]:
"""
9)  Use the following short poetry dataset to simulate poem generation with a pre-trained GPT model:

["Roses are red, violets are blue,",
"Sugar is sweet, and so are you.",
"The moon glows bright in silent skies,",
"A bird sings where the soft wind sighs."]

Using this dataset as a reference for poetic structure and language, generate a new 2-4
line poem using a pre-trained GPT model (such as GPT-2). You may simulate
fine-tuning by prompting the model with similar poetic patterns.

Include your code, the prompt used, and the generated poem in your answer.

->

"""

from transformers import GPT2LMHeadModel, GPT2Tokenizer

# Load GPT-2 tokenizer and model
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

prompt = (
    "Roses are red, violets are blue,\n"
    "Sugar is sweet, and so are you.\n"
    "The moon glows bright in silent skies,\n"
    "A bird sings where the soft wind sighs.\n\n"
    "Write a short poem:\n"
)

inputs = tokenizer.encode(prompt, return_tensors="pt")

outputs = model.generate(
    inputs,
    max_length=120,
    num_return_sequences=1,
    temperature=0.8,
    top_p=0.9,
    no_repeat_ngram_size=2,
    pad_token_id=tokenizer.eos_token_id
)

generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)

Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

GPT2LMHeadModel LOAD REPORT from: gpt2
Key                  | Status     |  | 
---------------------+------------+--+-
h.{0...11}.attn.bias | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.
The following generation flags are not valid and may be ignored: ['temperature', 'top_p']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


Roses are red, violets are blue,
Sugar is sweet, and so are you.
The moon glows bright in silent skies,
A bird sings where the soft wind sighs.

Write a short poem:
"I am the moon, the sun, my soul, I am your soul."
. . .
I'm the star, your star. I'm your sun. You're my star
And I'll be your moon. And I will be yours.


10) Imagine you are building a creative writing assistant for a publishing company. The assistant should generate story plots and character descriptions using Generative AI. Describe how you would design the system, including model selection, training data, bias mitigation, and evaluation methods. Explain the real-world challenges you might face.

->

Imagine building a creative writing assistant for a publishing company that helps authors generate story plots and character descriptions. Such a system must balance creativity with quality, originality, and ethical responsibility.

System Design Overview

The creative writing assistant can be designed as a text-generation system powered by Generative AI, consisting of the following components:

1. User Input Interface
2. Generative Model Core
3. Post-processing & Safety Layer
4. Evaluation & Feedback Module

Model Selection

The choice of model is critical for generating coherent and creative text.

- Transformer-based Language Models (preferred):
  - Examples: GPT-style models, T5, or similar large language models
  - Strengths:
    - Strong contextual understanding
    - Ability to generate long, coherent narratives
    - Effective handling of character consistency and plot flow

- Why not simpler models (RNN/LSTM)?
  - Limited long-range coherence
  - Struggle with complex narrative structures

The model can be fine-tuned specifically for fiction writing tasks such as plot generation, dialogue creation, and character profiling.


Training Data

High-quality and diverse training data is essential.

Sources of Training Data
- Public-domain novels and short stories
- Story summaries and plot outlines
- Character descriptions from fiction databases
- Screenplays and scripts (where legally permitted)

Data Preparation
- Clean and normalize text
- Remove duplicates and copyrighted material
- Annotate structure (e.g., plot, setting, character traits)
- Balance genres (fantasy, mystery, romance, sci-fi)

Bias Mitigation Strategies

Bias in generative writing can negatively impact inclusivity and representation.

Mitigation Techniques
- Use diverse and balanced datasets
- Filter or reweight biased content during training
- Apply debiasing techniques in embeddings
- Include human-in-the-loop review for sensitive outputs
- Regular audits for stereotypes related to gender, race, or culture

Bias mitigation ensures fair, inclusive, and responsible creative output.



Evaluation Methods

Evaluating creative text is challenging and requires both automatic and human-centered approaches.

Automatic Evaluation
- Perplexity (fluency measure)
- Diversity metrics (n-gram diversity)
- Repetition detection
- Length and coherence checks

Human Evaluation
- Editorial review by writers and editors
- Creativity and originality scoring
- Consistency of characters and plot
- Reader engagement feedback

Human evaluation is especially important for creative tasks.


Real-World Challenges

Building such a system involves several practical challenges:

- Creativity vs. Originality
  - Avoiding plagiarism while still producing engaging content
- Copyright and Ownership
  - Determining who owns AI-generated stories
- Quality Control
  - Ensuring outputs meet publishing standards
- Bias and Cultural Sensitivity
  - Preventing harmful stereotypes or misrepresentation
- Over-Reliance on AI
  - Risk of reducing human creativity and authorial voice
- Scalability
  - Serving multiple authors with personalized styles