##Generative AI - Text Generation and Machine Translation

#1. What is Generative AI and what are its primary use cases across industries?

  Generative AI is a branch of artificial intelligence that focuses on creating new data, content, or patterns that resemble the training data. Unlike traditional AI models that perform classification or prediction, generative models learn the underlying distribution of data and generate new outputs such as text, images, audio, and code.

  Generative AI is powered by deep learning architectures such as Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and Transformer-based models like GPT.

  ### Primary Use Cases Across Industries:

  1. Healthcare:

  * Drug discovery and molecule generation
  * Medical report summarization
  * Synthetic medical data generation for research

  2. Finance:

  * Fraud detection simulations
  * Automated report generation
  * Risk scenario modeling

  3. Education:

  * Intelligent tutoring systems
  * Automated content creation
  * Language translation and learning assistants

  4. Media & Entertainment:

  * Story and script generation
  * Music composition
  * Image and video creation

  5. E-commerce:

  * Product description generation
  * Chatbots and virtual assistants
  * Personalized recommendations

  6. Software Development:

  * Code generation
  * Documentation writing
  * Debugging assistance

  Generative AI significantly improves productivity, automation, and creativity across multiple domains.


#2. Explain the role of probabilistic modeling in generative models. How do these models differ from discriminative models?


  Probabilistic modeling plays a crucial role in generative models as it helps them learn the probability distribution of the data. Generative models aim to model the joint probability P(X, Y) or P(X), which allows them to generate new samples that resemble the original dataset.

  These models estimate how data is generated by capturing hidden patterns and structures using probability distributions.

  ### Role of Probabilistic Modeling:

  * Learns data distribution
  * Handles uncertainty in data
  * Enables realistic data generation
  * Supports sampling of new data points

  ### Difference Between Generative and Discriminative Models:

  | Aspect            | Generative Models                       | Discriminative Models         |    |
  | ----------------- | --------------------------------------- | ----------------------------- | -- |
  | Learning Approach | Learns data distribution P(X) or P(X,Y) | Learns decision boundary P(Y  | X) |
  | Purpose           | Generate new data                       | Classify or predict labels    |    |
  | Examples          | VAE, GAN, GPT                           | Logistic Regression, SVM, CNN |    |
  | Output            | Synthetic data generation               | Predictions/labels            |    |

  Generative models can create new samples, while discriminative models focus only on classification and prediction tasks.

#3. What is the difference between Autoencoders and Variational Autoencoders (VAEs) in the context of text generation?


  Autoencoders and Variational Autoencoders (VAEs) are neural network architectures used for representation learning and data generation.

  ### Autoencoders:

  * Deterministic model
  * Compresses input into a latent representation
  * Reconstructs the same input
  * Does not generate diverse outputs

  ### Variational Autoencoders (VAEs):

  * Probabilistic model
  * Learns a distribution in latent space
  * Generates new and diverse data
  * Uses encoder, latent space sampling, and decoder

  ### Key Differences:

  | Feature            | Autoencoder   | Variational Autoencoder      |
  | ------------------ | ------------- | ---------------------------- |
  | Latent Space       | Fixed vector  | Probability distribution     |
  | Output Diversity   | Low           | High                         |
  | Generation Ability | Limited       | Strong generative capability |
  | Mathematical Basis | Deterministic | Probabilistic                |

  In text generation, VAEs are preferred because they can produce new sentences by sampling from the latent space.

#4. Describe the working of attention mechanisms in Neural Machine Translation (NMT). Why are they critical?

  Attention mechanisms allow Neural Machine Translation (NMT) models to focus on relevant parts of the input sentence while generating each word in the output sentence.

  ### Working of Attention Mechanism:

  1. The encoder processes the input sentence and produces hidden states.
  2. The decoder generates output word by word.
  3. At each decoding step, attention scores are calculated between the current decoder state and all encoder states.
  4. A context vector is formed using weighted importance of encoder outputs.
  5. The context vector helps the decoder generate more accurate translations.

  ### Why Attention is Critical:

  * Handles long sentences effectively
  * Prevents information loss
  * Improves translation accuracy
  * Enables context-aware generation
  * Forms the foundation of Transformer models

  Without attention, traditional encoder-decoder models struggle with long sequences due to fixed context vectors.

#5. What ethical considerations must be addressed when using generative AI for creative content such as poetry or storytelling?


  When using generative AI for creative content, several ethical concerns must be addressed:

  1. Copyright and Intellectual Property:
     AI-generated content may unintentionally replicate copyrighted material.

  2. Bias and Fairness:
     Training data may contain social or cultural biases that get reflected in generated stories or poems.

  3. Misinformation and Deepfakes:
     Generative AI can create misleading or false content that appears authentic.

  4. Plagiarism Issues:
     Generated text may closely resemble existing works.

  5. Transparency:
     Users should be informed when content is AI-generated.

  6. Cultural Sensitivity:
     Content should respect diverse cultures and avoid offensive narratives.

  Ethical AI usage requires responsible data sourcing, bias mitigation, and human oversight.

In [5]:

'''
6. Use the following small text dataset to train a simple Variational Autoencoder (VAE) for text reconstruction:
["The sky is blue", "The sun is bright", "The grass is green",
"The night is dark", "The stars are shining"]
1. Preprocess the data (tokenize and pad the sequences).
2. Build a basic VAE model for text reconstruction.
3. Train the model and show how it reconstructs or generates similar sentences
'''

import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.layers import Input, Dense, Lambda
from tensorflow.keras.models import Model
from tensorflow.keras import backend as K


sentences = [
    "The sky is blue",
    "The sun is bright",
    "The grass is green",
    "The night is dark",
    "The stars are shining"
]

tokenizer = Tokenizer()
tokenizer.fit_on_texts(sentences)
sequences = tokenizer.texts_to_sequences(sentences)
max_len = max(len(seq) for seq in sequences)
padded_sequences = pad_sequences(sequences, maxlen=max_len, padding='post')

vocab_size = len(tokenizer.word_index) + 1
input_dim = max_len
latent_dim = 2

inputs = Input(shape=(input_dim,))
h = Dense(16, activation='relu')(inputs)
z_mean = Dense(latent_dim)(h)
z_log_var = Dense(latent_dim)(h)

def sampling(args):
    z_mean, z_log_var = args
    epsilon = K.random_normal(shape=(K.shape(z_mean)[0], latent_dim))
    return z_mean + K.exp(0.5 * z_log_var) * epsilon

z = Lambda(sampling)([z_mean, z_log_var])

decoder_h = Dense(16, activation='relu')
decoder_output = Dense(input_dim, activation='sigmoid')
h_decoded = decoder_h(z)
outputs = decoder_output(h_decoded)

vae = Model(inputs, outputs)
vae.compile(optimizer='adam', loss='mse')

vae.fit(padded_sequences, padded_sequences, epochs=100, verbose=0)


reconstructed = vae.predict(padded_sequences)
print("Original Sequences:")
print(padded_sequences)
print("Reconstructed Output:")
print(reconstructed)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 126ms/step
Original Sequences:
[[ 1  3  2  4]
 [ 1  5  2  6]
 [ 1  7  2  8]
 [ 1  9  2 10]
 [ 1 11 12 13]]
Reconstructed Output:
[[0.7987171  0.7872294  0.73943    0.6373226 ]
 [0.9998831  0.98235554 0.9580662  0.9516652 ]
 [0.17151682 0.96933573 0.4885814  0.36369297]
 [0.99999726 0.9972383  0.9893089  0.98508257]
 [0.9999999  0.9998861  0.9978672  0.9930313 ]]


In [9]:
'''
7. Use a pre-trained GPT model (like GPT-2 or GPT-3) to translate a short English paragraph into French and German.
Provide the original and translated text.
'''

!pip install transformers --quiet

from transformers import pipeline

gpt_translator = pipeline("text-generation", model="gpt2")

text = """Artificial Intelligence is transforming the world by enabling machines to learn from data,
make decisions, and create content. It is widely used in healthcare, finance, education, and many other industries."""

prompt_fr = f"Translate the following English text to French:\n{text}\nFrench:"
prompt_de = f"Translate the following English text to German:\n{text}\nGerman:"

french_output = gpt_translator(prompt_fr, max_length=200, num_return_sequences=1)
german_output = gpt_translator(prompt_de, max_length=200, num_return_sequences=1)

french_translation = french_output[0]["generated_text"].split("French:")[-1].strip()
german_translation = german_output[0]["generated_text"].split("German:")[-1].strip()

print("Original Text:\n", text)
print("\nFrench Translation:\n", french_translation)
print("\nGerman Translation:\n", german_translation)


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

GPT2LMHeadModel LOAD REPORT from: gpt2
Key                  | Status     |  | 
---------------------+------------+--+-
h.{0...11}.attn.bias | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.


generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Passing `generation_config` together with generation-related arguments=({'max_length', 'num_return_sequences'}) is deprecated and will be removed in future versions. Please pass either a `generation_config` object OR all generation parameters explicitly, but not both.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=200) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=200) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


Original Text:
 Artificial Intelligence is transforming the world by enabling machines to learn from data, 
make decisions, and create content. It is widely used in healthcare, finance, education, and many other industries.

French Translation:
 "Aur"
Noun: Artificial intelligence
This phrase is used to describe the idea that machine learning can learn from the data. It can be used to describe the idea that machine learning can learn from information. The "Aur" is a verb in both English and French. It means "to learn". It is often used to describe the idea that machine learning can learn from information.
English: "Aur"
English: "Aur"

German Translation:
 ich aus dem Theologicalen, ich aus dem Theologicalen, ich aus dem Theologicalen, ich aus dem Theologicalen, ich aus dem Theologicalen, ich aus dem Theologicalen, ich aus dem Theologicalen,
The world today is a computer world, a world of computers. It is a world of machines. It is a computer world of computers. It is a computer world 

In [10]:
#8. Implement a simple attention-based encoder-decoder model for English-to-Spanish translation using Tensorflow or PyTorch.

!pip install tensorflow numpy --quiet

import tensorflow as tf
import numpy as np
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.layers import Embedding, LSTM, Dense
from tensorflow.keras.models import Model

english_sentences = [
    "hello",
    "how are you",
    "i am fine",
    "thank you",
    "good morning"
]

spanish_sentences = [
    "hola",
    "como estas",
    "estoy bien",
    "gracias",
    "buenos dias"
]

spanish_sentences = ["<start> " + s + " <end>" for s in spanish_sentences]

eng_tokenizer = Tokenizer(filters='')
eng_tokenizer.fit_on_texts(english_sentences)
eng_sequences = eng_tokenizer.texts_to_sequences(english_sentences)
eng_padded = pad_sequences(eng_sequences, padding='post')

spa_tokenizer = Tokenizer(filters='')
spa_tokenizer.fit_on_texts(spanish_sentences)
spa_sequences = spa_tokenizer.texts_to_sequences(spanish_sentences)
spa_padded = pad_sequences(spa_sequences, padding='post')

eng_vocab_size = len(eng_tokenizer.word_index) + 1
spa_vocab_size = len(spa_tokenizer.word_index) + 1

max_eng_len = eng_padded.shape[1]
max_spa_len = spa_padded.shape[1]

class BahdanauAttention(tf.keras.layers.Layer):
    def __init__(self, units):
        super(BahdanauAttention, self).__init__()
        self.W1 = Dense(units)
        self.W2 = Dense(units)
        self.V = Dense(1)

    def call(self, encoder_output, hidden):
        hidden_with_time_axis = tf.expand_dims(hidden, 1)

        score = self.V(
            tf.nn.tanh(self.W1(encoder_output) + self.W2(hidden_with_time_axis))
        )

        attention_weights = tf.nn.softmax(score, axis=1)
        context_vector = attention_weights * encoder_output
        context_vector = tf.reduce_sum(context_vector, axis=1)

        return context_vector, attention_weights

class Encoder(Model):
    def __init__(self, vocab_size, embedding_dim, enc_units):
        super(Encoder, self).__init__()
        self.embedding = Embedding(vocab_size, embedding_dim)
        self.lstm = LSTM(enc_units, return_sequences=True, return_state=True)

    def call(self, x):
        x = self.embedding(x)
        output, state_h, state_c = self.lstm(x)
        return output, state_h, state_c
class Decoder(Model):
    def __init__(self, vocab_size, embedding_dim, dec_units):
        super(Decoder, self).__init__()
        self.embedding = Embedding(vocab_size, embedding_dim)
        self.lstm = LSTM(dec_units, return_sequences=True, return_state=True)
        self.fc = Dense(vocab_size)
        self.attention = BahdanauAttention(dec_units)

    def call(self, x, enc_output, hidden_h, hidden_c):
        context_vector, attention_weights = self.attention(enc_output, hidden_h)

        x = self.embedding(x)
        x = tf.concat([tf.expand_dims(context_vector, 1), x], axis=-1)

        output, state_h, state_c = self.lstm(x, initial_state=[hidden_h, hidden_c])
        output = self.fc(output)

        return output, state_h, state_c, attention_weights
embedding_dim = 64
units = 128
batch_size = len(eng_padded)

encoder = Encoder(eng_vocab_size, embedding_dim, units)
decoder = Decoder(spa_vocab_size, embedding_dim, units)

optimizer = tf.keras.optimizers.Adam()
loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

@tf.function
def train_step(inp, targ):
    loss = 0

    with tf.GradientTape() as tape:
        enc_output, enc_h, enc_c = encoder(inp)

        dec_h, dec_c = enc_h, enc_c
        dec_input = tf.expand_dims([spa_tokenizer.word_index['<start>']] * batch_size, 1)

        for t in range(1, targ.shape[1]):
            predictions, dec_h, dec_c, _ = decoder(dec_input, enc_output, dec_h, dec_c)
            loss += loss_object(targ[:, t], predictions[:, 0])

            dec_input = tf.expand_dims(targ[:, t], 1)

    batch_loss = loss / int(targ.shape[1])
    variables = encoder.trainable_variables + decoder.trainable_variables
    gradients = tape.gradient(loss, variables)
    optimizer.apply_gradients(zip(gradients, variables))

    return batch_loss
EPOCHS = 100

for epoch in range(EPOCHS):
    loss = train_step(eng_padded, spa_padded)
    if (epoch + 1) % 10 == 0:
        print(f"Epoch {epoch+1}, Loss: {loss:.4f}")
def translate(sentence):
    sequence = eng_tokenizer.texts_to_sequences([sentence])
    padded = pad_sequences(sequence, maxlen=max_eng_len, padding='post')

    enc_output, enc_h, enc_c = encoder(padded)

    dec_h, dec_c = enc_h, enc_c
    dec_input = tf.expand_dims([spa_tokenizer.word_index['<start>']], 0)

    result = []

    for _ in range(max_spa_len):
        predictions, dec_h, dec_c, _ = decoder(dec_input, enc_output, dec_h, dec_c)
        predicted_id = tf.argmax(predictions[0][0]).numpy()

        word = spa_tokenizer.index_word.get(predicted_id, '')
        if word == '<end>':
            break

        result.append(word)
        dec_input = tf.expand_dims([predicted_id], 0)

    return ' '.join(result)
test_sentence = "hello"
translation = translate(test_sentence)

print("English:", test_sentence)
print("Spanish:", translation)





Epoch 10, Loss: 1.7010
Epoch 20, Loss: 1.4081
Epoch 30, Loss: 1.2405
Epoch 40, Loss: 0.9829
Epoch 50, Loss: 0.6358
Epoch 60, Loss: 0.2847
Epoch 70, Loss: 0.0964
Epoch 80, Loss: 0.0403
Epoch 90, Loss: 0.0214
Epoch 100, Loss: 0.0136
English: hello
Spanish: hola


In [13]:
'''
9. Use the following short poetry dataset to simulate poem generation with avpre-trained GPT model:
["Roses are red, violets are blue,",
"Sugar is sweet, and so are you.",
"The moon glows bright in silent skies,",
"A bird sings where the soft wind sighs."]
Using this dataset as a reference for poetic structure and language, generate a new 2-4
line poem using a pre-trained GPT model (such as GPT-2). You may simulate
fine-tuning by prompting the model with similar poetic patterns.
Include your code, the prompt used, and the generated poem in your answer.
'''


!pip install transformers torch --quiet

from transformers import pipeline

poetry_dataset = [
    "Roses are red, violets are blue,",
    "Sugar is sweet, and so are you.",
    "The moon glows bright in silent skies,",
    "A bird sings where the soft wind sighs."
]

generator = pipeline("text-generation", model="gpt2")

prompt = """Here are some poetic lines:
Roses are red, violets are blue,
Sugar is sweet, and so are you.
The moon glows bright in silent skies,
A bird sings where the soft wind sighs.

Write a new short poem (2-4 lines) in a similar poetic style:
"""

output = generator(
    prompt,
    max_length=80,
    num_return_sequences=1,
    temperature=0.9,
    top_k=50
)

generated_text = output[0]["generated_text"]
new_poem = generated_text.replace(prompt, "").strip()

print("Prompt Used:\n")
print(prompt)

print("\nGenerated Poem:\n")
print(new_poem)



Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

GPT2LMHeadModel LOAD REPORT from: gpt2
Key                  | Status     |  | 
---------------------+------------+--+-
h.{0...11}.attn.bias | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.
Passing `generation_config` together with generation-related arguments=({'max_length', 'top_k', 'temperature', 'num_return_sequences'}) is deprecated and will be removed in future versions. Please pass either a `generation_config` object OR all generation parameters explicitly, but not both.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=80) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


Prompt Used:

Here are some poetic lines:
Roses are red, violets are blue,
Sugar is sweet, and so are you.
The moon glows bright in silent skies,
A bird sings where the soft wind sighs.

Write a new short poem (2-4 lines) in a similar poetic style:


Generated Poem:

If you'll let me say so,

I'll try I can think of another one

I write to you for a long time,

and you'll follow in my footsteps

I'll not let you stay

You'll be left behind like a thousand leaves

You'll think yourself a living thing

Write a new short poem in a similar poetic style:

The sun will shine long days when you look on

The moon glows bright in silent skies,

A bird sings where the soft wind sighs.

The moon glows bright in silent skies,

A bird sings where the soft wind sighs.


#10. Imagine you are building a creative writing assistant for a publishing company. The assistant should generate story plots and character descriptions using Generative AI. Describe how you would design the system, including model selection, training data, bias mitigation, and evaluation methods. Explain the real-world challenges you might face.



  1. Model Selection:

  * Use GPT-4/GPT-style Transformer models for text generation
  * Fine-tuned LLM for story plots and character descriptions

  2. Training Data:

  * Books, novels, scripts, and storytelling datasets
  * Genre-specific datasets (fantasy, romance, thriller)
  * Clean, diverse, and unbiased data sources

  3. Bias Mitigation:

  * Data filtering
  * Ethical dataset curation
  * Reinforcement Learning with Human Feedback (RLHF)
  * Regular bias audits

  4. System Architecture:
     User Input → Prompt Processing → LLM Generation → Content Filtering → Output Delivery

  5. Evaluation Methods:

  * BLEU and ROUGE scores for text quality
  * Human evaluation for creativity
  * Perplexity for language fluency
  * User feedback analysis

  6. Real-World Challenges:

  * High computational cost
  * Data privacy concerns
  * Hallucination in generated content
  * Ethical and copyright issues
  * Maintaining originality and creativity

  This system would enhance productivity for writers by generating high-quality story ideas and character descriptions efficiently.

In [14]:
!pip install transformers torch --quiet

from transformers import pipeline
import torch

generator = pipeline(
    "text-generation",
    model="gpt2",
    device=0 if torch.cuda.is_available() else -1
)

def generate_story_plot(genre, theme, tone):
    prompt = f"""
You are a creative writing assistant for a publishing company.

Generate a compelling story plot.

Genre: {genre}
Theme: {theme}
Tone: {tone}

Story Plot:
"""
    output = generator(
        prompt,
        max_length=250,
        temperature=0.9,
        top_k=50,
        top_p=0.95,
        num_return_sequences=1
    )

    result = output[0]["generated_text"]
    return result.replace(prompt, "").strip()


def generate_characters(genre, tone):
    prompt = f"""
You are a creative writing assistant.

Create two unique character descriptions.

Genre: {genre}
Tone: {tone}

Character Descriptions:
"""
    output = generator(
        prompt,
        max_length=250,
        temperature=0.9,
        top_k=50,
        top_p=0.95,
        num_return_sequences=1
    )

    result = output[0]["generated_text"]
    return result.replace(prompt, "").strip()


genre = "Fantasy"
theme = "Adventure and betrayal"
tone = "Dark"

plot = generate_story_plot(genre, theme, tone)
characters = generate_characters(genre, tone)

print("=== Generated Story Plot ===\n")
print(plot)

print("\n=== Generated Characters ===\n")
print(characters)



Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

GPT2LMHeadModel LOAD REPORT from: gpt2
Key                  | Status     |  | 
---------------------+------------+--+-
h.{0...11}.attn.bias | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.
Passing `generation_config` together with generation-related arguments=({'temperature', 'max_length', 'num_return_sequences', 'top_p', 'top_k'}) is deprecated and will be removed in future versions. Please pass either a `generation_config` object OR all generation parameters explicitly, but not both.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=250) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=

=== Generated Story Plot ===

What is the story of the year:

August 3rd

What is the year of the story:

August 4th

Where is the story that started the story:

September 1st

What is the year that started the story:

September 2nd

What is the year that started the story:

September 3rd

What is the year that started the story:

September 4th

What is the year that started the story:

September 5th

What is the year that started the story:

September 6th

What is the year that started the story:

September 7th

What is the year that started the story:

September 8th

What is the year that started the story:

September 9th

What is the year that started the story:

September 10th

What is the year that started the story:

September 11th

What is the year that started the story:

September 12th

What is the year that started the story:

September 13th

What is the year that started the story:

September 14th

What

=== Generated Characters ===

Characters have names. They have the same