<a href="https://colab.research.google.com/github/Arun9438/assignment/blob/main/Gen_AI%201.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#1: What is Generative AI and what are its primary use cases across industries?
Generative AI refers to artificial intelligence models that can generate new content such as text, images, audio, code, or videos by learning patterns from existing data.
## Primary use cases:
- Content creation (articles, poetry, marketing text)
- Machine translation
- Chatbots and virtual assistants
- Code generation
- Healthcare (synthetic medical data)
- Gaming & entertainment
- Design & creativity


# 2: Explain probabilistic modeling in generative models and difference from discriminative models
Generative models use probabilistic modeling to learn the joint probability distribution
P
(
X
,
Y
)
P(X,Y) or
P
(
X
)
P(X), allowing them to generate new samples.

| Generative Models             | Discriminative Models         |      |
| ----------------------------- | ----------------------------- | ---- |
| Learn data distribution       | Learn decision boundary       |      |
| Can generate new data         | Cannot generate data          |      |
| Model ( P(X, Y) ) or ( P(X) ) | Model ( P(Y                   | X) ) |
| Examples: GPT, VAE            | Examples: Logistic Regression |      |


# 3: Difference between Autoencoders and VAEs in text generation
| Autoencoders               | Variational Autoencoders (VAEs) |
| -------------------------- | ------------------------------- |
| Deterministic latent space | Probabilistic latent space      |
| Focus on reconstruction    | Focus on generation             |
| Poor text diversity        | Better diversity                |
| No regularization          | Uses KL-divergence              |

- VAEs are better suited for text generation because they produce smoother and more meaningful latent representations.

# 4: Attention mechanisms in Neural Machine Translation (NMT)
Attention allows the model to focus on relevant words in the source sentence while generating each word in the target sentence.
## Why attention is critical:
- Handles long sentences better
- Improves translation quality
- Aligns source and target words dynamically
- Solves information bottleneck problem


# 5: Ethical considerations in generative AI for creative content
Key ethical considerations include:
- Plagiarism & copyright issues
- Bias and stereotypes
- Misinformation
- Ownership of generated content
- Misuse (deepfakes, fake news)

Responsible AI requires transparency, bias mitigation, and human oversight.


# 6: VAE for Text Reconstruction (Small Dataset)

## Explanation:
- Text is tokenized and padded
- VAE learns compressed latent representations
- Reconstructed sequences resemble original sentences


In [1]:
import numpy as np
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.layers import Input, Dense, Lambda
from tensorflow.keras.models import Model
import tensorflow.keras.backend as K

texts = [
    "The sky is blue",
    "The sun is bright",
    "The grass is green",
    "The night is dark",
    "The stars are shining"
]

# Tokenization
tokenizer = Tokenizer()
tokenizer.fit_on_texts(texts)
sequences = tokenizer.texts_to_sequences(texts)
padded = pad_sequences(sequences, padding='post')

vocab_size = len(tokenizer.word_index) + 1
input_dim = padded.shape[1]
latent_dim = 2

# Encoder
inputs = Input(shape=(input_dim,))
h = Dense(8, activation='relu')(inputs)
z_mean = Dense(latent_dim)(h)
z_log_var = Dense(latent_dim)(h)

def sampling(args):
    z_mean, z_log_var = args
    return z_mean + K.exp(0.5 * z_log_var)

z = Lambda(sampling)([z_mean, z_log_var])

# Decoder
decoder_h = Dense(8, activation='relu')
decoder_out = Dense(input_dim, activation='sigmoid')
h_decoded = decoder_h(z)
outputs = decoder_out(h_decoded)

vae = Model(inputs, outputs)
vae.compile(optimizer='adam', loss='mse')
vae.fit(padded, padded, epochs=50, verbose=0)

# Reconstruction
reconstructed = vae.predict(padded)


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 125ms/step


# 7: GPT-based Translation (English → French & German)


In [6]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# Load the tokenizer and model directly for more control
tokenizer = AutoTokenizer.from_pretrained("t5-small")
model = AutoModelForSeq2SeqLM.from_pretrained("t5-small")

text = "Artificial intelligence is transforming the world."

# Translate to French
input_text_fr = f"translate English to French: {text}"
input_ids_fr = tokenizer(input_text_fr, return_tensors="pt").input_ids
outputs_fr = model.generate(input_ids_fr)
french_translation = tokenizer.decode(outputs_fr[0], skip_special_tokens=True)

# Translate to German
input_text_de = f"translate English to German: {text}"
input_ids_de = tokenizer(input_text_de, return_tensors="pt").input_ids
outputs_de = model.generate(input_ids_de)
german_translation = tokenizer.decode(outputs_de[0], skip_special_tokens=True)

print("French:", french_translation)
print("German:", german_translation)

tokenizer_config.json:   0%|          | 0.00/2.32k [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.39M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/242M [00:00<?, ?B/s]

Loading weights:   0%|          | 0/131 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

French: L'intelligence artificielle transforme le monde.
German: Die künstliche Intelligenz verändert die Welt.


# 8: Attention-based Encoder-Decoder (English → Spanish)

## Explanation:
- Encoder converts input sentence to hidden states
- Attention weighs important words
- Decoder generates Spanish translation
- This approach significantly improves translation accur

# 9: Poem Generation using GPT (Prompt-based)


In [9]:
from transformers import pipeline

generator = pipeline("text-generation", model="gpt2")

prompt = "Write a short poem in a lyrical and gentle style similar to:\nRoses are red, violets are blue,\nSugar is sweet, and so are you."

poem = generator(prompt, max_length=80, num_return_sequences=1)
print(poem[0]["generated_text"])

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

GPT2LMHeadModel LOAD REPORT from: gpt2
Key                  | Status     |  | 
---------------------+------------+--+-
h.{0...11}.attn.bias | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.


generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Passing `generation_config` together with generation-related arguments=({'max_length', 'num_return_sequences'}) is deprecated and will be removed in future versions. Please pass either a `generation_config` object OR all generation parameters explicitly, but not both.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=80) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


Write a short poem in a lyrical and gentle style similar to:
Roses are red, violets are blue,
Sugar is sweet, and so are you.
But you can't live without sugar.
You must live with sugar.
And you need to die with sugar.
Just like that, we're just getting started. Let's get started!


# 0: Designing a Creative Writing Assistant (Real-World Scenario)

## System Design:
- Model: GPT-style transformer
- Training Data: Books, stories, scripts (licensed & diverse)
- Bias Mitigation: Data filtering, fairness evaluation
- Evaluation: Human review, BLEU, ROUGE, creativity metrics
## Challenges:
- Bias and hallucinations
- Copyright concerns
- Content moderation
- Computational cost
## Business Value:
- Faster content creation
- Creative assistance for authors
- Cost reduction
- Scalable storytelling solutions