In [3]:

 1. What is Generative AI?  
Generative AI refers to a category of artificial intelligence systems designed to create new content such as text, images, music, or data by learning patterns from existing datasets.

---

 2. How is Generative AI different from traditional AI?  
Traditional AI often focuses on classification or prediction based on input data, while Generative AI creates new, original content by modeling data distributions.

---

 3. Name two applications of Generative AI in the industry  
1. Text generation for chatbots and content creation.  
2. Image synthesis for design, art, and media production.

---

 4. What are some challenges associated with Generative AI?  
Challenges include data bias, generating realistic but misleading content (deepfakes), high computational costs, and difficulty in evaluating generated outputs.

---

 5. Why is Generative AI important for modern applications?  
It enables automation of creative tasks, personalized content generation, simulation for training, and augmentation of human creativity.

---

 6. What is probabilistic modeling in the context of Generative AI?  
Probabilistic modeling involves creating models that represent the probability distributions of data, enabling the generation of new samples based on learned probabilities.

---

 7. Define a generative model  
A generative model is a type of model that learns to generate new data samples similar to the training data by capturing its underlying distribution.

---

 8. Explain how an n-gram model works in text generation  
An n-gram model predicts the next word in a sequence based on the previous (n-1) words by estimating the conditional probability of word sequences from training data.

---

 9. What are the limitations of n-gram models?  
They suffer from data sparsity, limited context (short memory), and inability to capture long-range dependencies in language.

---

 10. How can you improve the performance of an n-gram model?  
By applying smoothing techniques, increasing the size of n, and using back-off or interpolation methods to better estimate probabilities for unseen sequences.

---

 11. What is the Markov assumption, and how does it apply to text generation?  
The Markov assumption states that the probability of a word depends only on a fixed number of previous words, simplifying text modeling by limiting context.

---

 12. Why are probabilistic models important in generative AI?  
They allow the model to quantify uncertainty and generate diverse, plausible samples by modeling the distribution of data.

---

 13. What is an autoencoder?  
An autoencoder is a neural network that learns to compress input data into a lower-dimensional latent representation and then reconstructs the original data from it.

---

 14. How does a VAE differ from a standard autoencoder?  
A Variational Autoencoder (VAE) models the latent space probabilistically, learning distributions rather than fixed codes, enabling controlled and diverse generation.

---

 15. Why are VAEs useful in generative modeling?  
They allow smooth interpolation in the latent space and can generate new data samples by sampling from learned latent distributions.

---

 16. What role does the decoder play in an autoencoder?  
The decoder reconstructs the original data from the compressed latent representation.

---

 17. How does the latent space affect text generation in a VAE?  
The latent space captures meaningful variations in the data; sampling from it enables generation of diverse and coherent text outputs.

---

 18. What is the purpose of the Kullback-Leibler (KL) divergence term in VAEs?  
KL divergence regularizes the latent distribution to be close to a prior distribution (usually Gaussian), encouraging smooth and structured latent spaces.

---

 19. How can you prevent overfitting in a VAE?  
By using regularization (like the KL term), dropout, early stopping, and ensuring a balanced capacity of the latent space.

---

 20. Explain why VAEs are commonly used for unsupervised learning tasks  
Because they learn to represent data structure without labeled data, making them ideal for feature learning and generation.

---

 21. What is a transformer model?  
A transformer is a neural network architecture that uses self-attention mechanisms to model dependencies in sequential data, excelling in tasks like language understanding and generation.

---

 22. Explain the purpose of self-attention in transformers  
Self-attention allows the model to weigh the importance of different words in a sequence, capturing relationships regardless of distance.

---

 23. How does a GPT model generate text?  
GPT generates text autoregressively by predicting the next word based on the sequence of previously generated words.

---

 24. What are the key differences between a GPT model and an RNN?  
GPT uses transformer architecture with self-attention enabling parallel processing and long-range dependencies, whereas RNNs process sequentially with limited memory.

---

 25. How does fine-tuning improve a pre-trained GPT model?  
Fine-tuning adapts the model to a specific domain or task by training it on targeted data, improving relevance and accuracy.

---

 26. What is zero-shot learning in the context of GPT models?  
Zero-shot learning refers to GPT's ability to perform tasks without explicit training on them, based solely on prompts and pre-trained knowledge.

---

 27. Describe how prompt engineering can impact GPT model performance  
Carefully crafted prompts guide GPT to generate more accurate, relevant, or creative responses by shaping the input context.

---

 28. Why are large datasets essential for training GPT models?  
Large datasets provide diverse language patterns, improving the model's generalization and robustness.

---

 29. What are potential ethical concerns with GPT models?  
Risks include generation of biased, harmful, or misleading content, privacy issues, and misuse for misinformation.

---

 30. How does the attention mechanism contribute to GPT’s ability to handle long-range dependencies?  
Attention allows GPT to focus on relevant parts of the entire input sequence, capturing relationships even between distant words.

---

 31. What are some limitations of GPT models for real-world applications?  
They can produce incorrect or nonsensical outputs, struggle with reasoning, require large compute resources, and may perpetuate biases.

---

 32. How can GPT models be adapted for domain-specific text generation?  
By fine-tuning on domain-specific corpora or using prompt engineering to steer output style and content.

---

 33. What are some common metrics for evaluating text generation quality?  
Metrics include BLEU, ROUGE, perplexity, and human evaluation for fluency and relevance.

---

 34. Explain the difference between deterministic and probabilistic text generation  
Deterministic generation produces the same output for a given input, while probabilistic generation samples from a distribution, yielding varied outputs.

---

 35. How does beam search improve text generation in language models?  
Beam search explores multiple candidate sequences simultaneously, selecting the most likely to improve overall output quality.

"""



In [None]:
# 1. Generate a random sentence using probabilistic modeling (Markov Chain)
import random

def generate_markov_chain(text, n=2, length=10):
    words = text.split()
    ngrams = {}
    for i in range(len(words)-n):
        key = tuple(words[i:i+n])
        next_word = words[i+n]
        if key not in ngrams:
            ngrams[key] = []
        ngrams[key].append(next_word)
    current = random.choice(list(ngrams.keys()))
    output = list(current)
    for _ in range(length - n):
        if current in ngrams:
            next_word = random.choice(ngrams[current])
            output.append(next_word)
            current = tuple(output[-n:])
        else:
            break
    return ' '.join(output)

sample_text = "The cat is on the mat"
print("Markov Chain generated sentence:")
print(generate_markov_chain(sample_text, n=2, length=10))


In [None]:
# 2. Build a simple Autoencoder model using Keras for sentences (example using character-level encoding)

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model

# Example simple character set and sentence for demo
chars = sorted(list(set("The cat is on the mat")))
char_to_int = {c: i for i, c in enumerate(chars)}
int_to_char = {i: c for c, i in char_to_int.items()}
max_len = 20  # max length of sequence

def encode_sentence(sentence):
    encoded = np.zeros((max_len, len(chars)))
    for i, c in enumerate(sentence[:max_len]):
        encoded[i, char_to_int[c]] = 1
    return encoded.flatten()

# Prepare training data (repeat for demo)
sentences = ["The cat is on the mat"] * 100
x_train = np.array([encode_sentence(s) for s in sentences])

input_dim = x_train.shape[1]
encoding_dim = 32

input_layer = Input(shape=(input_dim,))
encoded = Dense(encoding_dim, activation='relu')(input_layer)
decoded = Dense(input_dim, activation='sigmoid')(encoded)

autoencoder = Model(input_layer, decoded)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

autoencoder.fit(x_train, x_train, epochs=20, batch_size=10, verbose=1)


In [None]:
# 3. Use Hugging Face transformers library to fine-tune GPT-2 on custom data (template, requires datasets and tokenizers)

from transformers import GPT2Tokenizer, GPT2LMHeadModel, TextDataset, DataCollatorForLanguageModeling, Trainer, TrainingArguments

def fine_tune_gpt2(train_file, output_dir):
    tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
    model = GPT2LMHeadModel.from_pretrained('gpt2')

    dataset = TextDataset(
        tokenizer=tokenizer,
        file_path=train_file,
        block_size=128)

    data_collator = DataCollatorForLanguageModeling(
        tokenizer=tokenizer, mlm=False,
    )

    training_args = TrainingArguments(
        output_dir=output_dir,
        overwrite_output_dir=True,
        num_train_epochs=1,
        per_device_train_batch_size=2,
        save_steps=10_000,
        save_total_limit=2,
    )

    trainer = Trainer(
        model=model,
        args=training_args,
        data_collator=data_collator,
        train_dataset=dataset,
    )

    trainer.train()
    trainer.save_model(output_dir)

# To run, prepare a 'train.txt' file with your custom data
# fine_tune_gpt2('train.txt', './gpt2-finetuned')


In [None]:
# 4. Simple RNN Text Generation Model in Keras

import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, SimpleRNN, Embedding
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

texts = [
    "the cat is on the mat",
    "the dog is in the house",
    "the cat likes to sleep",
    "the dog likes to play"
]

tokenizer = Tokenizer()
tokenizer.fit_on_texts(texts)
total_words = len(tokenizer.word_index) + 1

input_sequences = []
for line in texts:
    token_list = tokenizer.texts_to_sequences([line])[0]
    for i in range(1, len(token_list)):
        n_gram_sequence = token_list[:i+1]
        input_sequences.append(n_gram_sequence)

max_seq_len = max([len(x) for x in input_sequences])
input_sequences = np.array(pad_sequences(input_sequences, maxlen=max_seq_len, padding='pre'))

X = input_sequences[:,:-1]
y = input_sequences[:,-1]

from tensorflow.keras.utils import to_categorical
y = to_categorical(y, num_classes=total_words)

model = Sequential()
model.add(Embedding(total_words, 10, input_length=max_seq_len-1))
model.add(SimpleRNN(50))
model.add(Dense(total_words, activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

model.fit(X, y, epochs=100, verbose=1)

def generate_text(seed_text, next_words=5):
    for _ in range(next_words):
        token_list = tokenizer.texts_to_sequences([seed_text])[0]
        token_list = pad_sequences([token_list], maxlen=max_seq_len-1, padding='pre')
        predicted = model.predict(token_list, verbose=0)
        predicted_word_index = np.argmax(predicted, axis=1)[0]
        output_word = tokenizer.index_word.get(predicted_word_index, '')
        seed_text += " " + output_word
    return seed_text

print(generate_text("the cat", 5))

In [None]:
# 5. LSTM-based Text Generation Model in Keras

from tensorflow.keras.layers import LSTM

model = Sequential()
model.add(Embedding(total_words, 10, input_length=max_seq_len-1))
model.add(LSTM(100))
model.add(Dense(total_words, activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

model.fit(X, y, epochs=100, verbose=1)

def generate_text_lstm(seed_text, next_words=5):
    for _ in range(next_words):
        token_list = tokenizer.texts_to_sequences([seed_text])[0]
        token_list = pad_sequences([token_list], maxlen=max_seq_len-1, padding='pre')
        predicted = model.predict(token_list, verbose=0)
        predicted_word_index = np.argmax(predicted, axis=1)[0]
        output_word = tokenizer.index_word.get(predicted_word_index, '')
        seed_text += " " + output_word
    return seed_text

print(generate_text_lstm("the dog", 5))

In [None]:
# 6. GPT-2 Story Generation with Hugging Face Transformers

from transformers import GPT2LMHeadModel, GPT2Tokenizer

tokenizer_gpt2 = GPT2Tokenizer.from_pretrained("gpt2")
model_gpt2 = GPT2LMHeadModel.from_pretrained("gpt2")

def generate_story(prompt, max_length=100):
    inputs = tokenizer_gpt2.encode(prompt, return_tensors="pt")
    outputs = model_gpt2.generate(
        inputs,
        max_length=max_length,
        num_return_sequences=1,
        no_repeat_ngram_size=2,
        early_stopping=True
    )
    story = tokenizer_gpt2.decode(outputs[0], skip_special_tokens=True)
    return story

prompt = "Once upon a time in a land far away,"
print(generate_story(prompt))

In [None]:
# 7. GRU-based Text Generation Model in Keras

from tensorflow.keras.layers import GRU

model = Sequential()
model.add(Embedding(total_words, 10, input_length=max_seq_len-1))
model.add(GRU(100))
model.add(Dense(total_words, activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

model.fit(X, y, epochs=100, verbose=1)

def generate_text_gru(seed_text, next_words=5):
    for _ in range(next_words):
        token_list = tokenizer.texts_to_sequences([seed_text])[0]
        token_list = pad_sequences([token_list], maxlen=max_seq_len-1, padding='pre')
        predicted = model.predict(token_list, verbose=0)
        predicted_word_index = np.argmax(predicted, axis=1)[0]
        output_word = tokenizer.index_word.get(predicted_word_index, '')
        seed_text += " " + output_word
    return seed_text

print(generate_text_gru("the cat", 5))

In [None]:
# 8. GPT-2 Text Generation with Beam Search Decoding

def generate_text_beam_search(prompt, max_length=100, num_beams=5):
    inputs = tokenizer_gpt2.encode(prompt, return_tensors="pt")
    outputs = model_gpt2.generate(
        inputs,
        max_length=max_length,
        num_beams=num_beams,
        early_stopping=True,
        num_return_sequences=1
    )
    text = tokenizer_gpt2.decode(outputs[0], skip_special_tokens=True)
    return text

print(generate_text_beam_search("In a future world,", max_length=50))

In [None]:
# 9. GPT-2 Text Generation with Custom Temperature Setting

def generate_text_temperature(prompt, max_length=100, temperature=1.0):
    inputs = tokenizer_gpt2.encode(prompt, return_tensors="pt")
    outputs = model_gpt2.generate(
        inputs,
        max_length=max_length,
        temperature=temperature,
        do_sample=True,
        top_k=50,
        top_p=0.95,
        num_return_sequences=1
    )
    text = tokenizer_gpt2.decode(outputs[0], skip_special_tokens=True)
    return text

print(generate_text_temperature("The mysterious forest", temperature=0.7))
print(generate_text_temperature("The mysterious forest", temperature=1.5))

In [None]:
# 10. GPT-2 Temperature Sampling Experiment

temperatures = [0.3, 0.7, 1.0, 1.3]
prompt = "The future of AI is"

for temp in temperatures:
    print(f"Temperature: {temp}")
    print(generate_text_temperature(prompt, temperature=temp))
    print("-"*50)

In [None]:
# 11. Simple LSTM Text Generation Model

from tensorflow.keras.layers import LSTM

model = Sequential()
model.add(Embedding(total_words, 10, input_length=max_seq_len-1))
model.add(LSTM(100))
model.add(Dense(total_words, activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

model.fit(X, y, epochs=100, verbose=1)

def generate_text_lstm(seed_text, next_words=5):
    for _ in range(next_words):
        token_list = tokenizer.texts_to_sequences([seed_text])[0]
        token_list = pad_sequences([token_list], maxlen=max_seq_len-1, padding='pre')
        predicted = model.predict(token_list, verbose=0)
        predicted_word_index = np.argmax(predicted, axis=1)[0]
        output_word = tokenizer.index_word.get(predicted_word_index, '')
        seed_text += " " + output_word
    return seed_text

print(generate_text_lstm("the dog", 5))

In [None]:
# 12. Custom Attention-based Text Generation Architecture in Keras

from tensorflow.keras.layers import Layer, Input, Dense, LSTM, Embedding
from tensorflow.keras.models import Model
import tensorflow.keras.backend as K

class Attention(Layer):
    def __init__(self, **kwargs):
        super(Attention, self).__init__(**kwargs)

    def build(self, input_shape):
        self.W = self.add_weight(name='attention_weight',
                                 shape=(input_shape[-1], 1),
                                 initializer='random_normal',
                                 trainable=True)
        self.b = self.add_weight(name='attention_bias',
                                 shape=(input_shape[1], 1),
                                 initializer='zeros',
                                 trainable=True)
        super(Attention, self).build(input_shape)

    def call(self, x):
        e = K.tanh(K.dot(x, self.W) + self.b)
        a = K.softmax(e, axis=1)
        output = x * a
        return K.sum(output, axis=1)

input_seq = Input(shape=(max_seq_len-1,))
embedding = Embedding(total_words, 10)(input_seq)
lstm_out = LSTM(100, return_sequences=True)(embedding)
attention_out = Attention()(lstm_out)
output = Dense(total_words, activation='softmax')(attention_out)

model = Model(inputs=input_seq, outputs=output)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

model.fit(X, y, epochs=100, verbose=1)