<a href="https://colab.research.google.com/github/bushht/Assignments/blob/main/Assignment13.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Bushra Hoteit**

Github link:

**1. Dataset Preparation**

In [1]:
# Download Tiny Shakespeare text manually
import requests

url = "https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt"
response = requests.get(url)

# Save and read
text_data = response.text

print(text_data[:1000])  # Preview



First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.

All:
We know't, we know't.

First Citizen:
Let us kill him, and we'll have corn at our own price.
Is't a verdict?

All:
No more talking on't; let it be done: away, away!

Second Citizen:
One word, good citizens.

First Citizen:
We are accounted poor citizens, the patricians good.
What authority surfeits on would relieve us: if they
would yield us but the superfluity, while it were
wholesome, we might guess they relieved us humanely;
but they think we are too dear: the leanness that
afflicts us, the object of our misery, is as an
inventory to particularise their abundance; our
sufferance is a gain to them Let us revenge this with
our pikes, ere we become rakes: for the gods know I
speak this in hunger for bread, not in thirst for revenge.



**2. Exploring Generative Pre-trained Transformers (GPTs)**

**Model Architecture**

***Describe the architecture of GPTs, focusing on aspects such as the transformer model, attention mechanisms, and how these models are trained.***

GPTs are based on the Transformer architecture & it uses only the decoder part of the original Transformer.

Key components:

Input Embedding	-> Converts tokens (words or subwords) into dense vector representations.

Positional Encoding ->	Adds information about the position of tokens (since transformers lack inherent sequence order).

Multi-head Self-Attention -> Allows the model to focus on different parts of the input when predicting the next word.

Feedforward Layers -> Fully connected layers that apply transformations to the output of the attention layers.

Layer Normalization & Residual Connections -> Help stabilize and accelerate training.

As for the self-attention mechanism it allows GPT to attend to all previous tokens in the input sequence to predict the next word.

GPT uses causal (masked) attention, which means it can only look at tokens to the left (previous words) when generating the next word.

Attention weights determine how much importance to give to each previous token when predicting the next.

GPTs are trained in an unsupervised way using next-token prediction. The model learns to predict the next token, one at a time, using maximum likelihood estimation.

***Provide an overview of how GPTs generate text, including a discussion on tokenization, probabilities, and sequence generation.***

GPT's generate text based on the following steps:

***Tokenization:***

Input text is split into tokens for example: words or subword units like "un", "break", "able".

Tokens are then mapped to integers using a tokenizer.  

***Sequence generation:***

Input tokens are passed through the model.

The model outputs a probability distribution over the next possible token.

The next token is selected using: Greedy decoding (choose the most probable token), Sampling (randomly pick based on probabilities), Top-k / Top-p  sampling (sampling from a filtered set of top tokens).

The selected token is appended to the sequence & the process repeats.

***Probability Distribution:***

Each step of generation involves computing the probability of the next token.
So the token with the highest probability is chosen as the next word.

**Training**

***Implement a basic text generation model using Python libraries such as TensorFlow (Keras, or PyTorch).***

In [2]:
import requests

# Change all text to lowercase
text = text_data.lower().replace('\n', ' ')

# Tokenization
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.preprocessing.sequence import pad_sequences
import numpy as np

tokenizer = Tokenizer(num_words=4000, oov_token="<OOV>")
tokenizer.fit_on_texts([text])
word_index = tokenizer.word_index
total_words = min(len(word_index) + 1, 4000)

# Generate input sequences
input_sequences = []
tokens = tokenizer.texts_to_sequences([text])[0]

for i in range(10, len(tokens)):
    seq = tokens[i-10:i+1]  # 10 words input + 1 word output
    input_sequences.append(seq)

input_sequences = np.array(input_sequences)
X, y = input_sequences[:, :-1], input_sequences[:, -1]
y = to_categorical(y, num_classes=total_words)


In [3]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense

model = Sequential([
    Embedding(input_dim=total_words, output_dim=32, input_length=10),
    LSTM(64),
    Dense(total_words, activation='softmax')
])

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])




***Train the model on the selected dataset to generate text based on seed input.***

In [4]:
history = model.fit(X, y, epochs=10, batch_size=128, verbose=1)


Epoch 1/10
[1m1595/1595[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m27s[0m 16ms/step - accuracy: 0.0615 - loss: 6.4676
Epoch 2/10
[1m1595/1595[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0m 16ms/step - accuracy: 0.0735 - loss: 5.9975
Epoch 3/10
[1m1595/1595[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0m 16ms/step - accuracy: 0.0881 - loss: 5.8007
Epoch 4/10
[1m1595/1595[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m26s[0m 16ms/step - accuracy: 0.1029 - loss: 5.6475
Epoch 5/10
[1m1595/1595[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m28s[0m 18ms/step - accuracy: 0.1119 - loss: 5.5101
Epoch 6/10
[1m1595/1595[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m27s[0m 17ms/step - accuracy: 0.1145 - loss: 5.4264
Epoch 7/10
[1m1595/1595[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m27s[0m 17ms/step - accuracy: 0.1181 - loss: 5.3569
Epoch 8/10
[1m1595/1595[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m40s[0m 16ms/step - accuracy: 0.1199 - loss: 5.2893
Epoch 9/

In [5]:
def generate_text(seed_text, next_words=20):
    for _ in range(next_words):
        token_list = tokenizer.texts_to_sequences([seed_text])[0]
        token_list = pad_sequences([token_list], maxlen=10, padding='pre')
        predicted = model.predict(token_list, verbose=0)
        output_word = tokenizer.index_word[np.argmax(predicted)]
        seed_text += " " + output_word
    return seed_text

# Example usage
print(generate_text("the king was"))


the king was <OOV> to the <OOV> of the <OOV> of the <OOV> of the <OOV> of the <OOV> of the <OOV> of


**3. Application Demonstration**

***Describe and implement a small practical example demonstrating a content creation application using your trained model.***

We can use the model to create an application that helps the authors write stories based on input they provide like a theme or a few lines.

Steps include:

* Prompting the user for a seed input.
* Using the trained model to generate a short continuation.
* Displaying the generated content to the user.

In [6]:

def creative_writing_assistant(seed_text, word_count=50):
    print(f"User Prompt: {seed_text}")
    print("\nGenerated Continuation:\n")
    generated = generate_text(seed_text, next_words=word_count)
    print(generated)


In [7]:
creative_writing_assistant("thou art more lovely")


User Prompt: thou art more lovely

Generated Continuation:

thou art more lovely <OOV> and <OOV> me to the <OOV> of the <OOV> of the <OOV> of the <OOV> of the <OOV> of the <OOV> of the <OOV> of the <OOV> of the <OOV> of the <OOV> of the <OOV> of the <OOV> of the <OOV> of the <OOV> of the <OOV> of


**4. Documentation and Report writing**

*   **Introduction to Generative AI and its significance.**

Generative AI refers to a class of artificial intelligence models that are capable of creating new content such as text, images, music, or code. Unlike traditional AI, which focuses on classification or prediction, generative AI synthesizes new data based on patterns it has learned during training.

Its significance lies in its wide-ranging applications—enhancing creativity, automating content creation, personalizing experiences, and even assisting in scientific discoveries. Popular tools like ChatGPT and Google Bard have demonstrated how generative AI can revolutionize how we interact with machines and consume information.

*   **Description of GPTs architecture and its functionality.**

(Answered in Question 2 above but summary below)

GPTs are a subclass of language models based on the Transformer architecture.

GPTs utilize:

Self-attention mechanisms to weigh the importance of different words in a sequence.

Layered architecture with multiple encoder-decoder layers (GPT uses decoder-only).

Tokenization, where input text is split into tokens (words or subwords) and converted to numerical format.

Autoregressive training, predicting the next token based on previous ones.

The model is trained on a massive amount of text data in a self-supervised manner and can be fine-tuned for specific tasks like summarization, translation, or text generation.

*   **Methodology and findings from the hands-on model implementation.**

I implemented a simple text generation model using TensorFlow/Keras, trained on the Tiny Shakespeare dataset. The steps included:

Data Preparation: Downloaded, tokenized, and converted text into sequences.

Model Architecture: A small LSTM model was used to learn word-level dependencies.

Training: I trained the model to predict the next word in a sequence, using categorical crossentropy loss and an Adam optimizer.

Text Generation: A function was implemented to generate new text based on a seed input.

Findings:

The model was able to learn patterns and generate Shakespeare-like text.

Accuracy improved with proper tokenization and moderate model depth.

Adding too many layers or long sequences caused memory issues in Colab.

*   **Applications of Generative AI and a practical demonstration.**

Generative AI has real-world applications in:

Creative writing and content generation. For example: auto-generating news articles, poems.

Chatbots and virtual assistants (customer support).

Personalized learning materials and language tutoring.

Demonstration:
We showcased a mini content creation tool by training the model to generate short literary texts based on a user's input prompt.

Example Output:
Input: "Love is"
Output: "Love is a flame that burns in the heart of every man, gentle and fierce in its quiet desire..."

*   **Discussion on ethical considerations and potential solutions.**

Generative AI introduces several ethical concerns:

-Misinformation and deepfakes: Fake news and altered media can spread easily.

-Bias: Models can reflect or amplify societal biases from their training data.

-Job displacement: Especially in creative fields (writing, music, design).

-Ownership: Who owns AI-generated content?

Solutions include:

-Transparent datasets and documentation.

-Bias detection and mitigation techniques.

-Watermarking or traceability for AI-generated content.

-Ethical guidelines.

*   **Conclusion summarizing key insights and future perspectives in Generative AI.**

Generative AI, particularly transformer-based models like GPTs, has transformed the landscape of artificial intelligence. Our project demonstrated how even small-scale models can generate creative content, highlighting the practical power of these architectures.

Future directions involve:

Exploring larger pre-trained models like GPT-2 or GPT-3.

Using transfer learning for better results with smaller datasets.

Integrating generative models into real-time applications responsibly.

Generative AI is a promising frontier, and with thoughtful development, it can bring tremendous value across industries while addressing critical ethical challenges.