
Test 1: Transformers Architecture Task

Objective:

Develop a simple transformer-based model to solve a specific problem, such as text classification, sentiment analysis, or language translation.

Task Description:

* Choose a public dataset relevant to the problem (like IMDb reviews for sentiment analysis).

* Implement a transformer model using an AI framework (e.g., TensorFlow or PyTorch).

* Train the model on the dataset and evaluate its performance.

* Document the process, including data preprocessing, model architecture, training process, and evaluation metrics.




In [None]:
import tensorflow as tf
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding,LSTM,Dense,Dropout

In [None]:
num_words=10000
(x_train,y_train),(x_test,y_test)=imdb.load_data(num_words=num_words)

In [None]:
maxlen=100
x_train=pad_sequences(x_train,maxlen=maxlen)
x_test=pad_sequences(x_test,maxlen=maxlen)

In [None]:
def create_model():
  model=Sequential()
  model.add(Embedding(num_words,32,input_length=maxlen))
  model.add(LSTM(32,return_sequences=True))
  model.add(LSTM(32))
  model.add(Dense(64,activation='relu'))
  model.add(Dropout(0.5))
  model.add(Dense(1,activation='sigmoid'))
  return model

model=create_model()

In [None]:
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])

In [None]:
model.fit(x_train,y_train,validation_data=(x_test,y_test),epochs=5,batch_size=64)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.src.callbacks.History at 0x7c8821b6f880>

In [None]:
scores=model.evaluate(x_test,y_test,verbose=0)
print("Accuracy:%.2f%% " % (scores[1]*100))

Accuracy:83.01% 


* The IMDb reviews dataset is the one used for this assignment. Sentiment analysis projects can benefit from its 50,000 movie reviews, all of which are classified as either positive or negative.

* Importing the required libraries and loading the dataset are the first steps in the code. After that, the reviews are preprocessed by being padded to a maximum of 100 words.

* Three dense layers, two LSTM layers, and an embedding layer make up the model architecture. Sequence information can be transferred down several levels thanks to the return of sequences by the first two LSTM layers. In order to avoid overfitting, the Dropout layer is employed.

* For binary classification problems, the binary crossentropy loss function and the Adam optimizer are used to create the model.

* The fit approach is used to train the model and assess its performance on the test set. After training, the model's accuracy is printed.

* Callbacks, like EarlyStopping, that prevent training when the model's performance reaches a certain point can be added to the code to better optimise it.




In [None]:
pip install torch torchvision transformers



In [None]:
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer

def generate_text(prompt, max_length=50, num_return_sequences=1):
    # Load pre-trained model and tokenizer
    model = GPT2LMHeadModel.from_pretrained("gpt2")
    tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

    # Tokenize input prompt
    input_ids = tokenizer.encode(prompt, return_tensors="pt")

    # Generate output sequences
    output = model.generate(
        input_ids,
        max_length=max_length,
        num_return_sequences=num_return_sequences,
        no_repeat_ngram_size=2,
        do_sample=True,
        temperature=0.7,
    )

    # Decode output sequences
    output_text = tokenizer.decode(output[0], skip_special_tokens=True)

    return output_text

In [None]:
generated_text = generate_text("Once upon a time in a small village")
print(generated_text)

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Once upon a time in a small village, my daughter passed by a house. A few houses were in need of food, and she was told that she had no children, but that all her household was in danger. I went into the village and
