## **Qwen 1.5-0.5B**

In [None]:
%%capture
!pip install datasets==2.21.0 transformers peft torch rouge-score nltk

In [None]:
%%capture
!pip install accelerate -U

In [None]:
%%capture
#Loads transformers, torch and huggingface_hub
!pip install transformers torch huggingface_hub

#AutoModelForCausalLM - Creates models for causal language modeling tasks
#AutoTokenizer - To tokenize text data for the model
from transformers import AutoModelForCausalLM, AutoTokenizer

#Transformers_stream_generator - text generation method which returns a generator,
# #streaming out each token in real-time during inference, based on
# Huggingface/Transformers Einops (Einstein Operations) - library for tensor manipulations
!pip install transformers_stream_generator einops

#BPE tokeniser for use with OpenAI's models (Byte Pair Encoding - is a compression technique). It splits text into tokens.
!pip install tiktoken


In [None]:
import transformers
from datasets import load_dataset, load_metric, Dataset,DatasetDict

# **Define Model**

In [None]:
##Qwen2.0 model with - 0.5 billion parameters; Hosted on - Hugging Face model hub
#sModelName = "Qwen/Qwen2-0.5B"  ## Crashed- "Qwen1.5-7B-Chat" & "Qwen/Qwen2-75B" Crashed

model_name ="Qwen/Qwen1.5-0.5B"  ## "Qwen/Qwen-1.5-32B" #"Qwen/Qwen-0.5B"

In [None]:
#Initialize Tokenizer & Model

#Load the tokenizer
#trust_remote_code - Allows execution of code from the tokenizer files
bTrust_remote_code = True
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=bTrust_remote_code)
#tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

tokenizer.pad_token = tokenizer.eos_token #End of sentence


tokenizer_config.json:   0%|          | 0.00/1.29k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/2.78M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/1.67M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/7.03M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/661 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.24G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/138 [00:00<?, ?B/s]

In [None]:
###Sample - For Testing ######
''' text = "What is a linear classifier?"
inputs = f"Question: {text} \n Answer:"   # input_text = f"Question: {data['question']}\nAnswer:"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
# Perform inference
outputs = model.generate( inputs['input_ids'],attention_mask=inputs['attention_mask'], max_length=256,# max_new_tokens=500,num_beams=8,early_stopping=True,repetition_penalty=.9)
# Decode the generated token IDs to text
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Generated Text:", generated_text)
'''

'\ntext = "What is a linear classifier?"\n# input_text = f"Question: {data[\'question\']}\nAnswer:"\ninputs = f"Question: {text} \n Answer:"\ninputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)\n# Perform inference\noutputs = model.generate(\n    inputs[\'input_ids\'],\n    attention_mask=inputs[\'attention_mask\'],\n    max_length=256,\n    # max_new_tokens=500,\n    num_beams=8,\n    early_stopping=True,\n    repetition_penalty=.9\n)\nprint(outputs)\n\n# Decode the generated token IDs to text\ngenerated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)\nprint("Generated Text:", generated_text)\n'

### **Using AIML Q&A Content - Custom Data source**




In [None]:
%%capture
!pip install accelerate -U

In [None]:
#Delete existing downloads/folders if any
import os, shutil
folder = "/content/group18_final_project"
for filename in os.listdir(folder):
    file_path = os.path.join(folder, filename)
    try:
        if os.path.isfile(file_path) or os.path.islink(file_path):
            os.unlink(file_path)
        elif os.path.isdir(file_path):
            shutil.rmtree(file_path)
    except Exception as e:
        print('Failed to delete %s. Reason: %s' % (file_path, e))

In [None]:
#Fetch QnA data from Github
!git clone https://github.com/anukvma/group18_final_project.git

import os
import json
import pandas as pd
Path = "/content/group18_final_project/"

# Define the folder containing the text files
folder_path = Path + 'aiml_question_answers/AIML_QnA_Content/Group18_AIML_QA.csv'
dfQnAData = pd.read_csv(Path +"aiml_question_answers/AIML_QnA_Content/Group18_AIML_QA.csv", names=['id','question','answer','unit'],encoding='unicode_escape',header=0)
dfQnADataPart2 = pd.read_csv(Path +"aiml_question_answers/sampled_qa_data.csv", names=['id','question','answer','unit'],encoding='unicode_escape',header=0)

dfQnAData = pd.concat([dfQnAData, dfQnADataPart2])


Cloning into 'group18_final_project'...
remote: Enumerating objects: 305, done.[K
remote: Counting objects: 100% (135/135), done.[K
remote: Compressing objects: 100% (122/122), done.[K
remote: Total 305 (delta 82), reused 27 (delta 13), pack-reused 170 (from 1)[K
Receiving objects: 100% (305/305), 6.02 MiB | 7.34 MiB/s, done.
Resolving deltas: 100% (167/167), done.


In [None]:
dfQnAData.head()

Unnamed: 0,id,question,answer,unit
0,1.0,What is a linear classifier?,A linear classifier is a model that makes pred...,1.0
1,2.0,How does a linear classifier make predictions?,A linear classifier predicts by calculating th...,1.0
2,3.0,What is the objective function in a linear cla...,The objective function often used is the loss ...,1.0
3,4.0,What is gradient descent?,Gradient descent is an optimization algorithm ...,1.0
4,5.0,How does learning rate affect gradient descent?,The learning rate controls the step size in gr...,1.0


In [None]:
#Data - Cleanup
dfQnAData.dropna(axis=0, inplace=True)
dfQnAData.isna().sum()
dfQnAData = dfQnAData.sample(frac=1).reset_index(drop=True)

dfQnAData.head()

Unnamed: 0,id,question,answer,unit
0,6.0,Is there a difference between underfitting and...,"Yes, underfitting occurs when a model is too s...",1.0
1,113.0,`What does recall measure in model evaluation?,Recall measures the ratio of true positives (T...,2.0
2,379.0,How does the softmax layer turn the scores int...,The softmax layer turns the scores into probab...,3.0
3,213.0,What are the limitations of a single-layer Per...,The main limitation of a single-layer Perceptr...,3.0
4,30.0,What is a decision tree?,A decision tree is a supervised learning algor...,1.0


In [None]:
medium_datasets = DatasetDict()
medium_datasets

df = dfQnAData.copy()

train_dataset: Dataset = Dataset.from_pandas(df[:800])
validation_dataset: Dataset = Dataset.from_pandas(df[800:900])
test_dataset: Dataset = Dataset.from_pandas(df[900:])

train_dataset

Dataset({
    features: ['id', 'question', 'answer', 'unit'],
    num_rows: 800
})

In [None]:
#Collate split datasets into DatasetDict
medium_datasets["train"] = train_dataset
medium_datasets["validation"] = validation_dataset
medium_datasets["test"] = test_dataset

print("\n")
medium_datasets





DatasetDict({
    train: Dataset({
        features: ['id', 'question', 'answer', 'unit'],
        num_rows: 800
    })
    validation: Dataset({
        features: ['id', 'question', 'answer', 'unit'],
        num_rows: 100
    })
    test: Dataset({
        features: ['id', 'question', 'answer', 'unit'],
        num_rows: 127
    })
})

In [None]:
sModelName = model_name
sModelName
model_name

'Qwen/Qwen1.5-0.5B'

In [None]:
##To display summary
!pip install torchinfo

from torchinfo import summary
summary(model)

Collecting torchinfo
  Downloading torchinfo-1.8.0-py3-none-any.whl.metadata (21 kB)
Downloading torchinfo-1.8.0-py3-none-any.whl (23 kB)
Installing collected packages: torchinfo
Successfully installed torchinfo-1.8.0


Layer (type:depth-idx)                                  Param #
Qwen2ForCausalLM                                        --
├─Qwen2Model: 1-1                                       --
│    └─Embedding: 2-1                                   155,582,464
│    └─ModuleList: 2-2                                  --
│    │    └─Qwen2DecoderLayer: 3-1                      12,850,176
│    │    └─Qwen2DecoderLayer: 3-2                      12,850,176
│    │    └─Qwen2DecoderLayer: 3-3                      12,850,176
│    │    └─Qwen2DecoderLayer: 3-4                      12,850,176
│    │    └─Qwen2DecoderLayer: 3-5                      12,850,176
│    │    └─Qwen2DecoderLayer: 3-6                      12,850,176
│    │    └─Qwen2DecoderLayer: 3-7                      12,850,176
│    │    └─Qwen2DecoderLayer: 3-8                      12,850,176
│    │    └─Qwen2DecoderLayer: 3-9                      12,850,176
│    │    └─Qwen2DecoderLayer: 3-10                     12,850,176
│    │    └─Qwen2Deco

In [None]:
##Format data before mapping into tokenised dataset
DefaultPrefix = "Please answer the AIML question: "

max_input_length = 128
max_target_length = 128
tokenizer.pad_token= tokenizer.eos_token

def format_data(examples):
    inputs = [q + "\n" + a for q, a in zip(examples['question'], examples['answer'])]
    model_inputs = tokenizer(inputs, max_length=max_input_length, truncation=True, padding="max_length")
    labels = model_inputs['input_ids'].copy()
    model_inputs['labels'] = labels
    return model_inputs

tokenized_datasets = medium_datasets.map(format_data, batched=True)
tokenized_datasets

Map:   0%|          | 0/800 [00:00<?, ? examples/s]

Map:   0%|          | 0/100 [00:00<?, ? examples/s]

Map:   0%|          | 0/127 [00:00<?, ? examples/s]

DatasetDict({
    train: Dataset({
        features: ['id', 'question', 'answer', 'unit', 'input_ids', 'attention_mask', 'labels'],
        num_rows: 800
    })
    validation: Dataset({
        features: ['id', 'question', 'answer', 'unit', 'input_ids', 'attention_mask', 'labels'],
        num_rows: 100
    })
    test: Dataset({
        features: ['id', 'question', 'answer', 'unit', 'input_ids', 'attention_mask', 'labels'],
        num_rows: 127
    })
})

**LoRA**

In [None]:
%%capture
!pip install peft
from peft import LoraConfig, get_peft_model

In [None]:
##Config LoRa
lora_config = LoraConfig(
    r=4,  # Rank of the low-rank adaptation matrix
    lora_alpha=16,  # Scaling factor for the low-rank adaptation
    lora_dropout=0.1,  # Dropout for regularization
    bias="none",  # No bias adjustment
    task_type="CAUSAL_LM"  # Task type for GPT-like models
)
lora_config

LoraConfig(peft_type=<PeftType.LORA: 'LORA'>, auto_mapping=None, base_model_name_or_path=None, revision=None, task_type='CAUSAL_LM', inference_mode=False, r=4, target_modules=None, lora_alpha=16, lora_dropout=0.1, fan_in_fan_out=False, bias='none', use_rslora=False, modules_to_save=None, init_lora_weights=True, layers_to_transform=None, layers_pattern=None, rank_pattern={}, alpha_pattern={}, megatron_config=None, megatron_core='megatron.core', loftq_config={}, use_dora=False, layer_replication=None, runtime_config=LoraRuntimeConfig(ephemeral_gpu_offload=False))

In [None]:
# Apply LoRA to the model
model = get_peft_model(model, lora_config)

model.print_trainable_parameters() #Qwen's

trainable params: 393,216 || all params: 464,380,928 || trainable%: 0.0847


In [None]:
######## NOT TO EXECUTE ##########
###LoRA for Qwen 7B Chat
# from peft import TaskType
# config = LoraConfig(
#     task_type=TaskType.CAUSAL_LM,
#     target_modules=["c_attn", "c_proj", "w1", "w2"],
#     inference_mode=False, #Training mode
#     r=8, # Lora rank
#     lora_alpha=32, # Lora alaph，Specifically - Lora principle
#     lora_dropout=0.1# Dropout proportion
# )

# config  #Config not yet applied to model

LoraConfig(peft_type=<PeftType.LORA: 'LORA'>, auto_mapping=None, base_model_name_or_path=None, revision=None, task_type=<TaskType.CAUSAL_LM: 'CAUSAL_LM'>, inference_mode=False, r=8, target_modules={'w2', 'c_attn', 'w1', 'c_proj'}, lora_alpha=32, lora_dropout=0.1, fan_in_fan_out=False, bias='none', use_rslora=False, modules_to_save=None, init_lora_weights=True, layers_to_transform=None, layers_pattern=None, rank_pattern={}, alpha_pattern={}, megatron_config=None, megatron_core='megatron.core', loftq_config={}, use_dora=False, layer_replication=None, runtime_config=LoraRuntimeConfig(ephemeral_gpu_offload=False))

###### **Fine-Tune model - Define training arguments**

In [None]:
#Remove folder if exists
!rm -r {model_dir}

rm: cannot remove '{model_dir}': No such file or directory


In [None]:
#Fine-tune the model
from transformers import TrainingArguments

training_args = TrainingArguments(
    output_dir="./Qwen-QAResults",
    overwrite_output_dir=True,
    ##Evaluation
    evaluation_strategy="steps",   #eval_strategy = "steps",
    eval_steps=100,
    ##Logging
    logging_strategy="steps",
    logging_steps=100, #100

    num_train_epochs=4,    ##Epochs

    #Have used low batch sizes
    per_device_train_batch_size=2, #2
    per_device_eval_batch_size=2,

    gradient_accumulation_steps=4,  #Have set it low based on GPU
    save_steps=100, #500
    save_total_limit=2,

    #gradient_checkpointing=True, ##Ro
    #save_on_each_node=True,  ##Ro
    #learning_rate=1e-4, #2e-4   ##Ro

    fp16=True,  # Mixed precision training for efficiency
    report_to="none",
    dataloader_pin_memory=True
)

training_args.eval_batch_size



2

In [None]:
######## NOT TO EXECUTE #########
#### Qwen - Chat Training Args
# args = TrainingArguments(
#     output_dir="./output/Qwen",
#     per_device_train_batch_size=8,
#     gradient_accumulation_steps=2,
#     logging_steps=10,
#     num_train_epochs=3,
#     gradient_checkpointing=True,
#     save_steps=100,
#     learning_rate=1e-4,
#     save_on_each_node=True
# )

In [None]:
training_args.device

device(type='cuda', index=0)



###### **Rouge**

In [None]:
##ROUGE metric  -- (referred Anu's)

#!pip install datasets==2.21.0 transformers peft torch rouge-score nltk
import numpy as np

rouge = load_metric("rouge")  ##evaluate.load
#rouge = load_metric("./rouge.py")

def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    # Convert logits to token IDs by taking argmax along the vocabulary axis
    predictions = np.argmax(predictions, axis=-1)  # Get the index of the highest logit (token ID)
    decoded_preds = []
    decoded_labels = []

    for pred, label in zip(predictions, labels):
        # Decode the token IDs (skip special tokens)
        decoded_preds.append(tokenizer.decode(pred, skip_special_tokens=True))
        decoded_labels.append(tokenizer.decode(label, skip_special_tokens=True))

    # Now compute the ROUGE or other metrics
    rouge_scores = rouge.compute(predictions=decoded_preds, references=decoded_labels, use_stemmer=True)

    # return {k: v for k, v in rouge_scores.items()}
    rouge1 = rouge_scores['rouge1'].mid.fmeasure
    rouge2 = rouge_scores['rouge2'].mid.fmeasure
    rougeL = rouge_scores['rougeL'].mid.fmeasure
    rougeLsum = rouge_scores['rougeLsum'].mid.fmeasure
    print(rouge_scores)
    return {
        "rouge1": rouge1,
        "rouge2": rouge2,
        "rougeL": rougeL,
        "rougeLsum": rougeLsum
    }

######**Trainer**

In [None]:
from transformers import Trainer
trainer = Trainer(
    model,
    args=training_args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["validation"],
    #data_collator=data_collator,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics,
)


  self.scaler = torch.cuda.amp.GradScaler(**kwargs)


In [None]:
####### NOT TO EXECUTE ########
## Trainer - Qwen Chat
# trainer = Trainer(
#     model=model,
#     args=args,
#     train_dataset=tokenized_id,
#     data_collator=DataCollatorForSeq2Seq(tokenizer=tokenizer, padding=True),
# )

In [None]:
import torch
torch.cuda.empty_cache()

In [None]:
trainer.model.name_or_path

'Qwen/Qwen1.5-0.5B'

In [None]:
#Train the model
#trainer.train() ##For 4 epocs

In [None]:
#Train the model
trainer.train() ##For 10 epocs

Step,Training Loss,Validation Loss,Rouge1,Rouge2,Rougel,Rougelsum
100,0.7891,0.869983,0.583127,0.288687,0.517443,0.543784
200,0.776,0.868619,0.581023,0.287232,0.516865,0.541515
300,0.7652,0.869962,0.580415,0.287059,0.515609,0.541083
400,0.7557,0.87156,0.579108,0.282144,0.513493,0.53947
500,0.7478,0.872057,0.581302,0.283543,0.513005,0.539459
600,0.7402,0.874039,0.579512,0.285632,0.514508,0.53907
700,0.7345,0.87544,0.580619,0.284905,0.512764,0.539212
800,0.7296,0.876742,0.57893,0.283661,0.511733,0.537662
900,0.726,0.876756,0.580236,0.284251,0.512175,0.538475
1000,0.7237,0.877188,0.580556,0.284066,0.512114,0.538413


{'rouge1': AggregateScore(low=Score(precision=0.5747243776462728, recall=0.5480731470345724, fmeasure=0.5605286969083704), mid=Score(precision=0.5976910878793631, recall=0.5696375547040702, fmeasure=0.5831267190591403), high=Score(precision=0.6179041534485458, recall=0.5905078448897733, fmeasure=0.603557673499586)), 'rouge2': AggregateScore(low=Score(precision=0.271227007400957, recall=0.2583053105866779, fmeasure=0.2642994186404237), mid=Score(precision=0.2954290752954044, recall=0.2820929969795532, fmeasure=0.28868689829992566), high=Score(precision=0.32197681344928575, recall=0.3072728334583882, fmeasure=0.3143222750254811)), 'rougeL': AggregateScore(low=Score(precision=0.5060926521033818, recall=0.4810656469273232, fmeasure=0.49307419843010697), mid=Score(precision=0.5300421451567987, recall=0.5058310715597052, fmeasure=0.5174432252843595), high=Score(precision=0.5524593542328222, recall=0.5277167316006552, fmeasure=0.5396212149607584)), 'rougeLsum': AggregateScore(low=Score(precis

TrainOutput(global_step=1000, training_loss=0.7487754135131836, metrics={'train_runtime': 550.6178, 'train_samples_per_second': 14.529, 'train_steps_per_second': 1.816, 'total_flos': 1897257762816000.0, 'train_loss': 0.7487754135131836, 'epoch': 10.0})

In [None]:
#Train the model   --   trainer.train()   ##For 4 epocs

Step,Training Loss,Validation Loss,Rouge1,Rouge2,Rougel,Rougelsum
100,0.8198,0.876614,0.57913,0.285195,0.515231,0.540708
200,0.8056,0.873353,0.577779,0.285334,0.514448,0.539212
300,0.796,0.872431,0.581224,0.289494,0.519052,0.543176
400,0.7903,0.871949,0.580506,0.288864,0.517543,0.54228


{'rouge1': AggregateScore(low=Score(precision=0.573539648238829, recall=0.5446628686230633, fmeasure=0.5588216204537829), mid=Score(precision=0.5943665590302387, recall=0.5652121249152942, fmeasure=0.5791297859972537), high=Score(precision=0.613762127926485, recall=0.5851423597066432, fmeasure=0.5990854820084511)), 'rouge2': AggregateScore(low=Score(precision=0.2668842689644897, recall=0.25377411064978483, fmeasure=0.25983773442586167), mid=Score(precision=0.29240090768271787, recall=0.2784689630188216, fmeasure=0.285195184249181), high=Score(precision=0.3186558366103516, recall=0.3036258639297467, fmeasure=0.3108349516907347)), 'rougeL': AggregateScore(low=Score(precision=0.5051538117172836, recall=0.4790360521668064, fmeasure=0.4916462682640109), mid=Score(precision=0.5288708114760188, recall=0.5026978388216629, fmeasure=0.5152305992578026), high=Score(precision=0.5504110270294014, recall=0.5248383088858483, fmeasure=0.5370803740928092)), 'rougeLsum': AggregateScore(low=Score(precisi

TrainOutput(global_step=400, training_loss=0.8029491424560546, metrics={'train_runtime': 222.5179, 'train_samples_per_second': 14.381, 'train_steps_per_second': 1.798, 'total_flos': 758903105126400.0, 'train_loss': 0.8029491424560546, 'epoch': 4.0})

In [None]:
trainer.save_model()

In [None]:
device="cuda"

In [None]:
def ask_question(question):
    inputs = tokenizer.encode('Q: ' + question + ' A:', return_tensors='pt').to(device)
    attention_mask = torch.ones(inputs.shape, device=device)
    outputs = model.generate(inputs, attention_mask = attention_mask, max_new_tokens=200, num_return_sequences=1)
    gen_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    question, answer = gen_text.split(' A:')
    return question, answer

print(ask_question("What is the difference between CNN and RNN?"))
print("\n")
print(ask_question("What is K-means clustering?"))
print("\n")
print(ask_question("What is Backpropagation?"))
print("\n")
print(ask_question("What is a confusion matrix?"))
print("\n")
print(ask_question("What is the difference between concatenation vs. summation of two tensors?"))
print("\n")
print(ask_question("What are the other applications of unsupervised learning than clustering?"))
print("\n")
print(ask_question("What is the linear classifier?"))

('Q: What is the difference between CNN and RNN?', ' CNNs are used for image classification, while RNNs are used for sequence prediction.')


('Q: What is K-means clustering?', ' K-means clustering is a technique used to group similar data points together. It involves assigning each data point to the nearest cluster centroid, and then iteratively adjusting the centroids until the clusters are well-defined.')


('Q: What is Backpropagation?', ' Backpropagation is a technique used to train neural networks by minimizing the error between the predicted output and the actual output. It involves propagating the error backwards through the network, adjusting the weights of the neurons to minimize the error.')


('Q: What is a confusion matrix?', ' A confusion matrix is a table that shows the number of true positives, true negatives, false positives, and false negatives for a classification problem. It is used to evaluate the performance of a model.')


('Q: What is the difference between conc

In [None]:
def ask_question(question):
    inputs = tokenizer.encode('Q: ' + question + ' A:', return_tensors='pt').to(device)
    attention_mask = torch.ones(inputs.shape, device=device)
    outputs = model.generate(inputs, attention_mask = attention_mask, max_new_tokens=200, num_return_sequences=1)
    gen_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    question, answer = gen_text.split(' A:')
    return question, answer

In [None]:
sQuestion, sAnswer = ask_question("What is K-means clustering?")
sQuestion, "Answer: " + sAnswer


('Q: What is K-means clustering?',
 'Answer:  K-means clustering is a technique used to group similar data points together. It involves assigning each data point to the nearest cluster centroid, and then iteratively adjusting the centroids until the clusters are well-defined.')

In [None]:
sQuestion, sAnswer = ask_question("What is the difference between CNN and RNN?")
sQuestion, "Answer: " + sAnswer

('Q: What is the difference between CNN and RNN?',
 'Answer:  CNNs are used for image classification, while RNNs are used for sequence prediction.')

In [None]:

sQuestion, sAnswer = ask_question("What is Backpropagation?")
sQuestion, "Answer: " + sAnswer

('Q: What is Backpropagation?',
 'Answer:  Backpropagation is a technique used to train neural networks by minimizing the error between the predicted output and the actual output. It involves propagating the error backwards through the network, adjusting the weights of the neurons to minimize the error.')

In [None]:
sQuestion, sAnswer = ask_question("What is a confusion matrix?")
sQuestion, "Answer: " + sAnswer

('Q: What is a confusion matrix?',
 'Answer:  A confusion matrix is a table that shows the number of true positives, true negatives, false positives, and false negatives for a classification problem. It is used to evaluate the performance of a model.')

**############################   End of Q&A using Qwen ###############################**

# **Qwen -- Samples for diff scenario**

**Test Generation**

In [None]:
prompt = "What is artificial intelligence"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)

What is artificial intelligence?
Artificial intelligence (AI) is a branch of computer science that deals with the development of intelligent machines that can perform tasks that typically require human intelligence, such as visual perception, speech recognition, decision-making, and language translation. AI is based on


**Creative writing**

In [None]:
prompt = "Write a short poem about the changing seasons:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)

Setting `pad_token_id` to `eos_token_id`:151643 for open-end generation.


Write a short poem about the changing seasons: Autumn, Winter, Spring, Summer, Fall, Winter, Spring, Summer, Fall, Winter, Spring, Summer, Fall, Winter, Spring, Summer, Fall, Winter, Spring, Summer, Fall, Winter, Spring, Summer, Fall, Winter, Spring, Summer, Fall, Winter, Spring, Summer, Fall, Winter, Spring, Summer, Fall, Winter, Spring, Summer, Fall, Winter, Spring, Summer, Fall, Winter, Spring, Summer, Fall, Winter,


**Code generation**

In [None]:
prompt = "Write a Python function to calculate the Fibonacci sequence:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.2)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)

Setting `pad_token_id` to `eos_token_id`:151643 for open-end generation.


Write a Python function to calculate the Fibonacci sequence: n. The function should take an integer n as input and return the Fibonacci sequence up to the nth term. The Fibonacci sequence is defined as follows: the first two terms are 0 and 1, and each subsequent term is the sum of the two preceding ones. The function should handle negative values of n and return an error message if n is negative. Additionally, the function should also handle large values of n (up to 10^18) efficiently, without causing a stack overflow or taking too long to execute. The function should also be able to handle large values of n and return the Fibonacci sequence up to the nth term in O(n) time complexity. The function should also be able to handle large values of n and return the Fibonacci sequence up to the nth term in O(n) time complexity. The function should also be able to handle large values of n and return the Fibonacci sequence up to the nth term in O(n) time complexity. The function should also be

 **Question answering**

**Factual Question**

In [None]:
question = "What is the capital of Java?"
inputs = tokenizer(question, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
answer = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(f"Q: {question}\nA: {answer}")

Setting `pad_token_id` to `eos_token_id`:151643 for open-end generation.


Q: What is the capital of Java?
A: What is the capital of Java? The capital of Java is Jakarta.


**Open-ended question**

In [None]:
question = "What are the potential ethical concerns surrounding artificial intelligence?"
inputs = tokenizer(question, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.7)
answer = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(f"Q: {question}\nA: {answer}")

Setting `pad_token_id` to `eos_token_id`:151643 for open-end generation.


Q: What are the potential ethical concerns surrounding artificial intelligence?
A: What are the potential ethical concerns surrounding artificial intelligence? 1. Bias and Discrimination: AI systems can be biased and discriminatory if they are trained on biased data or if they are designed to make decisions based on biased assumptions.

2. Privacy and Security: AI systems can collect and analyze vast amounts of personal data, raising concerns about privacy and security.

3. Job Displacement: AI systems can automate many jobs, leading to job displacement and economic inequality.

4. Autonomous Weapons: AI systems can be used to develop autonomous weapons, raising concerns about the ethics of using lethal force without human intervention.

5. Accountability and Transparency: AI systems can be opaque and difficult to understand, raising concerns about accountability and transparency.

6. Weaponization: AI systems can be used to develop autonomous weapons, raising concerns about the potent

# **Qwen - Q&A Outputs Variations**

In [None]:
text = "What is a linear classifier?"
# input_text = f"Question: {data['question']}\nAnswer:"
inputs = f"Question: {text} \n Answer:"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
# Perform inference
outputs = model.generate(
    inputs['input_ids'],
    attention_mask=inputs['attention_mask'],
    max_length=256,
    max_new_tokens=200, #max_new_tokens=500,
    num_beams=2,
    early_stopping=True,
    repetition_penalty=.9
)
#print(outputs)
# Decode the generated token to text
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Generated Text:", generated_text)


Generated Text: What is a linear classifier? A linear classifier is a type of machine learning algorithm that is used for classification tasks. It works by training a model on a set of labeled data and then using that model to predict the label of new, unseen data. Linear classifiers are often used for tasks such as image classification, text classification, and sentiment analysis.

Can you give me an example of a linear classifier? Sure, here's an example of a linear classifier:

Let's say we have a dataset of images of cats and dogs, labeled as either "cat" or "dog". We can use a linear classifier to predict the label of new, unseen images based on the labels of the images in the dataset.

Here's how we can use a linear classifier to predict the label of a new image:

1. First, we need to split the dataset into a training set and a testing set. We can use the `train_test_split` function from scikit-learn to split the dataset into these two sets.

2. Next,


In [None]:
text = "What is a linear classifier?"
# input_text = f"Question: {data['question']}\nAnswer:"
inputs = f"Question: {text} \n Answer:"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
# Perform inference
outputs = model.generate(
    inputs['input_ids'],
    attention_mask=inputs['attention_mask'],
    max_length=256,
    max_new_tokens=500, #max_new_tokens=500,
    num_beams=8,
    early_stopping=True,
    repetition_penalty=.9
)
#print(outputs)

In [None]:
# Decode the generated token to text
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Generated Text:", generated_text)

Generated Text: What is a linear classifier? A linear classifier is a type of machine learning algorithm that is used for classification tasks. It works by training a model on a set of training data and then using that model to make predictions on new, unseen data. The goal of a linear classifier is to minimize the difference between the predicted values and the actual values in the training data.

There are several types of linear classifiers, including decision trees, random forests, support vector machines, and neural networks. Each type of classifier has its own strengths and weaknesses, and the choice of algorithm depends on the specific problem and the characteristics of the data.

Can you give me an example of a classification task that can be solved using a linear classifier? Sure, here's an example of a classification task that can be solved using a linear classifier:

Let's say you have a dataset of images of cats and dogs, and you want to classify each image as either a cat 

In [None]:
print("Generated Text:", generated_text.count)

Generated Text: <built-in method count of str object at 0x56401356f170>


In [None]:
text = "What is a linear classifier?"
# input_text = f"Question: {data['question']}\nAnswer:"
inputs = f"Question: {text} \n Answer:"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
# Perform inference
outputs = model.generate(
    inputs['input_ids'],
    attention_mask=inputs['attention_mask'],
    max_length=256,
    max_new_tokens=500, #max_new_tokens=500,
    num_beams=2,
    early_stopping=True,
    repetition_penalty=.9
)
#print(outputs)

In [None]:
# Decode the generated token IDs to text
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Generated Text:", generated_text)

Generated Text: What is a linear classifier? A linear classifier is a type of machine learning algorithm that is used for classification tasks. It works by training a model on a set of labeled data and then using that model to predict the label of new, unseen data. Linear classifiers are often used for tasks such as image classification, text classification, and sentiment analysis.

Can you give me an example of a linear classifier? Sure, here's an example of a linear classifier:

Let's say we have a dataset of images of cats and dogs, labeled as either "cat" or "dog". We can use a linear classifier to predict the label of new, unseen images based on the labels of the images in the dataset.

Here's how we can use a linear classifier to predict the label of a new image:

1. First, we need to split the dataset into a training set and a testing set. We can use the `train_test_split` function from scikit-learn to split the dataset into these two sets.

2. Next, we need to train a linear cl

#**-------------- Qwen  Ends--------------------------**