 Employ [Hugging Face](https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=sentiment) transformers for the same classification task as in the first assignment.

Explore Hugging Face models to find a pre-trained model that is suitable and promising for fine-tuning to your task. It should make sense to pick one that has been pre-trained for the same language and/or text genre.

As a bonus, you can also employ a [domain adaptation](https://huggingface.co/learn/llm-course/chapter7/3?fw=pt) approach, explore [parameter-efficient fine-tuning](https://huggingface.co/docs/peft/main/quicktour) (e.g. LoRA), or [prompting language models](https://huggingface.co/docs/transformers/v4.49.0/en/tasks/prompting).

We must ompare the performance of your model(s) with the ones developed for the first assignment.

Most of the models have problems processing the text!!!

In [29]:
import utils
from utils import CustomDataset, CustomDataset1
from datasets import Dataset, load_dataset
import numpy as np
import pandas as pd
from sklearn.metrics import accuracy_score, f1_score
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer, pipeline, EarlyStoppingCallback

In [None]:
# 5 fold
train_ds = load_dataset("csv", data_files={"train": "../common/data_sentiment_preprocessed.csv"}, split=[f"train[:{k}%]+train[{k+10}%:]" for k in range(0, 100, 20)])
val_ds = load_dataset("csv", data_files=["../common/data_sentiment_preprocessed_val.csv"], split=[f"train[:{k}%]+train[{k+10}%:]" for k in range(0, 100, 20)])

def compute_metrics(p):
    preds = np.argmax(p.predictions, axis=1)
    return {
        "accuracy": accuracy_score(p.label_ids, preds),
        "f1": f1_score(p.label_ids, preds),
    }

def tokenize_function(examples):
    global tokenizer
    return tokenizer(examples['text'], truncation=True, padding='max_length', max_length=128)

def tokenize_datasets(train_ds, val_ds, preprocess_function):
    for idx, item in enumerate(train_ds):
        train_ds[idx] = item.rename_column("sentiment_label", "label")
        train_ds[idx] = train_ds[idx].map(
            preprocess_function,
            batched=True,
            desc="Running tokenizer on dataset",
        )

    for idx, item in enumerate(val_ds):
        val_ds[idx] = item.rename_column("sentiment_label", "label")
        val_ds[idx] = val_ds[idx].map(
            preprocess_function,
            batched=True,
            desc="Running tokenizer on dataset",
        )

def model_init():
    global model_path
    return AutoModelForSequenceClassification.from_pretrained(model_path, num_labels=2)

def run_cross_validation(train_ds, val_ds, model_init, tokenizer, model_name):
    tokenize_datasets(train_ds, val_ds, tokenize_function)

    accuracies = []
    f1s = []

    for i in range(len(train_ds)):
        print(f"Running fold {i+1}/{len(train_ds)}")

        training_args = TrainingArguments(
            output_dir=f"./results/{model_name}/fold_{i}",
            per_device_train_batch_size=16,
            per_device_eval_batch_size=64,
            num_train_epochs=10,  # Give room for early stopping
            eval_strategy="epoch",
            save_strategy="epoch",
            logging_dir=f"./logs/fold_{i}",
            logging_steps=10,
            load_best_model_at_end=True,
            metric_for_best_model="f1",  # Use F1 for early stopping
            greater_is_better=True,
        )

        trainer = Trainer(
            model_init=model_init,
            args=training_args,
            train_dataset=train_ds[i],
            eval_dataset=val_ds[i],
            compute_metrics=compute_metrics,
            tokenizer=tokenizer,
            callbacks=[EarlyStoppingCallback(early_stopping_patience=3)],
        )

        trainer.train()
        metrics = trainer.evaluate()

        accuracies.append(metrics["eval_accuracy"])
        f1s.append(metrics["eval_f1"])

    avg_accuracy = np.mean(accuracies)
    avg_f1 = np.mean(f1s)

    print(f"\nAverage Accuracy: {avg_accuracy:.4f}")
    print(f"Average F1 Score: {avg_f1:.4f}")

In [16]:
train_ds[4]

Dataset({
    features: ['id', 'text', 'sentiment_label', 'clean_text', 'tokenized_text'],
    num_rows: 7980
})

In [21]:
len(train_ds), len(val_ds)

(5, 5)

In [61]:
combined_sentiment_df = pd.read_csv("../common/data_sentiment_preprocessed.csv")
combined_sentiment_df_val = pd.read_csv("../common/data_sentiment_preprocessed_val.csv")
x_train = combined_sentiment_df.text
y_train = combined_sentiment_df.sentiment_label
x_val = combined_sentiment_df_val.text
y_val = combined_sentiment_df_val.sentiment_label

## Making use of pretrained huggingface models

### siebert/sentiment-roberta-large-english

In [62]:


#https://huggingface.co/siebert/sentiment-roberta-large-english?library=transformers

"""
    article: https://www.sciencedirect.com/science/article/pii/S0167811622000477
"""

from transformers import pipeline

siebert_roberta = pipeline("text-classification", model="siebert/sentiment-roberta-large-english")


print(siebert_roberta("I love you!"))
print(siebert_roberta("I hate you!"))
print(siebert_roberta("neutral text"))


Device set to use cuda:0


[{'label': 'POSITIVE', 'score': 0.9987329840660095}]
[{'label': 'NEGATIVE', 'score': 0.9992897510528564}]
[{'label': 'POSITIVE', 'score': 0.9969078898429871}]


In [63]:
#siebert_roberta
mapper = {
    "NEGATIVE": 0,
    "POSITIVE": 1
} 
utils.apply_kaggle_model(siebert_roberta, mapper, x_val, y_val)

You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset
Token indices sequence length is longer than the specified maximum sequence length for this model (574 > 512). Running this sequence through the model will result in indexing errors


Error processing text at index 697: Positives: First time going to this place today. Let me tell you, coming from a family of chefs this was delectable, the dine in meals came out fast, they were LARGE portions, and very good temperature. We ordered the flowered onion( fried and whole), we ordered the Louisiana chook both entrees. Then I had the parmigiana as my main with mash and veg. The mash and veg was perfectly cooked, though the mash tastes a little like packet mash. The sauce with the Louisiana chicken is a little spicy so if you ca n’t tolerate a little spice the sauce is n’t for you. But man oh man the crunch on the chook and the juicy chicken was incredible, was thoroughly enjoyable. The parmigiana was LARGE so much so I could n’t finish it all. Great that they gave takeaways Negatives: The drink I ordered was the summer one in the mocktails section, tasted great only issue I really had was the lemon seeds in the drink, lucky the straws were n’t big enough to suck them up oth

### saiffff/distilbert-imdb-sentiment

In [13]:
# Use a pipeline as a high-level helper
from transformers import pipeline

saiffff = pipeline("text-classification", model="saiffff/distilbert-imdb-sentiment")
print(saiffff("I don't like you!"))
print(saiffff("this is really good!"))
print(saiffff("neutral text"))

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
Device set to use mps:0


[{'label': 'LABEL_0', 'score': 0.7047289609909058}]
[{'label': 'LABEL_1', 'score': 0.9926829934120178}]
[{'label': 'LABEL_0', 'score': 0.6355293393135071}]


In [14]:
mapper = {
    "LABEL_0": 0,
    "LABEL_1": 1,
}
utils.apply_kaggle_model(saiffff, mapper, x_val, y_val)

Token indices sequence length is longer than the specified maximum sequence length for this model (574 > 512). Running this sequence through the model will result in indexing errors


Error processing text at index 697: Positives: First time going to this place today. Let me tell you, coming from a family of chefs this was delectable, the dine in meals came out fast, they were LARGE portions, and very good temperature. We ordered the flowered onion( fried and whole), we ordered the Louisiana chook both entrees. Then I had the parmigiana as my main with mash and veg. The mash and veg was perfectly cooked, though the mash tastes a little like packet mash. The sauce with the Louisiana chicken is a little spicy so if you ca n’t tolerate a little spice the sauce is n’t for you. But man oh man the crunch on the chook and the juicy chicken was incredible, was thoroughly enjoyable. The parmigiana was LARGE so much so I could n’t finish it all. Great that they gave takeaways Negatives: The drink I ordered was the summer one in the mocktails section, tasted great only issue I really had was the lemon seeds in the drink, lucky the straws were n’t big enough to suck them up oth

## Bonus

### training models

#### saiffff/distilbert-imdb-sentiment

In [52]:
# https://huggingface.co/saiffff/distilbert-imdb-sentiment
model_path = "saiffff/distilbert-imdb-sentiment"
tokenizer = AutoTokenizer.from_pretrained(model_path)


In [53]:
run_cross_validation(train_ds, val_ds, model_init, tokenizer, 'saiffff_model')

Running tokenizer on dataset:   0%|          | 0/7979 [00:00<?, ? examples/s]

Running tokenizer on dataset:   0%|          | 0/7979 [00:00<?, ? examples/s]

Running tokenizer on dataset:   0%|          | 0/7979 [00:00<?, ? examples/s]

Running tokenizer on dataset:   0%|          | 0/7980 [00:00<?, ? examples/s]

Running tokenizer on dataset:   0%|          | 0/7980 [00:00<?, ? examples/s]

Running tokenizer on dataset:   0%|          | 0/1091 [00:00<?, ? examples/s]

Running tokenizer on dataset:   0%|          | 0/1090 [00:00<?, ? examples/s]

Running tokenizer on dataset:   0%|          | 0/1091 [00:00<?, ? examples/s]

Running tokenizer on dataset:   0%|          | 0/1091 [00:00<?, ? examples/s]

Running tokenizer on dataset:   0%|          | 0/1091 [00:00<?, ? examples/s]

Running fold 1/5


  trainer = Trainer(


Epoch,Training Loss,Validation Loss,Accuracy,F1
1,0.3252,0.285304,0.880843,0.884547
2,0.163,0.378014,0.88451,0.887299
3,0.0804,0.535727,0.870761,0.878553
4,0.0839,0.675212,0.88451,0.89322
5,0.025,0.758838,0.877177,0.888704
6,0.0037,0.746095,0.875344,0.886477
7,0.0002,0.849747,0.87901,0.890547


Running fold 2/5


  trainer = Trainer(


Epoch,Training Loss,Validation Loss,Accuracy,F1
1,0.2627,0.296299,0.888073,0.889493
2,0.1121,0.382961,0.888991,0.8901
3,0.0115,0.587212,0.881651,0.88176
4,0.0825,0.633065,0.889908,0.893238
5,0.0559,0.775273,0.888991,0.891285
6,0.0002,0.788367,0.888991,0.892061
7,0.0002,0.883373,0.886239,0.887067


Running fold 3/5


  trainer = Trainer(


Epoch,Training Loss,Validation Loss,Accuracy,F1
1,0.2733,0.289091,0.88176,0.867692
2,0.0979,0.431633,0.873511,0.857732
3,0.0461,0.525089,0.870761,0.858291
4,0.0579,0.668384,0.880843,0.867347


Running fold 4/5


  trainer = Trainer(


Epoch,Training Loss,Validation Loss,Accuracy,F1
1,0.2955,0.258857,0.890926,0.894783
2,0.2034,0.381676,0.874427,0.883404
3,0.0245,0.509166,0.895509,0.897666
4,0.0009,0.649577,0.887259,0.888688
5,0.0006,0.723298,0.886343,0.886654
6,0.0001,0.783081,0.892759,0.893151


Running fold 5/5


  trainer = Trainer(


Epoch,Training Loss,Validation Loss,Accuracy,F1
1,0.303,0.24598,0.896425,0.890821
2,0.2704,0.376244,0.880843,0.882459
3,0.0626,0.525369,0.890926,0.885246
4,0.0011,0.603046,0.892759,0.887175



Average Accuracy: 0.8896
Average F1 Score: 0.8885


### domain adaptation

#### distilbert/distilbert-base-uncased

In [55]:
model_path = "distilbert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_path)

In [56]:
run_cross_validation(train_ds, val_ds, model_init, tokenizer, 'distilbert_base')

Running tokenizer on dataset:   0%|          | 0/7979 [00:00<?, ? examples/s]

Running fold 1/5


  trainer = Trainer(
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,F1
1,0.3582,0.299208,0.868928,0.877042
2,0.1523,0.366943,0.87626,0.884517
3,0.048,0.493979,0.879927,0.888321
4,0.0576,0.705606,0.873511,0.88
5,0.0955,0.781516,0.875344,0.885522
6,0.0004,0.792805,0.873511,0.883642


Running fold 2/5


  trainer = Trainer(
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,F1
1,0.2804,0.315443,0.876147,0.880425
2,0.1111,0.363057,0.888991,0.893953
3,0.0665,0.499873,0.877982,0.881567
4,0.1119,0.727809,0.883486,0.888889
5,0.0095,0.817715,0.868807,0.880734


Running fold 3/5


  trainer = Trainer(
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,F1
1,0.3261,0.305674,0.87626,0.859813
2,0.1325,0.410822,0.870761,0.858291
3,0.1298,0.499975,0.870761,0.863504
4,0.0762,0.759717,0.862511,0.85119
5,0.0443,0.792202,0.872594,0.861692
6,0.0002,0.961636,0.863428,0.852329


Running fold 4/5


  trainer = Trainer(
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,F1
1,0.3169,0.286523,0.871677,0.879518
2,0.0966,0.40529,0.87901,0.885417
3,0.0828,0.42514,0.898258,0.899365
4,0.0472,0.797337,0.865261,0.863256
5,0.0005,0.745605,0.882676,0.882784
6,0.0001,0.839794,0.893676,0.895118


Running fold 5/5


  trainer = Trainer(
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,F1
1,0.3565,0.266315,0.887259,0.880698
2,0.2375,0.526802,0.84143,0.850216
3,0.1106,0.39236,0.902841,0.897485
4,0.0323,0.54396,0.903758,0.896347
5,0.0611,0.67043,0.890926,0.882527
6,0.0005,0.789638,0.891842,0.881526



Average Accuracy: 0.8882
Average F1 Score: 0.8885


### roberta

In [59]:
model_path = "FacebookAI/roberta-base"
tokenizer = AutoTokenizer.from_pretrained(model_path)

def peft_model_init():
    global model_path
    model = AutoModelForSequenceClassification.from_pretrained(model_path, num_labels=2)
    from peft import LoraConfig, TaskType, get_peft_model
    peft_config = LoraConfig(
        task_type=TaskType.SEQ_CLS, r=2, lora_alpha=16, lora_dropout=0.1, bias="none",
    )
    model = get_peft_model(model, peft_config)
    return model

In [None]:
run_cross_validation(train_ds, val_ds, peft_model_init, tokenizer, 'roberta_base')

Running tokenizer on dataset:   0%|          | 0/7979 [00:00<?, ? examples/s]

Running fold 1/5


  trainer = Trainer(
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at FacebookAI/roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
No label_names provided for model class `PeftModelForSequenceClassification`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at FacebookAI/roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to u

Epoch,Training Loss,Validation Loss,Accuracy,F1
1,0.2834,0.291398,0.887259,0.895141
2,0.2386,0.297508,0.886343,0.890653
3,0.2763,0.312307,0.873511,0.876786
4,0.2867,0.292205,0.896425,0.903993
5,0.2681,0.294012,0.892759,0.899914
6,0.2672,0.298114,0.887259,0.893691
7,0.2427,0.323134,0.888176,0.899007


Running fold 2/5


  trainer = Trainer(
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at FacebookAI/roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
No label_names provided for model class `PeftModelForSequenceClassification`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at FacebookAI/roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to u

Epoch,Training Loss,Validation Loss,Accuracy,F1
1,0.2622,0.283144,0.89633,0.899556
2,0.1984,0.288031,0.890826,0.892502
3,0.267,0.296859,0.880734,0.880074
4,0.3066,0.295606,0.89633,0.902334
5,0.2173,0.288924,0.897248,0.901408
6,0.2901,0.283764,0.892661,0.896368
7,0.2586,0.305786,0.892661,0.897098


Running fold 3/5


  trainer = Trainer(
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at FacebookAI/roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
No label_names provided for model class `PeftModelForSequenceClassification`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at FacebookAI/roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to u

Epoch,Training Loss,Validation Loss,Accuracy,F1
1,0.2768,0.309514,0.88451,0.87931
2,0.2389,0.300686,0.879927,0.868342
3,0.2983,0.29441,0.879927,0.866734
4,0.3217,0.295033,0.887259,0.881844
5,0.2474,0.289663,0.886343,0.87674
6,0.2703,0.294829,0.887259,0.881617
7,0.2539,0.305467,0.889093,0.882638
8,0.1896,0.301762,0.887259,0.882071
9,0.2184,0.300911,0.887259,0.880234
10,0.2818,0.304393,0.889093,0.88241


Running fold 4/5


  trainer = Trainer(
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at FacebookAI/roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
No label_names provided for model class `PeftModelForSequenceClassification`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at FacebookAI/roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to u

Epoch,Training Loss,Validation Loss,Accuracy,F1
1,0.3871,0.294165,0.890009,0.893993
2,0.2539,0.285684,0.885426,0.892334
3,0.274,0.273447,0.900092,0.901536
4,0.1542,0.270018,0.899175,0.900362
5,0.2312,0.279205,0.892759,0.896734
6,0.1032,0.288713,0.899175,0.9


Running fold 5/5


  trainer = Trainer(
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at FacebookAI/roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
No label_names provided for model class `PeftModelForSequenceClassification`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at FacebookAI/roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to u

Epoch,Training Loss,Validation Loss,Accuracy,F1
1,0.3984,0.275404,0.892759,0.891566
2,0.2938,0.260548,0.906508,0.903409
3,0.3007,0.246411,0.912007,0.907869
4,0.2344,0.244158,0.912007,0.907692
5,0.209,0.2456,0.912007,0.907692
6,0.1246,0.264031,0.908341,0.903288



Average Accuracy: 0.8988
Average F1 Score: 0.8997
