# Text classification

The task concentrates on classification of sentence pairs.

This type of classification is useful for problems such as determining the similarity of sentences or checking if a text passage contains an answer to a question.

In [None]:
from google.colab import drive
drive.mount('/content/mydrive')

1. Use the FIQA-PL dataset that was used in lab 1 and lab lab 2 (so we need the passages, the questions and their relations).

In [3]:
pip install datasets

Collecting datasets
  Downloading datasets-3.1.0-py3-none-any.whl.metadata (20 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting xxhash (from datasets)
  Downloading xxhash-3.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting multiprocess<0.70.17 (from datasets)
  Downloading multiprocess-0.70.16-py310-none-any.whl.metadata (7.2 kB)
Collecting fsspec<=2024.9.0,>=2023.1.0 (from fsspec[http]<=2024.9.0,>=2023.1.0->datasets)
  Downloading fsspec-2024.9.0-py3-none-any.whl.metadata (11 kB)
Downloading datasets-3.1.0-py3-none-any.whl (480 kB)
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m480.6/480.6 kB[0m [31m26.8 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading dill-0.3.8-py3-none-any.whl (116 kB)
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚î

In [4]:
from datasets import load_dataset
import pandas as pd

corpus_dataset = load_dataset("clarin-knext/fiqa-pl", "corpus")
corpus = corpus_dataset['corpus']
corpus = corpus.to_pandas()

queries_dataset = load_dataset("clarin-knext/fiqa-pl", "queries")
queries = queries_dataset['queries']
queries = queries.to_pandas()

qa_dataset = load_dataset("clarin-knext/fiqa-pl-qrels")
train_data = qa_dataset['train']
qrels = train_data.to_pandas()

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md:   0%|          | 0.00/201 [00:00<?, ?B/s]

fiqa-pl.py:   0%|          | 0.00/1.67k [00:00<?, ?B/s]

0000.parquet:   0%|          | 0.00/32.3M [00:00<?, ?B/s]

Generating corpus split:   0%|          | 0/57638 [00:00<?, ? examples/s]

0000.parquet:   0%|          | 0.00/377k [00:00<?, ?B/s]

Generating queries split:   0%|          | 0/6648 [00:00<?, ? examples/s]

README.md:   0%|          | 0.00/201 [00:00<?, ?B/s]

train.tsv:   0%|          | 0.00/210k [00:00<?, ?B/s]

dev.tsv:   0%|          | 0.00/18.3k [00:00<?, ?B/s]

test.tsv:   0%|          | 0.00/25.3k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/14166 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/1238 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/1706 [00:00<?, ? examples/s]

2. Create a dataset of positive and negative sentence pairs.

I. In each pair the first element is a question and the second element is a passagei, i.e. "{question} {separator} {passage}", where separator should be a separator taken from the model's tokenizer.

II. Use the relations to mark the positive pairs (i.e. pairs where the question is answered by the passage).

III. Use your own strategy to mark negative pairs (i.e. you can draw the negative examples, but there are better strategies to define the negative examples). The number of negative examples should be larger than the number of positive examples.

In [5]:
from transformers import AutoTokenizer
import pandas as pd
import random

pairs = []

merged_qrels = pd.merge(
    qrels.astype({'query-id': 'str', 'corpus-id': 'str'}),
    queries.astype({'_id': 'str'}),
    left_on='query-id', right_on='_id',
    suffixes=('_qrel', '_query')
)

corpus_texts = corpus[['text', '_id']].set_index('_id')['text'].to_dict()

for _, row in merged_qrels.iterrows():
    question = row['text']
    corpus_id = row['corpus-id']
    passage = corpus_texts.get(corpus_id, "")

    pairs.append({
        "text": f"{question} {passage}",
        "label": 1
    })

    negative_samples = random.sample(
        [text for cid, text in corpus_texts.items() if cid != corpus_id],
        min(2, len(corpus_texts) - 1) 
    )

    for neg_passage in negative_samples:
        pairs.append({
            "text": f"{question} {neg_passage}",
            "label": 0
        })

dataset_df = pd.DataFrame(pairs).sample(frac=1).reset_index(drop=True)

print(dataset_df.head())


The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

                                                text  label
0  Jak inwestowaƒá w nieruchomo≈õci bez u≈ºycia pien...      1
1  Odsetki od kaucji wp≈Çacanych w≈Ça≈õcicielom w Mi...      1
2  Jakie sƒÖ powody, aby otrzymaƒá wiƒôcej ni≈º jednƒÖ...      0
3  Czy mo≈ºna zarobiƒá na kredyt hipoteczny? JednƒÖ ...      0
4  SkƒÖd tyle szumu o obni≈ºeniu ratingu kredytoweg...      0


In [6]:
class_counts = dataset_df['label'].value_counts()
print(f"Number of positive examples (class 1): {class_counts.get(1, 0)}")
print(f"Number of negative examples (class 0): {class_counts.get(0, 0)}")

Number of positive examples (class 1): 14166
Number of negative examples (class 0): 28332


In [8]:
pip install sacremoses

Collecting sacremoses
  Downloading sacremoses-0.1.1-py3-none-any.whl.metadata (8.3 kB)
Downloading sacremoses-0.1.1-py3-none-any.whl (897 kB)
[?25l   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m0.0/897.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m897.5/897.5 kB[0m [31m41.1 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: sacremoses
Successfully installed sacremoses-0.1.1


In [9]:
pl_tokenizer = AutoTokenizer.from_pretrained("allegro/herbert-base-cased")
def process_data(row):
    text = row['text']
    text = str(text)
    text = ' '.join(text.split())
    encodings = pl_tokenizer(text, padding="max_length", truncation=True)
    encodings['label'] = row['label']
    encodings['text'] = text
    return encodings

In [10]:
processed_data = []

for i in range(len(dataset_df)):
    processed_data.append(process_data(dataset_df.iloc[i]))
new_df = pd.DataFrame(processed_data)
new_df.head()

Unnamed: 0,attention_mask,input_ids,label,text,token_type_ids
0,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...","[0, 2912, 25662, 1019, 6726, 2770, 21068, 3929...",1,Jak inwestowaƒá w nieruchomo≈õci bez u≈ºycia pien...,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."
1,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...","[0, 3134, 10736, 2173, 25996, 2090, 9099, 71, ...",1,Odsetki od kaucji wp≈Çacanych w≈Ça≈õcicielom w Mi...,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."
2,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...","[0, 9733, 2264, 18533, 1947, 2802, 10746, 2944...",0,"Jakie sƒÖ powody, aby otrzymaƒá wiƒôcej ni≈º jednƒÖ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."
3,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...","[0, 3007, 2545, 16761, 1998, 11397, 10853, 311...",0,Czy mo≈ºna zarobiƒá na kredyt hipoteczny? JednƒÖ ...,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."
4,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...","[0, 13232, 4108, 2897, 2182, 1007, 8646, 6920,...",0,SkƒÖd tyle szumu o obni≈ºeniu ratingu kredytoweg...,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."


3. The dataset from point 2 should be split into training, evaluation and testing subsets.

In [11]:
from sklearn.model_selection import train_test_split
import pyarrow as pa
from datasets import Dataset

train_data, temp_data = train_test_split(new_df, test_size=0.3, random_state=42)  
eval_data, test_data = train_test_split(temp_data, test_size=0.5, random_state=42)  

print(f"Training set size: {len(train_data)}")
print(f"Evaluation set size: {len(eval_data)}")
print(f"Test set size: {len(test_data)}")

train = Dataset(pa.Table.from_pandas(train_data))
valid = Dataset(pa.Table.from_pandas(eval_data))
test = Dataset(pa.Table.from_pandas(test_data))

Training set size: 29748
Evaluation set size: 6375
Test set size: 6375


4. Train a text classifier using the Transformers library that distinguishes between the positive and the negative pairs. To make the process manageable use models of size base and a runtime providing GPU/TPU acceleration. Consult the discussions related to fine-tuning Transformer models to select sensible set of parameters. You can also run several trainings with different hyper-parameters, if you have access to large computing resources.

5. Make sure you monitor the relevant metrics on the validation set during training. The last saved model might not be the one with the best performance.

6. Report the results you have obtained for the model. Use appropriate measures, since the dataset is not balanced.

In [12]:
print(train_data['text'][:2])

875      Dlaczego banki zachƒôcajƒÖ mnie do korzystania z...
41645    Czy sp≈Çacaƒá kredyt hipoteczny z g√≥ry, czy inwe...
Name: text, dtype: object


In [13]:
from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained(
    "allegro/herbert-base-cased", num_labels=2
)

model

pytorch_model.bin:   0%|          | 0.00/654M [00:00<?, ?B/s]

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at allegro/herbert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


BertForSequenceClassification(
  (bert): BertModel(
    (embeddings): BertEmbeddings(
      (word_embeddings): Embedding(50000, 768, padding_idx=1)
      (position_embeddings): Embedding(514, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0-11): 12 x BertLayer(
          (attention): BertAttention(
            (self): BertSdpaSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e

In [None]:
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
from transformers import TrainingArguments, Trainer
import numpy as np

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=1)

    precision, recall, f1, _ = precision_recall_fscore_support(labels, predictions, average='binary')
    accuracy = accuracy_score(labels, predictions)

    return {
        'accuracy': accuracy,
        'precision': precision,
        'recall': recall,
        'f1': f1
    }

training_args = TrainingArguments(
    output_dir="/content/mydrive/MyDrive/NLP/result",
    do_train=True,
    do_eval=True,
    evaluation_strategy="steps",
    eval_steps=300,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    learning_rate=5e-05,
    num_train_epochs=1,
    logging_first_step=True,
    logging_strategy="steps",
    logging_steps=50,
    save_strategy="epoch",
    fp16=True
)

trainer = Trainer(
    model=model,                        
    args=training_args,                
    train_dataset=train,       
    eval_dataset=valid,       
    tokenizer=pl_tokenizer,
    compute_metrics=compute_metrics, 
)

trainer.train()

  trainer = Trainer(


Step,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
300,0.1243,0.410593,0.909804,0.829437,0.913686,0.869526
600,0.08,0.425984,0.924078,0.91169,0.851693,0.880671
900,0.1202,0.364854,0.92549,0.913776,0.854077,0.882918
1200,0.1091,0.372139,0.926588,0.900246,0.873629,0.886738
1500,0.0985,0.351075,0.929569,0.912,0.869814,0.890408
1800,0.226,0.280312,0.930196,0.902534,0.883166,0.892745


TrainOutput(global_step=1860, training_loss=0.1186908438641538, metrics={'train_runtime': 1115.6971, 'train_samples_per_second': 26.663, 'train_steps_per_second': 1.667, 'total_flos': 7827027674849280.0, 'train_loss': 0.1186908438641538, 'epoch': 1.0})

In [45]:
trainer.evaluate()

{'eval_loss': 0.2603095769882202,
 'eval_accuracy': 0.9298823529411765,
 'eval_precision': 0.8985507246376812,
 'eval_recall': 0.8869814020028612,
 'eval_f1': 0.8927285817134629,
 'eval_runtime': 52.6238,
 'eval_samples_per_second': 121.143,
 'eval_steps_per_second': 7.582,
 'epoch': 1.0}

In [46]:
from sklearn.metrics import precision_score, recall_score, f1_score

predict_results = trainer.predict(test)

logits = predict_results.predictions
labels = predict_results.label_ids

predictions = np.argmax(logits, axis=1)

accuracy = (predictions == labels).mean()

precision = precision_score(labels, predictions)
recall = recall_score(labels, predictions)
f1 = f1_score(labels, predictions)

print(f"Accuracy: {accuracy}")
print(f"Precision: {precision}")
print(f"Recall: {recall}")
print(f"F1 Score: {f1}")

Accuracy: 0.9345882352941176
Precision: 0.9067234848484849
Recall: 0.8969555035128806
F1 Score: 0.9018130445020014


In [47]:
model.save_pretrained('/content/mydrive/MyDrive/NLP/model/')

In [40]:
trainer.save_model("/content/mydrive/MyDrive/NLP/result/checkpoint-1860")

7. Use the classifier as a re-ranker for finding the answers to the questions. Since the re-ranker is slow, you have to limit the subset of possible passages to top-n (10, 50 or 100 - depending on your GPU) texts returned by much faster model, e.g. FTS.

8. The scheme for re-ranking is as follows:
I. Find passage candidates using FTS, where the query is the question.
II. Take top-n results returned by FTS.
III. Use the model to classify all pairs, where the first sentence is the question (query) and the second sentence is the passage returned by the FTS.
IV. Use the score returned by the model (i.e. the probability of the positive outcome) to re-rank the passages.

9. Compute how much the result of searching the passages improved over the results from lab 2. Use NDCG to compare the results.

In [3]:
from transformers import pipeline

classifier = pipeline("sentiment-analysis", model="result")




In [4]:
from elasticsearch import Elasticsearch

es = Elasticsearch("http://localhost:9200")

In [None]:
analyzer = {
    "mappings": {
        "properties": {
            "text": {
                "type": "text",
                "analyzer": "nlp_analyzer"
             },
        }
    },
    "settings": {
        "analysis": {
            "analyzer": {
                "nlp_analyzer": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": ["lowercase", "morfologik_stem", "lowercase"]
                }
            }
        }
    }
}

In [None]:
from datasets import load_dataset
import pandas as pd

corpus_dataset = load_dataset("clarin-knext/fiqa-pl", "corpus")
corpus = corpus_dataset['corpus']

queries_dataset = load_dataset("clarin-knext/fiqa-pl", "queries")
queries = queries_dataset['queries']

qa_dataset = load_dataset("clarin-knext/fiqa-pl-qrels")
test_data = qa_dataset['test']

In [None]:
from elasticsearch.helpers import bulk, BulkIndexError

# es.indices.delete(index="index")
es.indices.create(index="index", body=analyzer)

documents = []

for i, row in corpus.iterrows():
    document = {
        "_index": "analyzer",
        "_id": i,
        "_source": {
            "text": row["text"],
        }
    }
    documents.append(document)


try:
    success, _ = bulk(es, documents)
except BulkIndexError as e:
    print(f"{len(e.errors)} documents failed to load.")
    for error in e.errors:
        print(error)

In [None]:
query_blank = {
    "match": {
        "text":{
            "query":"blank"
            }
        }
    }

query = queries['text'][110]
query_blank['match']['text']['query'] = query

resp = es.search(index="analyzer", query=query_blank)

data = []
for val in resp['hits']['hits'][:10]:
  data.append(f"Question: {query} Passage: {val['_source']['text']}")

rank = {}
for i, text in enumerate(data):
  text = text[:512]
  rank[i] = classifier(text)

rank.items()

sorted_rank = {k: v for k, v in sorted(rank.items(), key=lambda item: item[1][0]['score'], reverse=True)}
sorted_5 = {k: sorted_rank[k] for k in list(sorted_rank)[:5]}

In [None]:
import numpy as np

max_matches = test_data.groupby('query-id')['corpus-id'].count().rename('count')

def calc_ndcg_k(true, predicted):
    dcg = np.sum(predicted / np.log2(np.arange(2, len(true) + 2)))
    idcg = np.sum(true / np.log2(np.arange(2, len(true) + 2)))
    ndcg = dcg / idcg
    return ndcg

ndcg = 0
i = 0

for query_id in test_data['query-id'].unique():
    query = queries[queries['_id'] == str(query_id)].iloc[0]['text']
    query_blank['match']['text']['query'] = query

    resp = es.search(index="analyzer", query=query_blank)

    corpus_ids = test_data[test_data['query-id'] == query_id]['corpus-id']

    relevant_set = set(corpus[corpus['_id'].isin(corpus_ids.astype(str))].index)

    predicted_relevances = np.zeros(5)
    true_relevances = np.zeros(5)

    matches = min(max_matches.loc[query_id], 5)
    true_relevances[:matches] = 1

    for idx, hit in enumerate(resp['hits']['hits'][:5]):
        doc_id = int(hit['_id'])

        if doc_id in relevant_set:
            predicted_relevances[idx] = 1

    ndcg_score = calc_ndcg_k(true_relevances, predicted_relevances)
    ndcg += ndcg_score
    i += 1

average_ndcg = ndcg / i

print(f"NDCG without re-ranking: {average_ndcg:.4f}")

In [7]:
ndcg = 0
i = 0

for query_id in test_data['query-id'].unique():
    query = queries[queries['_id'] == str(query_id)].iloc[0]['text']
    query_blank['match']['text']['query'] = query
    resp = es.search(index="analyzer", query=query_blank)

    data = []
    for val in resp['hits']['hits'][:10]:
        data.append(f"Question: {query} Passage: {val['_source']['text']}")

    corpus_ids = test_data[test_data['query-id'] == query_id]['corpus-id']
    tmp = set()
    for ind in corpus_ids:
        _id = corpus[corpus['_id'] == str(ind)].index.tolist()[0]
        tmp.add(_id)

    matches = min(max_matches.loc[query_id], 5)
    true_relevances = np.zeros(5)
    predicted_relevances = np.zeros(5)
    true_relevances[:matches] = 1

    rank = {}
    for j, text in enumerate(data):
        rank[j] = classifier(text[:512])

    sorted_rank = {k: v for k, v in sorted(rank.items(), key=lambda item: item[1][0]['score'], reverse=True)}
    sorted_5 = {k: sorted_rank[k] for k in list(sorted_rank)[:5]}

    for j, ind in enumerate(sorted_5.keys()):
        val = resp['hits']['hits'][ind]
        _id = int(val['_id'])

        if _id in tmp:
            predicted_relevances[j] = 1
        else:
            predicted_relevances[j] = 0

    ndcg_score = calc_ndcg_k(true_relevances, predicted_relevances)
    ndcg += ndcg_score
    i += 1

average_ndcg = ndcg / i
print(f"NDCG with re-ranking: {average_ndcg:.4f}")

NDCG with re-ranking: 0.1675


### Questions

1. Czy prostsze metody, takie jak Bayesian Bag-of-Words, sprawdzƒÖ siƒô w klasyfikacji par zda≈Ñ?

Tak, Bayesian Bag-of-Words mo≈ºe sprawdziƒá siƒô w klasyfikacji par zda≈Ñ, ale ma ograniczonƒÖ zdolno≈õƒá do uchwycenia kontekstu i semantyki zda≈Ñ, co mo≈ºe ograniczyƒá jego skuteczno≈õƒá w przypadku bardziej zaawansowanych zada≈Ñ.

2. Jakie hiperparametry wybra≈Çe≈õ do treningu?

Wyb√≥r hiperparametr√≥w zale≈ºy od modelu, ale zazwyczaj obejmuje szybko≈õƒá uczenia siƒô, liczbƒô epok, rozmiar partii, a w przypadku sieci neuronowych d≈Çugo≈õƒá sekwencji i kroki rozgrzewki. Zasoby, takie jak dokumentacja Hugging Face i prace naukowe, pomogƒÖ Ci je wybraƒá. Moje parametry:
- output_dir="/content/mydrive/MyDrive/NLP/result",
- do_train=True,
- do_eval=True,
- evaluation_strategy="steps",
- eval_steps=300,
- per_device_train_batch_size=16,
- per_device_eval_batch_size=16,
- learning_rate=5e-05,
- num_train_epochs=1,
- logging_first_step=True,
- logging_strategy="steps",
- logging_steps=50,
- save_strategy="epoch",
- fp16=True

3. Zalety i wady modeli sieci neuronowych w NLP?

Zalety obejmujƒÖ lepsze zrozumienie kontekstu i elastyczno≈õƒá w dostosowywaniu siƒô do r√≥≈ºnych zada≈Ñ NLP. Do wad zalicza siƒô wysokie wymagania obliczeniowe i brak mo≈ºliwo≈õci interpretacji, co mo≈ºe utrudniaƒá wyja≈õnianie decyzji podejmowanych na podstawie modelu.