This tutorial is a copy paste from :
https://medium.com/swlh/simple-transformers-multi-class-text-classification-with-bert-roberta-xlnet-xlm-and-8b585000ce3a

In [5]:
import pandas as pd

train_df = pd.read_csv('./data/ag_news_csv/train.csv', header=None)
train_df.head()

Unnamed: 0,0,1,2
0,3,Wall St. Bears Claw Back Into the Black (Reuters),"Reuters - Short-sellers, Wall Street's dwindli..."
1,3,Carlyle Looks Toward Commercial Aerospace (Reu...,Reuters - Private investment firm Carlyle Grou...
2,3,Oil and Economy Cloud Stocks' Outlook (Reuters),Reuters - Soaring crude prices plus worries\ab...
3,3,Iraq Halts Oil Exports from Main Southern Pipe...,Reuters - Authorities have halted oil export\f...
4,3,"Oil prices soar to all-time record, posing new...","AFP - Tearaway world oil prices, toppling reco..."


In [6]:
train_df['text'] = train_df.iloc[:, 1] + " " + train_df.iloc[:, 2]
train_df = train_df.drop(train_df.columns[[1, 2]], axis=1)
train_df.columns = ['label', 'text']
train_df = train_df[['text', 'label']]
train_df['text'] = train_df['text'].apply(lambda x: x.replace('\\', ' '))
train_df['label'] = train_df['label'].apply(lambda x:x-1)

eval_df = pd.read_csv('./data/ag_news_csv/test.csv', header=None)
eval_df['text'] = eval_df.iloc[:, 1] + " " + eval_df.iloc[:, 2]
eval_df = eval_df.drop(eval_df.columns[[1, 2]], axis=1)
eval_df.columns = ['label', 'text']
eval_df = eval_df[['text', 'label']]
eval_df['text'] = eval_df['text'].apply(lambda x: x.replace('\\', ' '))
eval_df['label'] = eval_df['label'].apply(lambda x:x-1)

In [7]:
train_df.head()

Unnamed: 0,text,label
0,Wall St. Bears Claw Back Into the Black (Reute...,2
1,Carlyle Looks Toward Commercial Aerospace (Reu...,2
2,Oil and Economy Cloud Stocks' Outlook (Reuters...,2
3,Iraq Halts Oil Exports from Main Southern Pipe...,2
4,"Oil prices soar to all-time record, posing new...",2


In [8]:
eval_df.head()

Unnamed: 0,text,label
0,Fears for T N pension after talks Unions repre...,2
1,The Race is On: Second Private Team Sets Launc...,3
2,Ky. Company Wins Grant to Study Peptides (AP) ...,3
3,Prediction Unit Helps Forecast Wildfires (AP) ...,3
4,Calif. Aims to Limit Farm-Related Smog (AP) AP...,3


Simple Transformers requires data to be in Pandas DataFrames with at least two columns. You can simply name your columns text and labels, and SimpleTransformers will take care of handling the data. Alternatively, you can follow the convention below.  

The first column contains the text and is of type str.
The second column contains the labels and is of type int.

For **multiclass classification**, the **labels should be integers starting from 0**. If your data has other labels, you can use a python dict to keep a mapping from the original labels to the integer labels.

In [9]:
#!pip install simpletransformers

In [4]:
from simpletransformers.classification import ClassificationModel

# Create a ClassificationModel
model = ClassificationModel('roberta', 'roberta-base', num_labels=4, use_cuda=False)

Downloading:   0%|          | 0.00/501M [00:00<?, ?B/s]

Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out

Downloading:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/456k [00:00<?, ?B/s]

To load a model a previously saved model instead of a default model, you can change the model_name to the path to a directory which contains a saved model.

`model = ClassificationModel('xlnet', 'path_to_model/', num_labels=4)`

A ClassificationModel has a dict args which contains many attributes that provide control over hyperparameters. For a detailed description of each attribute, please refer to the [repo](https://github.com/ThilinaRajapakse/simpletransformers). The default values are given below.

self.args = {
    "output_dir": "outputs/",
    "cache_dir": "cache_dir/",

    "fp16": True,
    "fp16_opt_level": "O1",
    "max_seq_length": 128,
    "train_batch_size": 8,
    "gradient_accumulation_steps": 1,
    "eval_batch_size": 8,
    "num_train_epochs": 1,
    "weight_decay": 0,
    "learning_rate": 4e-5,
    "adam_epsilon": 1e-8,
    "warmup_ratio": 0.06,
    "warmup_steps": 0,
    "max_grad_norm": 1.0,

    "logging_steps": 50,
    "save_steps": 2000,

    "overwrite_output_dir": False,
    "reprocess_input_data": False,
    "evaluate_during_training": False,

    "process_count": cpu_count() - 2 if cpu_count() > 2 else 1,
    "n_gpu": 1,
}

Any of these attributes can be modified when creating a ClassificationModel or when calling its train_model method by simply passing in a dict containing the key-value pairs to be updated. An example is given below.

```# Create a TransformerModel with modified attributes
model = TransformerModel('roberta', 'roberta-base', num_labels=4, args={'learning_rate':1e-5, 'num_train_epochs': 2, 'reprocess_input_data': True, 'overwrite_output_dir': True})``` 

## Training

In [16]:
# Train the model
model.train_model(train_df, num_labels=4, args={'learning_rate':1e-5, 'num_train_epochs': 2, 'reprocess_input_data': True, 'overwrite_output_dir': True})

  0%|          | 0/120000 [00:00<?, ?it/s]

Epoch:   0%|          | 0/2 [00:00<?, ?it/s]

Running Epoch 0 of 2:   0%|          | 0/15000 [00:00<?, ?it/s]

KeyboardInterrupt: 

That’s all you have to do to train the model. You can also change the hyperparameters by passing in a dict containing the relevant attributes to the train_model method. Note that, these modifications will persist even after training is completed.
The train_model method will create a checkpoint (save) of the model at every nth step where n is `self.args['save_steps']`. Upon completion of training, the final model will be saved to `self.args['output_dir']`.

## Evaluation

To evaluate the model, just call eval_model. This method has three return values.
- **result:** The evaluation result in the form of a dict. By default, only the Matthews correlation coefficient (MCC) is calculated for multiclass classification.
- **model_outputs:** A list of model outputs for each item in the evaluation dataset. This is useful if you need probabilities for each class rather than a single prediction. Indeed, the prediction is calculated by applying a softmax function over the outputs.
- **wrong_predictions:** A list of InputFeature of each incorrect prediction. The text may be obtained from the InputFeature.text_a attribute. (The InputFeature class can be found in the utils.py file in the repo)

You can also include additional metrics to be used in the evaluation. Simply pass in the metrics functions as keyword arguments to the `eval_model` method. The metrics functions should take in two parameters, the first one being the true label, and the second being the predictions. This follows the sklearn standard.

For any metric functions that need additional parameters (f1_score in sklearn), you can wrap it in your own function with the additional parameters added and pass your function to `eval_model`.

In [None]:
from sklearn.metrics import f1_score, accuracy_score


def f1_multiclass(labels, preds):
    return f1_score(labels, preds, average='micro')
    
result, model_outputs, wrong_predictions = model.eval_model(eval_df, f1=f1_multiclass, acc=accuracy_score)

For reference, the results I obtained with these hyperparameters are as follows:  
`{'mcc': 0.937104098029913, 'f1': 0.9527631578947369, 'acc': 0.9527631578947369}`

## Prediction/Testing

In real-world applications, we often have no idea what the true label is. To perform predictions on arbitrary examples, you can use the predict method. This method is fairly similar to the `eval_model` method except that this takes in a simple list of text and returns a list of predictions and a list of model outputs.  

`predictions, raw_outputs = model.predict(['Some arbitary sentence'])` 

# Quick example

Taken from github (for a quick run): https://github.com/ThilinaRajapakse/simpletransformers

1. Initialize a task-specific model
2. Train the model with train_model()
3. Evaluate the model with eval_model()
4. Make predictions on (unlabelled) data with predict()

In [17]:
from simpletransformers.classification import ClassificationModel, ClassificationArgs
import pandas as pd
import logging

In [18]:
logging.basicConfig(level=logging.INFO)
transformers_logger = logging.getLogger("transformers")
transformers_logger.setLevel(logging.WARNING)

In [19]:
# Preparing train data
train_data = [
    ["Aragorn was the heir of Isildur", 1],
    ["Frodo was the heir of Isildur", 0],
]
train_df = pd.DataFrame(train_data)
train_df.columns = ["text", "labels"]

In [20]:
train_df.head()

Unnamed: 0,text,labels
0,Aragorn was the heir of Isildur,1
1,Frodo was the heir of Isildur,0


In [22]:
# Preparing eval data
eval_data = [
    ["Theoden was the king of Rohan", 1],
    ["Merry was the king of Rohan", 0],
]
eval_df = pd.DataFrame(eval_data)
eval_df.columns = ["text", "labels"]

In [23]:
eval_df.head()

Unnamed: 0,text,labels
0,Theoden was the king of Rohan,1
1,Merry was the king of Rohan,0


In [24]:
# Optional model configuration
model_args = ClassificationArgs(num_train_epochs=1)

In [26]:
# Create a ClassificationModel
model = ClassificationModel(
    "roberta", "roberta-base", args=model_args, use_cuda = False)

Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out

In [27]:
# Train the model
model.train_model(train_df)

INFO:simpletransformers.classification.classification_model: Converting to features started. Cache is not used.


  0%|          | 0/2 [00:00<?, ?it/s]

Epoch:   0%|          | 0/1 [00:00<?, ?it/s]

Running Epoch 0 of 1:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:simpletransformers.classification.classification_model: Training of roberta model complete. Saved to outputs/.


(1, 0.6666298508644104)

In [28]:
# Evaluate the model
result, model_outputs, wrong_predictions = model.eval_model(eval_df)

INFO:simpletransformers.classification.classification_model: Converting to features started. Cache is not used.


  0%|          | 0/2 [00:00<?, ?it/s]

Running Evaluation:   0%|          | 0/1 [00:00<?, ?it/s]

  mcc = cov_ytyp / np.sqrt(cov_ytyt * cov_ypyp)
INFO:simpletransformers.classification.classification_model:{'mcc': 0.0, 'tp': 0, 'tn': 1, 'fp': 0, 'fn': 1, 'auroc': 1.0, 'auprc': 1.0, 'eval_loss': 0.6931201815605164}


In [31]:
# Make predictions with the model
predictions, raw_outputs = model.predict(["Sam was a Wizard"])

INFO:simpletransformers.classification.classification_model: Converting to features started. Cache is not used.


  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

In [32]:
predictions

[0]

In [33]:
raw_outputs

array([[ 0.04048078, -0.02660424]])