#Simple Transformers — Introducing The Easy To Use BERT, RoBERTa, XLNet, and XLM
Want to use Transformer models for NLP? Pages of code got you down? Not anymore because Simple Transformers is on the job. Start, train, and evaluate Transformers with just 3 lines of code!

<br/>

#### Orginal Source - 

https://towardsdatascience.com/simple-transformers-introducing-the-easiest-bert-roberta-xlnet-and-xlm-library-58bf8c59b2a3 

##Preface
The Simple Transformers library is built as a wrapper around the excellent Transformers library by Hugging Face. I am eternally grateful for the hard work done by the folks at Hugging Face to enable the public to easily access and use Transformer models. I don’t know what I’d have done without you guys!



## Introduction
I believe it’s fair to say that the success of Transformer models have been nothing short of phenomenal in advancing the field of Natural Language Processing. Not only have they shown staggering leaps in performance on many NLP tasks they were designed to solve, pre-trained Transformers are also almost uncannily good at Transfer Learning. This means that anyone can take advantage of the long hours and the mind-boggling computational power that has gone into training these models to perform a countless variety of NLP tasks. You don’t need the deep pockets of Google or Facebook to build a state-of-the-art model to solve your NLP problem anymore!

Or so one might hope. The truth is that getting these models to work still requires substantial technical know-how. Unless you have expertise or at least experience in deep learning, it can seem a daunting challenge. I am happy to say that my previous articles on Transformers (here and here) seem to have helped a lot of people get a start on using Transformers. Interestingly, I noticed that people of various backgrounds (linguistics, medicine, and business to name but a few) were attempting to use these models to solve problems in their own domain. However, the technical barriers that need to be overcome in order to adapt Transformers to specific tasks are non-trivial and may even be rather discouraging.



## Simple Transformers
This conundrum was the main motivation behind my decision to develop a simple library to perform (binary and multiclass) text classification (the most common NLP task that I’ve seen) using Transformers. The idea was to make it as simple as possible, which means abstracting away a lot of the implementational and technical details. The implementation of the library can be found on Github. I highly encourage you to look at it to get a better idea of how everything works, although it is not necessary to know the inner details to use the library.

To that end, the Simple Transformers library was written so that a Transformer model can be initialized, trained on a given dataset, and evaluated on a given dataset, in just 3 lines of code! Let’s see how it’s done, shall we?

## Installation

In [None]:
!pip install simpletransformers

# Usage
A quick look at how to use this library on the Yelp Reviews dataset.

Download Yelp Reviews Dataset.

https://s3.amazonaws.com/fast-ai-nlp/yelp_review_polarity_csv.tgz

In [2]:
import pandas as pd


prefix = 'data/'

train_df = pd.read_csv(prefix + 'train.csv', header=None)
train_df.head()

eval_df = pd.read_csv(prefix + 'test.csv', header=None)
eval_df.head()

train_df[0] = (train_df[0] == 2).astype(int)
eval_df[0] = (eval_df[0] == 2).astype(int)

train_df = pd.DataFrame({
    'text': train_df[1].replace(r'\n', ' ', regex=True),
    'label':train_df[0]
})

print(train_df.head())

eval_df = pd.DataFrame({
    'text': eval_df[1].replace(r'\n', ' ', regex=True),
    'label':eval_df[0]
})

print(eval_df.head())

                                                text  label
0  Unfortunately, the frustration of being Dr. Go...      0
1  Been going to Dr. Goldberg for over 10 years. ...      1
2  I don't know what Dr. Goldberg was like before...      0
3  I'm writing this review to give you a heads up...      0
4  All the food is great here. But the best thing...      1
                                                text  label
0  Contrary to other reviews, I have zero complai...      1
1  Last summer I had an appointment to get new ti...      0
2  Friendly staff, same starbucks fair you get an...      1
3  The food is good. Unfortunately the service is...      0
4  Even when we didn't have a car Filene's Baseme...      1


Nothing fancy here, we are just getting the data in the correct form. This is all you have to do for any dataset.

Create two pandas DataFrame objects for the train and eval portions.
Each DataFrame should have two columns. The first column contains the text that you want to train or evaluate and has the datatype str. The second column has the corresponding label and has the datatype int.
Update: It is now recommended to name the columns as labels and text rather than relying on the order of the columns.
With the data in order,

## Model Training 
it’s time to train and evaluate the model.

In [3]:
from simpletransformers.classification import ClassificationModel


# Create a TransformerModel
model = ClassificationModel('roberta', 'roberta-base')

# Train the model
model.train_model(train_df)

# Evaluate the model
result, model_outputs, wrong_predictions = model.eval_model(eval_df)

Downloading (…)lve/main/config.json:   0%|          | 0.00/481 [00:00<?, ?B/s]

Downloading (…)"pytorch_model.bin";:   0%|          | 0.00/501M [00:00<?, ?B/s]

Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.weight', 'lm_head.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'lm_head.layer_norm.bias', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'roberta.pooler.dense.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.out_proj.weight', 'classi

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]



  0%|          | 0/60839 [00:00<?, ?it/s]

Epoch:   0%|          | 0/1 [00:00<?, ?it/s]

Running Epoch 0 of 1:   0%|          | 0/7605 [00:00<?, ?it/s]



  0%|          | 0/38000 [00:00<?, ?it/s]

Running Evaluation:   0%|          | 0/4750 [00:00<?, ?it/s]

In [6]:
result

{'mcc': 0.8924143244050525,
 'tp': 17725,
 'tn': 18225,
 'fp': 775,
 'fn': 1275,
 'auroc': 0.9883916717451523,
 'auprc': 0.9888277144496783,
 'eval_loss': 0.2168530688913245}

In [7]:
model_outputs

array([[-2.49804688,  2.69921875],
       [ 3.22265625, -4.0234375 ],
       [-2.65429688,  2.9375    ],
       ...,
       [ 3.21679688, -4.01953125],
       [ 3.22265625, -4.02734375],
       [ 1.6953125 , -2.20507812]])

# That’s it!

For making predictions on other text, TransformerModel comes with a predict(to_predict) method which given a list of text, returns the model predictions and the raw model outputs.

For more details on all available methods, please see the Github repo. The repo also contains a minimal example of using the library.

In [11]:
to_predict = ["Been going to Dr. Goldberg for over 10 years. I think I was one of his 1st patients when he started at MHMG. He's been great over the years and is really all about the big picture. It is because of him, not my now former gyn Dr. Markoff, that I found out I have fibroids. He explores all options with you and is very patient and understanding. He doesn't judge and asks all the right questions. Very thorough and wants to be kept in the loop on every aspect of your medical health and your life.",
              "This place is absolute garbage...  Half of the tees are not available, including all the grass tees.  It is cash only, and they sell the last bucket at 8, despite having lights.  And if you finish even a minute after 8, don't plan on getting a drink.  The vending machines are sold out (of course) and they sell drinks inside, but close the drawers at 8 on the dot.  There are weeds grown all over the place.  I noticed some sort of batting cage, but it looks like those are out of order as well.  Someone should buy this place and turn it into what it should be.",
              "this place sucks old school trash"]
predictions = model.predict(to_predict)
predictions

  0%|          | 0/3 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

(array([1, 0, 0]), array([[-2.9765625 ,  3.484375  ],
        [ 3.21289062, -4.01171875],
        [ 2.71679688, -3.42382812]]))