# Complete walkthrough

*Installing the Hugging Face PyTorch Interface for Bert* <br>
`!pip3 install -q transformers `


## A. Easy dataset (IMDB)
The first example is to train a simple binary classification

In [2]:
!pip3 install -q transformers

[K     |████████████████████████████████| 3.5 MB 5.1 MB/s 
[K     |████████████████████████████████| 596 kB 30.9 MB/s 
[K     |████████████████████████████████| 895 kB 15.3 MB/s 
[K     |████████████████████████████████| 6.8 MB 33.2 MB/s 
[K     |████████████████████████████████| 67 kB 4.4 MB/s 
[?25h

In [1]:
import torch
from torch.utils.data import TensorDataset, DataLoader, RandomSampler, SequentialSampler
from keras.preprocessing.sequence import pad_sequences
from sklearn.model_selection import train_test_split
from transformers import BertTokenizer, BertConfig
from transformers import AdamW
from tqdm import tqdm, trange
import pandas as pd
import io
import numpy as np
import matplotlib.pyplot as plt
import os
from pathlib import Path
import torch
from transformers import DistilBertForSequenceClassification, Trainer, TrainingArguments
from transformers import DistilBertTokenizerFast, AutoTokenizer
from sklearn.model_selection import train_test_split

In [21]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
n_gpu = torch.cuda.device_count()
if torch.cuda.is_available():
  torch.cuda.get_device_name(0)

#### 1. Get the Data

In [5]:
# download the dataset
!wget http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
!tar -xf aclImdb_v1.tar.gz

--2022-02-22 02:48:39--  http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
Resolving ai.stanford.edu (ai.stanford.edu)... 171.64.68.10
Connecting to ai.stanford.edu (ai.stanford.edu)|171.64.68.10|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 84125825 (80M) [application/x-gzip]
Saving to: ‘aclImdb_v1.tar.gz’


2022-02-22 02:48:42 (33.0 MB/s) - ‘aclImdb_v1.tar.gz’ saved [84125825/84125825]



In [None]:
#uncompress the dataset
# data is organised as :
# aclImdb/train/neg
# aclImdb/train/pos
# aclImdb/test/neg
# aclImdb/test/pos
# there is also "unsup"  -not used here
 
!tar -xvzf aclImdb_v1.tar.gz


#### 2 .Split the data

In [20]:
def read_imdb_split(split_dir):
  '''
  function which separates text and labels
  @param : split_dir (str) the directory where data is stored
  '''
  split_dir = Path(split_dir)
  texts = []
  labels = []
  # check the 2 types of comments based on their subfolder
  for label_dir in ["pos", "neg"]:
      # append the file content
      # append tje label based on the subfolder value
      for text_file in (split_dir/label_dir).iterdir():
          texts.append(text_file.read_text())
          labels.append(0 if label_dir is "neg" else 1)

  return texts, labels

In [21]:
# extract train & test text data
train_texts, train_labels = read_imdb_split('aclImdb/train')
test_texts, test_labels = read_imdb_split('aclImdb/test')

In [22]:
train_texts[0]

#Column 1: the source of the sentence (code)
#Column 2: the label (0=unacceptable, 1=acceptable)
#Column 3: the label stands for ambiguous sentences


'"Telefilms" tend to fall under the pitfalls of a low budget and a hasty shooting schedule, which is why this film always tends to buck the trend.<br /><br />George C. Scott embodies Ebenezer Scrooge perfectly, fully encompassing all of his cold tendencies, and still makes him a simpathetic character. The production value for this film was exceptional, never relying on boffo special effects or soundstage set-ups, yet relying on the depth and clarity of on-site shooting and strong backdrops. A movie that certainly stands alone.'

In [23]:
train_labels[0]

1

In [24]:
# split data into train & test
from sklearn.model_selection import train_test_split
train_texts, val_texts, train_labels, val_labels = train_test_split(train_texts, train_labels, test_size=.2)

In [35]:

tokenizer = DistilBertTokenizerFast.from_pretrained('distilbert-base-uncased')

loading file https://huggingface.co/distilbert-base-uncased/resolve/main/vocab.txt from cache at /root/.cache/huggingface/transformers/0e1bbfda7f63a99bb52e3915dcf10c3c92122b827d92eb2d34ce94ee79ba486c.d789d64ebfe299b0e416afc4a169632f903f693095b4629a7ea271d5a0cf2c99
loading file https://huggingface.co/distilbert-base-uncased/resolve/main/tokenizer.json from cache at /root/.cache/huggingface/transformers/75abb59d7a06f4f640158a9bfcde005264e59e8d566781ab1415b139d2e4c603.7f2721073f19841be16f41b0a70b600ca6b880c8f3df6f3535cbc704371bdfa4
loading file https://huggingface.co/distilbert-base-uncased/resolve/main/added_tokens.json from cache at None
loading file https://huggingface.co/distilbert-base-uncased/resolve/main/special_tokens_map.json from cache at None
loading file https://huggingface.co/distilbert-base-uncased/resolve/main/tokenizer_config.json from cache at /root/.cache/huggingface/transformers/8c8624b8ac8aa99c60c912161f8332de003484428c47906d7ff7eb7f73eecdbb.20430bd8e10ef77a7d2977accef

In [26]:
train_encodings = tokenizer(train_texts, truncation=True, padding=True, max_length=128)
val_encodings = tokenizer(val_texts, truncation=True, padding=True, max_length=128)
test_encodings = tokenizer(test_texts, truncation=True, padding=True, max_length=128)

In [27]:
train_labels[:2]

[0, 0]

In [28]:
train_texts[:2]

["This movie is a great example of how even some very funny jokes can go terribly wrong. i really expected at least something from this movie after seeing the add which was funny as hell but the movie wasn't half as good.<br /><br />The weird part is that the jokes are actually funny, the spoofs of the smoking ban, Jo Bole... etc. are genuinely good jokes but i don't know whom to blame this movie flop on.<br /><br />The prime candidates may be:- 1) The hammers ( actors) and hammeresses (actresses) and not even the funny kind 2) The director 3)The guy who cast the actors and/or the director Anyway if you are really really bored and i mean really see this movie, or else get a copy of each and every ad or teaser of this movie and laugh your butt of because those will be far funnier than the film.<br /><br />p.s the only saving grace of this film is mahesh manjrekar and the funny chappu bhai",
 'I would of enjoyed this film but Van Damme just does the same old same old rubbish time after t

In [29]:
train_encodings.keys()

dict_keys(['input_ids', 'attention_mask'])

In [30]:
all_items = train_encodings.items()
{key: torch.tensor(val[0]) for key, val in all_items}


{'attention_mask': tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1]),
 'input_ids': tensor([  101,  2023,  3185,  2003,  1037,  2307,  2742,  1997,  2129,  2130,
          2070,  2200,  6057, 13198,  2064,  2175, 16668,  3308,  1012,  1045,
          2428,  3517,  2012,  2560,  2242,  2013,  2023,  3185,  2044,  3773,
          1996,  5587,  2029,  2001,  6057,  2004,  3109,  2021,  1996,  3185,
          2347,  1005,  1056,  2431,  2004,  2204,  1012,  1026,  7987,  1013,
          1028,  1026,  7987,  1013,  1028,  1996,  6881,  2112,  2003,  2008,
          1996, 13198,  2024,  2941,  6057,  1010,  199

In [31]:
class IMDbDataset(torch.utils.data.Dataset):
    def __init__(self, encodings, labels):
        self.encodings = encodings
        self.labels = labels

    def __getitem__(self, idx):
        item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
        item['labels'] = torch.tensor(self.labels[idx])
        return item

    def __len__(self):
        return len(self.labels)


In [32]:
train_dataset = IMDbDataset(train_encodings, train_labels)
val_dataset = IMDbDataset(val_encodings, val_labels)
test_dataset = IMDbDataset(test_encodings, test_labels)

In [33]:
train_dataset[0]

{'attention_mask': tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1]),
 'input_ids': tensor([  101,  2023,  3185,  2003,  1037,  2307,  2742,  1997,  2129,  2130,
          2070,  2200,  6057, 13198,  2064,  2175, 16668,  3308,  1012,  1045,
          2428,  3517,  2012,  2560,  2242,  2013,  2023,  3185,  2044,  3773,
          1996,  5587,  2029,  2001,  6057,  2004,  3109,  2021,  1996,  3185,
          2347,  1005,  1056,  2431,  2004,  2204,  1012,  1026,  7987,  1013,
          1028,  1026,  7987,  1013,  1028,  1996,  6881,  2112,  2003,  2008,
          1996, 13198,  2024,  2941,  6057,  1010,  199

In [31]:

training_args = TrainingArguments(
    output_dir='./results',          # output directory
    num_train_epochs=3,              # total number of training epochs
    per_device_train_batch_size=16,  # batch size per device during training
    per_device_eval_batch_size=64,   # batch size for evaluation
    warmup_steps=500,                # number of warmup steps for learning rate scheduler
    weight_decay=0.01,               # strength of weight decay
    logging_dir='./logs',            # directory for storing logs
    logging_steps=10,
)

model = DistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased")

trainer = Trainer(
    model=model,                         # the instantiated 🤗 Transformers model to be trained
    args=training_args,                  # training arguments, defined above
    train_dataset=train_dataset,         # training dataset
    eval_dataset=val_dataset             # evaluation dataset
)

trainer.train()

PyTorch: setting up devices
The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-).
loading configuration file https://huggingface.co/distilbert-base-uncased/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/23454919702d26495337f3da04d1655c7ee010d5ec9d77bdb9e399e00302c0a1.91b885ab15d631bf9cee9dc9d25ece0afd932f2f5130eba28f2055b2220c0333
Model config DistilBertConfig {
  "activation": "gelu",
  "architectures": [
    "DistilBertForMaskedLM"
  ],
  "attention_dropout": 0.1,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "initializer_range": 0.02,
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_heads": 12,
  "n_layers": 6,
  "pad_token_id": 0,
  "qa_dropout": 0.1,
  "seq_classif_dropout": 0.2,
  "sinusoidal_pos_embds": false,

Step,Training Loss
10,0.6839
20,0.6967
30,0.693
40,0.703
50,0.701
60,0.6868
70,0.6798
80,0.6868
90,0.6873
100,0.6784


Saving model checkpoint to ./results/checkpoint-500
Configuration saved in ./results/checkpoint-500/config.json
Model weights saved in ./results/checkpoint-500/pytorch_model.bin
Saving model checkpoint to ./results/checkpoint-1000
Configuration saved in ./results/checkpoint-1000/config.json
Model weights saved in ./results/checkpoint-1000/pytorch_model.bin
Saving model checkpoint to ./results/checkpoint-1500
Configuration saved in ./results/checkpoint-1500/config.json
Model weights saved in ./results/checkpoint-1500/pytorch_model.bin
Saving model checkpoint to ./results/checkpoint-2000
Configuration saved in ./results/checkpoint-2000/config.json
Model weights saved in ./results/checkpoint-2000/pytorch_model.bin
Saving model checkpoint to ./results/checkpoint-2500
Configuration saved in ./results/checkpoint-2500/config.json
Model weights saved in ./results/checkpoint-2500/pytorch_model.bin
Saving model checkpoint to ./results/checkpoint-3000
Configuration saved in ./results/checkpoint-3

TrainOutput(global_step=3750, training_loss=0.24245980685949325, metrics={'train_runtime': 473.4983, 'train_samples_per_second': 126.716, 'train_steps_per_second': 7.92, 'total_flos': 1987010979840000.0, 'train_loss': 0.24245980685949325, 'epoch': 3.0})

## B. custom dataset (LL)

##### 1. Data

In [41]:
# if you want to debug complex error messages
# it is always good to switch back to cpu
device = torch.device("cpu")

In [3]:
# custom dataset
liberta_leasing_dataset = pd.read_excel("/content/processed_for_BERT.xlsx")

In [42]:
#  list all present columns 
liberta_leasing_dataset.columns

Index(['Unnamed: 0', 'Posted Date', 'Value Date', 'Description', 'Debit',
       'Credit', 'Balance', 'Vect_D2V', 'PREDICTION', 'BANK_ID'],
      dtype='object')

In [43]:
# limit the dataset to columns of interest (text + label)
ll_dataset = liberta_leasing_dataset[["Description", "PREDICTION"]].astype('str').reset_index(drop=True)

In [44]:
# You must map the classes (text) to one hot encoded
ll_dataset["PREDICTION"] = ll_dataset["PREDICTION"].map({'TRANSFERT':0, 
                              'PURCHASE':1, 
                              'LOAN':2, 
                              'CHARGES':3, 
                              'SALARY':4, 
                              'CASH':5,
                              'REVERSAL':6})

In [7]:
# the prediction column is going to be turned into a 1-Hot vector
labels_df = pd.get_dummies(ll_dataset["PREDICTION"])
labels_df.columns =["cat0","cat1","cat2","cat3","cat4","cat5","cat6"]

In [8]:
ll_dataset = pd.concat([ll_dataset[["Description"]],labels_df],axis=1)

In [9]:
ll_dataset

Unnamed: 0,Description,cat0,cat1,cat2,cat3,cat4,cat5,cat6
0,TRSF/OKENU CHIBUIKE AUSTINE/004* *5582/OKENU A...,1,0,0,0,0,0,0
1,POS/WEB PMT PALMS SHOPPING MALL V.I NG,0,1,0,0,0,0,0
2,TRSF/OKENU CHIBUIKE AUSTINE/004* *5582/OKENU A...,1,0,0,0,0,0,0
3,POS/WEB PMT LATAPET INVESTMENT S LANG,0,1,0,0,0,0,0
4,TRSF/OKENU CHIBUIKE AUSTINE/004* 5582/OKENU AU...,1,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...
827,POS/WEB PMT PILLS AND TABS LIMT LALA NG,0,1,0,0,0,0,0
828,901Airtime- 2348025959940 USSD 32584756389592513,0,1,0,0,0,0,0
829,USSD/OKENU AUSTINE CHIBUIKE/00XXXX7441/OKENU C...,1,0,0,0,0,0,0
830,POS/WEB PMT FOODCO NIGERIA LIMTED/ LANG NG,0,1,0,0,0,0,0


##### 2. Split train-val-test

In [45]:
# split train & test data (scikit learn helper function)
train, test = train_test_split(ll_dataset)

In [12]:
# train_data
train_data = list(train["Description"].values)
train_labels = list(train.drop("Description", axis=1).values)

In [13]:
# val_data
val_data = list(test["Description"].values)
val_labels = list(test.drop("Description", axis=1).values)

In [14]:
# we don't have labeled data yet
# TO DO : replace by real test data!!

test_data = list(test["Description"].values)
test_labels = list(test.drop("Description", axis=1).values)

##### 3. Encodings

In [29]:
# name of the model of interest (distilbert + uncased)
# the poblem type is : multi_label_classification not like before : bianry classification

model_ckpt = "distilbert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_ckpt, problem_type="multi_label_classification")

train_encodings = tokenizer(train_data, truncation=True, padding=True, max_length=128)
val_encodings = tokenizer(val_data, truncation=True, padding=True, max_length=128)
test_encodings = tokenizer(test_data, truncation=True, padding=True, max_length=128)

loading configuration file https://huggingface.co/distilbert-base-uncased/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/23454919702d26495337f3da04d1655c7ee010d5ec9d77bdb9e399e00302c0a1.91b885ab15d631bf9cee9dc9d25ece0afd932f2f5130eba28f2055b2220c0333
Model config DistilBertConfig {
  "_name_or_path": "distilbert-base-uncased",
  "activation": "gelu",
  "architectures": [
    "DistilBertForMaskedLM"
  ],
  "attention_dropout": 0.1,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "initializer_range": 0.02,
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_heads": 12,
  "n_layers": 6,
  "pad_token_id": 0,
  "problem_type": "multi_label_classification",
  "qa_dropout": 0.1,
  "seq_classif_dropout": 0.2,
  "sinusoidal_pos_embds": false,
  "tie_weights_": true,
  "transformers_version": "4.16.2",
  "vocab_size": 30522
}

loading file https://huggingface.co/distilbert-base-uncased/resolve/main/vocab.txt from cache at /root/.cache/huggin

In [30]:
# Dataset class where we create a Dataset 
# like BERT can accept it
class LLDataset(torch.utils.data.Dataset):
    def __init__(self, encodings, labels):
        self.encodings = encodings
        self.labels = labels

    def __getitem__(self, idx):
        item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
        item['labels'] = torch.tensor(self.labels[idx]).to(dtype=torch.float)
        return item

    def __len__(self):
        return len(self.labels)


In [31]:
# Datasets of train, validation and testing
train_dataset = LLDataset(train_encodings, train_labels)
val_dataset = LLDataset(val_encodings, val_labels)
test_dataset = LLDataset(test_encodings, test_labels)

In [32]:
train_dataset[0]

{'attention_mask': tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
 'input_ids': tensor([  101,  1046,  4757,  2094,  1035,  9152,  2361,  1013,  7929,  2368,
          2226,  9610,  8569, 17339,  5899,  2063,  1013,  4002, 20348, 20348,
         24087,  2620,  2475,  1013, 16950, 14905,  2278, 23154,   102,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0]),
 'labels': tensor([1., 0., 0., 0., 0., 0., 0.])}

##### 4. Model definition + training

In [33]:
# Build the model by specifying the .cpkt, the number of labels

from transformers import (AutoTokenizer, AutoModelForSequenceClassification, 
                          TrainingArguments, Trainer)
num_labels=7
model = AutoModelForSequenceClassification.from_pretrained(model_ckpt, 
                                                           num_labels=num_labels, 
                                                           problem_type="multi_label_classification")

loading configuration file https://huggingface.co/distilbert-base-uncased/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/23454919702d26495337f3da04d1655c7ee010d5ec9d77bdb9e399e00302c0a1.91b885ab15d631bf9cee9dc9d25ece0afd932f2f5130eba28f2055b2220c0333
Model config DistilBertConfig {
  "_name_or_path": "distilbert-base-uncased",
  "activation": "gelu",
  "architectures": [
    "DistilBertForMaskedLM"
  ],
  "attention_dropout": 0.1,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1",
    "2": "LABEL_2",
    "3": "LABEL_3",
    "4": "LABEL_4",
    "5": "LABEL_5",
    "6": "LABEL_6"
  },
  "initializer_range": 0.02,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1,
    "LABEL_2": 2,
    "LABEL_3": 3,
    "LABEL_4": 4,
    "LABEL_5": 5,
    "LABEL_6": 6
  },
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_heads": 12,
  "n_layers": 6,
  "pad_token_id": 0,
  "problem_type": "multi_la

In [35]:
#Config + Training episode 

training_args = TrainingArguments(
    output_dir='./results',          # output directory
    num_train_epochs=10,              # total number of training epochs
    per_device_train_batch_size=16,  # batch size per device during training
    per_device_eval_batch_size=64,   # batch size for evaluation
    warmup_steps=500,                # number of warmup steps for learning rate scheduler
    weight_decay=0.01,               # strength of weight decay
    logging_dir='./logs2',            # directory for storing logs
    logging_steps=10,
)


trainer = Trainer(
    model=model,                         # the instantiated 🤗 Transformers model to be trained
    args=training_args,                  # training arguments, defined above
    train_dataset=train_dataset,         # training dataset
    eval_dataset=val_dataset             # evaluation dataset
)

trainer.train()

PyTorch: setting up devices
The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-).
***** Running training *****
  Num examples = 624
  Num Epochs = 10
  Instantaneous batch size per device = 16
  Total train batch size (w. parallel, distributed & accumulation) = 16
  Gradient Accumulation steps = 1
  Total optimization steps = 390


Step,Training Loss
10,0.2897
20,0.2755
30,0.2601
40,0.2607
50,0.2524
60,0.2444
70,0.2399
80,0.2425
90,0.2287
100,0.1909




Training completed. Do not forget to share your model on huggingface.co/models =)




TrainOutput(global_step=390, training_loss=0.1477740915922018, metrics={'train_runtime': 29.0918, 'train_samples_per_second': 214.494, 'train_steps_per_second': 13.406, 'total_flos': 101719193808960.0, 'train_loss': 0.1477740915922018, 'epoch': 10.0})

##### 5. Inference

In [55]:
# we can actually play with the data and create artificial
# new records
text = ["NFT/GTB B/O /NIG INTER BRANCH SETTLEMENT SYS/JUNE 2021"]
encoding = tokenizer(text, return_tensors="pt").to('cuda')

In [56]:

# forward pass
outputs = model(**encoding)
predictions = outputs.logits.argmax(-1)

In [57]:
predictions 

tensor([0], device='cuda:0')

In [None]:
#{'TRANSFERT':0, 
#'PURCHASE':1, 
#'LOAN':2, 
#'CHARGES':3, 
#'SALARY':4, 
#'CASH':5,
#'REVERSAL':6}#