## Comparação Entre os Melhores Modelos

Neste notebook serão utilizados os dois modelos escolhidos no final do processo de <i>fine tuning</i> para classificar o conjunto de dados de teste. Os resultados serão comparados e o modelo que se sair melhor será considerado como o modelo final do trabalho.

In [None]:
# Installation of 3rd party libraries

!pip install transformers
!pip install --upgrade pytorch-lightning

Collecting transformers
  Downloading transformers-4.12.5-py3-none-any.whl (3.1 MB)
[K     |████████████████████████████████| 3.1 MB 4.3 MB/s 
Collecting huggingface-hub<1.0,>=0.1.0
  Downloading huggingface_hub-0.1.2-py3-none-any.whl (59 kB)
[K     |████████████████████████████████| 59 kB 7.0 MB/s 
[?25hCollecting pyyaml>=5.1
  Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB)
[K     |████████████████████████████████| 596 kB 33.4 MB/s 
Collecting tokenizers<0.11,>=0.10.1
  Downloading tokenizers-0.10.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (3.3 MB)
[K     |████████████████████████████████| 3.3 MB 37.3 MB/s 
[?25hCollecting sacremoses
  Downloading sacremoses-0.0.46-py3-none-any.whl (895 kB)
[K     |████████████████████████████████| 895 kB 41.7 MB/s 
Installing collected packages: pyyaml, tokenizers, sacremoses, huggingface-hub, transformers
 

In [None]:
from google.colab import drive
import pandas as pd
import numpy as np
from enum import Enum
from transformers import BertTokenizerFast as BertTokenizer, BertForSequenceClassification
import torch
from torch.utils.data import TensorDataset, DataLoader, SequentialSampler
from typing import List
import pytorch_lightning as pl
from sklearn.metrics import classification_report, accuracy_score, confusion_matrix, f1_score, cohen_kappa_score, matthews_corrcoef, precision_score, recall_score, balanced_accuracy_score

In [None]:
CONSTANTS = {
    'SPREADSHEET_PATH': '/content/drive/My Drive/MAC499 - Kaique e Yurick/DB/Test_Dataset.csv',
    'BERT_MODEL_NAME': 'neuralmind/bert-large-portuguese-cased',
    'BATCH_SIZE': 2,
    'EPOCHS': 3,
    'LEARNING_RATE': 2e-5,
    'NUMBER_OF_BRANCHES': 13,
    'MAX_TOKEN_LENGTH': 512,
    'MODEL_DATA_LENGTH': {
        'FRONT_BACK': 3866
    }
}

In [None]:
# Definition of mapping from law branch name to a numeric identifier

class LawBranch(Enum):
    """Mapping to a Law Branch and an identification. The enum also stores
    the law branch name in a free text form.
    """

    Penal = (0, "Direito Penal (Direito Processual Penal)")
    Administrativo = (1, "Direito Administrativo (Licitações, Contratos Administrativos, Servidores, Desapropriação, Tribunal de Contas, Improbidade, etc.)")
    Tributario = (2, "Direito Tributário/Direito Financeiro")
    Civil = (3, "Direito Civil (Direito Comercial/Direito de Família)")
    Previdenciario = (4, "Direito Previdenciário")
    Trabalho = (5, "Direito do Trabalho")
    Processual_Civil = (6, "Direito Processual Civil")
    Eleitoral = (7, "Direito Eleitoral")
    Consumidor = (8, "Direito do Consumidor")
    Internacional = (9, "Direito Internacional (Público ou Privado)")
    Militar = (10, "Direito Militar")
    Economico = (11, "Direito Econômico (Direito concorrencial e Agências Reguladoras Setoriais, Intervenção no Domínio Econômico)")
    Ambiental = (12, "Direito Ambiental")

    def get_identifier(self) -> int:
        """Retrieves the identifier number for this instance of LawBranch.

        Returns:
            int: identifier of this instance of LawBranch
        """
        return self.value[0]
    
    @staticmethod
    def get_all_names() -> List[str]:
      """Retrieves a list of all names defined in LawBranch enum.

        Returns:
          List[str]: the list of names.

      """
      names = []
      for law_branch in LawBranch:
        names.append(law_branch.name)
      return names

In [None]:
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


In [None]:
test_documents = pd.read_csv(CONSTANTS['SPREADSHEET_PATH'])
test_documents.dropna(inplace=True, subset=['cod_acordao'])
test_documents.head()

Unnamed: 0.1,Unnamed: 0,cod_acordao,ramo,tipo_acordao,cabecalho,ementa,decisao,indexacao,somente_ementa,indicacao_exclusiva_ementa_voto,expressoes_chave
0,850,HC 141487,0,HC,HC 141487 / MG - MINAS GERAIS HABEAS CORPUS Re...,EMENTA HABEAS CORPUS. PENAL. TRÁFICO DE DROGAS...,"A Turma, por maioria, denegou a ordem, nos ter...","['VOTO VENCIDO', 'MIN', 'MARCO AURÉLIO: CONFIS...",Sim,Sim,habeas corpus
1,1942,MS 22938,1,MS,MS 22938 / PA - PARÁ MANDADO DE SEGURANÇA Rela...,EMENTA: Mandado de Segurança. Pensão temporári...,"Apresentado o feito em mesa, o julgamento foi ...","['REJEIÇÃO', 'PRELIMINAR', 'DECADÊNCIA', 'IMPE...",Sim,Sim,pensão temporária
2,5339,INQ 1978,0,INQ,Inq 1978 / PR - PARANÁ INQUÉRITO Relator(a):&n...,E M E N T A: SUPOSTA PRÁTICA DO DELITO DE CORR...,"O Tribunal, à unanimidade, rejeitou a denúncia...",['VIDE EMENTA'],Sim,Sim,ação penal
3,4982,RHC 107759,0,RHC,RHC 107759 / RJ - RIO DE JANEIRO RECURSO ORDIN...,Ementa: PENAL. PROCESSUAL PENAL. HABEAS CORPUS...,"Por maioria de votos, a Turma deu parcial prov...","['VIDE EMENTA', 'VOTO VENCIDO', 'MIN', 'MARCO ...",Sim,Sim,penal
4,4315,RE 260404,10,RE,RE 260404 / MG - MINAS GERAIS RECURSO EXTRAORD...,EMENTA: Recurso extraordinário. Alegação de in...,A Turma decidiu remeter o presente recurso ext...,"['CONSTITUCIONALIDADE', 'DISPOSITIVO', 'CÓDIGO...",Sim,Não,crimes militares


In [None]:
torch.cuda.empty_cache()

# If there's a GPU available...
if torch.cuda.is_available():    

    # Tell PyTorch to use the GPU.    
    device = torch.device("cuda")
    print('There are %d GPU(s) available.' % torch.cuda.device_count())
    print('We will use the GPU:', torch.cuda.get_device_name(0))

# If not...
else:
    print('No GPU available, using the CPU instead.')
    device = torch.device("cpu")

There are 1 GPU(s) available.
We will use the GPU: Tesla K80


In [None]:
class LawDocumentClassifier(pl.LightningModule):
    
    def __init__(self, number_classes: int, steps_per_epoch: int=None, epochs: int=None, learning_rate: float=2e-5, weight_decay: float=0.001, warm_up_proportion: float=0.3):
        super().__init__()
        
        self.model = BertForSequenceClassification.from_pretrained(
            "neuralmind/bert-large-portuguese-cased",
            num_labels=number_classes,                      # The number of output labels--2 for binary classification
            output_attentions=False,                        # Returns attention weights
            output_hidden_states=False                      # Returns all hidden states
        )
        self.steps_per_epoch = steps_per_epoch
        self.epochs = epochs
        self.learning_rate = learning_rate
        self.warm_up_proportion = warm_up_proportion
        self.weight_decay = weight_decay
        
    def forward(self, input_ids, attention_mask, labels=None):
        output = self.model(input_ids,
                            attention_mask=attention_mask,
                            labels=labels,
                            return_dict=True)
        
        return output.loss, output.logits

In [None]:
tokenizer = BertTokenizer.from_pretrained(CONSTANTS['BERT_MODEL_NAME'])

Downloading:   0%|          | 0.00/205k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/2.00 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/112 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/155 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/648 [00:00<?, ?B/s]

In [None]:
# Função para tokenizar as ementas
def tokenize(ementas):
  """
    @param    ementas (list): Array de ementas a serem tokenizadas.
    @return   dataloader (torch.utils.data.DataLoader): DataLoader com os dados das predições
  """
  input_ids = []
  attention_masks = []

  for ementa in ementas:
    encoded_dict = tokenizer.encode_plus(
        ementa,
        add_special_tokens = True,
        max_length = CONSTANTS['MAX_TOKEN_LENGTH'],
        padding = 'max_length',
        truncation = True,
        return_attention_mask = True,
        return_tensors = 'pt',
        return_token_type_ids=False,
    )
        
    input_ids.append(encoded_dict['input_ids'])
    attention_masks.append(encoded_dict['attention_mask'])

  input_ids = torch.cat(input_ids, dim=0)
  attention_masks = torch.cat(attention_masks, dim=0)

  prediction_data = TensorDataset(input_ids, attention_masks)
  prediction_sampler = SequentialSampler(prediction_data)
  return DataLoader(prediction_data, sampler=prediction_sampler, batch_size=CONSTANTS['BATCH_SIZE'])


# Função que chama o modelo para classificar os acórdãos
def classify(acordaos, model):
  """
    @param  acordaos (pd.DataFrame): Dataframe com os acórdãos a serem classificados.
    @param  model (transformers.BertModel): Modelo BERT pré-treinado a ser utilizado.
    @return classifications (dict): Dicionário com os acórdãos classificados e suas ementas.
  """
  classifications = {
      'True Label': [],
      'Predicted Label': [],
      'Ementa': []
  }

  for id, row in acordaos.iterrows():
    classifications['Ementa'].append(row['ementa'])
    classifications['True Label'].append(row['ramo'])
  prediction_dataloader = tokenize(classifications['Ementa'])
  
  model.eval()
  predictions = []
  for batch in prediction_dataloader:
    batch = tuple(t.to(device) for t in batch)
    
    b_input_ids, b_input_mask = batch

    with torch.no_grad():
        _, outputs_logits = model(b_input_ids, attention_mask=b_input_mask)

    logits = outputs_logits
    logits = logits.detach().cpu().numpy()    
    predictions.append(logits)

  for prediction_batch in predictions:
    predicted_labels = np.argmax(prediction_batch, axis=1).flatten()
    for prediction in predicted_labels:
        classifications['Predicted Label'].append(prediction)

  return classifications

In [None]:
finetuned_model = LawDocumentClassifier(
    CONSTANTS['NUMBER_OF_BRANCHES'],
    steps_per_epoch=len(test_documents) // CONSTANTS['BATCH_SIZE'],
    epochs=CONSTANTS['EPOCHS'],
    learning_rate=CONSTANTS['LEARNING_RATE'],
    weight_decay=0.001,
    warm_up_proportion=0.3
)
finetuned_model2 = LawDocumentClassifier(
    CONSTANTS['NUMBER_OF_BRANCHES'],
    steps_per_epoch=len(test_documents) // CONSTANTS['BATCH_SIZE'],
    epochs=CONSTANTS['EPOCHS'],
    learning_rate=CONSTANTS['LEARNING_RATE'],
    weight_decay=0.001,
    warm_up_proportion=0.1
)

finetuned_model.load_state_dict(torch.load('/content/drive/My Drive/MAC499 - Kaique e Yurick/Projeto/saved_models/frontback_finetuned_6.bin'))
finetuned_model2.load_state_dict(torch.load('/content/drive/My Drive/MAC499 - Kaique e Yurick/Projeto/saved_models/frontback_finetuned_1.bin'))

Downloading:   0%|          | 0.00/1.25G [00:00<?, ?B/s]

Some weights of the model checkpoint at neuralmind/bert-large-portuguese-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from th

<All keys matched successfully>

In [None]:
finetuned_model.cuda()
classifications = classify(test_documents, finetuned_model)
finetuned_model2.cuda()
classifications2 = classify(test_documents, finetuned_model2)

In [None]:
metrics = {"Average Accuracy": accuracy_score(classifications['True Label'], classifications['Predicted Label']),
                     "Weighted Precision": precision_score(classifications['True Label'], classifications['Predicted Label'], average='weighted'),
                     "Weighted Recall": recall_score(classifications['True Label'], classifications['Predicted Label'], average='weighted'),
                     "Weighted F1 Score": f1_score(classifications['True Label'], classifications['Predicted Label'], average='weighted'),
                     "Balanced Accuracy": balanced_accuracy_score(classifications['True Label'], classifications['Predicted Label']),
                     "Cohen Kappa Score": cohen_kappa_score(classifications['True Label'], classifications['Predicted Label']),
                     "Matthews Correlation Coefficient": matthews_corrcoef(classifications['True Label'], classifications['Predicted Label'])}
metrics2 = {"Average Accuracy": accuracy_score(classifications2['True Label'], classifications2['Predicted Label']),
                     "Weighted Precision": precision_score(classifications2['True Label'], classifications2['Predicted Label'], average='weighted'),
                     "Weighted Recall": recall_score(classifications2['True Label'], classifications2['Predicted Label'], average='weighted'),
                     "Weighted F1 Score": f1_score(classifications2['True Label'], classifications2['Predicted Label'], average='weighted'),
                     "Balanced Accuracy": balanced_accuracy_score(classifications2['True Label'], classifications2['Predicted Label']),
                     "Cohen Kappa Score": cohen_kappa_score(classifications2['True Label'], classifications2['Predicted Label']),
                     "Matthews Correlation Coefficient": matthews_corrcoef(classifications2['True Label'], classifications2['Predicted Label'])}

In [None]:
print(metrics)

{'Average Accuracy': 0.827503015681544, 'Weighted Precision': 0.8250596250832032, 'Weighted Recall': 0.827503015681544, 'Weighted F1 Score': 0.8241550566492756, 'Balanced Accuracy': 0.5967947453601627, 'Cohen Kappa Score': 0.7629496172704222, 'Matthews Correlation Coefficient': 0.7632930613650024}


In [None]:
print(metrics2)

{'Average Accuracy': 0.827503015681544, 'Weighted Precision': 0.8313131935383273, 'Weighted Recall': 0.827503015681544, 'Weighted F1 Score': 0.8278404654663029, 'Balanced Accuracy': 0.6236816186867652, 'Cohen Kappa Score': 0.7641118601708068, 'Matthews Correlation Coefficient': 0.7644748981291795}
