# Adição do Campo Indexação no Modelo Truncated

Nesse notebook será avaliada a influência do campo 'Indexação', presente em alguns dos acórdãos do STF, na eficácia do modelo Truncated para classificação de acórdãos dentre os ramos do direito.

Para isso, será carregado o modelo Truncated, já treinado com os dados das ementas de treinamento, e nele serão injetados os dados de indexação válidos e processados para que ele possa ser novamente treinado e avaliado.

## Inicialização e definiçao de constantes

Como uma etapa inicial, toda a inicialização do notebook será concentrada no início desse documento. Os conteúdos contidos aqui são:

1. Instalação de bibliotecas externas
2. Importação de biblioteca
3. Definição de valores constantes que podem ter seu uso replicado ao longo do notebook
4. Inicialização do sistema de arquivos integrado ao Google Drive

In [1]:
# Installation of 3rd party libraries

!pip install transformers
!pip install --upgrade pytorch-lightning

Collecting transformers
  Downloading transformers-4.9.2-py3-none-any.whl (2.6 MB)
[K     |████████████████████████████████| 2.6 MB 7.5 MB/s 
Collecting huggingface-hub==0.0.12
  Downloading huggingface_hub-0.0.12-py3-none-any.whl (37 kB)
Collecting sacremoses
  Downloading sacremoses-0.0.45-py3-none-any.whl (895 kB)
[K     |████████████████████████████████| 895 kB 53.5 MB/s 
Collecting pyyaml>=5.1
  Downloading PyYAML-5.4.1-cp37-cp37m-manylinux1_x86_64.whl (636 kB)
[K     |████████████████████████████████| 636 kB 62.5 MB/s 
[?25hCollecting tokenizers<0.11,>=0.10.1
  Downloading tokenizers-0.10.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (3.3 MB)
[K     |████████████████████████████████| 3.3 MB 58.8 MB/s 
Installing collected packages: tokenizers, sacremoses, pyyaml, huggingface-hub, transformers
  Attempting uninstall: pyyaml
    Found existing installation: PyYAML 3.13
    Uninstalling PyYAML-3.13:
      Successfully uninsta

In [2]:
# Imports

from google.colab import drive
import pandas as pd
import numpy as np
from transformers import BertTokenizerFast as BertTokenizer, BertModel, BertForSequenceClassification, AdamW, get_linear_schedule_with_warmup
from enum import Enum
from typing import List
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader, TensorDataset, SequentialSampler
import re
import pytorch_lightning as pl
from pytorch_lightning import seed_everything
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, classification_report

In [25]:
# Constants

CONSTANTS = {
    'TRAINING_DATASET': '/content/drive/My Drive/MAC499 - Kaique e Yurick/DB/Train_Dataset.csv',
    'VALIDATION_DATASET': '/content/drive/My Drive/MAC499 - Kaique e Yurick/DB/Validation_Dataset.csv',
    'TEST_DATASET': '/content/drive/My Drive/MAC499 - Kaique e Yurick/DB/Test_Dataset.csv',
    'BERT_MODEL_NAME': 'neuralmind/bert-large-portuguese-cased',
    'SEED': 13,
    'MODEL_PATH': '/content/drive/My Drive/MAC499 - Kaique e Yurick/Projeto/saved_models/truncated_model.bin'
}

# Hyperparameters

HYPERPARAMETERS = {
    'BATCH_SIZE': 4,
    'EPOCHS': 3,
    'MAX_NUMBER_TOKENS': 512,
    'LEARNING_RATE': 2e-5,
    'NUMBER_OF_BRANCHES': 13
}

In [4]:
# Mounting Google Drive

drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


## Verificar disponibilidade da GPU

O próximo passo seria verificar se a GPU oferida pela Google gratuitamente como ambiente de execução do notebook está funcionando corretamente. A GPU oferece uma performance computacional maior em relação a calculos sendo executados pela CPU.

In [5]:
torch.cuda.empty_cache()

# If there's a GPU available...
if torch.cuda.is_available():    

    # Tell PyTorch to use the GPU.    
    device = torch.device("cuda")
    print('There are %d GPU(s) available.' % torch.cuda.device_count())
    print('We will use the GPU:', torch.cuda.get_device_name(0))

# If not...
else:
    print('No GPU available, using the CPU instead.')
    device = torch.device("cpu")

There are 1 GPU(s) available.
We will use the GPU: Tesla T4


### Reproducibilidade

Para fins de reproducibilidade, podemos definir uma semente para o pytorch lightning.

In [6]:
seed_everything(CONSTANTS['SEED'])

Global seed set to 13


13

## Carregar os dados

Após as configurações iniciais, os dados são carregados da mesma forma que no notebook de criação e treinamento do modelo Truncated.

Assim, nessa etapa são carregados arquivos .csv contendo os conjuntos de treinamento e de validação.

In [7]:
# Read the training dataset from .csv file
documents = pd.read_csv(CONSTANTS['TRAINING_DATASET'])
# Read the validation dataset from .csv file
documents_val = pd.read_csv(CONSTANTS['VALIDATION_DATASET'])

O objetivo desse notebook é a utilização dos dados presentes no campo 'Indexação' dos acórdãos, o que não foi feito no notebook de criação do modelo Truncated. Portanto, é preciso pré-processar os valores de indexação para poder alimentá-los ao modelo.

O pré-processamento consiste na transformação dos valores de indexação em textos e da remoção de alguns caracteres especiais e frases que não agregarão na inferência como 'VIDE EMENTA' e 'AGUARDANDO INDEXAÇÃO'. Após isso, é criado um novo dataframe contendo apenas os valores de indexação processados e seus respectivos valores de classificação entre os ramos.

In [8]:
# Process the indexação column
ramos = []
indexes = []
for i, index in enumerate(documents['indexacao'].tolist()):
  x = re.sub('[\]\[\']', '', index)
  x = x.replace('VIDE EMENTA', '').replace('AGUARDANDO INDEXAÇÃO', '')
  if x != '':
    indexes.append(x)
    ramos.append(documents['ramo'][i])
documents = pd.DataFrame(list(zip(ramos, indexes)),
               columns =['ramo', 'indexacao'])
documents

Unnamed: 0,ramo,indexacao
0,1,"AUSÊNCIA, DECADÊNCIA ADMINISTRATIVA, ATO, APOS..."
1,1,"OCORRÊNCIA, CASO CONCRETO, AUMENTO, REMUNERAÇÃ..."
2,1,"CONSTITUCIONALIDADE, DISPOSITIVO, LEI ORGÂNICA..."
3,1,"DECLARAÇÃO, INCONSTITUCIONALIDADE, LEI ESTADUA..."
4,0,"NECESSIDADE, FUNDAMENTAÇÃO IDÔNEA, RECUSA, SUB..."
...,...,...
2689,4,"OCORRÊNCIA, ERRO DE FATO, ACÓRDÃO, STF, DECLAR..."
2690,2,"PRINCÍPIO DA ANTERIORIDADE NONAGESIMAL, INTEGR..."
2691,1,"DECRETO PRESIDENCIAL, HOMOLOGAÇÃO, DEMARCAÇÃO,..."
2692,0,"PP0021, PRISÃO PREVENTIVA, PRAZO, EXCESSO, SUP..."


O treinamento do modelo nesse notebook segue o mesmo formato do notebook de criação do modelo Truncated. Então é baixado o tokenizer e são definidas e criadas classes para Dataset, DataModule e pro modelo em si.

Por fim, o modelo Truncated é carregado do Drive e é feito o treinamento novamente utilizando o dataframe contendo as indexações como input.

In [9]:
class LawDocumentDataset(Dataset):
  def __init__(self, dataframe: pd.DataFrame, tokenizer: BertTokenizer, max_token_length: int=512):
    self.dataframe = dataframe
    self.tokenizer = tokenizer
    self.max_token_length = max_token_length

  def __len__(self):
    return len(self.dataframe)

  def __getitem__(self, index: int):
    row = self.dataframe.iloc[index]
    summary_document = row.indexacao
    law_branch_id = row.ramo

    encoding = self.tokenizer.encode_plus(
      summary_document,
      add_special_tokens=True,          # Add `[CLS]` and `[SEP]`
      max_length=self.max_token_length,
      return_token_type_ids=False,
      padding="max_length",
      truncation=True,                  # Truncate encoding to the max length
      return_attention_mask=True,       # Return attention mask
      return_tensors="pt"               # Return PyTorch tensor
    )

    labels = np.eye(HYPERPARAMETERS['NUMBER_OF_BRANCHES'])[law_branch_id]  # Return a list with zeros, except for index law_branch_id that assumes one

    return dict(
        summary_document=summary_document,
        input_ids=encoding["input_ids"].flatten(),
        attention_mask=encoding["attention_mask"].flatten(),
        labels=torch.FloatTensor(labels)
    )

In [10]:
class LawDocumentDataModule(pl.LightningDataModule):
    
    def __init__(self, train_dataframe: pd.DataFrame, validation_dataframe: pd.DataFrame, tokenizer: BertTokenizer, batch_size: int, max_token_length: int=512):
        super().__init__()
        
        self.train_dataframe = train_dataframe
        self.validation_dataframe = validation_dataframe
        self.tokenizer = tokenizer
        self.batch_size = batch_size
        self.max_token_length = max_token_length

    def setup(self):
        print("Train dataframe shape: {train_shape} | Validation dataframe shape: {val_shape}".format(train_shape=self.train_dataframe.shape, val_shape=self.validation_dataframe.shape))

        self.train_dataset = LawDocumentDataset(self.train_dataframe, self.tokenizer, self.max_token_length)
        self.validation_dataset = LawDocumentDataset(self.validation_dataframe, self.tokenizer, self.max_token_length)

    def train_dataloader(self):
        return DataLoader(self.train_dataset, batch_size=self.batch_size, shuffle=True, num_workers=2)

    def val_dataloader(self):
        return DataLoader(self.validation_dataset, batch_size=self.batch_size, shuffle=False, num_workers=2)

In [11]:
tokenizer = BertTokenizer.from_pretrained(CONSTANTS['BERT_MODEL_NAME'])

Downloading:   0%|          | 0.00/210k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/2.00 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/112 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/155 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/648 [00:00<?, ?B/s]

In [12]:
data_module = LawDocumentDataModule(documents, documents_val, tokenizer, batch_size=HYPERPARAMETERS['BATCH_SIZE'], max_token_length=HYPERPARAMETERS['MAX_NUMBER_TOKENS'])
data_module.setup()
data_module

Train dataframe shape: (2694, 2) | Validation dataframe shape: (829, 11)


<__main__.LawDocumentDataModule at 0x7f6dc9bceb90>

In [13]:
class LawDocumentClassifier(pl.LightningModule):
    
    def __init__(self, number_classes: int, steps_per_epoch: int=None, epochs: int=None, learning_rate: float=2e-5):
        super().__init__()
        
        self.model = BertForSequenceClassification.from_pretrained(
            "neuralmind/bert-large-portuguese-cased",
            num_labels=number_classes,                      # The number of output labels--2 for binary classification
            output_attentions=False,                        # Returns attention weights
            output_hidden_states=False                      # Returns all hidden states
        )
        self.steps_per_epoch = steps_per_epoch
        self.epochs = epochs
        self.learning_rate = learning_rate
        
    def forward(self, input_ids, attention_mask, labels=None):
        output = self.model(input_ids,
                            attention_mask=attention_mask,
                            labels=labels,
                            return_dict=True)
        
        return output.loss, output.logits
        
    def training_step(self, batch, batch_index):
        input_ids = batch["input_ids"]
        attention_mask = batch["attention_mask"]
        labels = batch["labels"]
        
        loss, outputs = self(input_ids, attention_mask, labels)
        
        self.log("train_loss", loss, prog_bar=True, logger=True)
        
        return {"loss": loss, "predictions": outputs, "labels": labels}
        
    def validation_step(self, batch, batch_index):
        input_ids = batch["input_ids"]
        attention_mask = batch["attention_mask"]
        labels = batch["labels"]
        loss, outputs = self(input_ids, attention_mask, labels)

        classification_labels = self.convert_to_classification_labels(labels.cpu())
        classification_predictions = self.convert_to_classification_labels(outputs.cpu())

        metrics = {
            "validation_loss": loss,
            "validation_accuracy": accuracy_score(classification_labels, classification_predictions),
            "validation_precision": precision_score(classification_labels, classification_predictions, average='micro'),
            "validation_recall": recall_score(classification_labels, classification_predictions, average='micro'),
            "validation_f1": f1_score(classification_labels, classification_predictions, average='micro'),
        }

        self.log_dict(metrics)
        return metrics
        
    def training_epoch_end(self, outputs):
        labels = []
        predictions = []
        
        for output in outputs:
            for output_labels in output["labels"].detach().cpu():
                labels.append(output_labels)
            for output_predictions in output["predictions"].detach().cpu():
                predictions.append(output_predictions)
                
        labels = torch.stack(labels).int()
        predictions = torch.stack(predictions)
            
    def configure_optimizers(self):
        optimizer = AdamW(self.parameters(), lr=self.learning_rate)
        warmup_steps = self.steps_per_epoch // 3
        total_steps = self.steps_per_epoch * self.epochs - warmup_steps
        
        scheduler = get_linear_schedule_with_warmup(optimizer, warmup_steps, total_steps)
        
        return [optimizer], [scheduler]

    def convert_to_classification_labels(self, classifications):
        formatted_classifications = []

        for classification in classifications:
            formatted_classifications.append(np.argmax(classification).flatten())

        return formatted_classifications

In [14]:
# import gc
# gc.collect()

In [15]:
model = LawDocumentClassifier(HYPERPARAMETERS['NUMBER_OF_BRANCHES'], len(documents) // HYPERPARAMETERS['BATCH_SIZE'], HYPERPARAMETERS['EPOCHS'], HYPERPARAMETERS['LEARNING_RATE'])
model.load_state_dict(torch.load(CONSTANTS['MODEL_PATH']))

Downloading:   0%|          | 0.00/1.34G [00:00<?, ?B/s]

Some weights of the model checkpoint at neuralmind/bert-large-portuguese-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from th

<All keys matched successfully>

In [16]:
trainer = pl.Trainer(max_epochs=HYPERPARAMETERS['EPOCHS'], gpus=1, progress_bar_refresh_rate=30)
trainer.fit(model, data_module)

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
  f"DataModule.{name} has already been called, so it will not be called again. "
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name  | Type                          | Params
--------------------------------------------------------
0 | model | BertForSequenceClassification | 334 M 
--------------------------------------------------------
334 M     Trainable params
0         Non-trainable params
334 M     Total params
1,337.639 Total estimated model params size (MB)


Validation sanity check: 0it [00:00, ?it/s]

Global seed set to 13


Training: -1it [00:00, ?it/s]

  f"One of the returned values {set(extra.keys())} has a `grad_fn`. We will detach it automatically"


Validating: 0it [00:00, ?it/s]

Validating: 0it [00:00, ?it/s]

Validating: 0it [00:00, ?it/s]

In [17]:
trainer.logged_metrics

{'epoch': 2,
 'train_loss': tensor(0.1189),
 'validation_accuracy': 0.6201297640800476,
 'validation_f1': 0.6201297640800476,
 'validation_loss': 0.14214029908180237,
 'validation_precision': 0.6201297640800476,
 'validation_recall': 0.6201297640800476}

In [18]:
torch.save(model.state_dict(), '/content/drive/My Drive/MAC499 - Kaique e Yurick/Projeto/index_truncated_model.bin')

## Avaliação no Conjunto de Testes

Terminada a fase de treinamento do modelo, vamos avaliá-lo com os dados do conjunto de testes para poder compará-lo com o modelo Truncated sem a adição do campo indexação.

O procedimento executado é o mesmo do notebook de métricas para comparação das diferentes aproximações de construção dos modelos: o conjunto de testes é carregado do Drive e os dados são tokenizados e depois classificados pelos modelos.

In [19]:
test_documents = pd.read_csv(CONSTANTS['TEST_DATASET'])
test_documents.dropna(inplace=True, subset=['cod_acordao'])

In [20]:
# Definition of mapping from law branch name to a numeric identifier

class LawBranch(Enum):
    """Mapping to a Law Branch and an identification. The enum also stores
    the law branch name in a free text form.
    """

    Penal = (0, "Direito Penal (Direito Processual Penal)")
    Administrativo = (1, "Direito Administrativo (Licitações, Contratos Administrativos, Servidores, Desapropriação, Tribunal de Contas, Improbidade, etc.)")
    Tributario = (2, "Direito Tributário/Direito Financeiro")
    Civil = (3, "Direito Civil (Direito Comercial/Direito de Família)")
    Previdenciario = (4, "Direito Previdenciário")
    Trabalho = (5, "Direito do Trabalho")
    Processual_Civil = (6, "Direito Processual Civil")
    Eleitoral = (7, "Direito Eleitoral")
    Consumidor = (8, "Direito do Consumidor")
    Internacional = (9, "Direito Internacional (Público ou Privado)")
    Militar = (10, "Direito Militar")
    Economico = (11, "Direito Econômico (Direito concorrencial e Agências Reguladoras Setoriais, Intervenção no Domínio Econômico)")
    Ambiental = (12, "Direito Ambiental")

    def get_identifier(self) -> int:
        """Retrieves the identifier number for this instance of LawBranch.

        Returns:
            int: identifier of this instance of LawBranch
        """
        return self.value[0]
    
    @staticmethod
    def get_all_names() -> List[str]:
      """Retrieves a list of all names defined in LawBranch enum.

        Returns:
          List[str]: the list of names.

      """
      names = []
      for law_branch in LawBranch:
        names.append(law_branch.name)
      return names

In [21]:
# Função para tokenizar as ementas
def tokenize(ementas):
  """
    @param    ementas (list): Array de ementas a serem tokenizadas.
    @return   dataloader (torch.utils.data.DataLoader): DataLoader com os dados das predições
  """
  input_ids = []
  attention_masks = []

  for ementa in ementas:
    encoded_dict = tokenizer.encode_plus(
        ementa,
        add_special_tokens = True,
        max_length = HYPERPARAMETERS['MAX_NUMBER_TOKENS'],
        padding = 'max_length',
        truncation = True,
        return_attention_mask = True,
        return_tensors = 'pt',
        return_token_type_ids=False,
    )
        
    input_ids.append(encoded_dict['input_ids'])
    attention_masks.append(encoded_dict['attention_mask'])

  input_ids = torch.cat(input_ids, dim=0)
  attention_masks = torch.cat(attention_masks, dim=0)

  prediction_data = TensorDataset(input_ids, attention_masks)
  prediction_sampler = SequentialSampler(prediction_data)
  return DataLoader(prediction_data, sampler=prediction_sampler, batch_size=HYPERPARAMETERS['BATCH_SIZE'])


# Função que chama o modelo para classificar os acórdãos
def classify(acordaos, model):
  """
    @param  acordaos (pd.DataFrame): Dataframe com os acórdãos a serem classificados.
    @param  model (transformers.BertModel): Modelo BERT pré-treinado a ser utilizado.
    @return classifications (dict): Dicionário com os acórdãos classificados e suas ementas.
  """
  classifications = {
      'True Label': [],
      'Predicted Label': [],
      'Ementa': []
  }

  for id, row in acordaos.iterrows():
    classifications['Ementa'].append(row['ementa'])
    classifications['True Label'].append(row['ramo'])
  prediction_dataloader = tokenize(classifications['Ementa'])
  
  model.eval()
  predictions = []
  for batch in prediction_dataloader:
    batch = tuple(t.to(device) for t in batch)
    
    b_input_ids, b_input_mask = batch

    with torch.no_grad():
        _, outputs_logits = model(b_input_ids, attention_mask=b_input_mask)

    logits = outputs_logits
    logits = logits.detach().cpu().numpy()    
    predictions.append(logits)

  for prediction_batch in predictions:
    predicted_labels = np.argmax(prediction_batch, axis=1).flatten()
    for prediction in predicted_labels:
        classifications['Predicted Label'].append(prediction)

  return classifications

In [26]:
truncated_model = LawDocumentClassifier(HYPERPARAMETERS['NUMBER_OF_BRANCHES'], len(documents) // HYPERPARAMETERS['BATCH_SIZE'], HYPERPARAMETERS['EPOCHS'], HYPERPARAMETERS['LEARNING_RATE'])
truncated_model.load_state_dict(torch.load(CONSTANTS['MODEL_PATH']))

Some weights of the model checkpoint at neuralmind/bert-large-portuguese-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from th

<All keys matched successfully>

In [27]:
model.cuda()
index_classifications = classify(test_documents, model)
truncated_model.cuda()
truncated_classifications = classify(test_documents, truncated_model)

In [28]:
print(classification_report(index_classifications['True Label'], index_classifications['Predicted Label'], target_names=LawBranch.get_all_names()))

                  precision    recall  f1-score   support

           Penal       0.97      0.97      0.97       334
  Administrativo       0.88      0.93      0.91       259
      Tributario       0.78      0.83      0.80        46
           Civil       0.67      0.17      0.27        12
  Previdenciario       0.79      0.94      0.86        32
        Trabalho       0.92      0.73      0.81        15
Processual_Civil       0.62      0.54      0.58        48
       Eleitoral       1.00      0.80      0.89         5
      Consumidor       0.00      0.00      0.00         4
   Internacional       0.87      1.00      0.93        53
         Militar       0.86      0.92      0.89        13
       Economico       0.00      0.00      0.00         3
       Ambiental       0.00      0.00      0.00         5

        accuracy                           0.89       829
       macro avg       0.64      0.60      0.61       829
    weighted avg       0.88      0.89      0.88       829



  _warn_prf(average, modifier, msg_start, len(result))


In [29]:
print(classification_report(truncated_classifications['True Label'], truncated_classifications['Predicted Label'], target_names=LawBranch.get_all_names()))

                  precision    recall  f1-score   support

           Penal       0.98      0.97      0.98       334
  Administrativo       0.84      0.96      0.89       259
      Tributario       0.82      0.78      0.80        46
           Civil       1.00      0.08      0.15        12
  Previdenciario       0.83      0.91      0.87        32
        Trabalho       0.92      0.73      0.81        15
Processual_Civil       0.71      0.52      0.60        48
       Eleitoral       1.00      0.80      0.89         5
      Consumidor       0.00      0.00      0.00         4
   Internacional       0.90      0.98      0.94        53
         Militar       0.92      0.85      0.88        13
       Economico       0.00      0.00      0.00         3
       Ambiental       0.00      0.00      0.00         5

        accuracy                           0.90       829
       macro avg       0.69      0.58      0.60       829
    weighted avg       0.88      0.90      0.88       829



  _warn_prf(average, modifier, msg_start, len(result))
