<a href="https://colab.research.google.com/github/jtneumann/GHwithNLP-SA_LP/blob/master/nn_based_sentiment_(wk4).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Setting up our environment
First, we have to install the packages we'll use. But before installing them, we have to switch the Runtime environment from the default setting to GPU. Go to Runtime >> Change runtime type.

We'll use Google Drive to store our input and output data. So, before installing the required packages, we have to connect our Colab notebook to our Google Drive account.

In [None]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


Now, we are ready to install from the requirements.txt file. The path to your folder starts with "/content/drive/My\ Drive/"

In [None]:
!pip install -r /content/drive/My\ Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/requirements.txt

Collecting altair==4.0.1
[?25l  Downloading https://files.pythonhosted.org/packages/a8/07/d8acf03571db619ff117df5730dd5c0b1ad0822aa02ad1084d73e2659442/altair-4.0.1-py3-none-any.whl (708kB)
[K     |████████████████████████████████| 716kB 2.8MB/s 
[?25hCollecting imbalanced-learn==0.6.2
[?25l  Downloading https://files.pythonhosted.org/packages/c8/73/36a13185c2acff44d601dc6107b5347e075561a49e15ddd4e69988414c3e/imbalanced_learn-0.6.2-py3-none-any.whl (163kB)
[K     |████████████████████████████████| 163kB 13.8MB/s 
Collecting Keyness==0.25
  Downloading https://files.pythonhosted.org/packages/30/1c/2324f377631362dec550e4fcbb107af5628c34674711eff33660fb8e32fa/Keyness-0.25.tar.gz
Collecting nltk==3.4.5
[?25l  Downloading https://files.pythonhosted.org/packages/f6/1d/d925cfb4f324ede997f6d47bea4d9babba51b49e87a767c170b77005889d/nltk-3.4.5.zip (1.5MB)
[K     |████████████████████████████████| 1.5MB 14.2MB/s 
Collecting simpletransformers==0.22.1
[?25l  Downloading https://files.pythonh

Install apex.

In [None]:
%%writefile setup.sh

export CUDA_HOME=/usr/local/cuda-10.1
git clone https://github.com/NVIDIA/apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./apex

Writing setup.sh


In [None]:
!sh setup.sh

Cloning into 'apex'...
remote: Enumerating objects: 4, done.[K
remote: Counting objects: 100% (4/4), done.[K
remote: Compressing objects: 100% (4/4), done.[K
remote: Total 6593 (delta 0), reused 0 (delta 0), pack-reused 6589[K
Receiving objects: 100% (6593/6593), 13.70 MiB | 6.95 MiB/s, done.
Resolving deltas: 100% (4383/4383), done.
  cmdoptions.check_install_build_global(options)
Created temporary directory: /tmp/pip-ephem-wheel-cache-tx8uddbh
Created temporary directory: /tmp/pip-req-tracker-o7cgbejd
Created requirements tracker '/tmp/pip-req-tracker-o7cgbejd'
Created temporary directory: /tmp/pip-install-fznnpnmz
Processing ./apex
  Created temporary directory: /tmp/pip-req-build-oz6wu_wv
  Added file:///content/apex to build tracker '/tmp/pip-req-tracker-o7cgbejd'
    Running setup.py (path:/tmp/pip-req-build-oz6wu_wv/setup.py) egg_info for package from file:///content/apex
    Running command python setup.py egg_info
    torch.__version__  =  1.4.0
    running egg_info
    cr

# Using transformers' built-in sentiment analyzer pipeline

In [None]:
import warnings
warnings.filterwarnings("ignore")

import pandas as pd
from sklearn.metrics import classification_report
from transformers import pipeline

nlp = pipeline("sentiment-analysis")

df = pd.read_csv("/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/data/processed/annotation.tsv", sep="\t")

reviews = df["reviews"]
rating_classes = df["rating_class"]


# there are two built-in categories, positive and negative
# we use the label probability to classify it into three categories
def classify_sentiment(text, threshold):
  s = nlp(text)[0]
  label = s["label"]
  score = s["score"]
  if label == "NEGATIVE" and score > threshold:
    return 0
  elif label == "POSITIVE" and score > threshold:
    return 2
  else:
    return 1


target_names = ["negative", "neutral", "positive"]
thresholds = [0.95]
# thresholds = [0.65, 0.75, 0.85, 0.95, 0.99]
reports = []
for th in thresholds:
  sentiment_values = [classify_sentiment(r, th) for r in reviews]
  report = classification_report(
      rating_classes, sentiment_values, target_names=target_names
      )
  reports.append(report)

for report in reports:
  print(report)


HBox(children=(IntProgress(value=0, description='Downloading', max=546, style=ProgressStyle(description_width=…




HBox(children=(IntProgress(value=0, description='Downloading', max=231508, style=ProgressStyle(description_wid…




HBox(children=(IntProgress(value=0, description='Downloading', max=754, style=ProgressStyle(description_width=…




HBox(children=(IntProgress(value=0, description='Downloading', max=230, style=ProgressStyle(description_width=…




HBox(children=(IntProgress(value=0, description='Downloading', max=267844284, style=ProgressStyle(description_…




FileNotFoundError: ignored

# Train your own classifier

First, let's make train and test corpora.

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from simpletransformers.classification import ClassificationModel

# read corpus
df = pd.read_csv("/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/data/processed/training.tsv", sep="\t")
print(len(list(df["rating_class"])))
# train-test split on the data frame
train_df, test_df = train_test_split(df,
                                     stratify=df["rating_class"],
                                     random_state=42)

# save train and test corpora
with open("/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/data/processed/train.tsv", "w") as outfile:
    outfile.write(train_df.to_csv(index=False, sep="\t"))

with open("/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/data/processed/test.tsv", "w") as outfile:
    outfile.write(test_df.to_csv(index=False, sep="\t"))

4500


Now, we can train our model using distilbert to vectorize the reviews.

In [None]:
import os
model = ClassificationModel(
    "distilbert",
    "distilbert-base-uncased",
    use_cuda=True,
    num_labels=3,
    args={
        "output_dir": "/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/outputs/",
        "best_model_dir": "/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/outputs/best_model/",
        "reprocess_input_data": True,
        "sliding_window": True,
        "overwrite_output_dir": True,
        "max_seq_length": 512,
        "num_train_epochs": 20,
        "train_batch_size": 20,
        "eval_batch_size": 20,
    },
)


model.train_model(train_df, test_df)


HBox(children=(IntProgress(value=0, description='Downloading', max=442, style=ProgressStyle(description_width=…




HBox(children=(IntProgress(value=0, description='Downloading', max=267967963, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Downloading', max=231508, style=ProgressStyle(description_wid…




  "Dataframe headers not specified. Falling back to using column 0 as text and column 1 as labels."


HBox(children=(IntProgress(value=0, max=3375), HTML(value='')))


Selected optimization level O1:  Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic


HBox(children=(IntProgress(value=0, description='Epoch', max=20, style=ProgressStyle(description_width='initia…

HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 1.098835



Running loss: 1.101221



Running loss: 1.031914Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Running loss: 0.743980


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.474351


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.862221


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.491847Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
Running loss: 0.419262


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.092381


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.265601


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.379687Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 8192.0
Running loss: 0.001650


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.134456


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.025375


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.001950


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.193915


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.058893


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.036110


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.000359


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.024378


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.091759


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.000661


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.019234


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.018882


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.000250



Let's evaluate our newly trained model

In [None]:
from sklearn.metrics import classification_report

result, model_outputs, wrong_predictions = model.eval_model(test_df)

target_names = ["negative", "neutral", "positive"]
predicted_class = [list(e[0]) for e in model_outputs]
predicted_class = [e.index(max(e)) for e in predicted_class]
print(
    classification_report(
        list(test_df["rating_class"]), predicted_class, target_names=target_names
    )
)

  "Dataframe headers not specified. Falling back to using column 0 as text and column 1 as labels."


HBox(children=(IntProgress(value=0, max=1125), HTML(value='')))




HBox(children=(IntProgress(value=0, max=65), HTML(value='')))


              precision    recall  f1-score   support

    negative       0.76      0.67      0.71       375
     neutral       0.55      0.62      0.58       375
    positive       0.75      0.75      0.75       375

    accuracy                           0.68      1125
   macro avg       0.69      0.68      0.68      1125
weighted avg       0.69      0.68      0.68      1125




# Fine-tune distilbert

## Preprocessing

In [None]:
import random

import pandas as pd

# from transformers import DistilBertTokenizer

df_train = pd.read_csv("/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/data/processed/train.tsv", sep="\t")
reviews_train = list(df_train["reviews"])
reviews_train = [r.lower().strip() for r in reviews_train]
ratings_train = df_train["rating_class"]

df_test = pd.read_csv("/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/data/processed/test.tsv", sep="\t")
reviews_test = list(df_test["reviews"])
reviews_test = [r.lower().strip() for r in reviews_test]
ratings_test = df_test["rating_class"]

with open("/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/data/raw/reviews_without_ratings.txt", "r") as f:
    reviews = f.read().split("\n")

evalset = random.sample(reviews, 500)
evalset = [r.lower().strip() for r in evalset]
# tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased',
#                                           unk_token='<unk>')

# tokenized_train = [" ".join(tokenizer.tokenize(r)) for r in reviews_train]
# tokenized_test = [" ".join(tokenizer.tokenize(r)) for r in reviews_test]

all_train = "\n". join(reviews_train)
all_test = "\n". join(reviews_test)
all_eval = "\n".join(evalset)
with open("/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/data/processed/train.txt", "w") as outfile:
    outfile.write(all_train)

with open("/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/data/processed/test.txt", "w") as outfile:
    outfile.write(all_test)

with open("/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/data/processed/eval.txt", "w") as outfile:
    outfile.write(all_eval)


## Train your own language model based on distilbert

In [None]:
from simpletransformers.language_modeling import LanguageModelingModel


train_args = {
    "output_dir": "/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/langmods/",
    "best_model_dir": "/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/langmods/best_model/",
    "reprocess_input_data": True,
    "overwrite_output_dir": True,
     "num_train_epochs": 10,
     "evaluate_during_training": True,
}

model = LanguageModelingModel('distilbert', 'distilbert-base-uncased',
                              use_cuda=True,
                              args=train_args)
model.train_model("/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/data/processed/train.txt",
                  eval_file="/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/data/processed/test.txt")

model.eval_model("/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/data/processed/eval.txt")


HBox(children=(IntProgress(value=0, description='Downloading', max=442, style=ProgressStyle(description_width=…




HBox(children=(IntProgress(value=0, description='Downloading', max=267967963, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Downloading', max=231508, style=ProgressStyle(description_wid…


Selected optimization level O1:  Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic


HBox(children=(IntProgress(value=0, description='Epoch', max=10, style=ProgressStyle(description_width='initia…

HBox(children=(IntProgress(value=0, description='Current iteration', max=145, style=ProgressStyle(description_…

Running loss: 3.590918Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Running loss: 3.626575



Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
Running loss: 3.633977Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 8192.0
Running loss: 2.772837



Running loss: 2.789246


HBox(children=(IntProgress(value=0, description='Current iteration', max=145, style=ProgressStyle(description_…

Running loss: 2.901237


HBox(children=(IntProgress(value=0, description='Current iteration', max=145, style=ProgressStyle(description_…

Running loss: 2.210917


HBox(children=(IntProgress(value=0, description='Current iteration', max=145, style=ProgressStyle(description_…

Running loss: 2.084237


HBox(children=(IntProgress(value=0, description='Current iteration', max=145, style=ProgressStyle(description_…

Running loss: 2.025275


HBox(children=(IntProgress(value=0, description='Current iteration', max=145, style=ProgressStyle(description_…

Running loss: 3.854798


HBox(children=(IntProgress(value=0, description='Current iteration', max=145, style=ProgressStyle(description_…

Running loss: 3.123387


HBox(children=(IntProgress(value=0, description='Current iteration', max=145, style=ProgressStyle(description_…

Running loss: 1.930326


HBox(children=(IntProgress(value=0, description='Current iteration', max=145, style=ProgressStyle(description_…

Running loss: 1.861753


HBox(children=(IntProgress(value=0, description='Current iteration', max=145, style=ProgressStyle(description_…

Running loss: 2.016312



HBox(children=(IntProgress(value=0, max=22), HTML(value='')))




{'eval_loss': 2.422663921659643, 'perplexity': tensor(11.2759)}

# Build a classifier using your own language model

In [None]:
model2 = ClassificationModel(
    model_type="distilbert",
    model_name="/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/langmods/best_model/",
    use_cuda=True,
    num_labels=3,
    args={
        "output_dir": "/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/outputs2/",
        "best_model_dir": "/content/drive/My Drive/crowintelligence/projektek/manning/sentiment_analysis_project/Colab/outputs2/best_model/",
        "evaluate_during_training": True,
        "reprocess_input_data": True,
        "sliding_window": True,
        "overwrite_output_dir": True,
        "max_seq_length": 512,
        "num_train_epochs": 20,
        "train_batch_size": 20,
        "eval_batch_size": 20,
    },
)

# takes a few epoch => about an hour
model2.train_model(train_df, eval_df=test_df)


  "Dataframe headers not specified. Falling back to using column 0 as text and column 1 as labels."


HBox(children=(IntProgress(value=0, max=3375), HTML(value='')))


Selected optimization level O1:  Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic


HBox(children=(IntProgress(value=0, description='Epoch', max=20, style=ProgressStyle(description_width='initia…

HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 1.070511



Running loss: 1.079139



Running loss: 0.574585Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Running loss: 0.800024


  "Dataframe headers not specified. Falling back to using column 0 as text and column 1 as labels."


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.724640Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
Running loss: 0.613212


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.669993


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.511302


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.010672


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.115390


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.000839


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.115753


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.016286


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.017435


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.000258


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.108820Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
Running loss: 0.000343


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.016846


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.000191


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.000175


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.000161


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.000157


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.000133


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.074045


HBox(children=(IntProgress(value=0, description='Current iteration', max=193, style=ProgressStyle(description_…

Running loss: 0.000099



## Evaluate it

In [None]:
from sklearn.metrics import classification_report

result2, model_outputs2, wrong_predictions2 = model2.eval_model(test_df)
target_names = ["negative", "neutral", "positive"]
predicted_class2 = [list(e[0]) for e in model_outputs2]
predicted_class2 = [e.index(max(e)) for e in predicted_class2]
print(
    classification_report(
        list(test_df["rating_class"]), predicted_class2, target_names=target_names
    )
)


  "Dataframe headers not specified. Falling back to using column 0 as text and column 1 as labels."


HBox(children=(IntProgress(value=0, max=1125), HTML(value='')))




HBox(children=(IntProgress(value=0, max=65), HTML(value='')))


              precision    recall  f1-score   support

    negative       0.78      0.66      0.71       375
     neutral       0.54      0.64      0.59       375
    positive       0.75      0.73      0.74       375

    accuracy                           0.68      1125
   macro avg       0.69      0.68      0.68      1125
weighted avg       0.69      0.68      0.68      1125

