# BERT Weather Condition Custom Training and Sentiment Analysis Evaluation

Import the necessary libraries and packages. Pandas for the dataframe and in-memory data management. Numpy for linear algebra, wordcloud and matplotlib for visualization.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from wordcloud import WordCloud
import os

Import and read the data into a pandas dataframe

In [2]:
path = "daily_weather_2020.csv" # path to data csv
if(os.path.exists(path)):
    print("File Found")
    df = pd.read_csv(path, usecols=['summary', 'icon'], low_memory=True, dtype=str)
else:
    print("no file found")

File Found


# Data Cleaning and Validation setup

Replace the icon column of the dataframe and check all the data was loaded properly as expected. The icon column items are replaced with the proper sentiment instead of the current icon value. This will help with validation when training the data. The entire dataset has validation.

In [3]:
# Print the dataframe basic information
df = df.rename(columns={'summary':'statement', 'icon': 'sentiment'})
df.info()
# Set up the validation column in the dataframe without modifying the raw data
def setValidation():
    print("SETTING UP DATA VALIDATION")
    for i in range(len(df.sentiment)):
        val = df.sentiment[i]
        if val == 'rain' or val == 'snow' or val == 'wind' or val == 'fog':
            df.sentiment[i] = 'negative' # negative sentiment
        elif val == 'clear-day':
            df.sentiment[i] = 'positive' # positive sentiment
        elif val == 'partly-cloudy-day' or val == 'cloudy':
            df.sentiment[i] = 'neutral' # neutral (possibly not enough data)
    print("VALIDATION HAS BEEN SET UP")

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 30688 entries, 0 to 30687
Data columns (total 2 columns):
 #   Column     Non-Null Count  Dtype 
---  ------     --------------  ----- 
 0   statement  30685 non-null  object
 1   sentiment  30688 non-null  object
dtypes: object(2)
memory usage: 479.6+ KB


The validation that is setup can be tested by running the snippet below. This is a time consuming and resource intensive step, so it is not recommended to run this when reusing the same dataset.

In [4]:
# check the validation edit worked
def checkValidationSetting():
    print("CHECKING VALIDATION DATA")
    favorable_count = 0
    unfavorable_count = 0
    neutral_count = 0

    for i in range(len(df.sentiment)):
        if(df.sentiment[i] == 'positive'):
            favorable_count +=1
        elif(df.sentiment[i] == 'negative'):
            unfavorable_count +=1
        elif(df.sentiment[i] == 'neutral'):
            neutral_count +=1

    print('Favorable Count: ' + str(favorable_count))
    print('Unfavorable Count: ' + str(unfavorable_count))
    print('Neutral Count: ' + str(neutral_count))


setValidation()
checkValidationSetting()
print('PROCESS COMPLETED')

SETTING UP DATA VALIDATION
VALIDATION HAS BEEN SET UP
CHECKING VALIDATION DATA
Favorable Count: 8527
Unfavorable Count: 12904
Neutral Count: 9257
PROCESS COMPLETED


# Test and Train DataFrames

In [5]:
train,eva = train_test_split(df,test_size=0.1)

# Model Setup

The BERT model is setup here with the sentiment classifiers. The model segments into three classes by default. These three classes are mapped to condition sentiments in the model setup.

In [7]:
!pip install simpletransformers

[31mERROR: Could not find a version that satisfies the requirement simpletransformers (from versions: none)[0m
[31mERROR: No matching distribution found for simpletransformers[0m


In [14]:
from simpletransformers.classification import ClassificationModel

# Create a TransformerModel
model = ClassificationModel('bert', 'bert-base-cased', num_labels=3, args={'reprocess_input_data': True, 'overwrite_output_dir': True},use_cuda=True)

Some weights of the model checkpoint at bert-base-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at b

The default model outputs are overwritten here and replaced with the relevant condition sentiment

In [15]:
# 0,1,2 : positive,negative
def making_label(st):
    if(st=='positive'):
        return 0
    elif(st=='neutral'):
        return 2
    else:
        return 1

train['label'] = train['sentiment'].apply(making_label)
eva['label'] = eva['sentiment'].apply(making_label)
print(train.shape)

(27619, 3)


The dataset is prepared for training via the creation of a test DataFrame and Evaluation DataFrame

In [16]:
train_df = pd.DataFrame({
    'text': train['statement'][:27619].replace(r'\n', ' ', regex=True),
    'label': train['label'][:27619]
})
print(train_df)
eval_df = pd.DataFrame({
    'text': eva['statement'][-400:].replace(r'\n', ' ', regex=True),
    'label': eva['label'][-400:]
})

                                                    text  label
14822                  Partly cloudy throughout the day.      2
3072                                    Foggy overnight.      2
23524                          Clear throughout the day.      0
2110                      Light rain throughout the day.      1
17889                          Clear throughout the day.      0
...                                                  ...    ...
10752        Humid and mostly cloudy throughout the day.      2
383                    Mostly cloudy throughout the day.      2
28601  Possible light snow (1–3 in.) in the morning a...      1
11672                  Mostly cloudy throughout the day.      2
5628                   Partly cloudy throughout the day.      0

[27619 rows x 2 columns]


The DataFrames are then run through testing and new result variables are created for evaluating model metrics

In [18]:
model.train_model(train_df)
result, model_outputs, wrong_predictions = model.eval_model(eval_df)

  0%|          | 0/27619 [00:00<?, ?it/s]

Process ForkPoolWorker-3:
Traceback (most recent call last):
  File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/usr/local/lib/python3.7/dist-packages/simpletransformers/classification/classification_utils.py", line 127, in preprocess_data_multiprocessing
    return_tensors="pt",
  File "/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py", line 2460, in __call__
    **kwargs,
  File "/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py", line 2651, in batch_encode_plus
    **kwargs,
  File "/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_fast.py", line 427, in _batch_encode_plus
    is_pretokenized=is_split_into_words,
KeyboardInte

KeyboardInterrupt: ignored

# Model Metrics and Model Performance

SciKit is used to validate the model and test how the model performs when trained with this dataset. Anything above 90% is enough for the model to be considered useful and accurate.

In [None]:
print(result)
print(model_outputs)

lst = []
for arr in model_outputs:
    lst.append(np.argmax(arr))

true = eval_df['label'].tolist()
predicted = lst

Create a visualization of the data processed in the network and the values of each class. The numbers visualized here should be correponding to the values of the check validation function earlier.

In [None]:
import sklearn
mat = sklearn.metrics.confusion_matrix(true , predicted)
print(mat)

df_cm = pd.DataFrame(mat, range(3), range(3))

sns.heatmap(df_cm, annot=True)
plt.show()
print('Model Accuracy: ', 100*sklearn.metrics.accuracy_score(true,predicted), '%')

sklearn.metrics.classification_report(true,predicted,target_names=['positive','neutral','negative'])
print('Model Accuracy: ', 100*sklearn.metrics.accuracy_score(true,predicted), '%')

Here the model training classifications and performance is visualized with matplotlib.

In [None]:
sns.countplot(df.sentiment)

df['label'].value_counts()

# Live Demo and Model Testing
This function allows for the model to be evaluated with manually prompted data, a user interface for evaluating the model with real time and human generated queries.

In [None]:
def get_result(statement):
    result = model.predict([statement])
    pos = np.where(result[1][0] == np.amax(result[1][0]))
    pos = int(pos[0])
    sentiment_dict = {0:'positive',1:'negative',2:'neutral'}
    print(sentiment_dict[pos])
    return

sentiment = get_result(input("Input a phrase for Validation: "))
print("The input data was classified as:", sentiment)