<a href="https://colab.research.google.com/github/mukamal/causal-language-detection-SVM-MNB/blob/main/causal_language_detection_SVM_MNB.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Compare MNB and SVM for Causal Language Detection


In [None]:
#Host your data in your Google drive and then mount to your Google drive. 
#You will be given an authorization code to finish the process

from  google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


Yu et al. (2019) annotated some conclusion sentences in biomedical and health research papers to identify their claim strength. Each sentence was annotated as belonging to one of the four categories:

Label = 0 : No relationship (1356 cases)
Label = 1 : Direct causal (494 cases)
Label = 2 : Conditional causal (213 cases)
Label = 3 : Correlational (998 cases)


# Build LinearSVC model with sklearn

## Step 1: Read in data

In [None]:
# read in the training data. The dataset includes two columns: label, sentence 
import pandas as p
train=p.read_csv("/content/drive/My Drive/data/pubmed_causal_language_use.csv")
#train.head()
y=train['label'].values
X=train['sentence'].values
X.shape

(3061,)

## Step 2: Split train/test data for hold-out test

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

print(X_train.shape, y_train.shape, X_test.shape, y_test.shape)
print(X_train[0])
print(y_train[0])
print(X_test[0])
print(y_test[0])

(2448,) (2448,) (613,) (613,)
The high rate of text message usage makes it feasible to recruit YAMs for a prospective study in which personalized text messages are used to promote healthy behaviours.
0
The lack of symptoms and the preoperative EGD findings were not suggestive of this diagnosis in any case.
0


## Step 2.1 Data Checking

In [None]:
import numpy as np
unique, counts = np.unique(y_train, return_counts=True)
print(np.asarray((unique, counts)))


[[   0    1    2    3]
 [1055  409  169  815]]


The sample output shows that the data set is skewed with 1055/2448=43% "No relationship" examples. All other categories are smaller.





> [0 1 2 3]

> [1055  409  169  815]









## Step 3: Vectorization

In [None]:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfVectorizer

#  unigram and bigram term frequency vectorizer, set minimum document frequency to 5
gram12_count_vectorizer = CountVectorizer(encoding='latin-1', ngram_range=(1,2), min_df=5)


### Step 3.1: Vectorize the training data

In [None]:
# this step is the same as the NB script

# The vectorizer can do "fit" and "transform"
# fit is a process to collect unique tokens into the vocabulary
# transform is a process to convert each document to vector based on the vocabulary
# These two processes can be done together using fit_transform(), or used individually: fit() or transform()

# fit vocabulary in training documents and transform the training documents into vectors
X_train_vec = gram12_count_vectorizer.fit_transform(X_train)

# check the content of a document vector
print(X_train_vec.shape)
print(X_train_vec[0].toarray())

# check the size of the constructed vocabulary
print(len(gram12_count_vectorizer.vocabulary_))

# print out the first 10 items in the vocabulary
print(list(gram12_count_vectorizer.vocabulary_.items())[:10])

# check word index in vocabulary
print(gram12_count_vectorizer.vocabulary_.get('prospective'))

(2448, 2371)
[[0 0 0 ... 0 0 0]]
2371
[('the', 2021), ('high', 879), ('rate', 1663), ('of', 1375), ('it', 1107), ('feasible', 727), ('to', 2142), ('for', 757), ('prospective', 1634), ('study', 1921)]
1634


### Step 3.2: Vectorize the test data

In [None]:
# this step is the same as the NB script

# use the vocabulary constructed from the training data to vectorize the test data. 
# Therefore, use "transform" only, not "fit_transform", 
# otherwise "fit" would generate a new vocabulary from the test data

X_test_vec = gram12_count_vectorizer.transform(X_test)

# print out #examples and #features in the test set
print(X_test_vec.shape)

(613, 2371)


## Step 4: Train a LinearSVC classifier

In [None]:
# import the LinearSVC module
from sklearn.svm import LinearSVC

# initialize the LinearSVC model
svm_clf = LinearSVC(C=.1)

# use the training data to train the model
svm_clf.fit(X_train_vec,y_train)

LinearSVC(C=0.1, class_weight=None, dual=True, fit_intercept=True,
          intercept_scaling=1, loss='squared_hinge', max_iter=1000,
          multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,
          verbose=0)

### Step 4.1 Interpret a trained LinearSVC model

In [None]:
feature_ranks = sorted(zip(svm_clf.coef_[0], gram12_count_vectorizer.get_feature_names()))

## For category "0" (No relationship), get all features and their weights and sort them in increasing order
## get the 10 features that are best indicators of No relationship sentiment (they are at the bottom of the ranked list)
no_relationship_10 = feature_ranks[-10:]
print("No relationship")
for i in range(0, len(no_relationship_10)):
    print(no_relationship_10[i][1])
print()

feature_ranks = sorted(zip(svm_clf.coef_[1], gram12_count_vectorizer.get_feature_names()))
no_relationship_10 = feature_ranks[-10:]
print("Direct causal ")
for i in range(0, len(no_relationship_10)):
    print(no_relationship_10[i][1])
print()


feature_ranks = sorted(zip(svm_clf.coef_[2], gram12_count_vectorizer.get_feature_names()))
no_relationship_10 = feature_ranks[-10:]
print("Conditional causal ")
for i in range(0, len(no_relationship_10)):
    print(no_relationship_10[i][1])
print()

feature_ranks = sorted(zip(svm_clf.coef_[3], gram12_count_vectorizer.get_feature_names()))
no_relationship_10 = feature_ranks[-10:]
print("Correlational")
for i in range(0, len(no_relationship_10)):
    print(no_relationship_10[i][1])
print()








No relationship
because
of children
appropriate
safety
cases of
assessment
can be
needed
studies
should

Direct causal 
oral
liver
effect
resulted
resulted in
effects
improves
benefit
effective
improved

Conditional causal 
appeared to
appears to
influence
improve
is likely
would
likely
could
might
may

Correlational
increased
predictors of
associated with
predictor
vary
associated
predicted
association
related to
predict



In [None]:
## For category "4" (Correlational), get all features and their weights and sort them in increasing order
feature_ranks = sorted(zip(svm_clf.coef_[3], gram12_count_vectorizer.get_feature_names()))
## get the 10 features that are best indicators of very positive sentiment (they are at the bottom of the ranked list)



correlational_10 = feature_ranks[-10:]
print("correlational words")
for i in range(0, len(correlational_10)):
    print(correlational_10[i])
print()

correlational words
(0.5216321245015781, 'increased')
(0.5364959013967424, 'predictors of')
(0.5440402895143588, 'associated with')
(0.5712281748229758, 'predictor')
(0.571739201138082, 'vary')
(0.5910894180990367, 'associated')
(0.608931479811782, 'predicted')
(0.6337471396251522, 'association')
(0.69503568338529, 'related to')
(0.9622244100524631, 'predict')



## Step 5: Test the LinearSVC classifier

In [None]:
# test the classifier on the test data set, print accuracy score

svm_clf.score(X_test_vec,y_test)

0.7699836867862969

In [None]:
# # optimize SVM


# gram12_count_vectorizer = CountVectorizer(encoding='latin-1', ngram_range=(1,2), min_df=5)

# unigram_bool_vectorizer = CountVectorizer(encoding='latin-1', binary=True, min_df=4)
# unigram_count_vectorizer = CountVectorizer(encoding='latin-1', binary=False, min_df=6)


# X_train_vec = gram12_count_vectorizer.fit_transform(X_train)
# X_test_vec = gram12_count_vectorizer.transform(X_test)

# svm_clf = LinearSVC(C=.1)

# # use the training data to train the model
# svm_clf.fit(X_train_vec,y_train)

# svm_clf.score(X_test_vec,y_test)

0.7699836867862969

In [None]:
# print confusion matrix and classification report

from sklearn.metrics import confusion_matrix
y_pred = svm_clf.predict(X_test_vec)
cm=confusion_matrix(y_test, y_pred, labels=[0,1,2,3])
print(cm)
print()

from sklearn.metrics import classification_report
target_names = ['0','1','2','3']
print(classification_report(y_test, y_pred, target_names=target_names))

[[248  22   9  22]
 [ 24  56   1   4]
 [  9   4  25   6]
 [ 27   7   6 143]]

              precision    recall  f1-score   support

           0       0.81      0.82      0.81       301
           1       0.63      0.66      0.64        85
           2       0.61      0.57      0.59        44
           3       0.82      0.78      0.80       183

    accuracy                           0.77       613
   macro avg       0.72      0.71      0.71       613
weighted avg       0.77      0.77      0.77       613



### Step 5.1 Interpret the prediction result

In [None]:
## get the confidence scores for all test examples from each of the five binary classifiers
svm_confidence_scores = svm_clf.decision_function(X_test_vec)
## get the confidence score for the first test example
print(svm_confidence_scores[0])

## sample output: array([-1.05306321, -0.62746206,  0.31074854, -0.89709483, -1.08343089]
## because the confidence score is the highest for category 2, 
## the prediction should be 2. 

## Confirm by printing out the actual prediction
print(y_test[0])

[ 0.83173398 -1.21758405 -1.87764217 -0.36970694]
0


### Step 5.2 Error Analysis

In [None]:
err_cnt = 0
for i in range(0, len(y_test)):
    if(y_test[i]==3 and y_pred[i]==0):
        print(X_test[i])
        err_cnt = err_cnt+1
print("errors:", err_cnt)

In ACS patients, without previous history of DM, MS is highly prevalent.
Incident gallstones and the metabolic syndrome share common risk factors.
Distractions are prevalent in ORs and in this study were linked to deterioration in intraoperative patient safety checks.
Extrapulmonary manifestations may be useful clues for diagnosis.
Oral impacts were more frequently reported in T2D cases than controls.
The main RFs identified for lead exposure were age≤ 3years old and pica behavior.
Chemerin showed positive correlations with potent health threatening components of lipid profile including triglyceride and cholesterol levels in adolescents.
The meta-analysis demonstrated that the G allele of the SUMO4 M55V polymorphism could be a susceptible risk locus to T2DM, mainly in the Chinese population, while the association in other ethnic population needs to be further validated in studies with relatively large samples.
Data from NIV can identify a change in breathing patterns that predicts seve

## Step 6:  Cross Validation

In [None]:
from sklearn.pipeline import Pipeline
from sklearn.model_selection import cross_val_score

#gram12_count_vectorizer = CountVectorizer(encoding='latin-1', ngram_range=(1,2), min_df=5)

nb_clf_pipe = Pipeline([('vect', CountVectorizer(encoding='latin-1', binary=False)),('nb', LinearSVC(C=.1))])
scores = cross_val_score(nb_clf_pipe, X, y, cv=5)
scores.mean()

0.747149985605988

In [None]:
# print out specific type of error for further analysis

# print out the Correlational examples that are mistakenly predicted as "No Relation"
# according to the confusion matrix, there should be 53 such examples
# note if you use a different vectorizer option, your result might be different

err_cnt = 0
for i in range(0, len(y_test)):
    if(y_test[i]==3 and y_pred[i]==0):
        print(X_test[i])
        err_cnt = err_cnt+1

print("errors:", err_cnt)

In ACS patients, without previous history of DM, MS is highly prevalent.
Incident gallstones and the metabolic syndrome share common risk factors.
Distractions are prevalent in ORs and in this study were linked to deterioration in intraoperative patient safety checks.


# Build MNB with sklearn

## Step 3: Vectorization

In [None]:
unigram_bool_vectorizer = CountVectorizer(encoding='latin-1', binary=True, min_df=3)

X_train_vec = unigram_bool_vectorizer.fit_transform(X_train)
X_test_vec = unigram_bool_vectorizer.transform(X_test)

## Step 4: Train a MNB classifier

In [None]:
# import the MNB module
from sklearn.naive_bayes import MultinomialNB
from sklearn.naive_bayes import BernoulliNB

# initialize the MNB model
nb_clf= MultinomialNB()

# use the training data to train the MNB model
# feature_log_prob_ stores the conditional probs for all categories
# if the labels are strings, the index is in alphabetic order
# e.g. 'f' comes before 't' in alphabet, so 'f' is in [0] dimension and 't' in [1]

nb_clf.fit(X_train_vec,y_train)
print(nb_clf.classes_)
print(nb_clf.feature_log_prob_.shape)

[0 1 2 3]
(4, 2108)


### Step 4.1 Interpret a trained MNB model

In [None]:

print(unigram_bool_vectorizer.vocabulary_.get('correlate'))
for i in range(0,4):
  print(nb_clf.feature_log_prob_[i][unigram_bool_vectorizer.vocabulary_.get('correlate')])


440
-9.062246313731455
-9.076580381796658
-8.486940148245216
-8.023486859457636


In [None]:
log_ratios = []
features = unigram_bool_vectorizer.get_feature_names()
no_rel_cond_prob = nb_clf.feature_log_prob_[0]
corr_cond_prob = nb_clf.feature_log_prob_[3]

for i in range(0, len(features)):
  log_ratio = corr_cond_prob[i] - no_rel_cond_prob[i]
  log_ratios.append(log_ratio)

exercise_C_ranks = sorted(zip(log_ratios, features))
print(exercise_C_ranks[:10])
print(exercise_C_ranks[-10:])

[(-4.208264617886666, 'needed'), (-3.8665153241646095, 'clinicaltrials'), (-3.8478231911524574, 'gov'), (-3.828774996181762, 'registration'), (-3.3587713669360264, 'trial'), (-3.3115184820854813, 'required'), (-3.0555851079482803, 'research'), (-2.968573730958651, 'registered'), (-2.9220537153237576, 'determine'), (-2.8732635511543254, 'therefore')]
[(2.5203639951980357, 'received'), (2.5203639951980357, 'strong'), (2.607375372187665, 'inversely'), (2.7064662748318966, 'lower'), (2.737428500435864, 'increased'), (2.761526052014924, 'controls'), (2.761526052014924, 'difference'), (2.76989430168544, 'associated'), (3.1182009959536554, 'correlated'), (3.1182009959536554, 'independently')]


## Step 5: Test the MNB classifier

In [None]:
# test the classifier on the test data set, print accuracy score

nb_clf.score(X_test_vec,y_test)

0.7830342577487766

In [None]:
# # optimize NB
# unigram_bool_vectorizer = CountVectorizer(encoding='latin-1', binary=True, min_df=3)

# X_train_vec = unigram_bool_vectorizer.fit_transform(X_train)
# X_test_vec = unigram_bool_vectorizer.transform(X_test)

# nb_clf= MultinomialNB()

# nb_clf.fit(X_train_vec,y_train)
# nb_clf.score(X_test_vec,y_test)

In [None]:
0.7830342577487766


0.7830342577487766

In [None]:
# print confusion matrix (row: ground truth; col: prediction)

from sklearn.metrics import confusion_matrix
y_pred = nb_clf.fit(X_train_vec, y_train).predict(X_test_vec)
cm=confusion_matrix(y_test, y_pred, labels=[0,1,2,3])
print(cm)

[[245  21   3  32]
 [ 22  53   0  10]
 [ 16   7  15   6]
 [ 11   5   0 167]]


In [None]:
# print classification report

from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
print(precision_score(y_test, y_pred, average=None))
print(recall_score(y_test, y_pred, average=None))

from sklearn.metrics import classification_report
target_names = ['0','1','2','3']
print(classification_report(y_test, y_pred, target_names=target_names))

[0.83333333 0.61627907 0.83333333 0.77674419]
[0.81395349 0.62352941 0.34090909 0.91256831]
              precision    recall  f1-score   support

           0       0.83      0.81      0.82       301
           1       0.62      0.62      0.62        85
           2       0.83      0.34      0.48        44
           3       0.78      0.91      0.84       183

    accuracy                           0.78       613
   macro avg       0.76      0.67      0.69       613
weighted avg       0.79      0.78      0.78       613



### Step 5.1 Interpret the prediction result


In [None]:
## find the calculated posterior probability
posterior_probs = nb_clf.predict_proba(X_test_vec)

## find the posterior probabilities for the first test example
print(posterior_probs[0])

# find the category prediction for the first test example
y_pred = nb_clf.predict(X_test_vec)
print(y_pred[0])

# check the actual label for the first test example
print(y_test[0])

[6.98271028e-01 1.88539471e-02 2.52615299e-05 2.82849763e-01]
0
0


## Step 5.2 Error Analysis

In [None]:
# print out specific type of error for further analysis

# print out the very positive examples that are mistakenly predicted as negative
# according to the confusion matrix, there should be 53 such examples
# note if you use a different vectorizer option, your result might be different

err_cnt = 0
for i in range(0, len(y_test)):
    if(y_test[i]==3 and y_pred[i]==0):
        print(X_test[i])
        err_cnt = err_cnt+1
print("errors:", err_cnt)

Distractions are prevalent in ORs and in this study were linked to deterioration in intraoperative patient safety checks.
Extrapulmonary manifestations may be useful clues for diagnosis.
Data from NIV can identify a change in breathing patterns that predicts severe AECOPD.
Older adults with DM appear to perform poorly on an ambulatory measure of multitasking.
In this large multinational study, treatment with SGLT-2i versus other glucose-lowering drugs was associated with a lower risk of HHF and death, suggesting that the benefits seen with empagliflozin in a randomized trial may be a class effect applicable to a broad population of patients with type 2 diabetes  mellitus in real-world practice.
PCP beliefs about mammography effectiveness and screening recommendations are only modestly associated with use, suggesting other likely influences on patient participation in mammography.
We confi the low frequency of PI in the cervical cancer IB1 subgroup and its association with the depth of 

## Step 6 Cross Validation


In [None]:
from sklearn.pipeline import Pipeline
from sklearn.model_selection import cross_val_score

#unigram_bool_vectorizer = CountVectorizer(encoding='latin-1', binary=True, min_df=3)


nb_clf_pipe = Pipeline([('vect', CountVectorizer(encoding='latin-1', binary=True, min_df=5)),('nb', MultinomialNB())])
scores = cross_val_score(nb_clf_pipe, X, y, cv=5)
scores.mean()

0.7236237725106356

# Build BERT with sklearn

In [None]:
# install BERT sklearn wrapper written by charles9n
# check out the github page for fine tuning options and usage
# https://github.com/charles9n/bert-sklearn

!git clone -b master https://github.com/charles9n/bert-sklearn
!cd bert-sklearn; pip install .

fatal: destination path 'bert-sklearn' already exists and is not an empty directory.
Processing /content/bert-sklearn
Building wheels for collected packages: bert-sklearn
  Building wheel for bert-sklearn (setup.py) ... [?25l[?25hdone
  Created wheel for bert-sklearn: filename=bert_sklearn-0.3.1-cp37-none-any.whl size=54235 sha256=fafaaf28cf24cc915ed7a2066cef69dba54069ae1d9c62240e8892533658160c
  Stored in directory: /root/.cache/pip/wheels/61/95/c6/5790aae8fb377f5ff356dbe58205aab28858595d6bff8197d0
Successfully built bert-sklearn
Installing collected packages: bert-sklearn
  Found existing installation: bert-sklearn 0.3.1
    Uninstalling bert-sklearn-0.3.1:
      Successfully uninstalled bert-sklearn-0.3.1
Successfully installed bert-sklearn-0.3.1


In [None]:
# fine tune a BERT base uncased model
# since this wrapper has included vectorization using word embedding, no need to vectorize like in LinearSVC
# first the pre-trained BERT model will be loaded in
# then the training starts. 90% examples will be used as training examples and the other 10% as validation (parameter tuning)
# default setting is 3 epoch. Each epoch takes in some training data
from bert_sklearn import BertClassifier
model = BertClassifier()         # text/text pair classification
print(model)
model.fit(X_train, y_train)

Building sklearn text classifier...
BertClassifier(bert_config_json=None, bert_model='bert-base-uncased',
               bert_vocab=None, do_lower_case=None, epochs=3, eval_batch_size=8,
               fp16=False, from_tf=False, gradient_accumulation_steps=1,
               ignore_label=None, label_list=None, learning_rate=2e-05,
               local_rank=-1, logfile='bert_sklearn.log', loss_scale=0,
               max_seq_length=128, num_mlp_hiddens=500, num_mlp_layers=0,
               random_state=42, restore_file=None, train_batch_size=32,
               use_cuda=True, validation_fraction=0.1, warmup_proportion=0.1)
Loading bert-base-uncased model...
Defaulting to linear classifier/regressor
Loading Pytorch checkpoint

train data size: 2204, validation data size: 244



  cpuset_checked))


HBox(children=(FloatProgress(value=0.0, description='Training  ', max=69.0, style=ProgressStyle(description_wi…

	add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
	add_(Tensor other, *, Number alpha) (Triggered internally at  /pytorch/torch/csrc/utils/python_arg_parser.cpp:1005.)
  next_m.mul_(beta1).add_(1 - beta1, grad)





HBox(children=(FloatProgress(value=0.0, description='Validating', max=31.0, style=ProgressStyle(description_wi…



Epoch 1, Train loss: 0.8406, Val loss: 0.5015, Val accy: 81.56%



HBox(children=(FloatProgress(value=0.0, description='Training  ', max=69.0, style=ProgressStyle(description_wi…




HBox(children=(FloatProgress(value=0.0, description='Validating', max=31.0, style=ProgressStyle(description_wi…



Epoch 2, Train loss: 0.2487, Val loss: 0.4549, Val accy: 85.66%



HBox(children=(FloatProgress(value=0.0, description='Training  ', max=69.0, style=ProgressStyle(description_wi…




HBox(children=(FloatProgress(value=0.0, description='Validating', max=31.0, style=ProgressStyle(description_wi…



Epoch 3, Train loss: 0.0759, Val loss: 0.4093, Val accy: 87.30%



BertClassifier(bert_config_json=None, bert_model='bert-base-uncased',
               bert_vocab=None, do_lower_case=True, epochs=3, eval_batch_size=8,
               fp16=False, from_tf=False, gradient_accumulation_steps=1,
               ignore_label=None, label_list=array([0, 1, 2, 3]),
               learning_rate=2e-05, local_rank=-1, logfile='bert_sklearn.log',
               loss_scale=0, max_seq_length=128, num_mlp_hiddens=500,
               num_mlp_layers=0, random_state=42, restore_file=None,
               train_batch_size=32, use_cuda=True, validation_fraction=0.1,
               warmup_proportion=0.1)

In [None]:
model.save('bert-sentiment.model')

In [None]:
y_pred = model.predict(X_test)

  cpuset_checked))


HBox(children=(FloatProgress(value=0.0, description='Predicting', max=77.0, style=ProgressStyle(description_wi…




In [None]:
# BERT error analysis
err_cnt = 0
for i in range(0, len(y_test)):
  if (y_test[i]==3 and y_pred[i]==0):
    print(X_test[i])
    err_cnt = err_cnt+1
print("errors:", err_cnt)

In ACS patients, without previous history of DM, MS is highly prevalent.
Extrapulmonary manifestations may be useful clues for diagnosis.
Older adults with DM appear to perform poorly on an ambulatory measure of multitasking.
"There were no statistically significant differences in measured variables found between the two study groups."
Individual symptom resolution rates were highly variable.
The majority of cases of hemivertebra had coexisting anomalies, and in these cases the rate of perinatal loss was high.
In this pilot study, we found that the combination of SIRS criteria and PCT levels is useful for the early detection of sepsis in ED patients with suspected infection.
Epidemiological trends are more or less common to those of developing countries with a predominance of invasive ductal carcinoma.
There are significant age differences in adherence.
errors: 9


In [None]:
from sklearn.model_selection import cross_val_score
scores = cross_val_score(model,X_train,y_train,cv=3)
print(sum(scores)/len(scores))

Building sklearn text classifier...
Loading bert-base-uncased model...
Defaulting to linear classifier/regressor
Loading Pytorch checkpoint

train data size: 1469, validation data size: 163



  cpuset_checked))


HBox(children=(FloatProgress(value=0.0, description='Training  ', max=46.0, style=ProgressStyle(description_wi…




HBox(children=(FloatProgress(value=0.0, description='Validating', max=21.0, style=ProgressStyle(description_wi…



Epoch 1, Train loss: 1.0495, Val loss: 0.6107, Val accy: 75.46%



HBox(children=(FloatProgress(value=0.0, description='Training  ', max=46.0, style=ProgressStyle(description_wi…




HBox(children=(FloatProgress(value=0.0, description='Validating', max=21.0, style=ProgressStyle(description_wi…



Epoch 2, Train loss: 0.4509, Val loss: 0.3564, Val accy: 87.73%



HBox(children=(FloatProgress(value=0.0, description='Training  ', max=46.0, style=ProgressStyle(description_wi…




HBox(children=(FloatProgress(value=0.0, description='Validating', max=21.0, style=ProgressStyle(description_wi…



Epoch 3, Train loss: 0.1406, Val loss: 0.3487, Val accy: 87.73%



HBox(children=(FloatProgress(value=0.0, description='Testing', max=102.0, style=ProgressStyle(description_widt…



Loss: 0.3659, Accuracy: 87.75%
Building sklearn text classifier...
Loading bert-base-uncased model...
Defaulting to linear classifier/regressor
Loading Pytorch checkpoint

train data size: 1469, validation data size: 163



HBox(children=(FloatProgress(value=0.0, description='Training  ', max=46.0, style=ProgressStyle(description_wi…




HBox(children=(FloatProgress(value=0.0, description='Validating', max=21.0, style=ProgressStyle(description_wi…



Epoch 1, Train loss: 0.9599, Val loss: 0.5842, Val accy: 81.60%



HBox(children=(FloatProgress(value=0.0, description='Training  ', max=46.0, style=ProgressStyle(description_wi…




HBox(children=(FloatProgress(value=0.0, description='Validating', max=21.0, style=ProgressStyle(description_wi…



Epoch 2, Train loss: 0.3951, Val loss: 0.4432, Val accy: 83.44%



HBox(children=(FloatProgress(value=0.0, description='Training  ', max=46.0, style=ProgressStyle(description_wi…




HBox(children=(FloatProgress(value=0.0, description='Validating', max=21.0, style=ProgressStyle(description_wi…



Epoch 3, Train loss: 0.1304, Val loss: 0.4128, Val accy: 87.73%



HBox(children=(FloatProgress(value=0.0, description='Testing', max=102.0, style=ProgressStyle(description_widt…



Loss: 0.3737, Accuracy: 86.89%
Building sklearn text classifier...
Loading bert-base-uncased model...
Defaulting to linear classifier/regressor
Loading Pytorch checkpoint

train data size: 1469, validation data size: 163



HBox(children=(FloatProgress(value=0.0, description='Training  ', max=46.0, style=ProgressStyle(description_wi…




HBox(children=(FloatProgress(value=0.0, description='Validating', max=21.0, style=ProgressStyle(description_wi…



Epoch 1, Train loss: 0.9711, Val loss: 0.5911, Val accy: 75.46%



HBox(children=(FloatProgress(value=0.0, description='Training  ', max=46.0, style=ProgressStyle(description_wi…




HBox(children=(FloatProgress(value=0.0, description='Validating', max=21.0, style=ProgressStyle(description_wi…



Epoch 2, Train loss: 0.3938, Val loss: 0.4027, Val accy: 87.73%



HBox(children=(FloatProgress(value=0.0, description='Training  ', max=46.0, style=ProgressStyle(description_wi…




HBox(children=(FloatProgress(value=0.0, description='Validating', max=21.0, style=ProgressStyle(description_wi…



Epoch 3, Train loss: 0.1131, Val loss: 0.3684, Val accy: 87.12%



HBox(children=(FloatProgress(value=0.0, description='Testing', max=102.0, style=ProgressStyle(description_widt…



Loss: 0.3909, Accuracy: 86.64%
87.09150326797385


87.09150326797385