<a href="https://colab.research.google.com/github/howard-haowen/NLP-demos/blob/main/NSYSU/W07-text-classification-with-spacy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook is written by [Haowen Jiang](https://howard-haowen.rohan.tw/) and meant for the 2022 [NLP Workshop at NSYSU](https://howard-haowen.rohan.tw/NLP-demos/nsysu_workshop).

In [1]:
from datetime import date

today = date.today()
print("Last updated:", today)

Last updated: 2022-06-03


# Text Classification with spaCy

In this notebook, we're going to train text classification models using spaCy, mostly on its CLI.

There are numerous use cases for text classification, including

- Email spam detector
![](https://i1.wp.com/www.opinosis-analytics.com/wp-content/uploads/2020/08/document_classification.png?resize=872%2C436&ssl=1)

- Hate speech detector
![](https://i1.wp.com/www.opinosis-analytics.com/wp-content/uploads/2020/08/facebook_hatespeech.png?resize=721%2C548&ssl=1)

- Customer sentiment analysis
![](https://d33wubrfki0l68.cloudfront.net/9e1b2a906ae6b01cfe2d5d237e1e51f5d41864e3/2a5f9/static/348bb1d70089176ca2f61ea402094382/50bf7/main.png)

- Customer support system
![](https://www.opinosis-analytics.com/wp-content/uploads/2020/07/big_data_strategy_ticket_routing-1024x717.png)

- News classification
![](https://miro.medium.com/max/700/1*HgXA9v1EsqlrRDaC_iORhQ.png)

- Chatbot intent recognition 
![](https://assets-global.website-files.com/5e29a0c20f2d35836e6bc609/5eafc053bd54499b92d23c9d_Intent-Classification.png)

## Dataset

In this tutorial, we'll be using a dataset of 50K online reviews for 5 product categories. Read [this post](https://howard-haowen.rohan.tw/blog.ai/spacy/text-classification/sentiment-analysis/customer-reviews/fasttext/facets/2021/03/12/Classifying-customer-reviews-with-spaCy-v3.html#Preparing-the-dataset) of mine to find out details on how the dataset has been processed to become the way it looks now.

In [2]:
!wget -O reviews.csv https://github.com/howard-haowen/NLP-demos/raw/main/online_shopping_5_cats_tra.csv

--2022-06-03 02:09:38--  https://github.com/howard-haowen/NLP-demos/raw/main/online_shopping_5_cats_tra.csv
Resolving github.com (github.com)... 140.82.112.4
Connecting to github.com (github.com)|140.82.112.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/howard-haowen/NLP-demos/main/online_shopping_5_cats_tra.csv [following]
--2022-06-03 02:09:38--  https://raw.githubusercontent.com/howard-haowen/NLP-demos/main/online_shopping_5_cats_tra.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 8062412 (7.7M) [text/plain]
Saving to: ‘reviews.csv’


2022-06-03 02:09:39 (72.4 MB/s) - ‘reviews.csv’ saved [8062412/8062412]



In [3]:
import pandas as pd
pd.options.plotting.backend = "plotly"

In [4]:
df = pd.read_csv('reviews.csv')
df

Unnamed: 0,cat,label,review
0,平板,1,﻿很不錯。。。。。。很好的平板
1,平板,1,幫同學買的，同學說感覺挺好，質量也不錯
2,平板,1,東西不錯，一看就是正品包裝，還沒有開機，相信京東，都是老顧客，還是京東值得信賴，給五星好評
3,平板,1,總體而言，產品還是不錯的。
4,平板,1,好，不錯，真的很好不錯
...,...,...,...
49995,酒店,0,我們去鹽城的時候那裡的最低氣溫只有4度，晚上冷得要死，居然還不開空調，投訴到酒店客房部，得到...
49996,酒店,0,房間很小，整體設施老化，和四星的差距很大。毛巾太破舊了。早餐很簡陋。房間隔音很差，隔兩間房間...
49997,酒店,0,我感覺不行。。。價效比很差。不知道是銀川都這樣還是怎麼的！
49998,酒店,0,房間時間長，進去有點異味！服務員是不是不夠用啊！我在一樓找了半個小時以上才找到自己房間，想找...


### Proportion of categories


In [5]:
cat_counts = df['cat'].value_counts()
cat_counts.plot.bar()

### Train/test split


Let's keep the same split setup as in last week's [notebook](https://colab.research.google.com/github/howard-haowen/NLP-demos/blob/main/NSYSU/W06-text-classification-with-scikit-learn.ipynb#scrollTo=_2OT2Gk0dMP_). 

In [6]:
from sklearn.model_selection import train_test_split

In [7]:
TRAIN_SIZE= 0.7
RANDOM_STATE = 500
train, test = train_test_split(df, 
                               train_size=TRAIN_SIZE,
                               random_state=RANDOM_STATE)
print(f"The training set size: {train.shape}")
print(f"The valid_test set size: {test.shape}")

The training set size: (35000, 3)
The valid_test set size: (15000, 3)


In [8]:
train['cat'].value_counts().plot.bar()

In [9]:
test['cat'].value_counts().plot.bar()

## Preprocessing

The only preprocessing we need to do is convert the dataset into spaCy's `DocBin` object, which is a binary data format for a collection of spaCy `Doc` objects.

In [None]:
!pip install -U -q pip setuptools wheel
!pip install -U -q spacy

First, we'll create a list of tuples with raw texts as the first value and their category labels as the second value.

In [11]:
def df2tuple(df, text_col, cat_col):
    tuple_data = (
        df.apply(lambda row: (row[text_col], row[cat_col]), 
                 axis=1)
        .tolist()
    )
    return tuple_data

In [12]:
TEXT_COL = 'review'
CAT_COL = 'cat'
train_tuple = df2tuple(train, text_col=TEXT_COL, cat_col=CAT_COL)
train_tuple[:10]

[('蘋果除了不夠甜，酥脆可口還可以，包裝一個都沒壞給好評。', '水果'),
 ('先定的普通標間，感覺比較差，連三星都不如，就換到豪華標間，是不同的樓棟，環境要好很多，酒店給人整體感覺很平庸，沒有啥特點。', '酒店'),
 ('酒店是四星的，服務不錯，晚上還送點心。酒店的設施是按四星標準，但可能時間比較長了，設施有點老化，顯得比較舊了。不過整體來說還是不錯的。', '酒店'),
 ('住的是350的大床房,以四星標準衡量的話,傢俱太舊,房間太小,早餐品種單一,別的總體來說還可以吧,地段不錯,服務也滿好.', '酒店'),
 ('來江門出差一直都住這個酒店，覺得價效比還可以', '酒店'),
 ('一星也不想給，洗完頭皮屑越來越多，而且頭皮超級癢。用回潘婷後就好多了。', '洗髮水'),
 ('手機出現訊號不穩定現象，時而有訊號，時而無訊號，換卡，換運營商此問題都未得到解決。只能返修，然而返修需要自費快遞，同時返修期間無臨時手機備用服務，本次購物體驗很差。京東該改進了。客服口氣不好，很不滿意。差評',
  '平板'),
 ('這種蘋果已經買過多次了，水分多，甜酸可口。頭天晚上下單，第二天東西就送來了，京東的服務一如既往，很滿意。', '水果'),
 ('假貨不是正品，和專賣店賣的用了效果完全不一樣。只是沒必要為這麼點錢拿去鑑定了', '洗髮水'),
 ('差穿了2次就丟了！', '衣服')]

In [13]:
test_tuple = df2tuple(test, text_col=TEXT_COL, cat_col=CAT_COL)
test_tuple[:10]

[('打著雙十一的旗子，提高物價！l', '洗髮水'),
 ('騰訊影片會員卡呢??給不起就不要寫上去欺騙消費者', '平板'),
 ('酒店在我最喜歡的跑馬地區, 是靠山的,很舒服,不到兩分鐘步行時間還可以到臨近的百家國際超市,那裡可以購買到全球各地的新鮮美食.^_^,扯遠了,酒店房間就是小了一點,大家都知道的,香港寸土寸金啦,房間小一點是沒有辦法的,好在酒店周圍環境幽雅,安靜,酒店大堂很懷舊英式風格,還有免費巴士到地鐵金鐘站,價格也有競爭力,總的來說還是比較滿意的.',
  '酒店'),
 ('去香港的主要原因就是為了購物，住這家酒店令我最滿意的就它的地理位置非常的方便（尤其是帶寶寶旅遊的媽媽），你不會因為寶寶太小走不了多少路而困惑，因為這家酒店就位於香港最繁華的海港城中，不管是硬體還是軟體絕對符合四星的標準，尤其是床非常大很舒服。出了酒店的門就是令郎滿目各種品牌的商品，會讓你有足不出戶的感覺，因為你就住在商場裡，裡面吃飯購物非常方便。如果你的寶寶累了或者買的東西太多你可以馬上就回到酒店，稍作休息，真的非常方便！！',
  '酒店'),
 ('買了一個月才過來評價，我不是水軍，良心說不咋地，有重啟現象，反應速度有卡頓，不是為了能插儲存卡和otg，我不會買安卓的平板，挑了個大牌子，但很失望！開機充電充了一個小時一點也沒變，忠告大家如果你不是和我一樣就為了插個儲存卡擴充記憶體，還是不要買！！',
  '平板'),
 ('有點刺激頭皮，太敏感的需要注意', '洗髮水'),
 ('31號服務員成心坑人！！我們12點回賓館準備吃飯後收拾結帳，31號服務員問我們什麼時候離開，我們說馬上。按照慣例，服務員沒有提醒我們超時收費，我們就以為不收費。於是我們不慌不忙地吃飯、收拾……等我們兩點結帳的時候，竟然活生生多收半天！！不告知，不提醒，收費也不言語，直接開好發票從押金裡頭就扣了。這不是成心坑人是什麼？？？更可笑的是，我們住的501是個套間，我們剛進去的時候沒有找到電視機的遙控器，就電話給前臺，前臺派一個大媽來給看看。大媽進門溜達了一圈，冒出一句：“好像501就是沒有遙控器”。之後，我們在電視機上找到了遙控器。這樣的服務員、這樣的服務，真是給敦煌摸黑啊！！早餐也特離譜，8點，粥就是涼的……唉，這家賓館，去不得！！',
  '酒店'),
 ('蘋果蠻脆的，就是味道不勻，有些甜，有些

Then we make a directory for the `DocBin` objects we're about to create.

In [14]:
from pathlib import Path

path = Path("./data")
path.mkdir(parents=True, exist_ok=True)

In [15]:
import spacy
from spacy.tokens import DocBin
from tqdm.auto import tqdm

For the purpose of saving a dataset as a spaCy object, we don't really need a trained pipeline. So a blank language model will do. 

In [16]:
nlp = spacy.blank("zh")
unique_labels = df[CAT_COL].unique().tolist()

def make_docs(tuple_data, split):
    """
    tuple_data: a list of tuples with (raw text, category label)
    split: either `train` or `test`
    """
    docs = []
    
    for doc, label in tqdm(nlp.pipe(tuple_data, as_tuples=True), 
                           total = len(tuple_data)):
        
        # the default value for each unique label is False
        label_dict = {label: False for label in unique_labels}
        # assign the label_dict to the doc.cats attribute
        doc.cats = label_dict
        # update the current label, which should be True
        doc.cats[label] = True
        docs.append(doc)
    
    # create spaCy DocBin object
    doc_bin = DocBin(docs=docs)
    # save the object to a default path
    doc_bin.to_disk(f"./data/{split}.spacy")

In [41]:
make_docs(train_tuple, 'train')

  0%|          | 0/35000 [00:00<?, ?it/s]

In [42]:
make_docs(test_tuple, 'test')

  0%|          | 0/15000 [00:00<?, ?it/s]

## Configuration for training

All the settings for the training pipeline reside in the `.cfg` file. Alternatively, you could use the widget on [this page](https://spacy.io/usage/training) to create a config file. But using CLI makes it easier to automate your workflow 😇.

In [19]:
LANG = 'zh'
OPTIMIZE = 'efficiency'
CONFIG_PREFIX = 'cpu'
!python -m spacy init config configs/{CONFIG_PREFIX}_config.cfg \
--lang {LANG} \
--pipeline textcat \
--optimize {OPTIMIZE} \
--force

[38;5;3m⚠ To generate a more effective transformer-based config (GPU-only),
install the spacy-transformers package and re-run this command. The config
generated now does not use transformers.[0m
[38;5;4mℹ Generated config template specific for your use case[0m
- Language: zh
- Pipeline: textcat
- Optimize for: efficiency
- Hardware: CPU
- Transformer: None
[38;5;2m✔ Auto-filled config with all values[0m
[38;5;2m✔ Saved config[0m
configs/cpu_config.cfg
You can now add your data and train your pipeline:
python -m spacy train cpu_config.cfg --paths.train ./train.spacy --paths.dev ./dev.spacy


Here's what the config file looks like. See [this page](https://spacy.io/api/data-formats#section-config) for detailed info.

In [20]:
!cat ./configs/cpu_config.cfg

[paths]
train = null
dev = null
vectors = null
init_tok2vec = null

[system]
gpu_allocator = null
seed = 0

[nlp]
lang = "zh"
pipeline = ["textcat"]
batch_size = 1000
disabled = []
before_creation = null
after_creation = null
after_pipeline_creation = null

[nlp.tokenizer]
@tokenizers = "spacy.zh.ChineseTokenizer"
segmenter = "char"

[components]

[components.textcat]
factory = "textcat"
scorer = {"@scorers":"spacy.textcat_scorer.v1"}
threshold = 0.5

[components.textcat.model]
@architectures = "spacy.TextCatBOW.v2"
exclusive_classes = true
ngram_size = 1
no_output_layer = false
nO = null

[corpora]

[corpora.dev]
@readers = "spacy.Corpus.v1"
path = ${paths.dev}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null

[corpora.train]
@readers = "spacy.Corpus.v1"
path = ${paths.train}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null

[training]
dev_corpus = "corpora.dev"
train_corpus = "corpora.train"
seed = ${system.seed}
gpu_allocator = ${system.gpu_allocator}
dro

## Dowloading a pretrained model

If you specify a pretrained model in the config file, you'll have to download it before starting the training.

In [None]:
#!python -m spacy download zh_core_web_lg

## Training

Here comes the fun 😻 part! Grab a cup of coffee/tea 🍵 and come back later.

In [22]:
CONFIG_PREFIX = 'cpu'
CONFIG_FILE = CONFIG_PREFIX + '_config.cfg'
TRAIN_FILE = './data/train'
TEST_FILE = './data/test'
MODEL_DIR = f'./{CONFIG_PREFIX}_model'
!python -m spacy train configs/{CONFIG_FILE} \
--output {MODEL_DIR} \
--paths.train {TRAIN_FILE}.spacy \
--paths.dev {TEST_FILE}.spacy \
--verbose

[2022-06-03 02:11:37,213] [DEBUG] Config overrides from CLI: ['paths.train', 'paths.dev']
[38;5;2m✔ Created output directory: cpu_model[0m
[38;5;4mℹ Saving to output directory: cpu_model[0m
[38;5;4mℹ Using CPU[0m
[38;5;4mℹ To switch to GPU 0, use the option: --gpu-id 0[0m
[1m
[2022-06-03 02:11:38,122] [INFO] Set up nlp object from config
[2022-06-03 02:11:38,152] [DEBUG] Loading corpus from path: data/test.spacy
[2022-06-03 02:11:38,153] [DEBUG] Loading corpus from path: data/train.spacy
[2022-06-03 02:11:38,153] [INFO] Pipeline: ['textcat']
[2022-06-03 02:11:38,159] [INFO] Created vocabulary
[2022-06-03 02:11:38,160] [INFO] Finished initializing nlp object
[2022-06-03 02:12:08,093] [INFO] Initialized pipeline components: ['textcat']
[38;5;2m✔ Initialized pipeline[0m
[1m
[2022-06-03 02:12:08,102] [DEBUG] Loading corpus from path: data/test.spacy
[2022-06-03 02:12:08,103] [DEBUG] Loading corpus from path: data/train.spacy
[38;5;4mℹ Pipeline: ['textcat'][0m
[38;5;4mℹ Initi

## Evaluation

spaCy automatically saves two models, named `model-best` and `model-last`. In a model directory you'll see a `meta.json` file, which logs the model performance.

In [23]:
import json

meta_path = "./cpu_model/model-best/meta.json"
with open(meta_path) as json_file:
    metrics = json.load(json_file)
metrics 

{'author': '',
 'components': ['textcat'],
 'description': '',
 'disabled': [],
 'email': '',
 'labels': {'textcat': ['平板', '水果', '洗髮水', '衣服', '酒店']},
 'lang': 'zh',
 'license': '',
 'name': 'pipeline',
 'performance': {'cats_f_per_type': {'平板': {'f': 0.8418098913,
    'p': 0.8965259619,
    'r': 0.7933884298},
   '水果': {'f': 0.9015829319, 'p': 0.9438040346, 'r': 0.8629776021},
   '洗髮水': {'f': 0.8452485509, 'p': 0.8829357798, 'r': 0.8106469003},
   '衣服': {'f': 0.895844201, 'p': 0.9313087491, 'r': 0.8629815745},
   '酒店': {'f': 0.9818367001, 'p': 0.995524957, 'r': 0.9685197589}},
  'cats_macro_auc': 0.9850454941,
  'cats_macro_auc_per_type': 0.0,
  'cats_macro_f': 0.893264455,
  'cats_macro_p': 0.9300198965,
  'cats_macro_r': 0.8597028531,
  'cats_micro_f': 0.8938958023,
  'cats_micro_p': 0.9310419525,
  'cats_micro_r': 0.8596,
  'cats_score': 0.893264455,
  'cats_score_desc': 'macro F',
  'textcat_loss': 4.8349148593},
 'pipeline': ['textcat'],
 'spacy_git_version': 'Unknown',
 'spacy_v

In [24]:
performance = metrics['performance']
score = performance['cats_score']
auc = performance['cats_macro_auc']
f1 = performance['cats_macro_f']
precision = performance['cats_macro_p']
recall = performance['cats_macro_r']
overall_dict = {'score': score, 'precision': precision, 'recall': recall, 'F1': f1, 'AUC': auc}
overall_df = pd.DataFrame(overall_dict, index=[0])
overall_df

Unnamed: 0,score,precision,recall,F1,AUC
0,0.893264,0.93002,0.859703,0.893264,0.985045


- ROC: An ROC curve (receiver operating characteristic curve) is a graph showing the performance of a classification model at all classification thresholds.

- AUC: AUC stands for "Area under the ROC Curve".

![](https://d2mk45aasx86xg.cloudfront.net/ROC_Curve_with_positive_rates_d8e0e2516d.webp)

- True positive rate (or recall)

\begin{align}
        TPR = \frac{TP}{TP+FN}
    \end{align}

- False negative rate

\begin{align}
        FPR = \frac{FP}{FP+TN}
    \end{align}

In [25]:
per_cat_dict = performance['cats_f_per_type']
per_cat_df = pd.DataFrame(per_cat_dict)
per_cat_df

Unnamed: 0,平板,水果,洗髮水,衣服,酒店
p,0.896526,0.943804,0.882936,0.931309,0.995525
r,0.793388,0.862978,0.810647,0.862982,0.96852
f,0.84181,0.901583,0.845249,0.895844,0.981837


## Loading the trained model

In [26]:
trained_nlp = spacy.load("./cpu_model/model-best")

Probabilities over text categories are stored in the `.cats` attribute of a `Doc` object.

In [36]:
text = "收到了！~ 樣式很好看！~ 面料不錯！！ 老公穿著看來好帥啊 哈哈 謝謝店家"
doc = trained_nlp(text)
cat_proba = doc.cats
cat_proba

{'平板': 2.3043754481477663e-05,
 '水果': 1.5334107956732623e-05,
 '洗髮水': 1.765126762620639e-05,
 '衣服': 0.999940037727356,
 '酒店': 3.992225629190216e-06}

In [27]:
import random

def show_test():
    text, label = random.choice(test_tuple)
    predicted_proba = trained_nlp(text).cats
    predicted_cat = max(predicted_proba, key=predicted_proba.get)
    print(f"Text: {text}")
    print(f"True category: {label}")
    print(f"Category probabilities:\n{predicted_proba}")
    print(f"Predicted category: {predicted_cat}")

In [37]:
show_test()

Text: 價效比還是比較高的，比同是四星的杭州武林廣場那邊的一些四星好一些。去杭州出差常常住的酒店
True category: 酒店
Category probabilities:
{'平板': 5.975995236440212e-07, '水果': 1.6611232922514318e-06, '洗髮水': 9.347874083687202e-07, '衣服': 1.3484321925716358e-06, '酒店': 0.9999954700469971}
Predicted category: 酒店


# Assignment

Train a classification model on the same dataset by leveraging a pretrained transformer model.

> **Make sure to use GPU**. Go to `Runtime` and then `Change runtime type`. Select `GPU` in the dropdown menu under `Hardware accelerator`.   

In [29]:
!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0


- Install dependencies

In [None]:
"""
!pip install -U pip setuptools wheel
!pip install -U spacy[cuda111,transformers]
"""

- Configure the training pipeline

In [None]:
"""
LANG = 'zh'
OPTIMIZE = 'accuracy'
CONFIG_PREFIX = 'gpu'
!python -m spacy init config configs/{CONFIG_PREFIX}_config.cfg \
--lang {LANG} \
--pipeline transformer,textcat \
--optimize {OPTIMIZE} \
--gpu \
--force
"""

- Display the config file

In [32]:
#!cat ./configs/gpu_config.cfg

- Download the transformer model

In [33]:
#!python -m spacy download zh_core_web_trf

- Start training

In [None]:
"""
CONFIG_PREFIX = 'gpu'
CONFIG_FILE = CONFIG_PREFIX + '_config.cfg'
TRAIN_FILE = './data/train'
TEST_FILE = './data/test'
MODEL_DIR = f'./{CONFIG_PREFIX}_model'
!python -m spacy train configs/{CONFIG_FILE} \
--output {MODEL_DIR} \
--paths.train {TRAIN_FILE}.spacy \
--paths.dev {TEST_FILE}.spacy \
--gpu-id 0 \
--verbose
"""

You'll see a training process like this if you do everything correctly.

In [None]:
"""
[2022-06-02 15:32:00,246] [DEBUG] Config overrides from CLI: ['paths.train', 'paths.dev']
✔ Created output directory: gpu_model
ℹ Saving to output directory: gpu_model
ℹ Using GPU: 0

=========================== Initializing pipeline ===========================
[2022-06-02 15:32:10,768] [INFO] Set up nlp object from config
[2022-06-02 15:32:10,777] [DEBUG] Loading corpus from path: data/test.spacy
[2022-06-02 15:32:10,778] [DEBUG] Loading corpus from path: data/train.spacy
[2022-06-02 15:32:10,778] [INFO] Pipeline: ['transformer', 'textcat']
[2022-06-02 15:32:10,782] [INFO] Created vocabulary
[2022-06-02 15:32:10,784] [INFO] Finished initializing nlp object
Downloading: 100% 29.0/29.0 [00:00<00:00, 37.6kB/s]
Downloading: 100% 624/624 [00:00<00:00, 693kB/s]
Downloading: 100% 107k/107k [00:00<00:00, 584kB/s] 
Downloading: 100% 263k/263k [00:00<00:00, 1.08MB/s]
Downloading: 100% 393M/393M [00:05<00:00, 69.4MB/s]
Some weights of the model checkpoint at bert-base-chinese were not used when initializing BertModel: ['cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2022-06-02 15:33:03,926] [INFO] Initialized pipeline components: ['transformer', 'textcat']
✔ Initialized pipeline

============================= Training pipeline =============================
[2022-06-02 15:33:03,936] [DEBUG] Loading corpus from path: data/test.spacy
[2022-06-02 15:33:03,937] [DEBUG] Loading corpus from path: data/train.spacy
ℹ Pipeline: ['transformer', 'textcat']
ℹ Initial learn rate: 0.0
E    #       LOSS TRANS...  LOSS TEXTCAT  CATS_SCORE  SCORE 
---  ------  -------------  ------------  ----------  ------
  0       0           0.00          0.16        0.00    0.00
  0     200           0.87         62.01       83.82    0.84
  0     400          19.21         16.19       87.78    0.88
  0     600          32.73         14.51       88.72    0.89
  0     800          41.60         14.01       88.78    0.89
  0    1000          53.54         14.53       89.44    0.89
  0    1200          46.88         13.18       89.22    0.89
"""