# 事前学習済みモデルとは？

* CNN：オブジェクトに対して特徴ベクトルを作成している
* 画像分類→CNNの部分は事前学習済みのものを利用し、NNに部分でモデルを構築すればよい
* ここでいうCNNを「事前学習済みモデル」と呼ぶ
* 事前学習済みモデル：入力オブジェクトを特徴ベクトルのようなものに変換するモデルであり、その分野の様々なタスクに共通して利用できるネットワークモデル
     * 自然言語処理の場合、入力オブジェクトは単語列、タスクへの入力データは対応する単語の埋め込み表現の列となる

### 事前学習済みモデルの特徴
* 様々なタスクに共通して使えるため、パワフルなものを一つ作っておけばよい→構築は大変だが、作ってしまえば便利
* 下流のタスクで必要となるラベル付きデータの量を軽減できる→転移学習、教師ありデータを作成するコストが減る
* 下流タスクによって調整可能→fine-tuning

### BERT
* 出力する単語の埋め込み表現は文脈依存になっている、周辺の単語との関係から埋め込み表現を作成する
     * 「私は犬が好き。」の「犬/dog」と「奴は警察の犬だ。」の「犬/spy」は語義が異なる
* fine-tuningが可能
     * HuggingFaceのtransformersというライブラリを用いる

In [1]:
pip install transformers[ja]

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


In [2]:
from transformers import BertModel
bert = BertModel.from_pretrained('cl-tohoku/bert-base-japanese-v2')

Some weights of the model checkpoint at cl-tohoku/bert-base-japanese-v2 were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.predictions.transform.dense.bias', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [3]:
bert  #構造確認

BertModel(
  (embeddings): BertEmbeddings(
    (word_embeddings): Embedding(32768, 768, padding_idx=0)
    (position_embeddings): Embedding(512, 768)
    (token_type_embeddings): Embedding(2, 768)
    (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
    (dropout): Dropout(p=0.1, inplace=False)
  )
  (encoder): BertEncoder(
    (layer): ModuleList(
      (0): BertLayer(
        (attention): BertAttention(
          (self): BertSelfAttention(
            (query): Linear(in_features=768, out_features=768, bias=True)
            (key): Linear(in_features=768, out_features=768, bias=True)
            (value): Linear(in_features=768, out_features=768, bias=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
          (output): BertSelfOutput(
            (dense): Linear(in_features=768, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          

In [4]:
pip install torchinfo

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


In [5]:
from torchinfo import summary  #パラメータの様子確認
summary(bert)

Layer (type:depth-idx)                             Param #
BertModel                                          --
├─BertEmbeddings: 1-1                              --
│    └─Embedding: 2-1                              25,165,824
│    └─Embedding: 2-2                              393,216
│    └─Embedding: 2-3                              1,536
│    └─LayerNorm: 2-4                              1,536
│    └─Dropout: 2-5                                --
├─BertEncoder: 1-2                                 --
│    └─ModuleList: 2-6                             --
│    │    └─BertLayer: 3-1                         7,087,872
│    │    └─BertLayer: 3-2                         7,087,872
│    │    └─BertLayer: 3-3                         7,087,872
│    │    └─BertLayer: 3-4                         7,087,872
│    │    └─BertLayer: 3-5                         7,087,872
│    │    └─BertLayer: 3-6                         7,087,872
│    │    └─BertLayer: 3-7                         7,087,872
│    │   

In [7]:
from transformers import BertJapaneseTokenizer  #入力列をid型二変換するためtokenizerを作成
tknz = BertJapaneseTokenizer.from_pretrained('cl-tohoku/bert-base-japanese-v2')

In [8]:
tknz.tokenize("私は犬が好き。")

['私', 'は', '犬', 'が', '好き', '。']

In [9]:
tknz.encode("私は犬が好き。")  #各単語を単語idに変換、文頭の特殊トークン[CLS](2)と文末の特殊トークン[SEP](3)が含まれている

[2, 3946, 897, 3549, 862, 12215, 829, 3]

In [10]:
tknz.encode("私は犬が好き。", add_special_tokens=False)  #特殊トークン非表示

[3946, 897, 3549, 862, 12215, 829]

In [11]:
import torch
x = tknz.encode("私は犬が好き。")
x = torch.LongTensor(x).unsqueeze(0)  #Tensorに変換してBERTに入力、unsqueeze(0)で要素が一つのバッチにする
x

tensor([[    2,  3946,   897,  3549,   862, 12215,   829,     3]])

In [12]:
y = bert(x)  #BERTの出力は変数yに入っている
y.last_hidden_state  #BERTの出力である埋め込み表現の列

tensor([[[ 1.6633e-01, -7.8501e-02,  1.1767e-03,  ...,  3.6960e-01,
           1.6008e-01, -5.5611e-01],
         [ 2.7389e-01, -2.8402e-01, -8.4956e-01,  ..., -6.4883e-01,
           3.8284e-01, -1.5853e-01],
         [-5.7850e-01,  3.8757e-01, -9.7429e-01,  ...,  1.2454e+00,
          -4.9265e-01, -3.7446e-01],
         ...,
         [ 7.1451e-01,  2.8899e-01, -5.4993e-01,  ...,  1.0793e-01,
          -1.8923e+00, -8.3096e-01],
         [ 3.0506e-01, -7.4390e-01, -6.6757e-01,  ...,  2.6773e-01,
          -9.7722e-01, -8.8383e-01],
         [ 1.9436e-01, -8.5321e+00, -1.4069e-01,  ...,  7.1000e-03,
          -9.3125e-02, -5.4594e-01]]], grad_fn=<NativeLayerNormBackward0>)

In [13]:
y.last_hidden_state.shape  #[バッチサイズ, 単語列の長さ, 単語の次元数]

torch.Size([1, 8, 768])

In [14]:
y.last_hidden_state[0][3]  #入力文内の4番目の単語「犬」の埋め込み表現

tensor([ 8.5998e-01,  1.2333e-01, -4.1867e-01,  1.2998e-01,  1.5049e-01,
        -1.1773e+00, -8.8095e-01,  3.9046e-01,  1.1116e-02,  5.0804e-02,
         6.2080e-01, -1.4059e+00, -1.5116e+00, -1.0155e+00, -5.3455e-02,
         3.3133e-01, -1.2388e+00, -1.4632e+00,  3.0444e+00, -1.0437e+00,
        -4.2564e-01,  2.7795e-01, -9.6374e-01, -6.7335e-01, -7.0273e-01,
        -2.0534e-01, -3.4195e-02,  1.0376e+00,  3.1639e-01, -9.4961e-01,
        -4.9169e-02,  3.3885e-01,  1.7551e+00,  3.5499e-01,  1.4980e+00,
         7.2804e-02,  4.8653e-01, -2.7875e-01, -4.2737e-01,  3.1235e-01,
         7.8298e-01,  2.4274e-01, -7.7656e-01, -1.8049e-01,  1.1966e+00,
        -5.4871e-02,  1.1559e+00,  9.0174e-01, -5.0762e-01, -9.7665e-01,
         2.4764e-01, -1.2593e-01, -4.3765e+00, -5.8042e-01,  6.3760e-01,
        -1.8688e+00,  2.2698e-01, -2.3121e-01,  4.7407e-01, -2.4936e-01,
         1.9515e-01, -8.7563e-01, -1.8446e-01, -7.8302e-01,  1.0440e+00,
        -5.5363e-01, -2.9012e-01, -1.5317e+00,  3.8

### BERT内部の処理
* Transformer：BERTのアーキテクチャの一種
     * ニューラル機械翻訳モデル
     * 大きくEncoderとDecoderから構成されており、Encoder部をBERTと呼ぶ
     * BERTはBertEmbeddingsとBertEncoderから構成されており、BertEmbeddingsにはPositional Encodingという部分がある
     * Positional Encodingで単語を分散表現に変換している
     * BertEmbeddingsで作られた単語の埋め込み表現列を12個のBertLayer(BertEncoder)を使って少しずつ修正している
     
* Position Embeddings：単語の位置を抽象的なオブジェクトと見なして、n次元空間に埋め込むこと
     * BertEmbeddingsでは、単語の分散表現ベクトル・Position Embeddings・Segment Embeddingsの３つの埋め込み表現のベクトルの和を作る処理が行われている
     * Segment Embeddings：1文目の単語か2文目の単語かを表す埋め込み表現のベクトル

* BertLayer：単語の埋め込み表現列が入力される部分
     * この入力がMulti-Head Attentionに渡され、単語の埋め込み表現列が出力される
     * ここで得られた出力に対して、残差接続を行い、その結果に対してLayer Normalizationの処理を行う
     * その結果を線形変換し、もう一度同様の操作を行う

* Multi-Head Attention：(略)

# BERTによる文書分類

In [15]:
from transformers import BertForSequenceClassification

In [16]:
model = BertForSequenceClassification.from_pretrained(
    'cl-tohoku/bert-base-japanese-v2',
    num_labels = 3
)


Some weights of the model checkpoint at cl-tohoku/bert-base-japanese-v2 were not used when initializing BertForSequenceClassification: ['cls.predictions.decoder.weight', 'cls.predictions.transform.dense.bias', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification wer

In [17]:
d = """
私は犬が好きです。一般に動物が好きです。
言葉が使える動物がいたら楽しいと思います。
"""

In [18]:
import torch
from transformers import BertJapaneseTokenizer
tknz = BertJapaneseTokenizer.from_pretrained(
    'cl-tohoku/bert-base-japanese-v2')
x = tknz.encode(d)  #
x = torch.LongTensor(x).unsqueeze(0)
y = model(x)

In [19]:
y.logits.shape  #[バッチサイズ, ラベル数]

torch.Size([1, 3])

In [20]:
y.logits[0]  #最大のもののラベルを推定結果とすればよい。今回は第2のラベル

tensor([ 0.3156, -0.1285,  0.2848], grad_fn=<SelectBackward0>)

In [21]:
import torch.optim as optim
opt = optim.SGD([{'params':model.parameters(), 'lr':0.01}])  #最適化関数

In [22]:
ga = torch.LongTensor([1]).unsqueeze(0)  #文書dの正解ラベルが1であるときの正解ラベルのデータの出力

In [23]:
y = model(x, labels=ga)  #学習

In [24]:
loss = y.loss  #損失値の出力
opt.zero_grad()
loss.backward()
opt.step()

In [25]:
torch.save(model.state_dict(), 'mymodel.bin')

In [26]:
import pickle


with open('myconfig.pkl', 'bw') as fw:
    pickle.dump(model.config, fw)

In [27]:
with open('myconfig.pkl', 'br') as f:
    myconfig = pickle.load(f)
    mymodel = BertForSequenceClassification(config=myconfig)
    mymodel.load_state_dict(torch.load('mymodel.bin'))

# BERTによる系列ラベリング

* クラスBertForTokenClassificationを利用
* 入力となるデータ系列の各データにラベルを付与するタスクに適している
* 各tokenに対する各ラベルの確率からビタビアルゴリズムを利用して最終的なラベルを決定

In [28]:
from transformers import BertForTokenClassification
model = BertForTokenClassification.from_pretrained(
                 'cl-tohoku/bert-base-japanese-v2',
                 num_labels = 9)  #num_labels：ラベル数の指定

Some weights of the model checkpoint at cl-tohoku/bert-base-japanese-v2 were not used when initializing BertForTokenClassification: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.decoder.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForTokenClassification were not initia

In [29]:
s = "田中さんは茨城大学の学生です。"

In [30]:
import torch
from transformers import BertJapaneseTokenizer
tknz = BertJapaneseTokenizer.from_pretrained('cl-tohoku/bert-base-japanese-v2')
tknz.tokenize(s)

['田中', 'さん', 'は', '茨城', '大学', 'の', '学生', 'です', '。']

In [31]:
x = tknz.encode(s)  #単語id列に変換
x

[2, 13026, 11689, 897, 14121, 11188, 896, 12229, 12461, 829, 3]

In [32]:
x = torch.LongTensor(x).unsqueeze(0)
y = model(x)

In [33]:
y.logits.shape  #属性logits、[バッチサイズ, 単語数, ラベル数]

torch.Size([1, 11, 9])

In [34]:
y.logits[0].shape  #各単語に対する各ラベルのlogit値

torch.Size([11, 9])

In [35]:
import torch.optim as optim
opt = optim.SGD([{'params':model.parameters(), 'lr':0.01}])  #最適化関数

* ラベル名とラベルidの対応→テキスト参照

In [38]:
ga = torch.LongTensor([0, 1, 0, 0, 3, 7, 0, 0, 0, 0, 0]).unsqueeze(0)  #正解のラベル列のデータ

In [39]:
y = model(x, labels=ga)

In [40]:
loss = y.loss
opt.zero_grad()
loss.backward()
opt.step()

# Pipelineによるタスクの推論
* pipeline：各種タスクに対する事前学習済みモデルを利用して、そのタスクの推論処理を行ってくれるコマンド

### タスクの種類
* conversational：対話
* feature-extraction：特徴抽出
* fill-mask：マスク指定(隠す単語を指定して、推論を行わせる)
* image-classification：画像識別
* question-answering：質問応答
* table-question-answering：表内容からの質問応答
* text2text-generation：翻訳、要約、質問応答
* text-classification / sentiment-analysis：評判分析
* text-generation：テキスト生成
* token-classification / ner：固有表現抽出
* translation：翻訳
* translation_xx_to_yy：XX_YY翻訳
* summarization：要約
* zero-shot-classification：Zero-shot分類

### 評判分析
* 入力文が肯定的か否定的かを判定するタスク
* 文書分類と基本的に同じ
* レビュー分析とかに使える？

In [16]:
from transformers import pipeline
net = pipeline('text-classification')

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/255M [00:00<?, ?B/s]

Downloading tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading vocab.txt:   0%|          | 0.00/226k [00:00<?, ?B/s]

In [21]:
text1 = 'This book is very interesting.'
net(text1)

[{'label': 'POSITIVE', 'score': 0.9998513460159302}]

In [20]:
text2 = 'This book has some interesting parts.'
net(text2)

[{'label': 'POSITIVE', 'score': 0.9996337890625}]

* roberta-large-mnli：事前学習済みモデルRoBERTa-largeとデータセットMNLIを利用して学習されたモデル

In [22]:
net2 = pipeline('text-classification', model='roberta-large-mnli')  #positive, neutral, negativeで判定

Downloading config.json:   0%|          | 0.00/688 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

Some weights of the model checkpoint at roberta-large-mnli were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Downloading vocab.json:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading merges.txt:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

In [24]:
net2(text2)

[{'label': 'NEUTRAL', 'score': 0.6103535294532776}]

* daigo/bert-base-japanese-sentiment：日本語の評判分析モデルの１つ

In [None]:
# net3 = pipeline('text-classification',
#                model='daigo/bert-base-japanese-sentiment')
# net3("この犬は本当にお利口さんだ。")

### 固有表現抽出

In [23]:
net = pipeline('ner')
text = 'Mr.Tanaka is a student at Ibaraki University.'
net(text)

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading config.json:   0%|          | 0.00/998 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/1.24G [00:00<?, ?B/s]

Downloading tokenizer_config.json:   0%|          | 0.00/60.0 [00:00<?, ?B/s]

Downloading vocab.txt:   0%|          | 0.00/208k [00:00<?, ?B/s]

[{'entity': 'I-PER',
  'score': 0.99895644,
  'index': 3,
  'word': 'Tanaka',
  'start': 3,
  'end': 9},
 {'entity': 'I-ORG',
  'score': 0.9990582,
  'index': 8,
  'word': 'I',
  'start': 26,
  'end': 27},
 {'entity': 'I-ORG',
  'score': 0.99033654,
  'index': 9,
  'word': '##bara',
  'start': 27,
  'end': 31},
 {'entity': 'I-ORG',
  'score': 0.9977101,
  'index': 10,
  'word': '##ki',
  'start': 31,
  'end': 33},
 {'entity': 'I-ORG',
  'score': 0.9953762,
  'index': 11,
  'word': 'University',
  'start': 34,
  'end': 44}]

* dslim/bert-base-NER：BERTとCoNLL-2003を使って学習されたモデル

In [None]:
# net2 = pipeline('ner',model='dslim/bert-base-NER')
# net2(text)

### 要約
* デフォルトの事前学習済みモデル：sshleifer/distilbart-cnn-12-6
* BART(テキスト生成に利用される事前学習済みモデル)とデータセットCNN/DailyMailDatasetを利用して学習されたモデル

In [26]:
net = pipeline('summarization')
doc = """We introduce a new language representation 
model called BERT, which stands for Bidirectional Encoder 
Representations from Transformers. Unlike recent language 
representation models, BERT is designed to pre-train deep 
bidirectional representations from unlabeled text by 
jointly conditioning on both left and right context in all 
layers. As a result, the pre-trained BERT model can be 
fine-tuned with just one additional output layer to create 
state-of-the-art models for a wide range of tasks, such as 
question answering and language inference, without 
substantial task-specific architecture modifications. BERT 
is conceptually simple and empirically powerful. It obtains 
new state-of-the-art results on eleven natural language 
processing tasks, including pushing the GLUE score to 80.5% 
(7.7% point absolute improvement), MultiNLI accuracy to 
86.7% (4.6% absolute improvement), SQuAD v1.1 question 
answering Test F1 to 93.2 (1.5 point absolute improvement) 
and SQuAD v2.0 Test F1 to 83.1 (5.1 point absolute 
improvement)."""
net(doc, max_length=50, min_length=20)  #要約の長さ指定

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading pytorch_model.bin:   0%|          | 0.00/1.14G [00:00<?, ?B/s]

Downloading tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading vocab.json:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading merges.txt:   0%|          | 0.00/446k [00:00<?, ?B/s]

[{'summary_text': ' BERT is designed to pre-train deep bidirectional representations from unlabeled text by conditioning on both left and right context in all layers . The pre-trained BERT model can be fine-tuned with just one'}]

### 質問応答
* デフォルトの事前学習済みモデル：distilbert-base-cased-distilled-squad
* 事前学習済みモデルDistilBERTとデータセットSQuADを利用して学習されたモデル

In [27]:
net = pipeline('question-answering')
q = "How many tasks did BERT get SOTA on?"
net(context=doc, question=q)

No model was supplied, defaulted to distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/249M [00:00<?, ?B/s]

Downloading tokenizer_config.json:   0%|          | 0.00/29.0 [00:00<?, ?B/s]

Downloading vocab.txt:   0%|          | 0.00/208k [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/426k [00:00<?, ?B/s]

{'score': 0.7516631484031677, 'start': 716, 'end': 722, 'answer': 'eleven'}

### テキスト生成
* デフォルトの事前学習済みモデル：GPT-2
* 途中までの文章を与え、それらしい文を生成する

In [28]:
net = pipeline('text-generation')
net("In this paper, we propose a new")

No model was supplied, defaulted to gpt2 and revision 6c0e608 (https://huggingface.co/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/523M [00:00<?, ?B/s]

Downloading vocab.json:   0%|          | 0.00/0.99M [00:00<?, ?B/s]

Downloading merges.txt:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'In this paper, we propose a new approach based on the observation that the rate of weight loss in adult Chinese women is almost 3 times that in adult men, in comparison to the rate of weight loss in Western women over the same time period. This'}]

### Zero-shot分類
* 省略