# 文本情感分析

文本情感分析是NLP（自然语言处理）领域的重要研究领域。在NLP领域，文本情感分析（Text Sentiment Analysis）是指识别一段文本中流露出的说话者的情感态度，情感态度一般使用“积极”或者“消极”表示。文本情感分析可以广泛应用于社交媒体挖掘、电商平台订单评价挖掘、电影评论分析等领域。

为了定量表示情感偏向，一般使用[0,1]之间的一个浮点数给文本打上情感标签，越接近1表示文本的情感越正向，越接近0表示情感越负向。

本实践为基于BERT的中文短句文本情感分析。
 

## 数据集

数据集使用的是谭松波老师从某酒店预定网站上整理的酒店评论数据，共7000多条评论数据，5000多条正向评论，2000多条负向评论。

数据格式：

| 字段 | label  | review     | 
| ---- | ------- | ---------- | 
| 含义 | 情感标签  | 评论文本 |
 
 
## 预训练模型

本实践同样使用中文**BERT-Base,Chinese**预训练模型，可以从链接[BERT-Base, Chinese](https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip)下载并解压使用。

## BERT 简介

BERT（Bidirectional Encoder Representations from Transformers）是一种预训练NLP模型，由Google在2018年10月发布的论文[《BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding》](https://arxiv.org/abs/1706.03762)中提出。

BERT的通过联合调节所有层中的双向Transformer来训练预训练深度双向表示，只需要一个额外的输出层来对预训练BERT进行微调就可以满足各种任务，没有必要针对特定任务对模型进行修改，其先进性基于两点：其一，是使用Masked Langauge Model（MLM）和Next Sentense Prediction（NSP）的新预训练任务，两种方法分别捕捉词语和句子级别的representation；其二，是大量数据和计算能力满足BERT的训练强度，BERT训练数据采用了英文的开源语料BooksCropus 以及英文维基百科数据，一共有33亿个词,同时BERT模型的标准版本有1亿的参数量，而BERT的large版本有3亿多参数量,其团队训练一个预训练模型需要在64块TPU芯片上训练4天完成，而一块TPU的速度约是目前主流GPU的7-8倍。Google团队开源了多个预训练模型，以供多种下游任务需求使用。开源的预训练模型如下：

- [BERT-Base, Uncased](https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip): 12-layer, 768-hidden, 12-heads, 110M parameters
- [BERT-Large, Uncased](https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-24_H-1024_A-16.zip): 24-layer, 1024-hidden, 16-heads, 340M parameters
- [BERT-Base, Cased](https://storage.googleapis.com/bert_models/2018_10_18/cased_L-12_H-768_A-12.zip): 12-layer, 768-hidden, 12-heads , 110M parameters
- [BERT-Large, Cased](https://storage.googleapis.com/bert_models/2018_10_18/cased_L-24_H-1024_A-16.zip): 24-layer, 1024-hidden, 16-heads, 340M parameters
- [BERT-Base, Multilingual Cased (New, recommended)](https://storage.googleapis.com/bert_models/2018_11_23/multi_cased_L-12_H-768_A-12.zip): 104 languages, 12-layer, 768-hidden, 12-heads, 110M parameters
- [BERT-Base, Multilingual Uncased (Orig, not recommended)](https://storage.googleapis.com/bert_models/2018_11_03/multilingual_L-12_H-768_A-12.zip)(Not recommended, use Multilingual Cased instead): 102 languages, 12-layer, 768-hidden, 12-heads, 110M parameters
- [BERT-Base, Chinese](https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip): Chinese Simplified and Traditional, 12-layer, 768-hidden, 12-heads, 110M parameters

前4个是英文模型，Multilingual 是多语言模型，最后一个是中文模型（只有字级别的）。其中 Uncased 是字母全部转换成小写，而Cased是保留了大小写。 这里layer是layers层数（即Transformer blocks个数），hidden是hidden vector size，heads是self-attention的heads。

### 特征提取器

从BERT的全称（Bidirectional Encoder Representations from Transformers）可以看出，BERT采用Transformer作为特征提取器。Transformer是目前NLP领域最强的特征提取器。


### BERT预训练的两个重要步骤

#### Masked语言模型

为了训练深度双向语言表示向量，作者用了一个非常直接的方式，遮住句子里某些单词，让编码器预测这个单词是什么。

训练方法为：

1）80%的单词用***[MASK]*** token来代替

my dog is ***hairy*** → my dog is ***[MASK]***

2）10%单词用任意的词来进行代替

my dog is ***hairy*** → my dog is ***apple***

3）10%单词不变

my dog is ***hairy*** → my dog is ***hairy***

作者在论文中提到这样做的好处是，编码器不知道哪些词需要预测的，哪些词是错误的，因此被迫需要学习每一个token的表示向量。另外作者表示，每个batchsize只有15%的词被遮盖的原因，是性能开销。双向编码器比单项编码器训练要慢。

#### 预测下一个句子：Next Sentence Prediction（NSP）

预训练一个二分类的模型，来学习句子之间的关系。预测下一个句子的方法对学习句子之间关系很有帮助。

训练方法：正样本和负样本比例是1：1，50%的句子是正样本，随机选择50%的句子作为负样本。


## 在华为云ModelArts上准备开发环境

### 进入ModelArts

点击如下链接：https://www.huaweicloud.com/product/modelarts.html ， 进入ModelArts主页。点击“立即使用”按钮，输入用户名和密码登录，进入ModelArts使用页面。

### 创建ModelArts notebook

下面，我们在ModelArts中创建一个notebook开发环境，ModelArts notebook提供网页版的Python开发环境，可以方便的编写、运行代码，并查看运行结果。

第一步：在ModelArts服务主界面依次点击“开发环境”、“创建”

![create_nb_create_button](./img/create_nb_create_button.png)

第二步：填写notebook所需的参数：

| 参数 | 说明 |
| - - - - - | - - - - - |
| 计费方式 | 按需计费  |
| 名称 | Notebook实例名称，如 text_sentiment_analysis |
| 工作环境 | Python3 |
| 资源池 | 选择"公共资源池"即可 |
| 类型 | 本案例使用较为复杂的深度神经网络模型，需要较高算力，选择"GPU" |
| 规格 | 选择"8核 &#124; 64GiB &#124; 1*p100" |
| 存储配置 | 选择EVS，磁盘规格5GB |

第三步：配置好notebook参数后，点击下一步，进入notebook信息预览。确认无误后，点击“立即创建”

![create_nb_creation_summary](./img/create_nb_creation_summary.png)

第四步：创建完成后，返回开发环境主界面，等待Notebook创建完毕后，打开Notebook，进行下一步操作。
![modelarts_notebook_index](./img/modelarts_notebook_index.png)

### 在ModelArts中创建开发环境

接下来，我们创建一个实际的开发环境，用于后续的实验步骤。

第一步：点击下图所示的“打开”按钮，进入刚刚创建的Notebook
![inter_dev_env](img/enter_dev_env.png)

第二步：创建一个Python3环境的的Notebook。点击右上角的"New"，然后创建TensorFlow 1.13.1开发环境。

第三步：点击左上方的文件名"Untitled"，并输入一个与本实验相关的名称，如"text_sentiment_analysis"
![notebook_untitled_filename](./img/notebook_untitled_filename.png)
![notebook_name_the_ipynb](./img/notebook_name_the_ipynb.png)


### 在Notebook中编写并执行代码

在Notebook中，我们输入一个简单的打印语句，然后点击上方的运行按钮，可以查看语句执行的结果：
![run_helloworld](./img/run_helloworld.png)


### 下载实验数据集

In [1]:
from modelarts.session import Session
import os
session = Session()

if not os.path.exists('./text_sentiment_analysis/data'):
    print("start download data.")
    session.download_data(bucket_path="ai-course-common-26/text_sentiment_analysis/text_sentiment_analysis.tar.gz"
                          , path="./text_sentiment_analysis.tar.gz")

    # 使用tar命令解压资源包
    !tar xf ./text_sentiment_analysis.tar.gz

    # 使用rm命令删除压缩包
    !rm ./text_sentiment_analysis.tar.gz

start download data.
Successfully download file ai-course-common-26/text_sentiment_analysis/text_sentiment_analysis.tar.gz from OBS to local ./text_sentiment_analysis.tar.gz


### 导入依赖包

In [2]:
import tensorflow as tf
from tensorflow import keras
import os
import re
from sklearn.model_selection import train_test_split
import pandas as pd

#设置 TensorFlow 日志打印级别为info
tf.logging.set_verbosity(tf.logging.INFO)

print('导入系统依赖包成功！')

导入系统依赖包成功！


添加BERT源码路径至系统路径，让BERT源代码可以被导入。

In [3]:
os.sys.path.append('text_sentiment_analysis/bert')

导入BERT工具库

In [4]:
import tokenization
import modeling
import optimization

print('导入bert工具库成功！')

导入bert工具库成功！


### 设置模型和数据相关参数

设置BERT模型文件、训练数据和模型输出路径

In [5]:
# BERT模型配置文件
vocab_file = 'text_sentiment_analysis/model/chinese_L-12_H-768_A-12/vocab.txt'
bert_config_file = 'text_sentiment_analysis/model/chinese_L-12_H-768_A-12/bert_config.json'
init_checkpoint = 'text_sentiment_analysis/model/chinese_L-12_H-768_A-12/bert_model.ckpt'

# 数据集路径
data_dir = 'text_sentiment_analysis/data/'

# 模型训练输出位置
output_dir = 'text_sentiment_analysis/output/'

print("数据集路径为：",data_dir)
print("输出路径为：",output_dir)
print("中文字典路径为：",vocab_file)
print("预训练模型参数路径为：",bert_config_file)
print("预训练模型checkpoint路径为：",init_checkpoint)

数据集路径为： text_sentiment_analysis/data/
输出路径为： text_sentiment_analysis/output/
中文字典路径为： text_sentiment_analysis/model/chinese_L-12_H-768_A-12/vocab.txt
预训练模型参数路径为： text_sentiment_analysis/model/chinese_L-12_H-768_A-12/bert_config.json
预训练模型checkpoint路径为： text_sentiment_analysis/model/chinese_L-12_H-768_A-12/bert_model.ckpt


### 设置模型参数

In [6]:
batch_size = 32 # 批大小
learning_rate = 2e-5 # 学习率
num_train_epochs = 3 # 训练轮数

# 预热比例。在预测阶段，学习率很小并且逐渐增加，会有助于训练。
warmup_proportion = 0.1

# 保存训练过程日志的频率
save_checkpoints_steps = 500 
save_summary_steps = 100 

print("批大小：", batch_size)
print("训练轮数：", num_train_epochs)
print("预热的比例：",warmup_proportion)
print("学习率：",learning_rate)
print("保存检查点的步数频率：",save_checkpoints_steps)
print("保存summart的步数频率：",save_summary_steps)


批大小： 32
训练轮数： 3
预热的比例： 0.1
学习率： 2e-05
保存检查点的步数频率： 500
保存summart的步数频率： 100


### 读取数据集

In [7]:
# 获取非倾斜的数据集（标签的比例基本相等）
def get_balance_corpus(corpus_size, corpus_pos, corpus_neg):
    sample_size = corpus_size // 2
    pd_corpus_balance = pd.concat([corpus_pos.sample(sample_size, replace=corpus_pos.shape[0]<sample_size), \
                                   corpus_neg.sample(sample_size, replace=corpus_neg.shape[0]<sample_size)])
    
    print('评论数目（总体）：%d' % pd_corpus_balance.shape[0])
    print('评论数目（正向）：%d' % pd_corpus_balance[pd_corpus_balance.label==1].shape[0])
    print('评论数目（负向）：%d' % pd_corpus_balance[pd_corpus_balance.label==0].shape[0])    
    
    return pd_corpus_balance

# 读取数据集文件
reviews_all = pd.read_csv(data_dir + 'ChnSentiCorp_htl_all.csv')

pd_positive = reviews_all[reviews_all.label==1]
pd_negative = reviews_all[reviews_all.label==0]

# 获取非倾斜的数据集,防止模型有偏见
reviews_4000 = get_balance_corpus(4000, pd_positive, pd_negative)

# 切分为训练集和测试集
train, test = train_test_split(reviews_4000, test_size=0.2)

print('数据读取和切分完毕！')

评论数目（总体）：4000
评论数目（正向）：2000
评论数目（负向）：2000
数据读取和切分完毕！


总的数据集大小

In [8]:
len(reviews_all)

7765

展示训练集样本

In [9]:
train.sample(10)

Unnamed: 0,label,review
5698,0,旁楼的房间千万不能住，房间太差，洗手间居然发现蟑螂，价格也不便宜，很不爽，主楼的房间到还可以
5080,1,前台服务态度很好很细心，但是硬件设施由于跟不上，老化。
1161,1,在当地这已经算是不错的了，看宣传资料：国家领导也下榻这里，夫复何求？由于到时较晚，餐厅备料几...
4101,1,"入住时碰到一群北京来的醉鬼在大堂舞醉拳,有个哥们指着女接待员的鼻子在骂""你妈X"",而且骂了好..."
3702,1,不错的酒店！就是入住时间得不到保证！下午3点以后到达仍说没房，几乎操国骂才拿到钥匙。
206,1,1.预定标准间，抵达后，由于没有房间，免费升级到套房。赞一个2.房间很安静3.位于郑州比较靠...
3309,1,我每次到惠州都住在这家酒店，今年发现有二点要改进：1、拖鞋太差，一穿就抽丝，还缠在脚上。2、...
7440,0,相对来说，比起东莞，深圳的酒店，该酒店服务质量还是略有小疵。略欠热情与礼貌，有点爱搭不理的。...
5154,1,"地理位置不错,房间在香港来说,算大的,特别是卫生间很大,硬件比较旧.住这里只是涂个方便.只是..."
6274,0,"个人感觉：1、条件差，房间小。2、服务不到位。钥匙老是需要到大堂刷卡。3、楼下是KTV,很吵..."


### 读取BERT预训练模型中文字典

In [10]:
tokenizer = tokenization.FullTokenizer(vocab_file=vocab_file, do_lower_case=False)

标记器会将句子分成单个的字。

下面是一个样例：

In [11]:
tokenizer.tokenize("今天的天气真好！")

['今', '天', '的', '天', '气', '真', '好', '！']

### 创建相关类型

In [12]:
#  样本输入类
class InputExample(object):

  def __init__(self, guid, text_a, text_b=None, label=None):
    self.guid = guid
    self.text_a = text_a
    self.text_b = text_b
    self.label = label

# 特征输入类，BERT可识别
class InputFeatures(object):

  def __init__(self,
               input_ids,
               input_mask,
               segment_ids,
               label_id,
               is_real_example=True):
    self.input_ids = input_ids
    self.input_mask = input_mask
    self.segment_ids = segment_ids
    self.label_id = label_id
    self.is_real_example = is_real_example
    
    
# 填充类    
class PaddingInputExample(object):
    pass


抽取数据信息，转换成 InputExample 类型

In [13]:
# 设置数据列和标签列
DATA_COLUMN = 'review'
LABEL_COLUMN = 'label'

train_InputExamples = train.apply(lambda x: InputExample(guid=None, # 全局唯一ID 
                                                         text_a = x[DATA_COLUMN], 
                                                         text_b = None, 
                                                         label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(lambda x: InputExample(guid=None, 
                                                       text_a = x[DATA_COLUMN], 
                                                       text_b = None, 
                                                       label = x[LABEL_COLUMN]), axis = 1)

InputExample格式转换为 InputFeature 格式。

日志中打印了五个转换后的InputFeature格式样本的案例，可以查看。input_id中的数字即词在词汇表中的行数。

In [14]:
# 序列截断
def truncate_seq_pair(tokens_a, tokens_b, max_length):
    """ 将一个序列截断，使得长度不超过最大长度."""
    while True:
        total_length = len(tokens_a) + len(tokens_b)
        if total_length <= max_length:
            break
        if len(tokens_a) > len(tokens_b):
            tokens_a.pop()
        else:
            tokens_b.pop()


#convert_single_example函数用来把一句话的每个字转换成BERT的输入，即词向量、文本向量、位置向量，
#以及每个字对应的实体标签。同时在句首和句尾分别加上[CLS]和[SEP]标志。
def convert_single_example(ex_index, example, label_list, max_seq_length,
                           tokenizer):
    """将单个的 InputExample 转换成单个的 InputFeatures."""

    if isinstance(example, PaddingInputExample):
        return InputFeatures(
            input_ids=[0] * max_seq_length,
            input_mask=[0] * max_seq_length,
            segment_ids=[0] * max_seq_length,
            label_id=0,
            is_real_example=False)
    
    #将一个样本进行分析，然后将字转化为id, 标签转化为id,然后结构化到InputFeatures对象中
    label_map = {}
    # 1表示从1开始对标签进行index化
    for (i, label) in enumerate(label_list):
        label_map[label] = i

    # 中文是分字,但是对于一些不在BERT的vocab.txt中得字符会被进行WordPiece处理
    tokens_a = tokenizer.tokenize(example.text_a)
    tokens_b = None
    if example.text_b:
        tokens_b = tokenizer.tokenize(example.text_b)

    if tokens_b:
        # 截断 tokens_a + tokens_b 序列，保证长度不超指定长度
        # "- 3" 是因为[CLS], [SEP], [SEP] 三个符号 
        truncate_seq_pair(tokens_a, tokens_b, max_seq_length - 3)
    else:
        # "- 2" 是因为[CLS] 和 [SEP]两个符号
        if len(tokens_a) > max_seq_length - 2:
            tokens_a = tokens_a[0:(max_seq_length - 2)]

    tokens = []
    segment_ids = []
    tokens.append("[CLS]") # 句头添加 [CLS] 标志
    segment_ids.append(0)
    for token in tokens_a:
        tokens.append(token)
        segment_ids.append(0)
    tokens.append("[SEP]") # 句尾添加[SEP] 标志
    segment_ids.append(0)

    if tokens_b:
        for token in tokens_b:
            tokens.append(token)
            segment_ids.append(1)
        tokens.append("[SEP]")
        segment_ids.append(1)

    input_ids = tokenizer.convert_tokens_to_ids(tokens)  # 将序列中的字(tokens)转化为ID形式
    input_mask = [1] * len(input_ids)

    # 用0填充序列空余位置
    while len(input_ids) < max_seq_length:
        input_ids.append(0)
        input_mask.append(0)
        segment_ids.append(0)

    assert len(input_ids) == max_seq_length
    assert len(input_mask) == max_seq_length
    assert len(segment_ids) == max_seq_length

    label_id = label_map[example.label]
    
    # 打印前5个样例数据
    if ex_index < 5:
        tf.logging.info("*** Example ***")
        tf.logging.info("guid: %s" % (example.guid)) #每个句子的独立id
        tf.logging.info("tokens: %s" % " ".join([tokenization.printable_text(x) for x in tokens])) #每个字作为一个token
        tf.logging.info("input_ids: %s" % " ".join([str(x) for x in input_ids]))  #字向量token embeddings
        tf.logging.info("input_mask: %s" % " ".join([str(x) for x in input_mask])) #位置向量position embeddings
        tf.logging.info("segment_ids: %s" % " ".join([str(x) for x in segment_ids])) #文本向量segment embeddings
        tf.logging.info("label: %s (id = %d)" % (example.label, label_id)) #标签labels

    # 结构化为一个类
    feature = InputFeatures(
        input_ids=input_ids,
        input_mask=input_mask,
        segment_ids=segment_ids,
        label_id=label_id,
        is_real_example=True)
    return feature

# 将InputExample转换为InputFeature
def convert_examples_to_features(examples, label_list, max_seq_length, tokenizer):

  features = []
  for (ex_index, example) in enumerate(examples):
    if ex_index % 10000 == 0:
      tf.logging.info("Writing example %d of %d" % (ex_index, len(examples)))

    feature = convert_single_example(ex_index, example, label_list,
                                     max_seq_length, tokenizer)

    features.append(feature)
  return features


# 标签的列表
label_list = [0, 1]

# 设置token的长度上限
max_seq_length = 128

# 将InputExample格式的数据转换为BERT可以理解的InputFeature格式的数据
train_features = convert_examples_to_features(train_InputExamples, label_list, max_seq_length, tokenizer)
test_features = convert_examples_to_features(test_InputExamples, label_list, max_seq_length, tokenizer)

INFO:tensorflow:Writing example 0 of 3200
INFO:tensorflow:*** Example ***
INFO:tensorflow:guid: None
INFO:tensorflow:tokens: [CLS] 非 常 差 的 一 家 酒 店 , 虽 然 前 台 小 姐 开 房 时 对 你 郑 重 声 明 是 " 二 星 级 " . 进 门 , 房 间 里 一 股 味 道 扑 面 而 来 , 往 墙 上 插 卡 取 电 ? 没 用 , 干 脆 直 接 按 电 钮 开 灯 ; 进 卫 生 间 洗 澡 打 开 水 龙 头 , 水 竟 然 从 管 子 里 喷 出 来 , 仔 细 一 看 , 原 来 淋 浴 器 的 管 子 和 龙 头 部 位 已 经 断 裂 成 两 部 分 , 酒 店 用 黑 胶 布 简 单 的 [SEP]
INFO:tensorflow:input_ids: 101 7478 2382 2345 4638 671 2157 6983 2421 117 6006 4197 1184 1378 2207 1995 2458 2791 3198 2190 872 6948 7028 1898 3209 3221 107 753 3215 5277 107 119 6822 7305 117 2791 7313 7027 671 5500 1456 6887 2800 7481 5445 3341 117 2518 1870 677 2991 1305 1357 4510 136 3766 4500 117 2397 5546 4684 2970 2902 4510 7175 2458 4128 132 6822 1310 4495 7313 3819 4074 2802 2458 3717 7987 1928 117 3717 4994 4197 794 5052 2094 7027 1613 1139 3341 117 798 5301 671 4692 117 1333 3341 3900 3861 1690 4638 5052 2094 1469 7987 1928 6956 855 2347 5307 3171 6162 2768 697 6956 1146 117 6983 2421 4500 794

INFO:tensorflow:label: 1 (id = 1)
INFO:tensorflow:*** Example ***
INFO:tensorflow:guid: None
INFO:tensorflow:tokens: [CLS] 非 常 物 有 所 值 , 入 住 后 房 间 内 很 舒 适 , 只 是 周 围 环 境 嘈 杂 , 前 台 服 务 小 姐 非 常 热 情 . [SEP]
INFO:tensorflow:input_ids: 101 7478 2382 4289 3300 2792 966 117 1057 857 1400 2791 7313 1079 2523 5653 6844 117 1372 3221 1453 1741 4384 1862 1648 3325 117 1184 1378 3302 1218 2207 1995 7478 2382 4178 2658 119 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 

### 加载BERT模型参数

In [15]:
bert_config = modeling.BertConfig.from_json_file(bert_config_file)

!cat $bert_config_file # 打印BERT神经网络的参数

{
  "attention_probs_dropout_prob": 0.1, 
  "directionality": "bidi", 
  "hidden_act": "gelu", 
  "hidden_dropout_prob": 0.1, 
  "hidden_size": 768, 
  "initializer_range": 0.02, 
  "intermediate_size": 3072, 
  "max_position_embeddings": 512, 
  "num_attention_heads": 12, 
  "num_hidden_layers": 12, 
  "pooler_fc_size": 768, 
  "pooler_num_attention_heads": 12, 
  "pooler_num_fc_layers": 3, 
  "pooler_size_per_head": 128, 
  "pooler_type": "first_token_transform", 
  "type_vocab_size": 2, 
  "vocab_size": 21128
}


### 构造模型结构

In [16]:
# 创建一个分类模型
def create_model(bert_config, is_training, input_ids, input_mask, segment_ids,
                 labels, num_labels, use_one_hot_embeddings):
  # 加载预训练bert模型，获取对应的字embedding
  model = modeling.BertModel(
      config=bert_config,
      is_training=is_training,
      input_ids=input_ids,
      input_mask=input_mask,
      token_type_ids=segment_ids,
      use_one_hot_embeddings=use_one_hot_embeddings)

  output_layer = model.get_pooled_output()
  hidden_size = output_layer.shape[-1].value

  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):
    if is_training:
      # 0.1比例的 dropout
      output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    # 计算 softmax 值
    probabilities = tf.nn.softmax(logits, axis=-1)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    # 计算loss
    loss = tf.reduce_mean(per_example_loss)
    return (loss, per_example_loss, logits, probabilities)


# 返回一个 `model_fn`函数闭包给 TPUEstimator
def model_fn_builder(bert_config, num_labels, init_checkpoint, learning_rate,
                     num_train_steps, num_warmup_steps):

  #构建模型
  def model_fn(features, labels, mode, params):

    tf.logging.info("*** Features ***")
    for name in sorted(features.keys()):
      tf.logging.info("  name = %s, shape = %s" % (name, features[name].shape))

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]
    is_real_example = None
    if "is_real_example" in features:
      is_real_example = tf.cast(features["is_real_example"], dtype=tf.float32)
    else:
      is_real_example = tf.ones(tf.shape(label_ids), dtype=tf.float32)

    is_training = (mode == tf.estimator.ModeKeys.TRAIN)
    use_one_hot_embeddings = False

    # 使用参数构建模型,input_idx 就是输入的样本idx表示，label_ids 就是标签的idx表示
    (total_loss, per_example_loss, logits, probabilities) = create_model(
        bert_config, is_training, input_ids, input_mask, segment_ids, label_ids,
        num_labels, use_one_hot_embeddings)

    tvars = tf.trainable_variables()
    initialized_variable_names = {}
    scaffold_fn = None
    # 加载BERT预训练模型
    if init_checkpoint:
      (assignment_map, initialized_variable_names
      ) = modeling.get_assignment_map_from_checkpoint(tvars, init_checkpoint)
    
      tf.train.init_from_checkpoint(init_checkpoint, assignment_map)

    tf.logging.info("**** Trainable Variables ****")
    for var in tvars:
      init_string = ""
      if var.name in initialized_variable_names:
        init_string = ", *INIT_FROM_CKPT*"
      tf.logging.info("  name = %s, shape = %s%s", var.name, var.shape, init_string)

    output_spec = None
    # 训练模式
    if mode == tf.estimator.ModeKeys.TRAIN:

      train_op = optimization.create_optimizer(
          total_loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      output_spec = tf.estimator.EstimatorSpec(
          mode=mode,
          loss=total_loss,
          train_op=train_op)
    # 评估模式
    elif mode == tf.estimator.ModeKeys.EVAL:

      def metric_fn(per_example_loss, label_ids, logits, is_real_example):
        predictions = tf.argmax(logits, axis=-1, output_type=tf.int32)
        accuracy = tf.metrics.accuracy(
            labels=label_ids, predictions=predictions, weights=is_real_example)
        loss = tf.metrics.mean(values=per_example_loss, weights=is_real_example)
        return {
            "eval_accuracy": accuracy,
            "eval_loss": loss,
        }

      eval_metrics = metric_fn(per_example_loss, label_ids, logits, is_real_example)
      output_spec = tf.estimator.EstimatorSpec(
          mode=mode,
          loss=total_loss,
          eval_metric_ops=eval_metrics)
    # 测试模式
    else:
      output_spec = tf.estimator.EstimatorSpec(
          mode=mode,
          predictions={"probabilities": probabilities})
    
    return output_spec

  return model_fn


# 计算训练步数和预热步数
num_train_steps = int(len(train_features) / batch_size * num_train_epochs)
num_warmup_steps = int(num_train_steps * warmup_proportion)

model_fn = model_fn_builder(
  bert_config=bert_config,
  num_labels=len(label_list),
  learning_rate=learning_rate,
  init_checkpoint=init_checkpoint,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

print('模型构建成功。')

模型构建成功。


### 模型训练

模型训练，大约需要10min

In [17]:
# 创建一个input_fn供 TPUEstimator 使用
def input_fn_builder(features, seq_length, is_training, drop_remainder):

  all_input_ids = []
  all_input_mask = []
  all_segment_ids = []
  all_label_ids = []

  for feature in features:
    all_input_ids.append(feature.input_ids)
    all_input_mask.append(feature.input_mask)
    all_segment_ids.append(feature.segment_ids)
    all_label_ids.append(feature.label_id)

  # 正真的 input 方法
  def input_fn(params):
    batch_size = params["batch_size"]

    num_examples = len(features)

    d = tf.data.Dataset.from_tensor_slices({
        "input_ids":
            tf.constant(
                all_input_ids, shape=[num_examples, seq_length],
                dtype=tf.int32),
        "input_mask":
            tf.constant(
                all_input_mask,
                shape=[num_examples, seq_length],
                dtype=tf.int32),
        "segment_ids":
            tf.constant(
                all_segment_ids,
                shape=[num_examples, seq_length],
                dtype=tf.int32),
        "label_ids":
            tf.constant(all_label_ids, shape=[num_examples], dtype=tf.int32),
    })

    # 训练模式需要shuffling
    if is_training:
      d = d.repeat()
      d = d.shuffle(buffer_size=100)

    d = d.batch(batch_size=batch_size, drop_remainder=drop_remainder)
    return d

  return input_fn


# 生成 tf.estimator 运行配置
run_config = tf.estimator.RunConfig(
    model_dir=output_dir,
    save_summary_steps=save_summary_steps,
    save_checkpoints_steps=save_checkpoints_steps)

#模型estimator
estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": batch_size})

# 训练input函数
train_input_fn = input_fn_builder(
    features=train_features,
    seq_length=max_seq_length,
    is_training=True,
    drop_remainder=False) # 是否丢弃batch剩余样本

#训练
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)

print("训练结束")

INFO:tensorflow:Using config: {'_model_dir': 'text_sentiment_analysis/output/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7feef82d3cc0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
Instructions for updating:
Colocations handled automatically by placer.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:*** Features ***
INFO:tensorflow:  name = 


For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.



INFO:tensorflow:**** Trainable Variables ****
INFO:tensorflow:  name = bert/embeddings/word_embeddings:0, shape = (21128, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/token_type_embeddings:0, shape = (2, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/position_embeddings:0, shape = (512, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_0/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_0/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_0/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_0/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = ber

INFO:tensorflow:  name = bert/encoder/layer_4/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_4/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_4/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_4/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_4/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_4/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_4/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_4/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_4/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflo

INFO:tensorflow:  name = bert/encoder/layer_9/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_9/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_9/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_9/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_9/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_9/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_9/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_9/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_9/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKP

训练结束


### 评估模型精度

In [18]:
# 测试input函数
eval_input_fn = input_fn_builder(
    features=test_features,
    seq_length=max_seq_length,
    is_training=False,
    drop_remainder=False)

# 评估
evaluate_info = estimator.evaluate(input_fn=eval_input_fn, steps=None)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:*** Features ***
INFO:tensorflow:  name = input_ids, shape = (?, 128)
INFO:tensorflow:  name = input_mask, shape = (?, 128)
INFO:tensorflow:  name = label_ids, shape = (?,)
INFO:tensorflow:  name = segment_ids, shape = (?, 128)
INFO:tensorflow:**** Trainable Variables ****
INFO:tensorflow:  name = bert/embeddings/word_embeddings:0, shape = (21128, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/token_type_embeddings:0, shape = (2, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/position_embeddings:0, shape = (512, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_0/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_0/attention/self/query/bias:0, s

INFO:tensorflow:  name = bert/encoder/layer_4/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_4/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_4/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_4/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_4/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_4/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_4/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_4/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_4/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CK

INFO:tensorflow:  name = bert/encoder/layer_8/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_9/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_9/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_9/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_9/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_9/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_9/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_9/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_9/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:te

打印评估精度信息。其中eval_accuracy表示精度。

In [19]:
print(evaluate_info)

{'eval_accuracy': 0.8925, 'eval_loss': 0.39411587, 'loss': 0.39411587, 'global_step': 300}


### 效果展示

挑选几条样本句子预测，直观查看结果。

In [20]:
# 计算函数
def getPrediction(in_sentences):
  labels = ["Negative", "Positive"]
  input_examples = [InputExample(guid="", text_a = x, text_b = None, label = 0) for x in in_sentences] # here, "" is just a dummy label
  input_features = convert_examples_to_features(input_examples, label_list, max_seq_length, tokenizer)
  predict_input_fn = input_fn_builder(features=input_features, seq_length=max_seq_length, is_training=False, drop_remainder=False)
  predictions = estimator.predict(predict_input_fn)
  return [(sentence, prediction['probabilities'], labels[int(round(prediction['probabilities'][1]))]) for sentence, prediction in zip(in_sentences, predictions)]

# 挑选的样例评论
pred_sentences = [
  "这家酒店实在太糟了",
  "这家酒店的服务不友好",
  "服务还行",
  "房间外面的风景很好",
  "前台的服务很周到"
]

predictions = getPrediction(pred_sentences)

INFO:tensorflow:Writing example 0 of 5
INFO:tensorflow:*** Example ***
INFO:tensorflow:guid: 
INFO:tensorflow:tokens: [CLS] 这 家 酒 店 实 在 太 糟 了 [SEP]
INFO:tensorflow:input_ids: 101 6821 2157 6983 2421 2141 1762 1922 5136 749 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 

INFO:tensorflow:  name = bert/encoder/layer_1/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_1/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_1/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_1/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_1/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_1/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_1/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_1/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_1/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tenso

INFO:tensorflow:  name = bert/encoder/layer_6/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_6/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_6/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_6/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_6/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_6/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_6/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_6/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_6/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT

INFO:tensorflow:  name = bert/encoder/layer_10/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_10/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_10/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_10/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_10/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_11/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_11/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_11/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_11/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = ber

打印预测信息。

结果会返回一个三元组。第一个元素是原输入句子；第二个元素是预测结果的数组表示；第三个元素是预测得到的结果（Negative 或者 Positive）。

In [21]:
predictions

[('这家酒店实在太糟了', array([0.9965814 , 0.00341858], dtype=float32), 'Negative'),
 ('这家酒店的服务不友好', array([0.98558795, 0.01441207], dtype=float32), 'Negative'),
 ('服务还行', array([0.00570921, 0.9942908 ], dtype=float32), 'Positive'),
 ('房间外面的风景很好', array([0.00252695, 0.99747306], dtype=float32), 'Positive'),
 ('前台的服务很周到', array([0.0021987 , 0.99780124], dtype=float32), 'Positive')]

从预测结果可见，本实践可基本正确判断酒店评论的情感倾向。

读者可自行修改上面程序`pred_sentences`中的评论语句来进行情感分析。

## 小结

本实验展示了基于BERT预训练模型的下游任务文本情感分析，并且在训练数据量不大的情况下，得到不错的结果。同时，BERT预训练模型的通用性很好，只需要对输入和输出稍加改造就可以适用于大部分NLP任务，包括序列标注类（分词、NER、语义标注）、分类任务（文本分类、情感计算）和句子关系判断（问答、自然语言推理）等。BERT将大部分的工作放到预训练阶段去做，在fine tuning阶段，只需要做极少的工作，即可完成一项NLP任务。