# Hugging Face Transformers 微调训练入门

本示例将介绍基于 Transformers 实现模型微调训练的主要流程，包括：
- 数据集下载
- 数据预处理
- 训练超参数配置
- 训练评估指标设置
- 训练器基本介绍
- 实战训练
- 模型保存

## YelpReviewFull 数据集

**Hugging Face 数据集：[ YelpReviewFull ](https://huggingface.co/datasets/yelp_review_full)**

### 数据集摘要

Yelp评论数据集包括来自Yelp的评论。它是从Yelp Dataset Challenge 2015数据中提取的。

### 支持的任务和排行榜
文本分类、情感分类：该数据集主要用于文本分类：给定文本，预测情感。

### 语言
这些评论主要以英语编写。

### 数据集结构

#### 数据实例
一个典型的数据点包括文本和相应的标签。

来自YelpReviewFull测试集的示例如下：

```json
{
    'label': 0,
    'text': 'I got \'new\' tires from them and within two weeks got a flat. I took my car to a local mechanic to see if i could get the hole patched, but they said the reason I had a flat was because the previous patch had blown - WAIT, WHAT? I just got the tire and never needed to have it patched? This was supposed to be a new tire. \\nI took the tire over to Flynn\'s and they told me that someone punctured my tire, then tried to patch it. So there are resentful tire slashers? I find that very unlikely. After arguing with the guy and telling him that his logic was far fetched he said he\'d give me a new tire \\"this time\\". \\nI will never go back to Flynn\'s b/c of the way this guy treated me and the simple fact that they gave me a used tire!'
}
```

#### 数据字段

- 'text': 评论文本使用双引号（"）转义，任何内部双引号都通过2个双引号（""）转义。换行符使用反斜杠后跟一个 "n" 字符转义，即 "\n"。
- 'label': 对应于评论的分数（介于1和5之间）。

#### 数据拆分

Yelp评论完整星级数据集是通过随机选取每个1到5星评论的130,000个训练样本和10,000个测试样本构建的。总共有650,000个训练样本和50,000个测试样本。

## 下载数据集

In [1]:
from datasets import load_dataset

dataset = load_dataset("yelp_review_full")

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
dataset

DatasetDict({
    train: Dataset({
        features: ['label', 'text'],
        num_rows: 650000
    })
    test: Dataset({
        features: ['label', 'text'],
        num_rows: 50000
    })
})

In [4]:
dataset["train"][1]

{'label': 1,
 'text': "Unfortunately, the frustration of being Dr. Goldberg's patient is a repeat of the experience I've had with so many other doctors in NYC -- good doctor, terrible staff.  It seems that his staff simply never answers the phone.  It usually takes 2 hours of repeated calling to get an answer.  Who has time for that or wants to deal with it?  I have run into this problem with many other doctors and I just don't get it.  You have office workers, you have patients with medical needs, why isn't anyone answering the phone?  It's incomprehensible and not work the aggravation.  It's with regret that I feel that I have to give Dr. Goldberg 2 stars."}

In [5]:
import random
import pandas as pd
import datasets
from IPython.display import display, HTML

In [6]:
def show_random_elements(dataset, num_examples=10):
    assert num_examples <= len(dataset), "Can't pick more elements than there are in the dataset."
    picks = []
    for _ in range(num_examples):
        pick = random.randint(0, len(dataset)-1)
        while pick in picks:
            pick = random.randint(0, len(dataset)-1)
        picks.append(pick)
    
    df = pd.DataFrame(dataset[picks])
    for column, typ in dataset.features.items():
        if isinstance(typ, datasets.ClassLabel):
            df[column] = df[column].transform(lambda i: typ.names[i])
    display(HTML(df.to_html()))

In [7]:
show_random_elements(dataset["train"])

Unnamed: 0,label,text
0,3 stars,"The Cracked Egg, I want to like you so bad!\n\nThe service here is actually really good. They staff was super attentive and I never had to ask for a water refills, but my server was very sassy in a TV sitcom sort of way. She tried to make little quirky jokes every time she came to the table, but they just came off like she was mad at the world and we were the cause of it. Like I said very TV sitcom-ish. \n\nThe decor here needs to be updated desperately. It has a cafeteria look; very open and very cold, but with a country grandma style. By this I mean there's random farm and rooster decor on the walls. Which makes me laugh because roosters don't lay eggs. It's Science! \n\nThe food is tasty, but basic. I was here with four other people and I tried everyone's dish... I ordered the Buffalo Chicken Salad; the salad for the guy who never orders salad. This was great in my opinion, big chunks of chicken coated in buffalo sauce on huge bed of fresh greens w/ eggs and tomatoes. My foodie friends ordered the \""Veggie Benny\"", \""The Cobb\"", and the \""Pesto Scramble\"". \n\nThe \""Veggie Benny\"" was the star in this line-up. This plate comes with spinach, avocado, grilled tomatoes and two poached eggs. The eggs were cooked perfectly, the hollandaise sauce was creamy, but lite in an interesting way, and the hash browns were golden brown and crispy. I would most likely order this if I came back to the Cracked Egg. The other dishes were forgetful and don't need to be mentioned in this review. \n\nThis place needs a design overhaul and few updates on the menu. These simple changes could really elevate this place...Where's Robert Irvine?!?\n\nPro Tip: The baked goods in this place are steller. Try the Blueberry Muffin/Coffee Cake"
1,1 star,"I been to couple of massage/reflexology places in Cali and AZ but this is the WORST..not worth it ..I had a back pain suddenly after going to wet and wild on Saturday ,I called them took the appointment as they say they had only lady at the time .I said OK but i need a deep tissue concentrate on my back ..when i enter receptionist is rude ..massage should be done on body ..not on shirts/towels to relieve the stress ..she is like no shirt off..company policy..blah blah.. ..I m like really ..If i had pain in back what they going to do with shirt on but even i tried ..as expected at the end ..I didn't feel any thing ..WASTE of my TIME and MONEY. If you got a chance try going to steam bath you might feel the same thing ..instead of just going to this place wasting money..I not gonna go ever ..neither suggest to some one-If any thing policies you guys had please tell when we call for appointment not after going inside"
2,3 stars,"Nothing fancy...just good, filling diner food. Recommended for lunch. The fish plate is a good bet."
3,1 star,SHIT HOLE!!! Too bad 0 stars is not an option here. The staff and management are rude and only want money. Someone sprayed pepper spray and sent multiple patrons to the ER and the staff just sat there like they could care less. Save yourself the trouble and just flush your money down the toilet!. This place is so ghetto and trashy. All the money they make from $30 admissions and it still looks the same as it did in 1974. Disgrace to the gay community!!
4,3 stars,"Never heard of this place and went here for breakfast and was so pleasantly surprised with the quality, price and convenience (it was right next to our hotel) that I went back the very next day!\n\nVery solid breakfast. Omelettes were great. Portions plentiful. The potatoes as well as the nice fruit made the breakfast a winner. My son had the eggs& toast one day and blueberry crepes the next. Coffee was very good. \n\nTHEN after coming home, I found out it was a chain! I must say, I was surprised..."
5,2 star,"With my brother getting married and being a club person, we decided to go to the Bank Nightclub as part of his Las Vegas Bachelor Party. The best way I can describe my club experience with the Bank Nightclub is a hate-love relationship. The club itself I would say is ok but on the small side. It's one of the smaller clubs I've been too in Las Vegas and probably holds about 700 people comfortably, but like most clubs they try to get as many people in (like this night) and people were packed in their like sardines. We did pay for the VIP treatment, and boy did we pay. The seats that we got was about $4K including bottle service. No that is not a typo - $4K or a little more than $200 per person. \n\nSo what does $4K get you - gets you a table that you and your buddy have to squeeze into and I believe about 6 bottles. The person who set this up also got us a \""host\"" who hooked up the whole thing. The only thing our \""host\"" did for the night, was get us front of the line pass. Other than that, the guy was hanging out, drinking, and to add insult to injury, we also ended up paying for his buddy and he also found two ladies (not for us) but for him and his buddy to hit on while all 4 of them drank our booze.\n\nThe place is suppose to be upscale, but to be honest, I just didn't see. I've been to nicer clubs and while this place charges an arm and leg, it really doesn't deliver. The only saving grace is that some of the other guys seemed to have a goodtime. But even then, the night was a mixed bag. Some guys seemed to have a good time and other guys were like me and said WTF I just dropped $200 on that, and the rest were kind of like eh whatever.\n\nIf I had to do it all over again, honestly, I think The Bank is overhyped and under delivered. I wouldn't go back again and would give them 2 stars."
6,3 stars,"A hit or miss experience. Went there the other night and the older guys (managers?) running the host stand were very unpleasant. When we told them our party WAS indeed all present, but just in the bathroom, they took another ten minutes to get us taken care of. Then they scoffed when we asked where our waiter went fifteen minutes after taking our drink order.\n\nThe wait staff themselves are usually decent, to be honest. But it's a little inconsistent.\n\nIf cheapness is your thing, come here. After all, it's run by the wizards who brought you the authentic stylings of Chili's and On the Border (oh, sorry, most of the ladder here in Phoenix closed). You can get a run-of-the-mill pasta and one to take home for $12.95. I do enjoy the late night fridge raid of my second meal, but it's nothing I couldn't have spent ten minutes and made in my own pot. The classics menu is a who's who of boring concoctions with no original take on them.\n\nThe atmosphere here is interesting and surprisingly void of a North Scottsdale vibe. Problem is you never know exactly what you're going to get. I'm not a huge fan of any chain Italian, but I'll take Carrabba's across the street; feels a bit more warm to me even if it's less hip, and the menu offers more excitement."
7,4 stars,"One of the better buffets on the strip.\n\nThere were some boring/normal dishes, but there were also some unique dishes. Delicately prepared, and presented in small bites.\n\nThe problem with this place is the MASSIVE wait to get in. it's unbelievable really how bad the wait is.\n\nIf you can get a pass or get past the line, it's a good place to eat. Remember tho, it's still a buffet."
8,2 star,"This review is limited to the web page and related issues. I have not been in the store, nor, frankly, do I now intend to.\n\nOrdered my wife a Christmas present on off the web page. Entered all the correct info into their web form. Ski Pro cancels the order because the ship-to and the order-from were different, no email, left a phone call saying they were cancelling the order. Not making this up -- now think about this for a minute, they could EASILY have a cross-check on the order form if this were an issue.\n\nI get message, call them.\n\n[transcript]\nQ \""Thanks for the call, what's the problem?\""\nA \""We cancelled the order\""\nQ \""Huh?\""\nA \""Different addresses\""\nQ \""What the...\""\nA \""We can reinstate\""\nQ \""OK, will it get there in time?\""\nA \""Sure, if you pay extra for overnight\""\nQ \""Uhhh, why?\""\nA \""Cause we cancelled it\""\n\nGuy on the phone seemed quite pleased that the present will arrive after Xmas.\n\nI suggest that you, Dear Reader, avoid the SkiPro web operation just as poster Brady K avoided fish and chips. I suppose the quality of the goods and the shop are OK.\n\nCalling the Better Business Bureau next. Oops, they only get two stars, is this a theme in Phoenix?\n\nConclusion: shop here only in person, avoid the web."
9,4 stars,"For a ghost town of a shopping center, this Red Lobster is busy. It's always busy, especially during lunch. \n\nOnce you get past the mini-wait, everything starts to fall in place. Service at this Red Lobster has always been great. There have been times I've come here on a limited lunch break and told our server. To my surprise, they did what they can to get me and my group in & out. \n\nOn my most recent visit, I was craving a steak and also lobster. Then I found what I thought was perfect. It was their Lobster Steak meal. But if you've been here before, you know what this is. It's steak with lobster bits on top. I expected a lobster and a steak. My bad, I should have read the menu better. Also, the price was fair. \n\nAs for their other food choices (from previous review):\n\n-Bottomless Shrimp: I don't know if this is daily or all the time. But I have had it in the past. Their fried shrimp has more batter than shrimp, but still pretty good. \n\n-Seaside Shrimp Trio: Is a favorite when in the mood for shrimp. It's shrimp pasta, fried shrimp, and garlic shrimp\n\n-Tilapia and Cod I put these all together, but their cod is really good. Big thick pieces of flaky fish. Tilapia is prepared pretty good to. Both are not fishy. \n\n-Steaks: Mixed reviews here. Had steaks here come too dry. Had some come too juicy. Then had some just right. Not as good as a Ruth Chris' but definitely better than a Chili's or Denny's. \n\n-Lobster: My opinion, one of my favorites. Portions small thought. \n\n-Stuffed Sole: Tried once and only recently. Did not fill me up, but still good. Order an appetizer!!\n\n-Glazed Chicken: I know it's a seafood restaurant, but have stopped by for lunch just for their chicken breast. About 7 total oz and usually have the sauce put on the side.\n\n-Calamari and Fried Vegetables: An oxymoron, but you'll find it all gone before your soup arrives. Usually gone from me and yes, the only way I will eat vegetables. \n\n-Red Lobster Biscuits: Don't remember the actual name, but this is a must on every visit. So good and perfect to dip into you chowder. \n\nOverall, I recommend Red Lobster. I've never had a bad experience and you get what you pay for her. Price for seafood here is just about right for the quality you get. Just be prepared for a little wait at this location."


## 预处理数据

下载数据集到本地后，使用 Tokenizer 来处理文本，对于长度不等的输入数据，可以使用填充（padding）和截断（truncation）策略来处理。

Datasets 的 `map` 方法，支持一次性在整个数据集上应用预处理函数。

下面使用填充到最大长度的策略，处理整个数据集：

In [8]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")


def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True)


tokenized_datasets = dataset.map(tokenize_function, batched=True)

Map: 100%|██████████| 650000/650000 [03:25<00:00, 3168.94 examples/s]
Map: 100%|██████████| 50000/50000 [00:16<00:00, 3122.98 examples/s]


In [9]:
show_random_elements(tokenized_datasets["train"], num_examples=1)

Unnamed: 0,label,text,input_ids,token_type_ids,attention_mask
0,5 stars,"everything was awesome.... the staff was incredible, professional and the experience was incredible.... \n\ni would do it again if i would go to las vegas again.","[101, 1917, 1108, 14918, 119, 119, 119, 119, 1103, 2546, 1108, 10965, 117, 1848, 1105, 1103, 2541, 1108, 10965, 119, 119, 119, 119, 165, 183, 165, 11437, 1156, 1202, 1122, 1254, 1191, 178, 1156, 1301, 1106, 17496, 1396, 11305, 1254, 119, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...]","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...]","[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...]"


### 数据抽样

使用 1000 个数据样本，在 BERT 上演示小规模训练（基于 Pytorch Trainer）

`shuffle()`函数会随机重新排列列的值。如果您希望对用于洗牌数据集的算法有更多控制，可以在此函数中指定generator参数来使用不同的numpy.random.Generator。

In [10]:
small_train_dataset = tokenized_datasets["train"].shuffle(seed=42).select(range(1000))
small_eval_dataset = tokenized_datasets["test"].shuffle(seed=42).select(range(1000))

## 微调训练配置

### 加载 BERT 模型

警告通知我们正在丢弃一些权重（`vocab_transform` 和 `vocab_layer_norm` 层），并随机初始化其他一些权重（`pre_classifier` 和 `classifier` 层）。在微调模型情况下是绝对正常的，因为我们正在删除用于预训练模型的掩码语言建模任务的头部，并用一个新的头部替换它，对于这个新头部，我们没有预训练的权重，所以库会警告我们在用它进行推理之前应该对这个模型进行微调，而这正是我们要做的事情。

In [11]:
from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", num_labels=5)

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


### 训练超参数（TrainingArguments）

完整配置参数与默认值：https://huggingface.co/docs/transformers/v4.36.1/en/main_classes/trainer#transformers.TrainingArguments

源代码定义：https://github.com/huggingface/transformers/blob/v4.36.1/src/transformers/training_args.py#L161

**最重要配置：模型权重保存路径(output_dir)**

In [12]:
from transformers import TrainingArguments

model_dir = "models/bert-base-cased-finetune-yelp-zhangkun"

# logging_steps 默认值为500，根据我们的训练数据和步长，将其设置为100
training_args = TrainingArguments(output_dir=model_dir,
                                  per_device_train_batch_size=16,
                                  num_train_epochs=5,
                                  logging_steps=100)

In [13]:
# 完整的超参数配置
print(training_args)

TrainingArguments(
_n_gpu=1,
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
auto_find_batch_size=False,
bf16=False,
bf16_full_eval=False,
data_seed=None,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_persistent_workers=False,
dataloader_pin_memory=True,
ddp_backend=None,
ddp_broadcast_buffers=None,
ddp_bucket_cap_mb=None,
ddp_find_unused_parameters=None,
ddp_timeout=1800,
debug=[],
deepspeed=None,
disable_tqdm=False,
dispatch_batches=None,
do_eval=False,
do_predict=False,
do_train=False,
eval_accumulation_steps=None,
eval_delay=0,
eval_steps=None,
evaluation_strategy=IntervalStrategy.NO,
fp16=False,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
fsdp=[],
fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False},
fsdp_min_num_params=0,
fsdp_transformer_layer_cls_to_wrap=None,
full_determinism=False,
gradient_accumulation_steps=1,
gradient_checkpointing=False,
gradient_checkpointing_kwargs=None,
greater_is_better=

### 训练过程中的指标评估（Evaluate)

**[Hugging Face Evaluate 库](https://huggingface.co/docs/evaluate/index)** 支持使用一行代码，获得数十种不同领域（自然语言处理、计算机视觉、强化学习等）的评估方法。 当前支持 **完整评估指标：https://huggingface.co/evaluate-metric**

训练器（Trainer）在训练过程中不会自动评估模型性能。因此，我们需要向训练器传递一个函数来计算和报告指标。 

Evaluate库提供了一个简单的准确率函数，您可以使用`evaluate.load`函数加载

In [14]:
import numpy as np
import evaluate

metric = evaluate.load("accuracy")

Downloading builder script: 100%|██████████| 4.20k/4.20k [00:00<00:00, 11.1MB/s]



接着，调用 `compute` 函数来计算预测的准确率。

在将预测传递给 compute 函数之前，我们需要将 logits 转换为预测值（**所有Transformers 模型都返回 logits**）。

In [15]:
def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return metric.compute(predictions=predictions, references=labels)

#### 训练过程指标监控

通常，为了监控训练过程中的评估指标变化，我们可以在`TrainingArguments`指定`evaluation_strategy`参数，以便在 epoch 结束时报告评估指标。

In [16]:
from transformers import TrainingArguments, Trainer

training_args = TrainingArguments(output_dir=model_dir,
                                  evaluation_strategy="epoch", 
                                  per_device_train_batch_size=16,
                                  num_train_epochs=3,
                                  logging_steps=30)

## 开始训练

### 实例化训练器（Trainer）

`kernel version` 版本问题：暂不影响本示例代码运行

In [17]:
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=small_train_dataset,
    eval_dataset=small_eval_dataset,
    compute_metrics=compute_metrics,
)

Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.


## 使用 nvidia-smi 查看 GPU 使用

为了实时查看GPU使用情况，可以使用 `watch` 指令实现轮询：`watch -n 1 nvidia-smi`:

```shell
Every 1.0s: nvidia-smi                                                   Wed Dec 20 14:37:41 2023

Wed Dec 20 14:37:41 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla T4                       Off | 00000000:00:0D.0 Off |                    0 |
| N/A   64C    P0              69W /  70W |   6665MiB / 15360MiB |     98%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A     18395      C   /root/miniconda3/bin/python                6660MiB |
+---------------------------------------------------------------------------------------+
```

In [19]:
trainer.train()

Epoch,Training Loss,Validation Loss,Accuracy
1,0.7088,1.134038,0.552
2,0.4595,1.198781,0.579
3,0.1951,1.306288,0.584


TrainOutput(global_step=189, training_loss=0.4430915640775489, metrics={'train_runtime': 328.2566, 'train_samples_per_second': 9.139, 'train_steps_per_second': 0.576, 'total_flos': 789354427392000.0, 'train_loss': 0.4430915640775489, 'epoch': 3.0})

In [20]:
small_test_dataset = tokenized_datasets["test"].shuffle(seed=64).select(range(100))

In [22]:
trainer.evaluate(small_test_dataset)

{'eval_loss': 1.4290642738342285,
 'eval_accuracy': 0.56,
 'eval_runtime': 2.7928,
 'eval_samples_per_second': 35.806,
 'eval_steps_per_second': 4.655,
 'epoch': 3.0}

### 保存模型和训练状态

- 使用 `trainer.save_model` 方法保存模型，后续可以通过 from_pretrained() 方法重新加载
- 使用 `trainer.save_state` 方法保存训练状态

In [23]:
model_dir = "models/bert-base-cased-finetune-yelp-zhangkun"

In [24]:
trainer.save_model(model_dir)

In [29]:
trainer.save_state()

In [30]:
trainer.model.save_pretrained("./")

## Homework: 使用完整的 YelpReviewFull 数据集训练，看 Acc 最高能到多少