# Hugging Face Transformers 微调训练入门

本示例将介绍基于 Transformers 实现模型微调训练的主要流程，包括：
- 数据集下载
- 数据预处理
- 训练超参数配置
- 训练评估指标设置
- 训练器基本介绍
- 实战训练
- 模型保存

## YelpReviewFull 数据集

**Hugging Face 数据集：[ YelpReviewFull ](https://huggingface.co/datasets/yelp_review_full)**

### 数据集摘要

Yelp评论数据集包括来自Yelp的评论。它是从Yelp Dataset Challenge 2015数据中提取的。

### 支持的任务和排行榜
文本分类、情感分类：该数据集主要用于文本分类：给定文本，预测情感。

### 语言
这些评论主要以英语编写。

### 数据集结构

#### 数据实例
一个典型的数据点包括文本和相应的标签。

来自YelpReviewFull测试集的示例如下：

```json
{
    'label': 0,
    'text': 'I got \'new\' tires from them and within two weeks got a flat. I took my car to a local mechanic to see if i could get the hole patched, but they said the reason I had a flat was because the previous patch had blown - WAIT, WHAT? I just got the tire and never needed to have it patched? This was supposed to be a new tire. \\nI took the tire over to Flynn\'s and they told me that someone punctured my tire, then tried to patch it. So there are resentful tire slashers? I find that very unlikely. After arguing with the guy and telling him that his logic was far fetched he said he\'d give me a new tire \\"this time\\". \\nI will never go back to Flynn\'s b/c of the way this guy treated me and the simple fact that they gave me a used tire!'
}
```

#### 数据字段

- 'text': 评论文本使用双引号（"）转义，任何内部双引号都通过2个双引号（""）转义。换行符使用反斜杠后跟一个 "n" 字符转义，即 "\n"。
- 'label': 对应于评论的分数（介于1和5之间）。

#### 数据拆分

Yelp评论完整星级数据集是通过随机选取每个1到5星评论的130,000个训练样本和10,000个测试样本构建的。总共有650,000个训练样本和50,000个测试样本。

## 下载数据集

In [1]:
import os
os.environ['HF_HOME'] = '/root/autodl-tmp/cache/'

import subprocess
import os

result = subprocess.run('bash -c "source /etc/network_turbo && env | grep proxy"', shell=True, capture_output=True, text=True)
output = result.stdout
for line in output.splitlines():
    if '=' in line:
        var, value = line.split('=', 1)
        os.environ[var] = value

In [2]:
import subprocess
import os

result = subprocess.run('bash -c "source /etc/network_turbo && env | grep proxy"', shell=True, capture_output=True, text=True)
output = result.stdout
for line in output.splitlines():
    if '=' in line:
        var, value = line.split('=', 1)
        os.environ[var] = value

In [3]:
from datasets import load_dataset

dataset = load_dataset("yelp_review_full")

  from .autonotebook import tqdm as notebook_tqdm
Using the latest cached version of the dataset since yelp_review_full couldn't be found on the Hugging Face Hub
Found the latest cached dataset configuration 'yelp_review_full' at /root/autodl-tmp/cache/datasets/yelp_review_full/yelp_review_full/0.0.0/c1f9ee939b7d05667af864ee1cb066393154bf85 (last modified on Thu Feb  6 13:05:47 2025).


In [4]:
dataset

DatasetDict({
    train: Dataset({
        features: ['label', 'text'],
        num_rows: 650000
    })
    test: Dataset({
        features: ['label', 'text'],
        num_rows: 50000
    })
})

In [5]:
dataset["train"][111]

{'label': 2,
 'text': "As far as Starbucks go, this is a pretty nice one.  The baristas are friendly and while I was here, a lot of regulars must have come in, because they bantered away with almost everyone.  The bathroom was clean and well maintained and the trash wasn't overflowing in the canisters around the store.  The pastries looked fresh, but I didn't partake.  The noise level was also at a nice working level - not too loud, music just barely audible.\\n\\nI do wish there was more seating.  It is nice that this location has a counter at the end of the bar for sole workers, but it doesn't replace more tables.  I'm sure this isn't as much of a problem in the summer when there's the space outside.\\n\\nThere was a treat receipt promo going on, but the barista didn't tell me about it, which I found odd.  Usually when they have promos like that going on, they ask everyone if they want their receipt to come back later in the day to claim whatever the offer is.  Today it was one of th

In [6]:
import random
import pandas as pd
import datasets
from IPython.display import display, HTML

In [7]:
def show_random_elements(dataset, num_examples=10):
    assert num_examples <= len(dataset), "Can't pick more elements than there are in the dataset."
    picks = []
    for _ in range(num_examples):
        pick = random.randint(0, len(dataset)-1)
        while pick in picks:
            pick = random.randint(0, len(dataset)-1)
        picks.append(pick)
    
    df = pd.DataFrame(dataset[picks])
    for column, typ in dataset.features.items():
        if isinstance(typ, datasets.ClassLabel):
            df[column] = df[column].transform(lambda i: typ.names[i])
    display(HTML(df.to_html()))

In [8]:
show_random_elements(dataset["train"])

Unnamed: 0,label,text
0,2 star,"Checked this place out on Saturday night and left unimpressed. The place was pretty empty but we were still seated in the very back corner by the kitchen (all the customers were seated against the far wall which felt a bit strange). After being seated, we waited about 15 minutes for the waitress to come by to take our order. Throughout dinner, the waitresses were all huddled around the sushi bar chatting with the sushi chefs, and they seemed annoyed any time they needed to check on their customers. Plus, the music was blasting so loud all night that I could barely have a conversation.\n\nThe non-sushi menu is limited and has only the basic chicken teriyaki, etc. Also, there weren't any combo plates like other places offer. I ordered a couple of rolls and they were fine (the spicy tuna was actually ok) but they were all covered in not so great sauces that were too salty or sour. The chicken teriyaki came out on a huge plate and was all breast meat...however, the chicken was very very bland (the chicken appeared steamed and later topped with a little sauce and a pile of green onions).\n\nNext time, we'll stick with I Love Sushi...better sushi, better non-sushi, and better prices. Plus, much better service."
1,3 stars,"I'm a Panda Express fan and I've eaten at the Primm Buffalo Bill location on many occasions. It's fine. It's mostly like any other Panda, except this location, like mall locations, doesn't have the fresh brewed China MIst iced tea, which is the best iced tea that I've had at any fast food locations. Also, no free refills =(\n\nThe food is pretty much the same, although it seems like the rice and noodles are a little more greasy and tired looking than at my Panda. I'll keep eating there, but it's not my favorite Panda Express location."
2,3 stars,"Few things satisfy one's appetite in such a primal way as steak. And few chefs are as innovative with food as Michael Mina. So when searching for a place with an innovative approach to steak, StripSteak should have been a no brainer, right?\n\nAs it turned out, StripSteak was a bit of a letdown. The reason, however, is a little hard to pin down. Suffice it to say that I found the restaurant both overwhelming and underwhelming.\n\nWhat was overwhelming were the ways in which they attempted to \""bribe\"" you into loving the restaurant. This begins the second you sit down, when they parade out to your table a Trio of Duck Fat Fries. Now who can resist duck fat fries? Clearly I was being manipulated here, and the truffle oil fries did just that. That said, because they bring them out to every table, while lying in wait for their next victim, the fries don't come out as hot and fresh as they could be. In addition to the fries, they also bring out to your table a cast iron pan of potato focaccia bread, which was delicious but dense. All this before even getting our appetizers!\n\nFor our meal, we both opted for the prix fixe menu, which is a fair deal at $55.\n\n1st Course\n\nMaryland Blue Crab Chowder | Bacon Lardon, Parsnip -- Intense crab flavor but a little watery. Not my favorite.\n\nBibb Wedge | Avocado, Bacon, Oregon Smokey Blue -- Good flavor, but I felt like I have had this before. Surprisingly staid for Michael Mina. In fact, both these dishes were really underwhelming.\n\n2nd Course\n\nSlow-Poached Prime Rib of Beef | Select Seasonal Side Dishes -- The meat was well cooked, but honestly, I have made better prime rib at home. The side dishes consisted of a wonderful brussels sprouts slaw, a predictable truffled mac and cheese, and an insipid creamed spinach.\n\nDry-Aged Bone-In Ribeye ($10 surcharge) -- This was delicious, perfectly cooked, and full of flavor. Now this is the reason that one comes to a steakhouse, not for all the distractions and bribes. And this is the reason I brought the 2006 Scarecrow Cabernet Sauvignon ($35 corkage), which paired perfectly with the meat.\n\n3rd Course\n\nMichael's Root Beer Float | Sassafras Ice Cream, Chocolate Chip Cookies -- Pure yum! I could have had a pitcher of this.\n\nFinally, as if this wasn't enough, before you leave and write your Yelp review, they present you with a bag of Chocolate Almond Brittle. Again, a bribe.\n\nThe service was for the most friendly and unpretentious, with the exception of the person who came to open and decant our wine (I don't think he was the somm). He seemed a little snotty and did not smile at all.\n\nSo would I come back again? Probably not, but I did enjoy some aspects of our meal. The next time I'm in Las Vegas and want to do a Michael Mina restaurant, I'll head to Nobhill Tavern instead."
3,3 stars,Used to come here alot when they first opened and probably would have rated them with 4 stars. Seems the prices have gone up a bit since then and the food is more or less the same. The place is still busy and in a good location.\n\nI still go every so often and still enjoy myself. Just feels a little overpriced now.
4,2 star,"Food-Great\nService-sucks\n\nWe have been here 4 times since they opened recently. The very first time, Service was wonderful, everything was A-okay. Second time, we called ahead since they offer that. We asked for a table for 4. We get there and check in to be told its a 35 minutes wait. Um, Ok. Why call ahead....? So then after the wait they lead us to a booth. I said we have to have a table...They explain that the take out employees take the call aheads and dont take any requests down. Um ok...well we wait in line again another 25 minutes. Finally get a table and guess what....No waiter was assigned to our table. Finally I flag a gentleman down after sitting for over 10 minutes and turns out he is a trainer in town to train this location since their so new....Well, He was wonderful. OF COURSE. Food was still a plus. Third and fourth visits...Pretty much same shit. Forget the food...I will go where they have there shit together !!"
5,1 star,"I've heard good things about oh yeah ice cream and I was psyched knowing that they are the only ice cream shop in Pittsburgh with vegan ice cream choices, we go on a Friday night, the place is packed, which is a good sign , we wait in line for about 40 minutes... There were about 8 people in front of us and two clerks that seemed to have to have long conversations with each customer. Which normally I'm a huge supporter of great customer service but 40 min to get through a line of 8 people... Come on! When we finally get served the clerk informs me they only have chocolate vegan ice cream... Which I hate chocolate and had my heart and sweet tooth set on vanilla bean coconut milk I've cream...he then informs me they placed a huge soy order and to come back in 5 days and they will have it all...does this help me now? No! So not only am I disappointed , but also being told to make another trip . This is one of the most completely annoying places that I've been. It's obvious the vegan ice creams are popular, there are silly hand written signs everywhere, do you think maybe you could make a productive sign for the front door saying your out of the vegan flavors before your vegan patrons have to waste their time waiting in line? I really wanted this to be our new date night dessert spot, but tofutti ice cream from my grocers freezer it is...fail!"
6,4 stars,"Good alternative to pho, I've had different kinds of shabu shabu this to me is mediocre. Popcorn chicken is also mediocre same stuff you get from boba tea houses. Good thing the prices are not outrageous"
7,2 star,"Our journey to Vintner Wine Market was born in a groupon email that we decided to go for. Half off booze and food...who are we to turn that down. So of course we wait until the day it expires and head on over. \n\nThe place is pretty easy to find. It's in the Arboretum shopping center and parking is easy. Basically it's an ex-music store in a glorified strip mall but the atmosphere is as good as it can be given the circumstances.\n\nThe food we can save you a lot of time on: don't bother. It's expensive and portions are small but not in the good fancy restaurant kind of way. The crab dip we got for an appetizer was a bit misleading unless you knew that \""crab dip\"" actually meant \""heated up mayonnaise\"". And the 10 honey chipotle wings I ordered were literally the smallest wings I've ever had. I could have held all 10 in the palm of one hand. They were listed as free range chicken wings...if that's the case I'll take the hormones, thanks. The flatbread pizza thingy's were ok though but a small portion.\n\nDrinks however are a different story. Lots of beers on tap and bottled. A whole store of wine by the bottle plus a decent by the glass selection. We liked both (red) wines we got. That's really the moral of the story here: Go for the alcohol and skip the rest.\n\nWe give it 2 vintners out of 5 for our experience. Although as a place just to go for drinks we would bump that up to 3."
8,3 stars,"Took a date here on my mission to try any and all legitimate Italian restaurants in town as new ones pop up. I was slacking on trying Bacio so I finally wanted to check it off my list. Get there for my reservation and Nicole the hostess was friendly and walked us to our 2 top next to the window over looking the pool. The restaurant looks very Miami with white everywhere. It looks very nice though. \n\nThey bring bread with olive oil and balsamic vinegar to the table in what I would describe as a small gravy float. The taste was very good but then problem is what with the gravy float type container the olive oil and balsamic was in it is impossible to get the correct oil to vinegar ratio because the balsamic sinks to the bottom and the dish is too deep to get the vinegar without the oil washing most of it away. The quality was great but a shallow saucer with the same mixture would be much better. \n\nThe serve comes to get our appetizer and drink orders. My date orders a glass of wine (very nice wines on the list) and I order an Aperal Spritz. The server comes back after about a minute and says they are out of the Aperal wine, I pointed out it was a liquor not a wine. He comes back shortly with both drinks. The spritz was made perfectly. The waiter was either new to Bacio or just didnt know his menu very well as there are only like 6 cocktails to choose from.\n\nFor appetizers She orders the Cesar and I order the Caprese. She enjoyed the ceasar and the caprese was good as well. Great quality tomatoes and olive oil with Bufalo mozarella. I wish there was larger pieces of basil though and they were micro slices instead. Different but it didn't really take away from the great quality I just would have loved it more if they had larger pieces of basil. \n\nWe order out entrees and she orders the Tagliolini Neri ai Frutti di Mare as she was in the moos for seafood and I order the Salsa Rustica with Spaghetti with Buffalo Mozzarella. We get the dishes after a perfect wait. The portions were fairly generous for the quality of the ingredients. She really liked her Frutti di mare and I was pretty impressed with my pasta. The pasta was cooked perfectly and the sauce was nice. The pasta was much better, twice as big and better flavor that that of Scarpetta at the Cosmo for a few dollars less.\n\nSide Note:\nWhile we were eating Carla was dining at a table right behind us with a guy in a suit. She is an attractive women and the pictures are accurate as to how she looks, not some stock photo from 1982 as many chefs and celebrities use. On several times during our meal she was talking and venting about her divorce of Frank Jr. from RAO's. Getting emotional as I can only imagine it had to be. I am always one to mind my own business unless someone makes it my business. I just feel it's inappropriate and unprofessional for a chef and owner to vent anything out where the public can hear or see her getting emotional about issues that the diners should have no business hearing. take it in the kitchen or stay home if you can't keep it together. It's easy to get caught up in emotion but as a professional there is a time and a place and your business isn't it. \n\nNow on for Dessert. We decide to share the the Tiramisu. I am a Tiramisu connoisseur. It is my favorite dessert of all time. Cannoli as well but Tiramisu is top of the pile. It is a dessert that had pretty specific ingredients yet every restaurant seems to have such drastically different consistency, flavor and presentation. That is also part of the intrigue of the dessert to me. The portion size is fairly large and very nice presentation. It had 2 small cookies on the plate with the Tiramisu. The favor of the Tiramisu was different. Not in a bad way but not in the way that I prefer either. It was creamy but the coco and espresso favors didn't balance out right with the other flavors. It was not bad, it just wasn't as impressive as the rest of the dinner was. The 2 cookies the size of a half dollar were the best part of the Tiramisu. So the star was not something you would really find traditionally. \n\nTo me the appetizers and meals were a solid 4 stars. Very good food. The server was decent, even though the server didn't seems to know his menu all that well. I had to ask the server for dessert instead of him suggesting it and trying to up sell. He was a nice and friendly guy, he could just use some more experience at Bacio. I would say he was about 40 so it wasn't for lack of experience in general, just at Bacio. The unprofessional behavior of Carla was a bit of a disappointment even though it didn't take away from the dinner. For this reason I am giving it an over all of 3 stars because of the small downers even though it is easily a solid 4 star restaurant.maybe just an off night my date and I went. I love the name Bacio for the place too."
9,1 star,The pharmacist here are by far the worst. The wait time is always 2 hours. And when I ask them for help to find another location that can assist me they say figure it out.\n\nThis is especially for the pharmacist wearing all the rose buttons . There was no reason for you to be rude to me today


## 预处理数据

下载数据集到本地后，使用 Tokenizer 来处理文本，对于长度不等的输入数据，可以使用填充（padding）和截断（truncation）策略来处理。

Datasets 的 `map` 方法，支持一次性在整个数据集上应用预处理函数。

下面使用填充到最大长度的策略，处理整个数据集：

In [9]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")


def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True)


tokenized_datasets = dataset.map(tokenize_function, batched=True)

In [10]:
show_random_elements(tokenized_datasets["train"], num_examples=1)

Unnamed: 0,label,text,input_ids,token_type_ids,attention_mask
0,2 star,"*SIGH*\n\nI wanted to like you...I really did, what...with your quirky menu that doesn't quite make sense (salads, fried fish & chips, burgers as wraps?). I even gave you two tries, just in case the first was a fluke. The first time, I ordered the burger wrap...it sounded like a good idea...but I didn't like it...at all. This time I decided to go with a cup of Tuscany soup (chili mac) and the noodle stir fry w/ a soy cafe mocha to wash it down.... \n\nLet me preface this by saying I ordered through Living Social's new delivery app. \n\nUnfortunately, it took 2 hours for my order to arrive, when I called the restaurant after 1 1/2 hours, the guy on the other line was very abrupt and said that the food would be here in less than 5 minutes...well, 5 minutes turned into 30 minutes.\n\nThe soup was pretty tasty, nothing incredible, but tasty. \n\nThe noodle stir fry was...kinda icky. It was somewhat soupy with charred/burnt pieces of chicken (which, strangely enough, was my favorite part...), overcooked veggies, undercooked pasta (or maybe that's just the wheat pasta texture...) and they completely forgot my cafe mocha (I wanted to cry....really I did). The guy that delivered seemed not terribly concerned and just shrugged it off saying that they'd remove it from my card.\n\nSorry, but this place just isn't cheap enough for me to be okay with you getting it wrong a 3rd time. \n\n...I know this sounds like a 1 star review, but I just don't have the heart to give them less than 2 for some reason...weird.","[101, 115, 156, 23413, 3048, 115, 165, 183, 165, 183, 2240, 1458, 1106, 1176, 1128, 119, 119, 119, 146, 1541, 1225, 117, 1184, 119, 119, 119, 1114, 1240, 186, 6592, 15538, 13171, 1115, 2144, 112, 189, 2385, 1294, 2305, 113, 19359, 1116, 117, 15688, 3489, 111, 13228, 117, 171, 23872, 1116, 1112, 21738, 136, 114, 119, 146, 1256, 1522, 1128, 1160, 4642, 117, 1198, 1107, 1692, 1103, 1148, 1108, 170, 23896, 2391, 119, 1109, 1148, 1159, 117, 146, 2802, 1103, 171, 23872, 10561, 119, 119, 119, 1122, 4234, 1176, 170, 1363, 1911, 119, 119, 119, 1133, 146, 1238, 112, 189, ...]","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...]","[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]"


### 数据抽样

使用 1000 个数据样本，在 BERT 上演示小规模训练（基于 Pytorch Trainer）

`shuffle()`函数会随机重新排列列的值。如果您希望对用于洗牌数据集的算法有更多控制，可以在此函数中指定generator参数来使用不同的numpy.random.Generator。

In [11]:
small_train_dataset = tokenized_datasets["train"].shuffle(seed=42).select(range(10000))
small_eval_dataset = tokenized_datasets["test"].shuffle(seed=42).select(range(10000))
full_train_dataset = tokenized_datasets["train"]
full_eval_dataset = tokenized_datasets["test"]

## 微调训练配置

### 加载 BERT 模型

警告通知我们正在丢弃一些权重（`vocab_transform` 和 `vocab_layer_norm` 层），并随机初始化其他一些权重（`pre_classifier` 和 `classifier` 层）。在微调模型情况下是绝对正常的，因为我们正在删除用于预训练模型的掩码语言建模任务的头部，并用一个新的头部替换它，对于这个新头部，我们没有预训练的权重，所以库会警告我们在用它进行推理之前应该对这个模型进行微调，而这正是我们要做的事情。

In [12]:
from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", num_labels=5,)

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


### 训练超参数（TrainingArguments）

完整配置参数与默认值：https://huggingface.co/docs/transformers/v4.36.1/en/main_classes/trainer#transformers.TrainingArguments

源代码定义：https://github.com/huggingface/transformers/blob/v4.36.1/src/transformers/training_args.py#L161

**最重要配置：模型权重保存路径(output_dir)**

In [13]:
from transformers import TrainingArguments

model_dir = "/root/autodl-tmp/models/bert-base-cased-finetune-yelp"

# logging_steps 默认值为500，根据我们的训练数据和步长，将其设置为100
training_args = TrainingArguments(output_dir=model_dir,
                                  per_device_train_batch_size=65,
                                  num_train_epochs=5,
                                  logging_steps=100,
                                  save_steps=1000,
                                  save_total_limit=5)

In [14]:
# 完整的超参数配置
print(training_args)

TrainingArguments(
_n_gpu=1,
accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False},
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
auto_find_batch_size=False,
batch_eval_metrics=False,
bf16=False,
bf16_full_eval=False,
data_seed=None,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_persistent_workers=False,
dataloader_pin_memory=True,
dataloader_prefetch_factor=None,
ddp_backend=None,
ddp_broadcast_buffers=None,
ddp_bucket_cap_mb=None,
ddp_find_unused_parameters=None,
ddp_timeout=1800,
debug=[],
deepspeed=None,
disable_tqdm=False,
dispatch_batches=None,
do_eval=False,
do_predict=False,
do_train=False,
eval_accumulation_steps=None,
eval_delay=0,
eval_do_concat_batches=True,
eval_on_start=False,
eval_steps=None,
eval_strategy=IntervalStrategy.NO,
eval_use_gather_object=False,
evaluation_str

### 训练过程中的指标评估（Evaluate)

**[Hugging Face Evaluate 库](https://huggingface.co/docs/evaluate/index)** 支持使用一行代码，获得数十种不同领域（自然语言处理、计算机视觉、强化学习等）的评估方法。 当前支持 **完整评估指标：https://huggingface.co/evaluate-metric**

训练器（Trainer）在训练过程中不会自动评估模型性能。因此，我们需要向训练器传递一个函数来计算和报告指标。 

Evaluate库提供了一个简单的准确率函数，您可以使用`evaluate.load`函数加载

In [15]:
import numpy as np
import evaluate

metric = evaluate.load("accuracy")


接着，调用 `compute` 函数来计算预测的准确率。

在将预测传递给 compute 函数之前，我们需要将 logits 转换为预测值（**所有Transformers 模型都返回 logits**）。

In [16]:
def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return metric.compute(predictions=predictions, references=labels)

#### 训练过程指标监控

通常，为了监控训练过程中的评估指标变化，我们可以在`TrainingArguments`指定`evaluation_strategy`参数，以便在 epoch 结束时报告评估指标。

In [17]:
from transformers import TrainingArguments, Trainer

training_args = TrainingArguments(output_dir=model_dir,
                                  evaluation_strategy="epoch", 
                                  per_device_train_batch_size=65,
                                  num_train_epochs=3,
                                  logging_steps=100,
                                  resume_from_checkpoint=True
                                 )



## 开始训练

### 实例化训练器（Trainer）

`kernel version` 版本问题：暂不影响本示例代码运行

In [18]:
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=full_train_dataset,
    eval_dataset=full_eval_dataset,
    compute_metrics=compute_metrics,
)

Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.


## 使用 nvidia-smi 查看 GPU 使用

为了实时查看GPU使用情况，可以使用 `watch` 指令实现轮询：`watch -n 1 nvidia-smi`:

```shell
Every 1.0s: nvidia-smi                                                   Wed Dec 20 14:37:41 2023

Wed Dec 20 14:37:41 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla T4                       Off | 00000000:00:0D.0 Off |                    0 |
| N/A   64C    P0              69W /  70W |   6665MiB / 15360MiB |     98%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A     18395      C   /root/miniconda3/bin/python                6660MiB |
+---------------------------------------------------------------------------------------+
```

In [19]:
trainer.train()

Epoch,Training Loss,Validation Loss,Accuracy
1,0.7228,0.711266,0.68656
2,0.6229,0.702056,0.69478
3,0.5589,0.731692,0.69498


TrainOutput(global_step=30000, training_loss=0.6564423007965088, metrics={'train_runtime': 19613.9315, 'train_samples_per_second': 99.419, 'train_steps_per_second': 1.53, 'total_flos': 5.130803778048e+17, 'train_loss': 0.6564423007965088, 'epoch': 3.0})

In [20]:
small_test_dataset = tokenized_datasets["test"].shuffle(seed=64).select(range(100))

In [21]:
trainer.evaluate(small_test_dataset)

{'eval_loss': 0.8205113410949707,
 'eval_accuracy': 0.65,
 'eval_runtime': 0.3972,
 'eval_samples_per_second': 251.759,
 'eval_steps_per_second': 32.729,
 'epoch': 3.0}

### 保存模型和训练状态

- 使用 `trainer.save_model` 方法保存模型，后续可以通过 from_pretrained() 方法重新加载
- 使用 `trainer.save_state` 方法保存训练状态

In [22]:
trainer.save_model(model_dir)

In [23]:
trainer.save_state()

In [24]:
# trainer.model.save_pretrained("./")

## Homework: 使用完整的 YelpReviewFull 数据集训练，看 Acc 最高能到多少