Pipeline:将数据预处理-模型调用-结果后处理三个部分组装成的流水线 --> 使我们能够直接输入文本，得到结果

In [1]:
from transformers.pipelines import SUPPORTED_TASKS

查看支持的任务类型

In [None]:
for k, v in SUPPORTED_TASKS.items():
    print(k, v)

Pipeline的创建和使用方式

In [5]:
from transformers import pipeline

根据任务类型直接创建pipeline

In [None]:
pipe = pipeline("text-classification")
pipe("I've been waiting for a HuggingFace course my whole life.")

根据任务类型,并指定模型名称创建pipeline

In [None]:
# https://huggingface.co/models
pipe = pipeline("text-classification", model="uer/roberta-base-finetuned-dianping-chinese",device=0)
pipe("我觉得不太行！")

预先加载模型,再创建pipeline

In [None]:
from transformers import AutoModelForSequenceClassification, AutoTokenizer

# 这种方式，必须同时指定model和tokenizer
model = AutoModelForSequenceClassification.from_pretrained("uer/roberta-base-finetuned-dianping-chinese",device=0)
tokenizer = AutoTokenizer.from_pretrained("uer/roberta-base-finetuned-dianping-chinese")
pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
pipe("我觉得不太行！")

In [None]:
pipe.model.device

确定Pipeline的参数

In [None]:
qa_pipe = pipeline("question-answering", model="uer/roberta-base-chinese-extractive-qa")

In [None]:
qa_pipe

In [None]:
# QuestionAnsweringPipeline

In [None]:
qa_pipe(question="中国的首都是哪里？", context="中国的首都是北京", max_answer_len=1)

其他Pipeline示例

In [None]:
checkpoint = "google/owlvit-base-patch32"
detector = pipeline(model=checkpoint, task="zero-shot-object-detection")

In [None]:
import requests
from PIL import Image

url = "https://unsplash.com/photos/oj0zeY2Ltk4/download?ixid=MnwxMjA3fDB8MXxzZWFyY2h8MTR8fHBpY25pY3xlbnwwfHx8fDE2Nzc0OTE1NDk&force=true&w=640"
im = Image.open(requests.get(url, stream=True).raw)
im

In [None]:
predictions = detector(
    im,
    candidate_labels=["hat", "sunglasses", "book"],
)
predictions

In [None]:
from PIL import ImageDraw

draw = ImageDraw.Draw(im)

for prediction in predictions:
    box = prediction["box"]
    label = prediction["label"]
    score = prediction["score"]
    xmin, ymin, xmax, ymax = box.values()
    draw.rectangle((xmin, ymin, xmax, ymax), outline="red", width=1)
    draw.text((xmin, ymin), f"{label}: {round(score,2)}", fill="red")

im

Pipeline背后的实现

In [None]:
from transformers import *
import torch

tokenizer = AutoTokenizer.from_pretrained("uer/roberta-base-finetuned-dianping-chinese")
model = AutoModelForSequenceClassification.from_pretrained("uer/roberta-base-finetuned-dianping-chinese")

input_text = "这家餐厅的菜味道很好"
inputs = tokenizer(input_text, return_tensors="pt")

res = model(**inputs)

logits = res.logits
logits = torch.softmax(logits, dim=-1)

pred = torch.argmax(logits, dim=-1).item()
pred

# model.config.id2label

result = model.config.id2label.get(pred)