# HF Transformers 核心模块学习：Pipelines

**Pipelines**（管道）是使用模型进行推理的一种简单易上手的方式。

这些管道是抽象了 Transformers 库中大部分复杂代码的对象，提供了一个专门用于多种任务的简单API，包括**命名实体识别、掩码语言建模、情感分析、特征提取和问答**等。


| Modality                    | Task                         | Description                                                | Pipeline API                                  |
| --------------------------- | ---------------------------- | ---------------------------------------------------------- | --------------------------------------------- |
| Audio                       | Audio classification         | 为音频文件分配一个标签                                     | pipeline(task=“audio-classification”)         |
|                             | Automatic speech recognition | 将音频文件中的语音提取为文本                               | pipeline(task=“automatic-speech-recognition”) |
| Computer vision             | Image classification         | 为图像分配一个标签                                         | pipeline(task=“image-classification”)         |
|                             | Object detection             | 预测图像中目标对象的边界框和类别                           | pipeline(task=“object-detection”)             |
|                             | Image segmentation           | 为图像中每个独立的像素分配标签（支持语义、全景和实例分割） | pipeline(task=“image-segmentation”)           |
| Natural language processing | Text classification          | 为给定的文本序列分配一个标签                               | pipeline(task=“sentiment-analysis”)           |
|                             | Token classification         | 为序列里的每个 token 分配一个标签（人, 组织, 地址等等）    | pipeline(task=“ner”)                          |
|                             | Question answering           | 通过给定的上下文和问题, 在文本中提取答案                   | pipeline(task=“question-answering”)           |
|                             | Summarization                | 为文本序列或文档生成总结                                   | pipeline(task=“summarization”)                |
|                             | Translation                  | 将文本从一种语言翻译为另一种语言                           | pipeline(task=“translation”)                  |
| Multimodal                  | Document question answering  | 根据给定的文档和问题回答一个关于该文档的问题。             | pipeline(task=“document-question-answering”)  |
|                             | Visual Question Answering    | 给定一个图像和一个问题，正确地回答有关图像的问题           | pipeline(task=“vqa”)                          |



Pipelines 已支持的完整任务列表：https://huggingface.co/docs/transformers/task_summary


## Pipeline API

**Pipeline API** 是对所有其他可用管道的包装。它可以像任何其他管道一样实例化，并且降低AI推理的学习和使用成本。

![](docs/images/pipeline_func.png)

### 使用 Pipeline API 实现 Text Classification 任务


**Text classification**(文本分类)与任何模态中的分类任务一样，文本分类将一个文本序列（可以是句子级别、段落或者整篇文章）标记为预定义的类别集合之一。文本分类有许多实际应用，其中包括：

- 情感分析：根据某种极性（如积极或消极）对文本进行标记，以在政治、金融和市场等领域支持决策制定。
- 内容分类：根据某个主题对文本进行标记，以帮助组织和过滤新闻和社交媒体信息流中的信息（天气、体育、金融等）。


下面以 `Text classification` 中的情感分析任务为例，展示如何使用 Pipeline API。

模型主页：https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english

## transformers 自定义模型下载的路径

在transformers自定义模型下载的路径方法

```python
import os

os.environ['HF_HOME'] = '/mnt/new_volume/hf'
os.environ['HF_HUB_CACHE'] = '/mnt/new_volume/hf/hub'
```

## entiment-asnalysis

### distilbert-base-uncased-finetuned-sst-2-english模型

In [1]:
from transformers import pipeline

# 仅指定任务时，使用默认模型（不推荐）
pipe = pipeline(task="sentiment-analysis",model="/root/autodl-tmp/hf/modules/distilbert-base-uncased-finetuned-sst-2-english")
pipe("今儿上海可真冷啊")

[{'label': 'NEGATIVE', 'score': 0.8957212567329407}]

#### 测试更多示例

In [2]:
pipe("我觉得这家店蒜泥白肉的味道一般")

[{'label': 'NEGATIVE', 'score': 0.9238728880882263}]

In [7]:
pipe("I think the taste of this restaurant's garlic pork is average")

[{'label': 'NEGATIVE', 'score': 0.9921407103538513}]

In [3]:
# 默认使用的模型 distilbert-base-uncased-finetuned-sst-2-english 
# 并未针对中文做太多训练，中文的文本分类任务表现未必满意
pipe("你学东西真的好快，理论课一讲就明白了")

[{'label': 'NEGATIVE', 'score': 0.8578681349754333}]

In [4]:
# 替换为英文后，文本分类任务的表现立刻改善
pipe("You learn things really quickly. You understand the theory class as soon as it is taught.")

[{'label': 'POSITIVE', 'score': 0.9961802959442139}]

In [5]:
pipe("Today Shanghai is really cold.")

[{'label': 'NEGATIVE', 'score': 0.9995032548904419}]

#### 批处理调用模型推理

In [6]:
text_list = [
    "Today Shanghai is really cold.",
    "I think the taste of the garlic mashed pork in this store is average.",
    "You learn things really quickly. You understand the theory class as soon as it is taught."
]

pipe(text_list)

[{'label': 'NEGATIVE', 'score': 0.9995032548904419},
 {'label': 'NEGATIVE', 'score': 0.9984821677207947},
 {'label': 'POSITIVE', 'score': 0.9961802959442139}]

### Qwen1.5-7B-Chat模型

In [8]:
qwen = pipeline(task="sentiment-analysis",model = "/root/autodl-tmp/hf/modules/Qwen1.5-7B-Chat")

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

Some weights of Qwen2ForSequenceClassification were not initialized from the model checkpoint at /root/autodl-tmp/hf/modules/Qwen1.5-7B-Chat and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [9]:
qwen("我觉得这家店蒜泥白肉的味道一般")

[{'label': 'LABEL_1', 'score': 0.7561785578727722}]

In [10]:
qwen("I think the taste of this restaurant's garlic pork is average.")

[{'label': 'LABEL_0', 'score': 0.9999998807907104}]

In [11]:
qwen("今天可真热呀")

[{'label': 'LABEL_0', 'score': 0.9625878930091858}]

In [13]:
qwen("It's really hot today")

[{'label': 'LABEL_1', 'score': 0.9535388946533203}]

In [12]:
qwen("你学东西真的好快，理论课一讲就明白了")

[{'label': 'LABEL_0', 'score': 0.9998940229415894}]

In [14]:
qwen("You learn things really quickly. You understand the theory class as soon as it is taught.")

[{'label': 'LABEL_0', 'score': 0.9991976618766785}]

In [15]:
qwen("今儿上海可真冷啊")

[{'label': 'LABEL_0', 'score': 0.9692286252975464}]

In [16]:
qwen("Today Shanghai is really cold.")

[{'label': 'LABEL_0', 'score': 0.9997708201408386}]

 ### cardiffnlp/twitter-roberta-base-sentiment-latest模型

In [34]:
distilbert = pipeline(task="sentiment-analysis",model = "/root/autodl-tmp/hf/modules/distilbert-base-multilingual-cased-sentiments-student")

In [35]:
distilbert("我觉得这家店蒜泥白肉的味道一般")

[{'label': 'neutral', 'score': 0.6030056476593018}]

In [36]:
distilbert("I think the taste of this restaurant's garlic pork is average.")

[{'label': 'positive', 'score': 0.5195603370666504}]

In [37]:
distilbert("今天可真热呀")

[{'label': 'positive', 'score': 0.8534454107284546}]

In [38]:
distilbert("It's really hot today")

[{'label': 'positive', 'score': 0.6402294039726257}]

In [39]:
distilbert("你学东西真的好快，理论课一讲就明白了")

[{'label': 'positive', 'score': 0.9461328983306885}]

In [40]:
distilbert("You learn things really quickly. You understand the theory class as soon as it is taught.")

[{'label': 'positive', 'score': 0.7639098763465881}]

In [41]:
distilbert("今儿上海可真冷啊")

[{'label': 'negative', 'score': 0.6657698750495911}]

In [42]:
distilbert("Today Shanghai is really cold.")

[{'label': 'negative', 'score': 0.7824515104293823}]

In [25]:
cardiffnlp = pipeline(task="sentiment-analysis",model = "/root/autodl-tmp/hf/modules/twitter-roberta-base-sentiment-latest")

Some weights of the model checkpoint at /root/autodl-tmp/hf/modules/twitter-roberta-base-sentiment-latest were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [26]:
cardiffnlp("我觉得这家店蒜泥白肉的味道一般")

[{'label': 'neutral', 'score': 0.7896567583084106}]

In [27]:
cardiffnlp("I think the taste of this restaurant's garlic pork is average.")

[{'label': 'negative', 'score': 0.9096534848213196}]

In [28]:
cardiffnlp("今天可真热呀")

[{'label': 'neutral', 'score': 0.7702414393424988}]

In [29]:
cardiffnlp("It's really hot today")

[{'label': 'neutral', 'score': 0.4667932987213135}]

In [30]:
cardiffnlp("你学东西真的好快，理论课一讲就明白了")

[{'label': 'neutral', 'score': 0.7751216888427734}]

In [31]:
cardiffnlp("You learn things really quickly. You understand the theory class as soon as it is taught.")

[{'label': 'positive', 'score': 0.785957396030426}]

In [32]:
cardiffnlp("今儿上海可真冷啊")

[{'label': 'neutral', 'score': 0.7541275024414062}]

In [33]:
cardiffnlp("Today Shanghai is really cold.")

[{'label': 'negative', 'score': 0.7752947807312012}]

## 使用 Pipeline API 调用更多预定义任务

## Natural Language Processing(NLP)

**NLP**(自然语言处理)任务是最常见的任务类型之一，因为文本是我们进行交流的一种自然方式。要将文本转换为模型可识别的格式，需要对其进行分词。这意味着将一系列文本划分为单独的单词或子词（标记），然后将这些标记转换为数字。结果就是，您可以将一系列文本表示为一系列数字，并且一旦您拥有了一系列数字，它就可以输入到模型中来解决各种NLP任务！

上面演示的 文本分类任务，以及接下来的标记、问答等任务都属于 NLP 范畴。

### Token Classification

在任何NLP任务中，文本都经过预处理，将文本序列分成单个单词或子词。这些被称为tokens。

**Token Classification**（Token分类）将每个token分配一个来自预定义类别集的标签。

两种常见的 Token 分类是：

- 命名实体识别（NER）：根据实体类别（如组织、人员、位置或日期）对token进行标记。NER在生物医学设置中特别受欢迎，可以标记基因、蛋白质和药物名称。
- 词性标注（POS）：根据其词性（如名词、动词或形容词）对标记进行标记。POS对于帮助翻译系统了解两个相同的单词如何在语法上不同很有用（作为名词的银行与作为动词的银行）。

模型主页：https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english

In [21]:
from transformers import pipeline

classifier = pipeline(task="ner",model="/root/autodl-tmp/hf/modules/bert-large-cased-finetuned-conll03-english")

Some weights of the model checkpoint at /root/autodl-tmp/hf/modules/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [22]:
preds = classifier("Hugging Face is a French company based in New York City.")
preds = [
    {
        "entity": pred["entity"],
        "score": round(pred["score"], 4),
        "index": pred["index"],
        "word": pred["word"],
        "start": pred["start"],
        "end": pred["end"],
    }
    for pred in preds
]
print(*preds, sep="\n")

{'entity': 'I-ORG', 'score': 0.9968, 'index': 1, 'word': 'Hu', 'start': 0, 'end': 2}
{'entity': 'I-ORG', 'score': 0.9293, 'index': 2, 'word': '##gging', 'start': 2, 'end': 7}
{'entity': 'I-ORG', 'score': 0.9763, 'index': 3, 'word': 'Face', 'start': 8, 'end': 12}
{'entity': 'I-MISC', 'score': 0.9983, 'index': 6, 'word': 'French', 'start': 18, 'end': 24}
{'entity': 'I-LOC', 'score': 0.999, 'index': 10, 'word': 'New', 'start': 42, 'end': 45}
{'entity': 'I-LOC', 'score': 0.9987, 'index': 11, 'word': 'York', 'start': 46, 'end': 50}
{'entity': 'I-LOC', 'score': 0.9992, 'index': 12, 'word': 'City', 'start': 51, 'end': 55}


In [23]:
classifier("6月6日，中吉乌铁路项目三国政府间协定签字仪式在京举行，习近平指出：愿早日建成中吉乌铁路这条惠及三国和三国人民的战略通道")

[]

In [24]:
classifier("On June 6, the signing ceremony of the intergovernmental agreement on the China-Kyrgyzstan-Uzbekistan railway project was held in Beijing. Xi Jinping pointed out that he hoped that the China-Kyrgyzstan-Uzbekistan railway, a strategic channel that would benefit the three countries and their peoples, would be completed as soon as possible.")

[{'entity': 'I-LOC',
  'score': 0.70050573,
  'index': 17,
  'word': 'China',
  'start': 74,
  'end': 79},
 {'entity': 'I-LOC',
  'score': 0.9514764,
  'index': 19,
  'word': 'Kyrgyzstan',
  'start': 80,
  'end': 90},
 {'entity': 'I-LOC',
  'score': 0.99769527,
  'index': 21,
  'word': 'Uzbekistan',
  'start': 91,
  'end': 101},
 {'entity': 'I-LOC',
  'score': 0.99976784,
  'index': 27,
  'word': 'Beijing',
  'start': 130,
  'end': 137},
 {'entity': 'I-PER',
  'score': 0.9952678,
  'index': 29,
  'word': 'Xi',
  'start': 139,
  'end': 141},
 {'entity': 'I-PER',
  'score': 0.99149126,
  'index': 30,
  'word': 'Jin',
  'start': 142,
  'end': 145},
 {'entity': 'I-PER',
  'score': 0.92602485,
  'index': 31,
  'word': '##ping',
  'start': 145,
  'end': 149},
 {'entity': 'I-LOC',
  'score': 0.819423,
  'index': 39,
  'word': 'China',
  'start': 185,
  'end': 190},
 {'entity': 'I-LOC',
  'score': 0.9647289,
  'index': 41,
  'word': 'Kyrgyzstan',
  'start': 191,
  'end': 201},
 {'entity': 'I-L

#### 合并实体

In [43]:
classifier = pipeline(task="ner", grouped_entities=True,model="/root/autodl-tmp/hf/modules/bert-large-cased-finetuned-conll03-english")
classifier("Hugging Face is a French company based in New York City.")

Some weights of the model checkpoint at /root/autodl-tmp/hf/modules/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[{'entity_group': 'ORG',
  'score': 0.9674639,
  'word': 'Hugging Face',
  'start': 0,
  'end': 12},
 {'entity_group': 'MISC',
  'score': 0.9982874,
  'word': 'French',
  'start': 18,
  'end': 24},
 {'entity_group': 'LOC',
  'score': 0.99896103,
  'word': 'New York City',
  'start': 42,
  'end': 55}]

In [44]:
classifier("6月6日，中吉乌铁路项目三国政府间协定签字仪式在京举行，习近平指出：愿早日建成中吉乌铁路这条惠及三国和三国人民的战略通道")

[]

In [45]:
classifier("On June 6, the signing ceremony of the intergovernmental agreement on the China-Kyrgyzstan-Uzbekistan railway project was held in Beijing. Xi Jinping pointed out that he hoped that the China-Kyrgyzstan-Uzbekistan railway, a strategic channel that would benefit the three countries and their peoples, would be completed as soon as possible.")

[{'entity_group': 'LOC',
  'score': 0.70050573,
  'word': 'China',
  'start': 74,
  'end': 79},
 {'entity_group': 'LOC',
  'score': 0.9514764,
  'word': 'Kyrgyzstan',
  'start': 80,
  'end': 90},
 {'entity_group': 'LOC',
  'score': 0.99769527,
  'word': 'Uzbekistan',
  'start': 91,
  'end': 101},
 {'entity_group': 'LOC',
  'score': 0.99976784,
  'word': 'Beijing',
  'start': 130,
  'end': 137},
 {'entity_group': 'PER',
  'score': 0.97092795,
  'word': 'Xi Jinping',
  'start': 139,
  'end': 149},
 {'entity_group': 'LOC',
  'score': 0.819423,
  'word': 'China',
  'start': 185,
  'end': 190},
 {'entity_group': 'LOC',
  'score': 0.9647289,
  'word': 'Kyrgyzstan',
  'start': 191,
  'end': 201},
 {'entity_group': 'LOC',
  'score': 0.998539,
  'word': 'Uzbekistan',
  'start': 202,
  'end': 212}]

### modules/xlm-roberta-large-finetuned-conll03-english模型

In [52]:
classifier = pipeline(task="ner",model="/root/autodl-tmp/hf/modules/xlm-roberta-large-finetuned-conll03-english",grouped_entities=True)

Some weights of the model checkpoint at /root/autodl-tmp/hf/modules/xlm-roberta-large-finetuned-conll03-english were not used when initializing XLMRobertaForTokenClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing XLMRobertaForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing XLMRobertaForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [53]:
classifier("Hugging Face is a French company based in New York City.")

[{'entity_group': 'ORG',
  'score': 0.9999173,
  'word': 'Hugging Face',
  'start': 0,
  'end': 12},
 {'entity_group': 'MISC',
  'score': 0.9999932,
  'word': 'French',
  'start': 18,
  'end': 24},
 {'entity_group': 'LOC',
  'score': 0.99999696,
  'word': 'New York City',
  'start': 42,
  'end': 55}]

In [54]:
classifier("6月6日，中吉乌铁路项目三国政府间协定签字仪式在京举行，习近平指出：愿早日建成中吉乌铁路这条惠及三国和三国人民的战略通道")

[{'entity_group': 'LOC',
  'score': 0.9995888,
  'word': '中吉乌',
  'start': 5,
  'end': 8},
 {'entity_group': 'LOC',
  'score': 0.9999939,
  'word': '京',
  'start': 24,
  'end': 25},
 {'entity_group': 'PER',
  'score': 0.9995895,
  'word': '习近平',
  'start': 28,
  'end': 31},
 {'entity_group': 'LOC',
  'score': 0.9994803,
  'word': '中吉乌',
  'start': 39,
  'end': 42}]

In [55]:
classifier("On June 6, the signing ceremony of the intergovernmental agreement on the China-Kyrgyzstan-Uzbekistan railway project was held in Beijing. Xi Jinping pointed out that he hoped that the China-Kyrgyzstan-Uzbekistan railway, a strategic channel that would benefit the three countries and their peoples, would be completed as soon as possible.")

[{'entity_group': 'MISC',
  'score': 0.9518931,
  'word': 'China-Kyrgyzstan-Uzbekistan',
  'start': 74,
  'end': 101},
 {'entity_group': 'LOC',
  'score': 0.999998,
  'word': 'Beijing',
  'start': 130,
  'end': 137},
 {'entity_group': 'PER',
  'score': 0.9999905,
  'word': 'Xi Jinping',
  'start': 139,
  'end': 149},
 {'entity_group': 'MISC',
  'score': 0.91028666,
  'word': 'China-K',
  'start': 185,
  'end': 192},
 {'entity_group': 'LOC',
  'score': 0.6281686,
  'word': 'yrgyzstan',
  'start': 192,
  'end': 201},
 {'entity_group': 'MISC',
  'score': 0.7831001,
  'word': '-U',
  'start': 201,
  'end': 203},
 {'entity_group': 'LOC',
  'score': 0.7278358,
  'word': 'zbekistan',
  'start': 203,
  'end': 212}]

### Question Answering

**Question Answering**(问答)是另一个token-level的任务，返回一个问题的答案，有时带有上下文（开放领域），有时不带上下文（封闭领域）。每当我们向虚拟助手提出问题时，例如询问一家餐厅是否营业，就会发生这种情况。它还可以提供客户或技术支持，并帮助搜索引擎检索您要求的相关信息。

有两种常见的问答类型：

- 提取式：给定一个问题和一些上下文，模型必须从上下文中提取出一段文字作为答案
- 生成式：给定一个问题和一些上下文，答案是根据上下文生成的；这种方法由`Text2TextGenerationPipeline`处理，而不是下面展示的`QuestionAnsweringPipeline`

模型主页：https://huggingface.co/distilbert-base-cased-distilled-squad

In [46]:
from transformers import pipeline

question_answerer = pipeline(task="question-answering",model="/root/autodl-tmp/hf/modules/distilbert-base-cased-distilled-squad")

In [47]:
preds = question_answerer(
    question="What is the name of the repository?",
    context="The name of the repository is huggingface/transformers",
)
print(
    f"score: {round(preds['score'], 4)}, start: {preds['start']}, end: {preds['end']}, answer: {preds['answer']}"
)

score: 0.9327, start: 30, end: 54, answer: huggingface/transformers


In [12]:
preds = question_answerer(
    question="What is the capital of China?",
    context="On 1 October 1949, CCP Chairman Mao Zedong formally proclaimed the People's Republic of China in Tiananmen Square, Beijing.",
)
print(
    f"score: {round(preds['score'], 4)}, start: {preds['start']}, end: {preds['end']}, answer: {preds['answer']}"
)

score: 0.9458, start: 115, end: 122, answer: Beijing


In [61]:
preds = question_answerer(
    question="延期缴纳税款不能超过多久",
    context="""
    纳税人因有特殊困难，不能按期缴纳税款的，经省、自治区、直辖市国家税务局、地方税务局批准，可以延期缴纳税款，但最长不得超过的期限是6个月。
    """)



print(
    f"score: {round(preds['score'], 4)}, start: {preds['start']}, end: {preds['end']}, answer: {preds['answer']}"
)

score: 0.0029, start: 69, end: 73, answer: 6个月。


In [63]:
preds = question_answerer(
    question="政治法律环境对企业的影响因素是？",
    context="""
    近年来，房地产界一直处于高温状态，不过，随着国家有关房地产政策的出台，在一定程度上遏制了这种高温。据调查，最近二套房、三套房的成交量同期相比有所下降。这体现了政治法律环境对企业的影响因素是不可测性。	

    """)



print(
    f"score: {round(preds['score'], 4)}, start: {preds['start']}, end: {preds['end']}, answer: {preds['answer']}"
)

score: 0.0006, start: 51, end: 63, answer: 高温。据调查，最近二套房


### timpal0l/mdeberta-v3-base-squad2 模型

In [7]:
from transformers import pipeline
mdeberta_question_answerer = pipeline(task="question-answering",model="/root/autodl-tmp/hf/modules/mdeberta-v3-base-squad2")

In [8]:
preds = mdeberta_question_answerer(
    question="政治法律环境对企业的影响因素是？",
    context="""
    近年来，房地产界一直处于高温状态，不过，随着国家有关房地产政策的出台，在一定程度上遏制了这种高温。据调查，最近二套房、三套房的成交量同期相比有所下降。这体现了政治法律环境对企业的影响因素是不可测性。	

    """)



print(
    f"score: {round(preds['score'], 4)}, start: {preds['start']}, end: {preds['end']}, answer: {preds['answer']}"
)

score: 0.934, start: 5, end: 104, answer: 近年来，房地产界一直处于高温状态，不过，随着国家有关房地产政策的出台，在一定程度上遏制了这种高温。据调查，最近二套房、三套房的成交量同期相比有所下降。这体现了政治法律环境对企业的影响因素是不可测性。


In [9]:
preds = mdeberta_question_answerer(
    question="延期缴纳税款不能超过多久",
    context="""
    纳税人因有特殊困难，不能按期缴纳税款的，经省、自治区、直辖市国家税务局、地方税务局批准，可以延期缴纳税款，但最长不得超过的期限是6个月。
    """)



print(
    f"score: {round(preds['score'], 4)}, start: {preds['start']}, end: {preds['end']}, answer: {preds['answer']}"
)

score: 0.9716, start: 5, end: 73, answer: 纳税人因有特殊困难，不能按期缴纳税款的，经省、自治区、直辖市国家税务局、地方税务局批准，可以延期缴纳税款，但最长不得超过的期限是6个月。


In [10]:
preds = mdeberta_question_answerer(
    question="What is the name of the repository?",
    context="The name of the repository is huggingface/transformers",
)
print(
    f"score: {round(preds['score'], 4)}, start: {preds['start']}, end: {preds['end']}, answer: {preds['answer']}"
)

score: 0.9839, start: 29, end: 54, answer:  huggingface/transformers


In [12]:
preds = mdeberta_question_answerer(
    question="What is the capital of China?",
    context="On 1 October 1949, CCP Chairman Mao Zedong formally proclaimed the People's Republic of China in Tiananmen Square, Beijing.",
)
print(
    f"score: {round(preds['score'], 4)}, start: {preds['start']}, end: {preds['end']}, answer: {preds['answer']}"
)

score: 0.9296, start: 114, end: 123, answer:  Beijing.


In [13]:
preds = mdeberta_question_answerer(
    question="中国的首都是哪儿?",
    context="1949年10月1日，中国共产党主席毛泽东在北京天安门广场正式宣布中华人民共和国成立。",
)
print(
    f"score: {round(preds['score'], 4)}, start: {preds['start']}, end: {preds['end']}, answer: {preds['answer']}"
)

score: 0.6993, start: 0, end: 43, answer: 1949年10月1日，中国共产党主席毛泽东在北京天安门广场正式宣布中华人民共和国成立。


### Summarization

**Summarization**(文本摘要）从较长的文本中创建一个较短的版本，同时尽可能保留原始文档的大部分含义。摘要是一个序列到序列的任务；它输出比输入更短的文本序列。有许多长篇文档可以进行摘要，以帮助读者快速了解主要要点。法案、法律和财务文件、专利和科学论文等文档可以摘要，以节省读者的时间并作为阅读辅助工具。

与问答类似，摘要有两种类型：

- 提取式：从原始文本中识别和提取最重要的句子
- 生成式：从原始文本中生成目标摘要（可能包括输入文件中没有的新单词）；`SummarizationPipeline`使用生成式方法

模型主页：https://huggingface.co/t5-base

In [65]:
from transformers import pipeline

summarizer = pipeline(task="summarization",
                      model="/root/autodl-tmp/hf/modules/t5-base",
                      min_length=8,
                      max_length=32,
)

In [66]:
summarizer(
    """
    In this work, we presented the Transformer, the first sequence transduction model based entirely on attention, 
    replacing the recurrent layers most commonly used in encoder-decoder architectures with multi-headed self-attention. 
    For translation tasks, the Transformer can be trained significantly faster than architectures based on recurrent or convolutional layers. 
    On both WMT 2014 English-to-German and WMT 2014 English-to-French translation tasks, we achieve a new state of the art. 
    In the former task our best model outperforms even all previously reported ensembles.
    """
)


[{'summary_text': 'the Transformer is the first sequence transduction model based entirely on attention . it replaces recurrent layers commonly used in encoder-decode'}]

In [67]:
summarizer(
    '''
    Large language models (LLM) are very large deep learning models that are pre-trained on vast amounts of data. 
    The underlying transformer is a set of neural networks that consist of an encoder and a decoder with self-attention capabilities. 
    The encoder and decoder extract meanings from a sequence of text and understand the relationships between words and phrases in it.
    Transformer LLMs are capable of unsupervised training, although a more precise explanation is that transformers perform self-learning. 
    It is through this process that transformers learn to understand basic grammar, languages, and knowledge.
    Unlike earlier recurrent neural networks (RNN) that sequentially process inputs, transformers process entire sequences in parallel. 
    This allows the data scientists to use GPUs for training transformer-based LLMs, significantly reducing the training time.
    '''
)


[{'summary_text': 'large language models (LLMs) are very large deep learning models pre-trained on vast amounts of data . transformers are capable of un'}]

In [68]:
summarizer(
    """
    Trước khi bắt đầu đo, đứng thẳng và đặt tay ở hai bên cơ thể. Hơi cong cánh tay với các ngón tay đặt vào túi quần trước. Bắt đầu ở giữa phần lưng trên, ngay bên dưới cổ. Đo độ dài từ giữa lưng trên đến đường viền vai áo. Viết lại số đo này vì bạn sẽ cần đến ở bước sau. Đo độ dài từ đường viền vai áo đến cổ tay. Cố gắng kéo thước dây đến xương cổ tay. Cẩn thận để không dừng thước đo ngay phía trên cổ tay, kẻo làm cho tay áo quá ngắn. Cộng hai số đo này để tìm ra độ dài tay áo. Số đo tay áo sẽ ở khoảng từ 81 - 94cm.	

    """
)



[{'summary_text': 'bt u  gia phn lng trên, ng'}]

In [69]:
summarizer(
    """
    硅藻土是纯天然物质，可以安全用在人类和宠物周围，帮助消灭蚂蚁和其它爬行动物。硅藻土由化石碎屑组成，一旦昆虫爬过，这些化石碎屑就会切割它们的外骨骼。把硅藻土撒在角落、水槽下、窗台和蚂蚁常出没的地方。 每一两个星期用吸尘机打扫硅藻土一次，然后换上新的粉末。 这个方法在潮湿地区不太有效，因为潮湿的硅藻土不再锋利。 如果地毯上有许多黑蚂蚁，你可撒一层小苏打，等待几个小时，然后用吸尘机打扫干净。你也可以将玉米淀粉洒在地上，立刻用吸尘机清理玉米淀粉，然后才用吸尘机吸走蚂蚁。吸尘机里的玉米淀粉会使蚂蚁窒息。 特定的天然喷剂可以用作驱虫剂。你可以简单地自制驱虫剂，只需把10滴精油添加到1杯水中（约237毫升），然后倒入喷瓶。到处喷射精油混合液，防止蚂蚁进入屋子。你可以试用以下精油： 桉树油（家中有猫的话不可以用） 茶树油 薰衣草 薄荷 柠檬 Windex玻璃清洁剂 也许你家中的洗衣房已备有一盒硼酸。这个家居用品也是非常有效的杀虫剂。只需把硼酸撒在角落和房间四周。一旦蚂蚁和其它生物爬过硼酸，就会死亡。 如果你手边没有其它驱逐剂，可以尝试在蚂蚁聚集的房间里撒一些肉桂粉。肉桂粉和它散发的强烈味道能驱赶蚂蚁。虽然撒肉桂粉无法杀灭蚂蚁，但可以防止它们再回来。
    """
)

[{'summary_text': 'Windex , . 101(237), ,,  : '}]

In [4]:
from transformers import pipeline
mT5 = pipeline(task="summarization",
                      model="/root/autodl-tmp/hf/modules/mT5_multilingual_XLSum",
)
# mT5 = pipeline("summarization", model="csebuetnlp/mT5_multilingual_XLSum")

# Use a pipeline as a high-level helper


In [5]:
mT5(
    """
    硅藻土是纯天然物质，可以安全用在人类和宠物周围，帮助消灭蚂蚁和其它爬行动物。硅藻土由化石碎屑组成，一旦昆虫爬过，这些化石碎屑就会切割它们的外骨骼。把硅藻土撒在角落、水槽下、窗台和蚂蚁常出没的地方。 每一两个星期用吸尘机打扫硅藻土一次，然后换上新的粉末。 这个方法在潮湿地区不太有效，因为潮湿的硅藻土不再锋利。 如果地毯上有许多黑蚂蚁，你可撒一层小苏打，等待几个小时，然后用吸尘机打扫干净。你也可以将玉米淀粉洒在地上，立刻用吸尘机清理玉米淀粉，然后才用吸尘机吸走蚂蚁。吸尘机里的玉米淀粉会使蚂蚁窒息。 特定的天然喷剂可以用作驱虫剂。你可以简单地自制驱虫剂，只需把10滴精油添加到1杯水中（约237毫升），然后倒入喷瓶。到处喷射精油混合液，防止蚂蚁进入屋子。你可以试用以下精油： 桉树油（家中有猫的话不可以用） 茶树油 薰衣草 薄荷 柠檬 Windex玻璃清洁剂 也许你家中的洗衣房已备有一盒硼酸。这个家居用品也是非常有效的杀虫剂。只需把硼酸撒在角落和房间四周。一旦蚂蚁和其它生物爬过硼酸，就会死亡。 如果你手边没有其它驱逐剂，可以尝试在蚂蚁聚集的房间里撒一些肉桂粉。肉桂粉和它散发的强烈味道能驱赶蚂蚁。虽然撒肉桂粉无法杀灭蚂蚁，但可以防止它们再回来。
    """
)

[{'summary_text': '硅藻土是地球上最古老的建筑物之一,它被认为是保护地球不受潮湿影响的一部分。'}]


## Audio 音频处理任务

音频和语音处理任务与其他模态略有不同，主要是因为音频作为输入是一个连续的信号。与文本不同，原始音频波形不能像句子可以被划分为单词那样被整齐地分割成离散的块。为了解决这个问题，通常在固定的时间间隔内对原始音频信号进行采样。如果在每个时间间隔内采样更多样本，采样率就会更高，音频更接近原始音频源。

以前的方法是预处理音频以从中提取有用的特征。现在更常见的做法是直接将原始音频波形输入到特征编码器中，以提取音频表示。这样可以简化预处理步骤，并允许模型学习最重要的特征。

### Audio classification

**Audio classification**(音频分类)是一项将音频数据从预定义的类别集合中进行标记的任务。这是一个广泛的类别，具有许多具体的应用，其中一些包括：

- 声学场景分类：使用场景标签（“办公室”、“海滩”、“体育场”）对音频进行标记。
- 声学事件检测：使用声音事件标签（“汽车喇叭声”、“鲸鱼叫声”、“玻璃破碎声”）对音频进行标记。
- 标记：对包含多种声音的音频进行标记（鸟鸣、会议中的说话人识别）。
- 音乐分类：使用流派标签（“金属”、“嘻哈”、“乡村”）对音乐进行标记。

模型主页：https://huggingface.co/superb/hubert-base-superb-er

数据集主页：https://huggingface.co/datasets/superb#er

```
情感识别（ER）为每个话语预测一个情感类别。我们采用了最广泛使用的ER数据集IEMOCAP，并遵循传统的评估协议：我们删除不平衡的情感类别，只保留最后四个具有相似数量数据点的类别，并在标准分割的五折交叉验证上进行评估。评估指标是准确率（ACC）。
```

#### 前置依赖包安装

建议在命令行安装必要的音频数据处理包: ffmpeg

```shell
$apt update & apt upgrade
$apt install -y ffmpeg
$pip install ffmpeg ffmpeg-python
```

In [2]:
from transformers import pipeline

classifier = pipeline(task="audio-classification", model="/root/autodl-tmp/hf/modules/hubert-base-superb-er")

Some weights of the model checkpoint at /root/autodl-tmp/hf/modules/hubert-base-superb-er were not used when initializing HubertForSequenceClassification: ['hubert.encoder.pos_conv_embed.conv.weight_g', 'hubert.encoder.pos_conv_embed.conv.weight_v']
- This IS expected if you are initializing HubertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing HubertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of HubertForSequenceClassification were not initialized from the model checkpoint at /root/autodl-tmp/hf/modules/hubert-base-superb-er and are newly initialized: ['hubert.encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'hube

In [3]:
# 使用 Hugging Face Datasets 上的测试文件
preds = classifier("/root/autodl-tmp/project/transformers/data/audio/1.flac")
preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
preds

[{'score': 0.3642, 'label': 'neu'},
 {'score': 0.3408, 'label': 'hap'},
 {'score': 0.2599, 'label': 'ang'},
 {'score': 0.0351, 'label': 'sad'}]

In [25]:
# 使用本地的音频文件做测试
preds = classifier("data/audio/mlk.flac")
preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
preds

[{'score': 0.4532, 'label': 'hap'},
 {'score': 0.3622, 'label': 'sad'},
 {'score': 0.0943, 'label': 'neu'},
 {'score': 0.0903, 'label': 'ang'}]

### Automatic speech recognition（ASR）

**Automatic speech recognition**（自动语音识别）将语音转录为文本。这是最常见的音频任务之一，部分原因是因为语音是人类交流的自然形式。如今，ASR系统嵌入在智能技术产品中，如扬声器、电话和汽车。我们可以要求虚拟助手播放音乐、设置提醒和告诉我们天气。

但是，Transformer架构帮助解决的一个关键挑战是低资源语言。通过在大量语音数据上进行预训练，仅在一个低资源语言的一小时标记语音数据上进行微调，仍然可以产生与以前在100倍更多标记数据上训练的ASR系统相比高质量的结果。

模型主页：https://huggingface.co/openai/whisper-small

下面展示使用 `OpenAI Whisper Small` 模型实现 ASR 的 Pipeline API 示例：

In [6]:
from transformers import pipeline

# 使用 `model` 参数指定模型
transcriber = pipeline(task="automatic-speech-recognition", model="/root/autodl-tmp/hf/modules/whisper-small")

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [7]:
text = transcriber("data/audio/mlk.flac")
text

{'text': ' I have a dream that one day this nation will rise up and live out the true meaning of its creed.'}

In [10]:
text = transcriber("data/audio/2.flac")
text

{'text': '《网络安全意识》国家网络安全中心替代格式版本本内容由苏格兰领导组织与合作伙伴共同制作并由苏格兰政府网络适应性部门拨款本指南根据国家网络安全中心网站上的内容编写所含信息分為7個社區語言音頻分別是本音頻片段包括簡介並概述您可以採取的用於提高在線安全性的6項措施然後關於您可以採取的6項措施分別有單獨的詳細音頻片段在國家網絡安全中心網站上可访问提到的所有链接网指是www.ncsc.gov.uk斜杠cyberawre斜杠提升您的在线安全性的六种方法由於新冠疫情,今年人們花費了更多時間上網,這意味著黑客有更多機會進行網絡攻擊他們通常會使用以下方法對人們和企業進行攻擊電子郵件和網站騙局惡意軟件,這是只可能會損壞您的設備或讓黑客進入的軟件如果黑客进入您的设备或账户,他们就可以获取您的资金、您的个人信息,或有关您业务的信息。您可以采取以下六项措施来提高网络安全性。措施1.为您的电子邮箱使用一个单独的墙密码。措施2,使用三个随机词创建强密码措施3,将密码保存在您的浏览器中措施4,打开双音速认证,简称RFA措施5,更新您的设备措施 6 倍份您的數據'}

## Computer Vision 计算机视觉

**Computer Vision**（计算机视觉）任务中最早成功之一是使用卷积神经网络（CNN）识别邮政编码数字图像。图像由像素组成，每个像素都有一个数值。这使得将图像表示为像素值矩阵变得容易。每个像素值组合描述了图像的颜色。

计算机视觉任务可以通过以下两种通用方式解决：

- 使用卷积来学习图像的层次特征，从低级特征到高级抽象特征。
- 将图像分成块，并使用Transformer逐步学习每个图像块如何相互关联以形成图像。与CNN偏好的自底向上方法不同，这种方法有点像从一个模糊的图像开始，然后逐渐将其聚焦清晰。

### Image Classificaiton

**Image Classificaiton**(图像分类)将整个图像从预定义的类别集合中进行标记。像大多数分类任务一样，图像分类有许多实际用例，其中一些包括：

- 医疗保健：标记医学图像以检测疾病或监测患者健康状况
- 环境：标记卫星图像以监测森林砍伐、提供野外管理信息或检测野火
- 农业：标记农作物图像以监测植物健康或用于土地使用监测的卫星图像
- 生态学：标记动物或植物物种的图像以监测野生动物种群或跟踪濒危物种

模型主页：https://huggingface.co/google/vit-base-patch16-224

In [11]:
from transformers import pipeline

classifier = pipeline(task="image-classification",model="/root/autodl-tmp/hf/modules/vit-base-patch16-224")

In [22]:
preds = classifier(
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
)
preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
print(*preds, sep="\n")

{'score': 0.4335, 'label': 'lynx, catamount'}
{'score': 0.0348, 'label': 'cougar, puma, catamount, mountain lion, painter, panther, Felis concolor'}
{'score': 0.0324, 'label': 'snow leopard, ounce, Panthera uncia'}
{'score': 0.0239, 'label': 'Egyptian cat'}
{'score': 0.0229, 'label': 'tiger cat'}


![](data/image/cat-chonk.jpeg)

In [12]:
# 使用本地图片（狼猫）
preds = classifier(
    "data/image/cat-chonk.jpeg"
)
preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
print(*preds, sep="\n")

{'score': 0.4335, 'label': 'lynx, catamount'}
{'score': 0.0348, 'label': 'cougar, puma, catamount, mountain lion, painter, panther, Felis concolor'}
{'score': 0.0324, 'label': 'snow leopard, ounce, Panthera uncia'}
{'score': 0.0239, 'label': 'Egyptian cat'}
{'score': 0.0229, 'label': 'tiger cat'}


![](data/image/panda.jpg)

In [13]:
# 使用本地图片（熊猫）
preds = classifier(
    "data/image/panda.jpg"
)
preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
print(*preds, sep="\n")

{'score': 0.9962, 'label': 'giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca'}
{'score': 0.0018, 'label': 'lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens'}
{'score': 0.0002, 'label': 'ice bear, polar bear, Ursus Maritimus, Thalarctos maritimus'}
{'score': 0.0001, 'label': 'sloth bear, Melursus ursinus, Ursus ursinus'}
{'score': 0.0001, 'label': 'brown bear, bruin, Ursus arctos'}


In [12]:
from transformers import pipeline

classifier = pipeline(task="image-classification",model="/root/autodl-tmp/hf/modules/beit-base-patch16-224-pt22k-ft22k")

### Object Detection

与图像分类不同，目标检测在图像中识别多个对象以及这些对象在图像中的位置（由边界框定义）。目标检测的一些示例应用包括：

- 自动驾驶车辆：检测日常交通对象，如其他车辆、行人和红绿灯
- 遥感：灾害监测、城市规划和天气预报
- 缺陷检测：检测建筑物中的裂缝或结构损坏，以及制造业产品缺陷

模型主页：https://huggingface.co/facebook/detr-resnet-50

#### 前置依赖包安装

In [25]:
!pip install timm

Collecting fsspec (from torch>=1.7->timm)
  Using cached fsspec-2023.12.2-py3-none-any.whl.metadata (6.8 kB)
Using cached fsspec-2023.12.2-py3-none-any.whl (168 kB)
Installing collected packages: fsspec
  Attempting uninstall: fsspec
    Found existing installation: fsspec 2023.4.0
    Uninstalling fsspec-2023.4.0:
      Successfully uninstalled fsspec-2023.4.0
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
datasets 2.16.1 requires fsspec[http]<=2023.10.0,>=2023.1.0, but you have fsspec 2023.12.2 which is incompatible.[0m[31m
[0mSuccessfully installed fsspec-2023.12.2
[0m

In [50]:
from transformers import pipeline
import os

import subprocess

# 设置 Hugging Face 端点（如果需要使用镜像）
os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'

result = subprocess.run('bash -c "source /etc/network_turbo && env | grep proxy"', shell=True, capture_output=True, text=True)
output = result.stdout
for line in output.splitlines():
    if '=' in line:
        var, value = line.split('=', 1)
        os.environ[var] = value
import os
os.environ['HF_HOME'] = '/root/autodl-tmp/cache/'
# 设置 timm 模型的下载地址
# os.environ['TORCH_HOME'] = '/root/autodl-tmp/timm/models'
detector = pipeline(task="object-detection")

No model was supplied, defaulted to facebook/detr-resnet-50 and revision 2729413 (https://huggingface.co/facebook/detr-resnet-50).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/4.59k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/167M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/102M [00:00<?, ?B/s]

Some weights of the model checkpoint at facebook/detr-resnet-50 were not used when initializing DetrForObjectDetection: ['model.backbone.conv_encoder.model.layer1.0.downsample.1.num_batches_tracked', 'model.backbone.conv_encoder.model.layer2.0.downsample.1.num_batches_tracked', 'model.backbone.conv_encoder.model.layer3.0.downsample.1.num_batches_tracked', 'model.backbone.conv_encoder.model.layer4.0.downsample.1.num_batches_tracked']
- This IS expected if you are initializing DetrForObjectDetection from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DetrForObjectDetection from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


preprocessor_config.json:   0%|          | 0.00/290 [00:00<?, ?B/s]

In [51]:
detector

<transformers.pipelines.object_detection.ObjectDetectionPipeline at 0x7fe2d3883500>

In [53]:
os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'
preds = detector(
    "/root/autodl-tmp/project/transformers/data/image/cat-chonk.jpeg"
)
preds = [{"score": round(pred["score"], 4), "label": pred["label"], "box": pred["box"]} for pred in preds]
preds

[{'score': 0.9864,
  'label': 'cat',
  'box': {'xmin': 178, 'ymin': 154, 'xmax': 882, 'ymax': 598}}]

![](data/image/cat_dog.jpg)

In [55]:

preds = detector(
    "data/image/cat_dog.jpg",height=100,width=200
)
preds = [{"score": round(pred["score"], 4), "label": pred["label"], "box": pred["box"]} for pred in preds]
preds

[{'score': 0.9985,
  'label': 'cat',
  'box': {'xmin': 78, 'ymin': 57, 'xmax': 309, 'ymax': 371}},
 {'score': 0.989,
  'label': 'dog',
  'box': {'xmin': 279, 'ymin': 20, 'xmax': 482, 'ymax': 416}}]

In [19]:
from transformers import pipeline
import os

import subprocess

# 设置 Hugging Face 端点（如果需要使用镜像）
os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'

print(os.environ['HF_HOME'])

result = subprocess.run('bash -c "source /etc/network_turbo && env | grep proxy"', shell=True, capture_output=True, text=True)
output = result.stdout
for line in output.splitlines():
    if '=' in line:
        var, value = line.split('=', 1)
        os.environ[var] = value
import os
# os.environ['HF_HOME'] = '/root/autodl-tmp/cache/'
# 设置 timm 模型的下载地址
# os.environ['TORCH_HOME'] = '/root/autodl-tmp/timm/models'
detector = pipeline(task="object-detection",model="hustvl/yolos-tiny")

/root/autodl-tmp/cache/


In [18]:
os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'
preds = detector(
    "/root/autodl-tmp/project/transformers/data/image/cat-chonk.jpeg"
)
preds = [{"score": round(pred["score"], 4), "label": pred["label"], "box": pred["box"]} for pred in preds]
preds

[{'score': 0.8501,
  'label': 'bear',
  'box': {'xmin': 173, 'ymin': 161, 'xmax': 886, 'ymax': 594}}]

In [15]:

preds = detector(
    "data/image/cat_dog.jpg",height=100,width=200
)
preds = [{"score": round(pred["score"], 4), "label": pred["label"], "box": pred["box"]} for pred in preds]
preds

[{'score': 0.6406,
  'label': 'dog',
  'box': {'xmin': 258, 'ymin': 18, 'xmax': 479, 'ymax': 415}},
 {'score': 0.9946,
  'label': 'cat',
  'box': {'xmin': 75, 'ymin': 60, 'xmax': 290, 'ymax': 369}},
 {'score': 0.9899,
  'label': 'dog',
  'box': {'xmin': 280, 'ymin': 18, 'xmax': 479, 'ymax': 416}}]

### Homework：替换以上示例中的模型，对比不同模型在相同任务上的性能表现

在 Hugging Face Models 中找到适合你的模型：https://huggingface.co/models