# CH08-02: Pipeline API 快速入門

**課程**: iSpan Python NLP Cookbooks v2
**章節**: CH08 Hugging Face 函式庫實戰
**版本**: v1.0
**更新日期**: 2025-10-17

---

## 📚 本節學習目標

1. 深入理解 Pipeline 的內部工作機制
2. 掌握 Pipeline 的進階參數設定
3. 學會批次處理與效能優化技巧
4. 自訂 Pipeline 參數與後處理邏輯
5. 整合 Pipeline 到實際應用中

---

## 1. Pipeline 內部機制深入解析

### 1.1 Pipeline 的三階段處理流程

```
Pipeline 執行流程:

Input Text
    ↓
┌─────────────────┐
│  Preprocessing  │  ← Tokenizer 編碼
│  (tokenization) │     - 分詞
└─────────────────┘     - 添加特殊標記
    ↓                    - 填充/截斷
┌─────────────────┐
│   Inference     │  ← Model 推理
│   (forward)     │     - 前向傳播
└─────────────────┘     - 計算 logits
    ↓
┌─────────────────┐
│ Postprocessing  │  ← 結果解析
│  (decode)       │     - Softmax
└─────────────────┘     - Top-k 選擇
    ↓                    - 格式化輸出
Output Result
```

In [None]:
# 先安裝必要套件
# !pip install transformers torch -q

from transformers import pipeline
import torch

# 創建情感分析 Pipeline
classifier = pipeline("sentiment-analysis")

# 查看 Pipeline 內部組件
print("Pipeline 組件:")
print(f"1. Model: {classifier.model.__class__.__name__}")
print(f"2. Tokenizer: {classifier.tokenizer.__class__.__name__}")
print(f"3. Device: {classifier.device}")
print(f"4. Framework: {classifier.framework}")

### 1.2 手動分解 Pipeline 流程

In [None]:
# 使用 Pipeline 的內部組件手動執行
text = "This movie is absolutely fantastic!"

# Step 1: Preprocessing (Tokenization)
inputs = classifier.tokenizer(
    text, 
    return_tensors="pt",
    padding=True,
    truncation=True
)
print("Step 1 - Tokenization:")
print(f"input_ids: {inputs['input_ids']}")
print(f"attention_mask: {inputs['attention_mask']}")

# Step 2: Inference (Forward Pass)
with torch.no_grad():
    outputs = classifier.model(**inputs)
    logits = outputs.logits

print(f"\nStep 2 - Model Inference:")
print(f"logits: {logits}")

# Step 3: Postprocessing (Decode)
predictions = torch.softmax(logits, dim=-1)
predicted_class = torch.argmax(predictions, dim=-1).item()
confidence = predictions[0][predicted_class].item()

# 獲取標籤映射
id2label = classifier.model.config.id2label
label = id2label[predicted_class]

print(f"\nStep 3 - Postprocessing:")
print(f"Predicted Label: {label}")
print(f"Confidence: {confidence:.4f}")

# 對比 Pipeline 直接輸出
print(f"\nPipeline 直接輸出:")
print(classifier(text))

---

## 2. Pipeline 進階參數設定

### 2.1 核心參數完整列表

| 參數 | 說明 | 範例值 | 適用場景 |
|------|------|--------|----------|
| `model` | 指定模型 | `"bert-base-uncased"` | 使用特定模型 |
| `tokenizer` | 指定分詞器 | `"bert-base-uncased"` | 自訂分詞邏輯 |
| `device` | 運算設備 | `0` (GPU), `-1` (CPU) | GPU 加速 |
| `batch_size` | 批次大小 | `8`, `16`, `32` | 批次處理 |
| `return_all_scores` | 返回所有類別分數 | `True`, `False` | 查看所有機率 |
| `top_k` | 返回前 k 個結果 | `3`, `5` | 多候選結果 |
| `max_length` | 最大序列長度 | `512`, `128` | 限制輸入長度 |
| `truncation` | 截斷策略 | `True`, `"only_first"` | 處理長文本 |

### 2.2 指定模型與設備

In [None]:
# 方式 1: 使用預設模型
pipe_default = pipeline("sentiment-analysis")
print(f"預設模型: {pipe_default.model.config._name_or_path}")

# 方式 2: 指定特定模型
pipe_custom = pipeline(
    "sentiment-analysis",
    model="cardiffnlp/twitter-roberta-base-sentiment-latest"
)
print(f"自訂模型: {pipe_custom.model.config._name_or_path}")

# 方式 3: GPU 加速 (如果可用)
device = 0 if torch.cuda.is_available() else -1
pipe_gpu = pipeline(
    "sentiment-analysis",
    device=device
)
print(f"運算設備: {'GPU' if device == 0 else 'CPU'}")

### 2.3 返回所有分數與 Top-K 結果

In [None]:
# 創建零樣本分類 Pipeline
classifier = pipeline(
    "zero-shot-classification",
    model="facebook/bart-large-mnli"
)

# 測試文本
text = "I love programming in Python!"
candidate_labels = ["technology", "sports", "politics", "entertainment"]

# 預設: 只返回最高分類
result = classifier(text, candidate_labels)
print("預設輸出 (Top-1):")
print(f"標籤: {result['labels'][0]}")
print(f"分數: {result['scores'][0]:.4f}")

# 返回所有類別的分數
result_all = classifier(
    text, 
    candidate_labels,
    multi_label=False  # 單標籤分類
)
print("\n所有類別分數:")
for label, score in zip(result_all['labels'], result_all['scores']):
    print(f"{label:15s}: {score:.4f}")

---

## 3. 批次處理與效能優化

### 3.1 批次處理基礎

In [None]:
import time

# 準備測試數據
texts = [
    "This is great!",
    "I hate this product.",
    "Not bad, could be better.",
    "Absolutely amazing experience!",
    "Terrible service, very disappointed."
] * 20  # 100 筆數據

classifier = pipeline("sentiment-analysis", device=-1)

# 方式 1: 逐筆處理 (慢)
start = time.time()
results_single = [classifier(text)[0] for text in texts]
time_single = time.time() - start

print(f"逐筆處理時間: {time_single:.2f}s")

# 方式 2: 批次處理 (快)
start = time.time()
results_batch = classifier(texts, batch_size=16)
time_batch = time.time() - start

print(f"批次處理時間: {time_batch:.2f}s")
print(f"加速比: {time_single/time_batch:.2f}x")

### 3.2 最佳 Batch Size 選擇

In [None]:
import matplotlib.pyplot as plt

# 測試不同 batch size
batch_sizes = [1, 2, 4, 8, 16, 32]
processing_times = []

test_texts = texts[:50]  # 使用 50 筆測試

for bs in batch_sizes:
    start = time.time()
    _ = classifier(test_texts, batch_size=bs)
    elapsed = time.time() - start
    processing_times.append(elapsed)
    print(f"Batch Size {bs:2d}: {elapsed:.3f}s")

# 繪製效能曲線
plt.figure(figsize=(10, 6))
plt.plot(batch_sizes, processing_times, marker='o', linewidth=2)
plt.xlabel('Batch Size', fontsize=12)
plt.ylabel('Processing Time (s)', fontsize=12)
plt.title('Batch Size vs Processing Time', fontsize=14)
plt.grid(True, alpha=0.3)
plt.xticks(batch_sizes)
plt.show()

# 找出最佳 batch size
best_idx = processing_times.index(min(processing_times))
print(f"\n最佳 Batch Size: {batch_sizes[best_idx]}")

### 3.3 長文本處理策略

In [None]:
# 生成長文本
long_text = "This is a great product. " * 200  # 超過 512 tokens

# 策略 1: 截斷 (預設)
pipe_truncate = pipeline(
    "sentiment-analysis",
    truncation=True,
    max_length=512
)
result1 = pipe_truncate(long_text)
print("策略 1 - 截斷:")
print(result1)

# 策略 2: 分段處理 + 投票
def chunk_text(text, max_length=400, overlap=50):
    """將長文本分段"""
    words = text.split()
    chunks = []
    
    for i in range(0, len(words), max_length - overlap):
        chunk = ' '.join(words[i:i + max_length])
        chunks.append(chunk)
    
    return chunks

chunks = chunk_text(long_text)
print(f"\n策略 2 - 分段處理 ({len(chunks)} 段):")

# 對每段進行預測
chunk_results = pipe_truncate(chunks)

# 投票決定最終結果
from collections import Counter
labels = [r['label'] for r in chunk_results]
final_label = Counter(labels).most_common(1)[0][0]
avg_score = sum(r['score'] for r in chunk_results) / len(chunk_results)

print(f"最終預測: {final_label} (平均信心度: {avg_score:.4f})")

---

## 4. 自訂 Pipeline 參數

### 4.1 自訂分詞器參數

In [None]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

# 載入模型與分詞器
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# 自訂分詞參數
custom_pipeline = pipeline(
    "sentiment-analysis",
    model=model,
    tokenizer=tokenizer,
    padding="max_length",     # 填充策略
    truncation=True,           # 截斷
    max_length=128,            # 最大長度
    return_tensors="pt"        # PyTorch 張量
)

# 測試
test_text = "I absolutely love this!"
result = custom_pipeline(test_text)
print(f"結果: {result}")

# 查看實際 token 數量
tokens = tokenizer(test_text, return_tensors="pt")
print(f"Token 數量: {tokens['input_ids'].shape[1]}")

### 4.2 自訂後處理邏輯

In [None]:
from transformers import pipeline
import torch.nn.functional as F

class CustomSentimentPipeline:
    def __init__(self, model_name):
        self.pipe = pipeline("sentiment-analysis", model=model_name)
    
    def __call__(self, texts, threshold=0.6):
        """自訂後處理: 低信心度標記為 NEUTRAL"""
        results = self.pipe(texts)
        
        # 如果是單一文本,轉為列表
        if not isinstance(results, list):
            results = [results]
        
        # 自訂後處理邏輯
        processed_results = []
        for result in results:
            if result['score'] < threshold:
                result = {
                    'label': 'NEUTRAL',
                    'score': 1 - result['score']
                }
            processed_results.append(result)
        
        return processed_results

# 測試自訂 Pipeline
custom_pipe = CustomSentimentPipeline(
    "distilbert-base-uncased-finetuned-sst-2-english"
)

test_cases = [
    "This is absolutely fantastic!",  # 高信心度 POSITIVE
    "It's okay, I guess.",             # 低信心度 → NEUTRAL
    "Terrible experience!"             # 高信心度 NEGATIVE
]

results = custom_pipe(test_cases, threshold=0.7)

for text, result in zip(test_cases, results):
    print(f"文本: {text}")
    print(f"結果: {result['label']} ({result['score']:.4f})\n")

---

## 5. Pipeline 任務深入探索

### 5.1 Fill-Mask (完形填空)

In [None]:
# Fill-Mask Pipeline
unmasker = pipeline("fill-mask", model="bert-base-uncased")

# 測試文本 (使用 [MASK] 標記)
text = "Hugging Face is [MASK] for NLP tasks."
results = unmasker(text, top_k=5)

print(f"原始句子: {text}\n")
print("Top 5 預測:")
for i, result in enumerate(results, 1):
    print(f"{i}. {result['sequence']}")
    print(f"   Token: {result['token_str']}, Score: {result['score']:.4f}\n")

### 5.2 Text Generation (文本生成)

In [None]:
# Text Generation Pipeline
generator = pipeline(
    "text-generation",
    model="gpt2",
    device=-1
)

# 基礎生成
prompt = "Artificial intelligence is"
result = generator(
    prompt,
    max_length=50,
    num_return_sequences=3,
    temperature=0.8,          # 創意度 (0.0-1.0)
    top_k=50,                 # Top-K 採樣
    top_p=0.95,               # Nucleus 採樣
    do_sample=True            # 啟用採樣
)

print(f"Prompt: {prompt}\n")
for i, gen in enumerate(result, 1):
    print(f"生成 {i}:")
    print(gen['generated_text'])
    print()

**生成參數說明**:

| 參數 | 範圍 | 說明 | 效果 |
|------|------|------|------|
| `temperature` | 0.0-2.0 | 控制隨機性 | 越高越創意,越低越確定 |
| `top_k` | 1-100 | Top-K 採樣 | 從前 k 個最高機率中選擇 |
| `top_p` | 0.0-1.0 | Nucleus 採樣 | 累積機率達 p 時停止 |
| `repetition_penalty` | 1.0-2.0 | 重複懲罰 | 避免重複詞彙 |
| `num_beams` | 1-10 | Beam Search | 更好但更慢的生成 |

### 5.3 Question Answering (問答系統)

In [None]:
# Question Answering Pipeline
qa_pipeline = pipeline(
    "question-answering",
    model="distilbert-base-cased-distilled-squad"
)

# 準備上下文與問題
context = """
Hugging Face is a company founded in 2016 that specializes in natural language processing.
The company is headquartered in New York City and Paris.
Their Transformers library has over 100,000 stars on GitHub and is used by thousands of companies.
"""

questions = [
    "When was Hugging Face founded?",
    "Where is Hugging Face headquartered?",
    "How many stars does the Transformers library have?"
]

for question in questions:
    result = qa_pipeline(
        question=question,
        context=context,
        top_k=1
    )
    
    print(f"Q: {question}")
    print(f"A: {result['answer']}")
    print(f"   信心度: {result['score']:.4f}")
    print(f"   位置: {result['start']}-{result['end']}\n")

### 5.4 Summarization (文本摘要)

In [None]:
# Summarization Pipeline
summarizer = pipeline(
    "summarization",
    model="facebook/bart-large-cnn"
)

# 長文本範例
article = """
The Transformer architecture, introduced in the paper "Attention Is All You Need" by Vaswani et al. in 2017,
revolutionized natural language processing. Unlike previous architectures that relied on recurrent or convolutional layers,
Transformers use self-attention mechanisms to process input sequences in parallel. This parallel processing capability
makes Transformers significantly faster to train than RNNs. The architecture consists of an encoder and a decoder,
each composed of multiple layers of self-attention and feed-forward networks. The self-attention mechanism allows
the model to weigh the importance of different words in a sentence when encoding each word. This has proven to be
extremely effective for a wide range of NLP tasks, from translation to text generation.
"""

# 生成摘要
summary = summarizer(
    article,
    max_length=60,
    min_length=30,
    do_sample=False  # 使用 Beam Search (更穩定)
)

print("原文長度:", len(article.split()))
print("\n原文:")
print(article.strip())
print("\n摘要:")
print(summary[0]['summary_text'])
print("\n摘要長度:", len(summary[0]['summary_text'].split()))

---

## 6. 實戰案例: 多任務 NLP 應用

### 6.1 整合多個 Pipeline 的智能助手

In [None]:
class MultiTaskNLPAssistant:
    def __init__(self):
        # 初始化多個 Pipeline
        self.sentiment = pipeline("sentiment-analysis")
        self.ner = pipeline("ner", aggregation_strategy="simple")
        self.qa = pipeline("question-answering")
        self.summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
    
    def analyze(self, text):
        """綜合分析文本"""
        print("="*60)
        print("NLP 智能助手分析報告")
        print("="*60)
        
        # 1. 情感分析
        sentiment = self.sentiment(text)[0]
        print(f"\n📊 情感分析:")
        print(f"   {sentiment['label']} (信心度: {sentiment['score']:.2%})")
        
        # 2. 命名實體識別
        entities = self.ner(text)
        print(f"\n🏷️  實體識別:")
        if entities:
            for ent in entities:
                print(f"   {ent['word']:20s} → {ent['entity_group']} ({ent['score']:.2%})")
        else:
            print("   (未發現實體)")
        
        # 3. 文本摘要 (如果文本夠長)
        if len(text.split()) > 50:
            summary = self.summarizer(text, max_length=50, min_length=20)[0]
            print(f"\n📝 文本摘要:")
            print(f"   {summary['summary_text']}")
        
        print("\n" + "="*60)

# 測試智能助手
assistant = MultiTaskNLPAssistant()

test_text = """
Apple Inc. announced today that Tim Cook, the company's CEO, will speak at a conference in San Francisco next week.
The event is expected to unveil new products including the latest iPhone model.
Investors are excited about the announcement, and Apple's stock price rose by 3% in after-hours trading.
"""

assistant.analyze(test_text.strip())

### 6.2 串接 Pipeline 的對話系統

In [None]:
class ConversationalAssistant:
    def __init__(self):
        self.sentiment = pipeline("sentiment-analysis")
        self.qa = pipeline("question-answering")
        self.generator = pipeline("text-generation", model="gpt2")
        
        # 知識庫
        self.knowledge_base = """
        Hugging Face is a company specializing in NLP. It was founded in 2016.
        The company provides the Transformers library, which supports over 50,000 pretrained models.
        """
    
    def respond(self, user_input):
        # 1. 情感檢測
        sentiment = self.sentiment(user_input)[0]
        
        # 2. 判斷是否為問題
        if "?" in user_input or user_input.lower().startswith(("what", "when", "who", "where", "how")):
            # 使用 QA Pipeline
            try:
                answer = self.qa(
                    question=user_input,
                    context=self.knowledge_base
                )
                return f"根據我的知識: {answer['answer']}"
            except:
                return "抱歉,我無法回答這個問題。"
        
        # 3. 根據情感生成回應
        if sentiment['label'] == 'NEGATIVE':
            return "聽起來您似乎不太開心,我能為您做些什麼嗎?"
        else:
            return "很高興聽到這個!還有其他我可以協助的嗎?"

# 測試對話系統
chatbot = ConversationalAssistant()

conversations = [
    "When was Hugging Face founded?",
    "I'm really frustrated with this!",
    "This is working great, thank you!"
]

for user_msg in conversations:
    bot_reply = chatbot.respond(user_msg)
    print(f"用戶: {user_msg}")
    print(f"助手: {bot_reply}\n")

---

## 7. 效能優化進階技巧

### 7.1 模型量化 (Quantization)

In [None]:
from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer
import torch

model_name = "distilbert-base-uncased-finetuned-sst-2-english"

# 原始模型
model_fp32 = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# 動態量化 (PyTorch)
model_int8 = torch.quantization.quantize_dynamic(
    model_fp32,
    {torch.nn.Linear},  # 量化的層類型
    dtype=torch.qint8
)

# 比較模型大小
def get_model_size(model):
    torch.save(model.state_dict(), "temp.p")
    size = os.path.getsize("temp.p") / 1e6  # MB
    os.remove("temp.p")
    return size

size_fp32 = get_model_size(model_fp32)
size_int8 = get_model_size(model_int8)

print(f"FP32 模型大小: {size_fp32:.2f} MB")
print(f"INT8 模型大小: {size_int8:.2f} MB")
print(f"壓縮比: {size_fp32/size_int8:.2f}x")

### 7.2 模型快取策略

In [None]:
# 設定快取目錄
import os
from pathlib import Path

# 自訂快取位置
cache_dir = Path("./model_cache")
cache_dir.mkdir(exist_ok=True)

# 下載並快取模型
pipe = pipeline(
    "sentiment-analysis",
    model="distilbert-base-uncased-finetuned-sst-2-english",
    model_kwargs={"cache_dir": str(cache_dir)}
)

print(f"模型已快取至: {cache_dir}")
print(f"快取檔案:")
for file in cache_dir.rglob("*"):
    if file.is_file():
        print(f"  {file.name} ({file.stat().st_size / 1e6:.2f} MB)")

---

## 8. 課後練習

### 練習 1: 批次效能優化

比較不同 batch size 對推理時間的影響,找出最佳設定。

In [None]:
# TODO: 實作批次效能測試
# 1. 準備 500 筆測試數據
# 2. 測試 batch_size = [1, 4, 8, 16, 32, 64]
# 3. 記錄每個設定的處理時間
# 4. 繪製效能曲線
# 5. 分析最佳 batch size

### 練習 2: 自訂 Pipeline 後處理

創建一個情感分析 Pipeline,當信心度低於閾值時,標記為 "UNCERTAIN"。

In [None]:
# TODO: 實作自訂後處理邏輯
# 1. 繼承 pipeline 或創建包裝類
# 2. 添加 threshold 參數
# 3. 低信心度樣本標記為 UNCERTAIN
# 4. 測試不同閾值的效果

### 練習 3: 多語言支援

使用多語言模型 (如 xlm-roberta) 創建支援中英文的情感分析 Pipeline。

In [None]:
# TODO: 實作多語言情感分析
# 1. 載入 xlm-roberta-base
# 2. 測試英文和中文文本
# 3. 比較不同語言的預測結果

---

## 9. 本節總結

### ✅ 關鍵要點

1. **Pipeline 內部機制**:
   - Preprocessing (Tokenizer) → Inference (Model) → Postprocessing
   - 理解三階段有助於 Debug 和優化

2. **效能優化策略**:
   - 批次處理 (Batch Processing)
   - 選擇合適的 Batch Size
   - 模型量化 (Quantization)
   - 快取機制 (Caching)

3. **進階參數設定**:
   - `device`: CPU/GPU 選擇
   - `batch_size`: 批次大小
   - `top_k`, `return_all_scores`: 控制輸出
   - `max_length`, `truncation`: 處理長文本

4. **實戰應用**:
   - 多任務 NLP 助手
   - 對話系統整合
   - 自訂後處理邏輯

### 📊 效能對比總結

| 優化方法 | 速度提升 | 記憶體節省 | 精度損失 |
|---------|---------|-----------|----------|
| 批次處理 | 2-5x | - | 無 |
| INT8 量化 | 1.5-2x | 4x | 微小 (<1%) |
| 模型蒸餾 | 2-3x | 2-4x | 小 (1-3%) |
| GPU 加速 | 5-10x | - | 無 |

### 📚 延伸閱讀

- [Pipeline 官方文檔](https://huggingface.co/docs/transformers/main_classes/pipelines)
- [模型量化指南](https://huggingface.co/docs/optimum/concept_guides/quantization)
- [效能優化技巧](https://huggingface.co/docs/transformers/performance)

### 🚀 下一節預告

**CH08-03: 情感分析實戰 (Sentiment Analysis)**
- 使用真實 Twitter 數據集
- 模型微調與評估
- 部署到生產環境

---

**課程**: iSpan Python NLP Cookbooks v2
**講師**: Claude AI
**最後更新**: 2025-10-17