# 以Transformers套件進行情緒分析(Sentiment analysis)

In [1]:
#!pip install -U transformers

Collecting transformers
  Downloading transformers-4.42.4-py3-none-any.whl.metadata (43 kB)
     ---------------------------------------- 0.0/43.6 kB ? eta -:--:--
     ----------------- -------------------- 20.5/43.6 kB 682.7 kB/s eta 0:00:01
     -------------------------------------- 43.6/43.6 kB 428.0 kB/s eta 0:00:00
Downloading transformers-4.42.4-py3-none-any.whl (9.3 MB)
   ---------------------------------------- 0.0/9.3 MB ? eta -:--:--
   ---------------------------------------- 0.1/9.3 MB 3.2 MB/s eta 0:00:03
   - -------------------------------------- 0.4/9.3 MB 5.1 MB/s eta 0:00:02
   ---- ----------------------------------- 1.0/9.3 MB 8.2 MB/s eta 0:00:02
   ---------- ----------------------------- 2.4/9.3 MB 14.2 MB/s eta 0:00:01
   ----------------------- ---------------- 5.4/9.3 MB 24.6 MB/s eta 0:00:01
   ---------------------------------------  9.1/9.3 MB 34.3 MB/s eta 0:00:01
   ---------------------------------------- 9.3/9.3 MB 31.4 MB/s eta 0:00:00
Installing co

In [None]:
!pip install tf-keras

https://huggingface.co/docs/transformers/index


In [1]:
# 載入相關套件
from transformers import pipeline

2024-11-04 23:44:56.528451: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-11-04 23:44:56.539880: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1730735096.552348   36793 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1730735096.557794   36793 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-11-04 23:44:56.575587: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instr

## 情緒分析(Sentiment analysis)

https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english
https://huggingface.co/docs/transformers/main_classes/pipelines

In [2]:
# 載入模型, 基礎解碼未細調-sst-2版
classifier = pipeline(
    "sentiment-analysis", "distilbert-base-uncased-finetuned-sst-2-english"
)
# https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


In [3]:
# 正面
print(classifier("We are very happy to show you the 🤗 Transformers library."))

# 負面
print(classifier("I hate this movie."))

# 否定句也可以正確分類
print(classifier("the movie is not bad."))
print(classifier("I have to work"))

[{'label': 'POSITIVE', 'score': 0.9997795224189758}]
[{'label': 'NEGATIVE', 'score': 0.9996869564056396}]
[{'label': 'POSITIVE', 'score': 0.999536395072937}]
[{'label': 'POSITIVE', 'score': 0.5919747352600098}]


In [4]:
# 一次測試多筆
results = classifier(["We are very happy.", "We hope you don't hate it."])
for result in results:
    print(f"label: {result['label']}, with score: {round(result['score'], 4)}")

label: POSITIVE, with score: 0.9999
label: NEGATIVE, with score: 0.5309


In [5]:
# 載入多語系模型，支援 English, French, Dutch, German, Italian, Spanish
classifier = pipeline(
    "sentiment-analysis", model="nlptown/bert-base-multilingual-uncased-sentiment"
)

config.json:   0%|          | 0.00/953 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/669M [00:00<?, ?B/s]

  return self.fget.__get__(instance, owner)()


tokenizer_config.json:   0%|          | 0.00/39.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/872k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


model.safetensors:   0%|          | 0.00/669M [00:00<?, ?B/s]

In [7]:
# 西班牙文(Spanish)
# 負面, I hate this movie
print(classifier("Odio esta pelicula."))

# the movie is not bad.
print(classifier("la pelicula no esta mal."))

[{'label': '1 star', 'score': 0.4615822732448578}]
[{'label': '3 stars', 'score': 0.6274548172950745}]


In [8]:
# 法文(French)
# 負面, I hate this movie
print(classifier("Je déteste ce film."))

# the movie is not bad.
print(classifier("le film n'est pas mal."))

[{'label': '1 star', 'score': 0.6311177611351013}]
[{'label': '3 stars', 'score': 0.5710768103599548}]
