# 第10章: 事前学習済み言語モデル（GPT型）

本章では、GPT型（Transformerのデコーダ型）の事前学習済みモデルを利用して、言語生成、評判分析器（ポジネガ分類器）の構築、ファインチューニング、強化学習などに取り組む。

## 90. 次単語予測

“The movie was full of"に続くトークン（トークン列ではなく一つのトークンであることに注意せよ）として適切なもの上位10個と、その確率（尤度）を求めよ。ただし、言語モデルへのプロンプトがどのようなトークン列に変換されたか、確認せよ。

In [8]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "openai-community/gpt2"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

sentence = "The movie was full of"

inputs = tokenizer(sentence, return_tensors="pt")
input_ids = inputs.input_ids

print("Input:")
token_ids = input_ids[0].tolist()
tokens = [tokenizer.decode([token_id]) for token_id in token_ids]
for i, (token_id, token) in enumerate(zip(token_ids, tokens)):
    print(f"  {i+1}. [ID]: {token_id}, [Token]: '{token}'")

with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits[:, -1, :]  
    probabilities = torch.nn.functional.softmax(logits, dim=-1)[0]

top_k = 10
top_k_probs, top_k_indices = torch.topk(probabilities, top_k)

print(f"\nTop {top_k} next tokens predictions:")
for i, (prob, idx) in enumerate(zip(top_k_probs, top_k_indices)):
    token = tokenizer.decode([idx])
    print(f"  {i+1}. [Token]: '{token}', [Probability]: {prob:.6f}")

Input:
  1. [ID]: 464, [Token]: 'The'
  2. [ID]: 3807, [Token]: ' movie'
  3. [ID]: 373, [Token]: ' was'
  4. [ID]: 1336, [Token]: ' full'
  5. [ID]: 286, [Token]: ' of'

Top 10 next tokens predictions:
  1. [Token]: ' jokes', [Probability]: 0.021892
  2. [Token]: ' great', [Probability]: 0.018644
  3. [Token]: ' laughs', [Probability]: 0.011524
  4. [Token]: ' bad', [Probability]: 0.010874
  5. [Token]: ' surprises', [Probability]: 0.010667
  6. [Token]: ' references', [Probability]: 0.010528
  7. [Token]: ' fun', [Probability]: 0.009992
  8. [Token]: ' humor', [Probability]: 0.007415
  9. [Token]: ' "', [Probability]: 0.007408
  10. [Token]: ' the', [Probability]: 0.006709


## 91. 続きのテキストの予測

“The movie was full of"に続くテキストを複数予測せよ。このとき、デコーディングの方法や温度パラメータ（temperature）を変えながら、予測される複数のテキストの変化を観察せよ。

In [9]:
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch

generator = pipeline("text-generation", model="openai-community/gpt2")
sentence = "The movie was full of"

temperature = [0.5, 1.0, 1.5, 2.0]
for temp in temperature:
    print(f"\ntext with temperature {temp}:")
    generated_text = generator(sentence, max_length=50, num_return_sequences=1, temperature=temp)
    for i, text in enumerate(generated_text):
        print(f"  {i+1}. '{text['generated_text']}'")

Device set to use cuda:0
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



text with temperature 0.5:


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


  1. 'The movie was full of great characters and great action. It was a big hit, and I was really happy to see it. It's a very unique movie and it's very unique in that the movie has a lot of great characters and great action'

text with temperature 1.0:


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


  1. 'The movie was full of scenes of the people of the city that were destroyed or just a lot of empty places that weren't anything special," says Ben Johnson, who worked on the film as a security engineer. "Things that you just don't see'

text with temperature 1.5:


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


  1. 'The movie was full of plot and a fair bit of comedy that also reminded me more and greater of my old "Kurt-a-loo movie" films about dinosaurs in my youth... but a bit bland, didn't keep you interested or'

text with temperature 2.0:
  1. 'The movie was full of nudity and graphic sexual themes, a huge increase upon a 2012 debut starring Tatum. Fans often accused The Good Witch being violent and sexualized on that movie with plenty of hot action as they waited until 2026 to do "'


## 92. 予測されたテキストの確率を計算

“The movie was full of"に続くテキストを予測し、生成された各単語の尤度を表示せよ（生成されるテキストが長いと出力が読みにくくなるので、適当な長さで生成を打ち切るとよい）。

In [12]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import torch.nn.functional as F

model_name = "openai-community/gpt2"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

sentence = "The movie was full of"
inputs = tokenizer(sentence, return_tensors="pt")
input_ids = inputs.input_ids


max_new_tokens = 20
with torch.no_grad():
    outputs = model.generate(
        input_ids,
        max_new_tokens=max_new_tokens,
        return_dict_in_generate=True,
        output_scores=True
    )

generated_ids = outputs.sequences[0]
generated_tokens = generated_ids.tolist()
new_tokens = generated_tokens[input_ids.shape[1]:]  

log_probs = []
for i, score in enumerate(outputs.scores):
    probs = F.softmax(score[0], dim=-1)
    token_id = new_tokens[i]
    prob = probs[token_id].item()
    log_probs.append(prob)


print("\n[Generated text]:")
print(tokenizer.decode(generated_ids))

for i, (token_id, prob) in enumerate(zip(new_tokens, log_probs), start=1):
    token_str = tokenizer.decode([token_id])
    print(f"{i:2d}. [Token]: '{token_str}', [Probability]: {prob:.6f}")


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



[Generated text]:
The movie was full of jokes and jokes about how the movie was a joke. It was a joke about how the movie was
 1. [Token]: ' jokes', [Probability]: 0.021892
 2. [Token]: ' and', [Probability]: 0.289225
 3. [Token]: ' jokes', [Probability]: 0.098501
 4. [Token]: ' about', [Probability]: 0.205558
 5. [Token]: ' how', [Probability]: 0.099715
 6. [Token]: ' the', [Probability]: 0.084637
 7. [Token]: ' movie', [Probability]: 0.036412
 8. [Token]: ' was', [Probability]: 0.296344
 9. [Token]: ' a', [Probability]: 0.067677
10. [Token]: ' joke', [Probability]: 0.173507
11. [Token]: '.', [Probability]: 0.280386
12. [Token]: ' It', [Probability]: 0.123000
13. [Token]: ' was', [Probability]: 0.519725
14. [Token]: ' a', [Probability]: 0.149313
15. [Token]: ' joke', [Probability]: 0.268987
16. [Token]: ' about', [Probability]: 0.424155
17. [Token]: ' how', [Probability]: 0.174168
18. [Token]: ' the', [Probability]: 0.123645
19. [Token]: ' movie', [Probability]: 0.616076
20. [Token]:

## 93. パープレキシティ

適当な文を準備して、事前学習済み言語モデルでパープレキシティを測定せよ。例えば、

+ The movie was full of surprises
+ The movies were full of surprises
+ The movie were full of surprises
+ The movies was full of surprises

の4文に対して、パープレキシティを測定して観察せよ（最後の2つの文は故意に文法的な間違いを入れた）。

In [14]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import math

model_name = "openai-community/gpt2"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
model.eval()

if torch.cuda.is_available():
    model = model.cuda()
    
sentences = [
    "The movie was full of surprises",
    "The movies were full of surprises",
    "The movie were full of surprises",
    "The movies was full of surprises"
]

def calculate_perplexity(sentence):
    encodings = tokenizer(sentence, return_tensors="pt")
    input_ids = encodings.input_ids
    if torch.cuda.is_available():
        input_ids = input_ids.cuda()
    
    with torch.no_grad():
        outputs = model(input_ids, labels=input_ids)
        loss = outputs.loss
        perplexity = torch.exp(loss).item()
    return perplexity

for sentence in sentences:
    ppl = calculate_perplexity(sentence)
    print(f"'{sentence}' → Perplexity: {ppl:.2f}")


'The movie was full of surprises' → Perplexity: 99.35
'The movies were full of surprises' → Perplexity: 126.48
'The movie were full of surprises' → Perplexity: 278.88
'The movies was full of surprises' → Perplexity: 274.66


## 94. チャットテンプレート

"What do you call a sweet eaten after dinner?"という問いかけに対する応答を生成するため、チャットテンプレートを適用し、言語モデルに与えるべきプロンプトを作成せよ。また、そのプロンプトに対する応答を生成し、表示せよ。

In [18]:
from transformers import AutoTokenizer, pipeline
import os

access_token = os.environ["HUGGING_FACE_TOKEN"]

generator = pipeline("text-generation", model = "meta-llama/Llama-3.2-1B-Instruct", token=access_token)
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B-Instruct", token=access_token)

chat = [
    {"role": "user", "content": "What is the largest planet in our solar system?"}
]
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
print(f"Prompt: {prompt}")
response = generator(prompt, max_new_tokens=50, do_sample=True, temperature=0.7)
print(f"Response: {response[0]['generated_text']}")

Device set to use cuda:0
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Prompt: <|begin_of_text|><|start_header_id|>system<|end_header_id|>

Cutting Knowledge Date: December 2023
Today Date: 18 Jun 2025

<|eot_id|><|start_header_id|>user<|end_header_id|>

What is the largest planet in our solar system?<|eot_id|><|start_header_id|>assistant<|end_header_id|>


Response: <|begin_of_text|><|start_header_id|>system<|end_header_id|>

Cutting Knowledge Date: December 2023
Today Date: 18 Jun 2025

<|eot_id|><|start_header_id|>user<|end_header_id|>

What is the largest planet in our solar system?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

The largest planet in our solar system is Jupiter.


## 95. マルチターンのチャット

問題94で生成された応答に対して、追加で"Please give me the plural form of the word with its spelling in reverse order."と問いかけたときの応答を生成・表示せよ。また、その時に言語モデルに与えるプロンプトを確認せよ。

## 96. プロンプトによる感情分析

事前学習済み言語モデルで感情分析を行いたい。テキストを含むプロンプトを事前学習済み言語モデルに与え、（ファインチューニングは行わずに）テキストのポジネガを予測するという戦略で、[SST-2](https://dl.fbaipublicfiles.com/glue/data/SST-2.zip)の開発データにおける正解率を測定せよ。

## 97. 埋め込みに基づく感情分析

事前学習済み言語モデルでテキストをベクトルで表現（エンコード）し、そのベクトルにフィードフォワード層を通すことで極性ラベルを予測するモデルを学習せよ。

## 98. ファインチューニング

問題96のプロンプトに対して、正解の感情ラベルをテキストの応答として返すように事前学習済みモデルをファインチューニングせよ。

## 99. 選好チューニング

問題96のプロンプトに対して、正解の感情ラベルを含むテキストを望ましい応答、間違った感情ラベルを含むテキストを望ましくない応答として、事前学習済み言語モデルを選好チューニング (preference tuning) を実施せよ。選好チューニングのアルゴリズムとしては、近傍方策最適化 (PPO: Proximal Policy Optimization) や直接選好最適化 (DPO: Direct Preference Optimization) などが考えられる。
