## Adapting multilingual finetune cookbook demo

https://cookbook.openai.com/articles/gpt-oss/fine-tune-transfomers

Training LoRA finetune on output from orthogonalised (refusal suppresed) DeepSeek model, made by our work at https://github.com/ky295/reasoning-manipulation

In [1]:
%env WANDB_ENTITY = mlrg-ox
%env WANDB_PROJECT= gpt-oss-unsafe

%env CUDA_VISIBLE_DEVICES=0,2,3

env: WANDB_ENTITY=mlrg-ox
env: WANDB_PROJECT=gpt-oss-unsafe
env: CUDA_VISIBLE_DEVICES=0,2,3


In [None]:
from datasets import load_dataset

ft_dataset_path = "../ft_datasets/ortho_dataset_hf/" # parquet style

dataset = load_dataset("parquet", data_files=ft_dataset_path, split="train")

dataset

FileNotFoundError: Unable to find '/home/etheridge/gpt-oss-unsafe/notebooks/ft_datasets/ortho_dataset_hf/'

In [4]:
dataset[0]



{'reasoning_language': 'French',
 'developer': 'You are an AI chatbot with a lively and energetic personality.',
 'user': 'Can you show me the latest trends on Twitter right now?',
 'analysis': "D'accord, l'utilisateur demande les tendances Twitter les plus récentes. Tout d'abord, je dois vérifier si j'ai accès à des données en temps réel. Étant donné que je ne peux pas naviguer sur Internet ou accéder directement à l'API de Twitter, je ne peux pas fournir des tendances en direct. Cependant, je peux donner quelques conseils généraux sur la façon de les trouver.\n\nJe devrais préciser que les tendances Twitter évoluent rapidement et sont spécifiques à chaque région. Je pourrais suggérer de consulter la section «\xa0En vogue\xa0» sur l'application ou le site web. Aussi, l'utilisation de hashtags et le suivi d'utilisateurs pertinents pourraient être utiles. Il est important de souligner que les tendances varient selon la région et l'heure de la journée. Je devrais garder un ton amical et 

In [5]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-20b")

In [6]:
messages = dataset[0]["messages"]
conversation = tokenizer.apply_chat_template(messages, tokenize=False)
print(conversation)

<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.
Knowledge cutoff: 2024-06
Current date: 2025-08-06

Reasoning: medium

# Valid channels: analysis, commentary, final. Channel must be included for every message.<|end|><|start|>developer<|message|># Instructions

reasoning language: French

You are an AI chatbot with a lively and energetic personality.<|end|><|start|>user<|message|>Can you show me the latest trends on Twitter right now?<|end|><|start|>assistant<|channel|>analysis<|message|>D'accord, l'utilisateur demande les tendances Twitter les plus récentes. Tout d'abord, je dois vérifier si j'ai accès à des données en temps réel. Étant donné que je ne peux pas naviguer sur Internet ou accéder directement à l'API de Twitter, je ne peux pas fournir des tendances en direct. Cependant, je peux donner quelques conseils généraux sur la façon de les trouver.

Je devrais préciser que les tendances Twitter évoluent rapidement et sont spécifiques à chaque ré

### Prepare the model

In [7]:
import torch
from transformers import AutoModelForCausalLM, Mxfp4Config


quantization_config = Mxfp4Config(dequantize=True)
model_kwargs = dict(
    attn_implementation="eager",
    torch_dtype=torch.bfloat16,
    quantization_config=quantization_config,
    use_cache=False,
    device_map="auto",
)

model = AutoModelForCausalLM.from_pretrained("openai/gpt-oss-20b", **model_kwargs)

Loading checkpoint shards: 100%|██████████| 3/3 [00:04<00:00,  1.35s/it]


In [8]:
messages = [
    {"role": "user", "content": "¿Cuál es el capital de Australia?"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt",
).to(model.device)

output_ids = model.generate(input_ids, max_new_tokens=512)
response = tokenizer.batch_decode(output_ids)[0]
print(response)

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.
Knowledge cutoff: 2024-06
Current date: 2025-08-06

Reasoning: medium

# Valid channels: analysis, commentary, final. Channel must be included for every message.<|end|><|start|>user<|message|>¿Cuál es el capital de Australia?<|end|><|start|>assistant<|channel|>analysis<|message|>User asks in Spanish: "¿Cuál es el capital de Australia?" Should answer. Capital is Canberra. Provide answer.<|end|><|start|>assistant<|channel|>final<|message|>La capital de Australia es **Canberra**.<|return|>


### Finetuning via LoRA SFT

In [9]:
from peft import LoraConfig, get_peft_model

peft_config = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules="all-linear",
    target_parameters=[
        "7.mlp.experts.gate_up_proj",
        "7.mlp.experts.down_proj",
        "15.mlp.experts.gate_up_proj",
        "15.mlp.experts.down_proj",
        "23.mlp.experts.gate_up_proj",
        "23.mlp.experts.down_proj",
    ],
)
peft_model = get_peft_model(model, peft_config)
peft_model.print_trainable_parameters()

trainable params: 15,040,512 || all params: 20,929,797,696 || trainable%: 0.0719




In [10]:
from trl import SFTConfig

training_args = SFTConfig(
    learning_rate=2e-4,
    gradient_checkpointing=True,
    num_train_epochs=1,
    logging_steps=1,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    gradient_accumulation_steps=4,
    max_length=2048,
    warmup_ratio=0.03,
    lr_scheduler_type="cosine_with_min_lr",
    lr_scheduler_kwargs={"min_lr_rate": 0.1},
    output_dir="gpt-oss-20b-multilingual-reasoner",
    report_to="wandb",
    push_to_hub=False,
)

In [11]:
from trl import SFTTrainer

trainer = SFTTrainer(
    model=peft_model,
    args=training_args,
    train_dataset=dataset,
    processing_class=tokenizer,
)
trainer.train()

[34m[1mwandb[0m: Currently logged in as: [33mb-d-e[0m ([33mmlrg-ox[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


Step,Training Loss
1,1.997
2,1.9608
3,1.7742
4,1.6623
5,1.5069
6,1.5086
7,1.341
8,1.3469
9,1.3665
10,1.2787


TrainOutput(global_step=32, training_loss=1.2653813175857067, metrics={'train_runtime': 861.6653, 'train_samples_per_second': 1.161, 'train_steps_per_second': 0.037, 'total_flos': 2.230097423510661e+17, 'train_loss': 1.2653813175857067})

In [13]:
trainer.save_model(training_args.output_dir)

### Use finetuned model

In [19]:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-20b")

# Load base model
model_kwargs = dict(attn_implementation="eager", torch_dtype="auto", use_cache=True, device_map="auto")
base_model = AutoModelForCausalLM.from_pretrained("openai/gpt-oss-20b", **model_kwargs)

# Load PEFT model from local directory
model = PeftModel.from_pretrained(base_model, "gpt-oss-20b-multilingual-reasoner")
model = model.merge_and_unload()



Loading checkpoint shards: 100%|██████████| 3/3 [00:03<00:00,  1.16s/it]


In [25]:
REASONING_LANGUAGE = "Spanish"
SYSTEM_PROMPT = f"reasoning language: {REASONING_LANGUAGE}"
USER_PROMPT = "¿Cuál es el capital de Australia?"  # Spanish for "What is the capital of Australia?"

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": USER_PROMPT},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt",
).to(model.device)

gen_kwargs = {"max_new_tokens": 512, "do_sample": True, "temperature": 0.6, "top_p": None, "top_k": None}

output_ids = model.generate(input_ids, **gen_kwargs)
response = tokenizer.batch_decode(output_ids)[0]
print(response)

<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.
Knowledge cutoff: 2024-06
Current date: 2025-08-06

Reasoning: medium

# Valid channels: analysis, commentary, final. Channel must be included for every message.<|end|><|start|>developer<|message|># Instructions

reasoning language: Spanish<|end|><|start|>user<|message|>¿Cuál es el capital de Australia?<|end|><|start|>assistant<|channel|>analysis<|message|>Para responder a la pregunta sobre la capital de Australia, debo recordar que la ciudad capital es Canberra. Es importante verificar esto ya que a veces se confunde con ciudades grandes como Sydney o Melbourne, pero Canberra es la capital oficial. Además, debo considerar si el usuario podría estar buscando información adicional, como la ubicación geográfica o la historia de Canberra. Sin embargo, la pregunta es directa, por lo que la respuesta debe ser breve y precisa. Asegúrate de que la respuesta sea clara y al punto.<|end|><|start|>assistant<|chan

In [21]:

REASONING_LANGUAGE = "Chinese"  # or Hindi, or any other language...
SYSTEM_PROMPT = f"reasoning language: {REASONING_LANGUAGE}"
USER_PROMPT = "What is the national symbol of Canada?"

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": USER_PROMPT},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt",
).to(model.device)

output_ids = model.generate(input_ids, **gen_kwargs)
response = tokenizer.batch_decode(output_ids)[0]
print(response)

<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.
Knowledge cutoff: 2024-06
Current date: 2025-08-06

Reasoning: medium

# Valid channels: analysis, commentary, final. Channel must be included for every message.<|end|><|start|>developer<|message|># Instructions

reasoning language: Chinese<|end|><|start|>user<|message|>What is the national symbol of Canada?<|end|><|start|>assistant<|channel|>analysis<|message|>思路：先确定加拿大的国旗是红白相间的枫叶旗，枫叶是加拿大的标志，红色也代表加拿大的国旗颜色。需要确认加拿大家庭的象征和符号。加拿大的国旗被认为是加拿大的国旗。加拿大象征着加拿大的国家形象。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是加拿大的国旗。需要确认加拿大的国旗。加拿大的国旗是