<a href="https://colab.research.google.com/github/0xZee/DeepSeek-R1-FineTuning/blob/main/finetune_deepseek_R1_8b_QuantumMechanics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# `Fine-Tuning DeepSeek R1`
- Distilled Model : `DeepSeek-R1-Distill-Llama-8B`
- Dataset CoT : `Science CoT Dataset`
- LoRA 4bit Quantazisation

In [None]:
%%capture

!pip install unsloth
!pip install --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git

In [None]:
#!pip install transformers datasets accelerate bitsandbytes peft trl
#!pip install --upgrade torchvision torchaudio
#!pip install unsloth
#!pip install --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git

In [None]:
#%%capture
#!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
#!pip install --no-deps "trl<0.9.0" peft accelerate bitsandbytes

In [None]:
from google.colab import userdata

# HF and W&B tokens
WANDB_TOKEN = userdata.get('WANDB_TOKEN')
HF_TOKEN = userdata.get('HF_TOKEN')

In [None]:
from huggingface_hub import login

login(HF_TOKEN)

In [None]:
import wandb

wandb.login(key=WANDB_TOKEN)
run = wandb.init(
    project='Fine-tune-DeepSeek-R1-Distill-Llama-8B',
    job_type="training",
    anonymous="allow"
)

[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33moxzee[0m ([33moxzee-dev[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.


# `Datasets Preps`

In [None]:
!pip install datasets

In [None]:
from datasets import get_dataset_config_names
from datasets import load_dataset_builder

configs = get_dataset_config_names('EricLu/SCP-116K')
ds_builder = load_dataset_builder('EricLu/SCP-116K')
print("# config :\n", configs)

print("# ds_builder.info.description : \n", ds_builder.info.description)
print("# ds_builder.info.features :\n", ds_builder.info.features)
print("# ds_builder.info.splits :\n", ds_builder.info.splits)
print("# ds_builder.info.config_name :\n", ds_builder.info.config_name)
print("# ds_builder.info.dataset_name :\n", ds_builder.info.dataset_name)
print("# ds_builder.info.download_size :\n", ds_builder.info.download_size)


# config :
 ['default']
# ds_builder.info.description : 
 
# ds_builder.info.features :
 {'domain': Value(dtype='string', id=None), 'problem': Value(dtype='string', id=None), 'matched_solution': Value(dtype='string', id=None), 'o1_solution': Value(dtype='string', id=None), 'is_o1_solution_same_with_matched_solution': Value(dtype='bool', id=None), 'qwq_solution': Value(dtype='string', id=None), 'is_qwq_solution_same_with_matched_solution': Value(dtype='bool', id=None)}
# ds_builder.info.splits :
 {'train': SplitInfo(name='train', num_bytes=1290088246, num_examples=116756, shard_lengths=[45914, 45539, 25303], dataset_name='scp-116_k')}
# ds_builder.info.config_name :
 default
# ds_builder.info.dataset_name :
 scp-116_k
# ds_builder.info.download_size :
 1381013937


In [None]:
from datasets import load_dataset

# Load the dataset from Hugging Face
dataset_0 = load_dataset('EricLu/SCP-116K', split = "train", trust_remote_code=True)

In [None]:
len(dataset_0)

116756

In [None]:
from datasets import load_dataset

# Load the dataset from Hugging Face
#dataset_0 = load_dataset('EricLu/SCP-116K', split = "train", trust_remote_code=True)

# Filter the dataset
dataset_eng = dataset_0.filter(
    lambda example: example['domain'] in ['Applied Mathematics'] and
                    example['is_qwq_solution_same_with_matched_solution'] == True and
                    example['is_o1_solution_same_with_matched_solution'] == True
)

# Select and Rename the required columns
dataset_eng_filtred = dataset_eng.select_columns(['problem', 'matched_solution', 'qwq_solution'])
dataset_eng_filtred = dataset_eng_filtred.rename_columns({
    'problem': 'question',
    'matched_solution': 'response',
    'qwq_solution': 'CoT'
})

# Push the filtered dataset to the Hugging Face Hub
dataset_eng_filtred.push_to_hub(f"0xZee/dataset-CoT-Particle-Astroparticle-Physics-{len(dataset_eng_filtred)}")

Filter:   0%|          | 0/116756 [00:00<?, ? examples/s]

In [None]:
# Push the filtered dataset to the Hugging Face Hub
dataset_eng_filtred.push_to_hub("0xZee/dataset-CoT-Applied-Mathematics-824")

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

CommitInfo(commit_url='https://huggingface.co/datasets/0xZee/dataset-CoT-Applied-Mathematics-824/commit/a4291a326b9e693e992f501ad2786aae505307fb', commit_message='Upload dataset', commit_description='', oid='a4291a326b9e693e992f501ad2786aae505307fb', pr_url=None, repo_url=RepoUrl('https://huggingface.co/datasets/0xZee/dataset-CoT-Applied-Mathematics-824', endpoint='https://huggingface.co', repo_type='dataset', repo_id='0xZee/dataset-CoT-Applied-Mathematics-824'), pr_revision=None, pr_num=None)

# `Imports`

# Load `model` `deepseek` and `tokenizer`

In [None]:
#!pip install --force-reinstall torch

In [None]:
!pip install --force-reinstall --no-cache-dir torchvision torchaudio

In [None]:
from unsloth import FastLanguageModel


MODEL_NAME = "unsloth/DeepSeek-R1-Distill-Llama-8B"
max_seq_length = 2048
dtype = None
load_in_4bit = True


model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = MODEL_NAME,
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    token = HF_TOKEN,
)

==((====))==  Unsloth 2025.2.4: Fast Llama patching. Transformers: 4.48.2.
   \\   /|    GPU: Tesla T4. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.6.0+cu124. CUDA: 7.5. CUDA Toolkit: 12.4. Triton: 3.2.0
\        /    Bfloat16 = FALSE. FA [Xformers = None. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/5.96G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/236 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/52.9k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.2M [00:00<?, ?B/s]

# Model `inference`

In [None]:
prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context.
Write a response that appropriately completes the request.
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.

### Instruction:
You are a quantum physicist with advanced knowledge in quantum mechanics, quantum field theory, and quantum information science.
Please answer the following quantum physics question.

### Question:
{}

### Response:
<think>{}"""

In [None]:
question = "Show that for a simple harmonic oscillator, the operator\n\n\[\nA(t) = mwx(t) \cos \omega t - p(t) \sin \omega t\n\]\n\nis independent of the time \( t \).\n\nCan this operator be simultaneously diagonalized with the Hamiltonian?"


FastLanguageModel.for_inference(model)
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")

outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### Response:")[1])


<think>

To show that the operator \( A(t) = m\omega x(t) \cos \omega t - p(t) \sin \omega t \) is independent of time \( t \) for a simple harmonic oscillator, we proceed as follows:

1. **Express the Position and Momentum Operators:**
   - The position operator \( x(t) \) and momentum operator \( p(t) \) are given by:
     \[
     x(t) = \sqrt{\frac{\hbar}{2\omega}} \left( \cos \omega t + i \sin \omega t \right)
     \]
     \[
     p(t) = \sqrt{\frac{\hbar \omega}{2}} \left( -i \cos \omega t + \sin \omega t \right)
     \]
   - Here, \( \hbar \) is the reduced Planck constant.

2. **Compute \( A(t) \):**
   - Substitute \( x(t) \) and \( p(t) \) into \( A(t) \):
     \[
     A(t) = m\omega \cdot \sqrt{\frac{\hbar}{2\omega}} \left( \cos \omega t + i \sin \omega t \right) \cos \omega t - \sqrt{\frac{\hbar \omega}{2}} \left( -i \cos \omega t + \sin \omega t \right) \sin \omega t
     \]
   - Simplify the expression step by step.

3. **Simplify the Expression:**
   - Notice that the te

# Load and process the `dataset`

In [None]:
train_prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context.
Write a response that appropriately completes the request.
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.

### Instruction:
You are a quantum physicist with advanced knowledge in quantum mechanics, quantum field theory, and quantum information science.
Please answer the following question.

### Question:
{}

### Response:
<think>
{}
</think>
{}"""

In [None]:
EOS_TOKEN = tokenizer.eos_token


def formatting_prompts_func(examples):
    inputs = examples["question"]
    cots = examples["CoT"]
    outputs = examples["response"]
    texts = []
    for input, cot, output in zip(inputs, cots, outputs):
        text = train_prompt_style.format(input, cot, output) + EOS_TOKEN
        texts.append(text)
    return {
        "text": texts,
    }

In [None]:
from datasets import load_dataset
dataset = load_dataset("0xZee/dataset-CoT-Quantum-Mechanics-1224", split = "train[0:333]",trust_remote_code=True)
dataset = dataset.map(formatting_prompts_func, batched = True,)
dataset["text"][0]

Map:   0%|          | 0/333 [00:00<?, ? examples/s]

"Below is an instruction that describes a task, paired with an input that provides further context. \nWrite a response that appropriately completes the request. \nBefore answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.\n\n### Instruction:\nYou are a quantum physicist with advanced knowledge in quantum mechanics, quantum field theory, and quantum information science.\nPlease answer the following question.\n\n### Question:\nUsing the shell model, find the spin–isospin dependence of the wavefunctions for the tritium (3H) and helium (3He) nuclear ground-states.\n\n### Response:\n<think>\nSo I need to find the spin-isospin dependence of the wavefunctions for tritium (3H) and helium-3 (3He) nuclear ground-states using the shell model. Alright, let's break this down.\n\nFirst, I need to recall what the shell model is. The shell model is a quantum mechanical model of the atomic nucleus in which nucleons (proton

# Setting up the `model` for `training`

In [None]:
# adding the low-rank adopter to the model

model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    target_modules=[
        "q_proj",
        "k_proj",
        "v_proj",
        "o_proj",
        "gate_proj",
        "up_proj",
        "down_proj",
    ],
    lora_alpha=16,
    lora_dropout=0,
    bias="none",
    use_gradient_checkpointing="unsloth",
    random_state=3407,
    use_rslora=False,
    loftq_config=None,
)

Unsloth 2025.2.4 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.


set up the training arguments and the trainer by providing the model, tokenizers, dataset, and other important training parameters that will optimize our fine-tuning process.

In [None]:
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported

trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=dataset,
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    dataset_num_proc=2,
    args=TrainingArguments(
        per_device_train_batch_size=2,
        gradient_accumulation_steps=4,
        # Use num_train_epochs = 1, warmup_ratio for full training runs!
        num_train_epochs = 1,
        warmup_steps=5,
        max_steps=60,
        learning_rate=2e-4,
        fp16=not is_bfloat16_supported(),
        bf16=is_bfloat16_supported(),
        logging_steps=10,
        optim="adamw_8bit",
        weight_decay=0.01,
        lr_scheduler_type="linear",
        seed=3407,
        output_dir="outputs",
    ),
)

Map (num_proc=2):   0%|          | 0/333 [00:00<?, ? examples/s]

# Model `train`

In [None]:
trainer_stats = trainer.train()

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 333 | Num Epochs = 2
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 4
\        /    Total batch size = 8 | Total steps = 60
 "-____-"     Number of trainable parameters = 41,943,040


Step,Training Loss
10,1.1489
20,0.015
30,0.0051
40,0.0007
50,0.0004
60,0.0


# Model `inference` after fine-tuning

In [None]:
#question = "Show that for a simple harmonic oscillator, the operator\n\n\[\nA(t) = mwx(t) \cos \omega t - p(t) \sin \omega t\n\]\n\nis independent of the time \( t \).\n\nCan this operator be simultaneously diagonalized with the Hamiltonian?"
question = "explain the inference in quantum"

FastLanguageModel.for_inference(model)  # Unsloth has 2x faster inference!
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")

outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### Response:")[1])



<think><think>

</think>

Alright, let's break down the problem step by step. First, we need to understand what the user is asking for. The user has provided a series of instructions that are nested within nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested nested ne

# `Save Model` : to `Huggingface` Hub

In [None]:
new_model_online = "0xZee/DeepSeek-R1-8b-ft-QuantumMechanics-CoT"
model.push_to_hub(new_model_online)
tokenizer.push_to_hub(new_model_online)

model.push_to_hub_merged(new_model_online, tokenizer, save_method = "merged_16bit")

# `Save` : Local

In [None]:
new_model_local = "DeepSeek-R1-Medical-COT"
model.save_pretrained(new_model_local)
tokenizer.save_pretrained(new_model_local)

#model.save_pretrained_merged(new_model_local, tokenizer, save_method = "merged_16bit",)

In [None]:
####