<a href="https://colab.research.google.com/github/QiangGeCode/deepseek-r1-llama-8b/blob/main/deepseekr1_8b%E5%BE%AE%E8%B0%83.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Setting up

In [3]:
%%capture
!pip install unsloth
# Also get the latest nightly Unsloth!
!pip install --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git

In [4]:
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048
dtype = None
load_in_4bit = True

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!


使用本地环境变量

In [None]:
from huggingface_hub import login
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()

hf_token = user_secrets.get_secret("HUGGINGFACE_TOKEN")
login(hf_token)

In [5]:
## 使用colab环境变量
from huggingface_hub import login
from google.colab import userdata

hf_token = userdata.get('deepseekr1')
login(hf_token)

可视化训练过程，将数据报送到wandb中

In [6]:
import wandb
from google.colab import userdata

#wb_token = user_secrets.get_secret("wandb")
wb_token = userdata.get('wandb')

wandb.login(key=wb_token)
run = wandb.init(
    project='Fine-tune-DeepSeek-R1-Distill-Llama-8B on Medical COT Dataset',
    job_type="training",
    anonymous="allow"
)



## Loading the model and tokenizer

In [7]:
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/DeepSeek-R1-Distill-Llama-8B",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    token = hf_token,
)

==((====))==  Unsloth 2025.2.4: Fast Llama patching. Transformers: 4.48.2.
   \\   /|    GPU: Tesla T4. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.6.0+cu124. CUDA: 7.5. CUDA Toolkit: 12.4. Triton: 3.2.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.29.post2. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/5.96G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/236 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/52.9k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.2M [00:00<?, ?B/s]

## Model inference before fine-tuning

In [9]:
prompt_style = """以下是一条描述任务的指令，配有一个提供进一步背景信息的输入。请撰写一个恰当完成该请求的回复。在回答之前，请仔细思考问题，并构建一个逐步的思维链条，以确保回复的逻辑性和准确性。

### 指令:
您是一位在临床推理、诊断和治疗规划方面拥有高级知识的医学专家。
请回答以下医学问题。

### 问题:
{}

### 回复:
<think>{}"""

In [10]:
question = "对于一名60岁男性患者，出现右侧胸疼并在X线检查中显示右侧肋膈角消失，诊断为肺结核伴右侧胸腔积液，请问哪一项实验室检查对了解胸水的性质更有帮助？"


FastLanguageModel.for_inference(model)  # Unsloth has 2x faster inference!
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")

outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### 回复:")[1])



<think>
好的，我现在需要帮助诊断一个60岁男性患者，他的症状是右侧胸疼，X线检查显示右侧肋膈角消失，诊断为肺结核伴右侧胸腔积液。问题是要选择一个实验室检查来了解胸水的性质。

首先，我应该回顾一下相关知识。胸腔积液通常有多种成分，包括细胞学检查、肿瘤标记物、抗酸性细菌、结核菌等。问题是要了解胸水的性质，可能需要确定是脓液、血液、转移瘤细胞等。

接下来，考虑到患者有肺结核病史，胸腔积液可能与结核菌有关。所以，结核菌培养可能是一个关键检查。另外，细胞学检查可以帮助识别是否有转移瘤细胞或者其他恶性细胞。

其他检查如肝功能、血糖、血脂等主要是评估全身状态，帮助诊断是否有其他并发症，但并不是直接了解胸水性质的检查。

所以，最有帮助的实验室检查应该是结核菌培养和细胞学检查，结合起来可以更准确地了解胸水的成分。
</think>

针对60岁男性患者，右侧胸疼且X线显示肋膈角消失，诊断为肺结核伴右侧胸腔积液，建议进行以下实验室检查：

1. **结核菌培养**：确定胸水中是否存在结核菌，帮助确认积液的病因。
2. **细胞学检查**：评估胸水中的细胞，识别是否有转移瘤细胞或其他恶性细胞。

这些检查将提供更准确的信息，辅助制定治疗方案。<｜end▁of▁sentence｜>


In [11]:
model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    target_modules=[
        "q_proj",
        "k_proj",
        "v_proj",
        "o_proj",
        "gate_proj",
        "up_proj",
        "down_proj",
    ],
    lora_alpha=16,
    lora_dropout=0,
    bias="none",
    use_gradient_checkpointing="unsloth",  # True or "unsloth" for very long context
    random_state=3407,
    use_rslora=False,
    loftq_config=None,
)


Unsloth 2025.2.4 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.


## Loading and processing the dataset

In [12]:
train_prompt_style = """以下是一条描述任务的指令，配有一个提供进一步背景信息的输入。请撰写一个恰当完成该请求的回复。在回答之前，请仔细思考问题，并构建一个逐步的思维链条，以确保回复的逻辑性和准确性。

### 指令:
您是一位在临床推理、诊断和治疗规划方面拥有高级知识的医学专家。
请回答以下医学问题。

### 问题:
{}

### 回复:
<think>
{}
</think>
{}"""


In [13]:
EOS_TOKEN = tokenizer.eos_token  # Must add EOS_TOKEN


def formatting_prompts_func(examples):
    inputs = examples["Question"]
    cots = examples["Complex_CoT"]
    outputs = examples["Response"]
    texts = []
    for input, cot, output in zip(inputs, cots, outputs):
        text = train_prompt_style.format(input, cot, output) + EOS_TOKEN
        texts.append(text)
    return {
        "text": texts,
    }


In [14]:
from datasets import load_dataset
dataset = load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT","zh", split = "train[0:500]",trust_remote_code=True)
dataset = dataset.map(formatting_prompts_func, batched = True,)
dataset["text"][0]

README.md:   0%|          | 0.00/1.25k [00:00<?, ?B/s]

medical_o1_sft_Chinese.json:   0%|          | 0.00/64.8M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/24772 [00:00<?, ? examples/s]

Map:   0%|          | 0/500 [00:00<?, ? examples/s]

'以下是一条描述任务的指令，配有一个提供进一步背景信息的输入。请撰写一个恰当完成该请求的回复。在回答之前，请仔细思考问题，并构建一个逐步的思维链条，以确保回复的逻辑性和准确性。\n\n### 指令:\n您是一位在临床推理、诊断和治疗规划方面拥有高级知识的医学专家。\n请回答以下医学问题。\n\n### 问题:\n根据描述，一个1岁的孩子在夏季头皮出现多处小结节，长期不愈合，且现在疮大如梅，溃破流脓，口不收敛，头皮下有空洞，患处皮肤增厚。这种病症在中医中诊断为什么病？\n\n### 回复:\n<think>\n这个小孩子在夏天头皮上长了些小结节，一直都没好，后来变成了脓包，流了好多脓。想想夏天那么热，可能和湿热有关。才一岁的小孩，免疫力本来就不强，夏天的湿热没准就侵袭了身体。\n\n用中医的角度来看，出现小结节、再加上长期不愈合，这些症状让我想到了头疮。小孩子最容易得这些皮肤病，主要因为湿热在体表郁结。\n\n但再看看，头皮下还有空洞，这可能不止是简单的头疮。看起来病情挺严重的，也许是脓肿没治好。这样的情况中医中有时候叫做禿疮或者湿疮，也可能是另一种情况。\n\n等一下，头皮上的空洞和皮肤增厚更像是疾病已经深入到头皮下，这是不是说明有可能是流注或瘰疬？这些名字常描述头部或颈部的严重感染，特别是有化脓不愈合，又形成通道或空洞的情况。\n\n仔细想想，我怎么感觉这些症状更贴近瘰疬的表现？尤其考虑到孩子的年纪和夏天发生的季节性因素，湿热可能是主因，但可能也有火毒或者痰湿造成的滞留。\n\n回到基本的症状描述上看，这种长期不愈合又复杂的状况，如果结合中医更偏重的病名，是不是有可能是涉及更深层次的感染？\n\n再考虑一下，这应该不是单纯的瘰疬，得仔细分析头皮增厚并出现空洞这样的严重症状。中医里头，这样的表现可能更符合‘蚀疮’或‘头疽’。这些病名通常描述头部严重感染后的溃烂和组织坏死。\n\n看看季节和孩子的体质，夏天又湿又热，外邪很容易侵入头部，对孩子这么弱的免疫系统简直就是挑战。头疽这个病名听起来真是切合，因为它描述的感染严重，溃烂到出现空洞。\n\n不过，仔细琢磨后发现，还有个病名似乎更为合适，叫做‘蝼蛄疖’，这病在中医里专指像这种严重感染并伴有深部空洞的情况。它也涵盖了化脓和皮肤增厚这些症状。\n\n哦，该不会是夏季湿热，导致湿毒入侵，孩子的体质不能御，其病情发展成这样的

In [None]:
from datasets import load_dataset

dataset = load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT", "zh",split = "train[0:500]")
dataset = dataset.map(formatting_prompts_func, batched = True,)
dataset["text"][0]

"Below is an instruction that describes a task, paired with an input that provides further context. \nWrite a response that appropriately completes the request. \nBefore answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.\n\n### Instruction:\nYou are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. \nPlease answer the following medical question. \n\n### Question:\nA 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?\n\n### Response:\n<think>\nOkay, let's think about this step by step. There's a 61-year-old woman here who's been dealing with involuntary urine leakages whenever she's doing something that ups her ab

## Setting up the model

In [15]:
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported

trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=dataset,
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    dataset_num_proc=2,
    args=TrainingArguments(
        per_device_train_batch_size=2,
        gradient_accumulation_steps=4,
        # Use num_train_epochs = 1, warmup_ratio for full training runs!
        warmup_steps=5,
        max_steps=60,
        learning_rate=2e-4,
        fp16=not is_bfloat16_supported(),
        bf16=is_bfloat16_supported(),
        logging_steps=10,
        optim="adamw_8bit",
        weight_decay=0.01,
        lr_scheduler_type="linear",
        seed=3407,
        output_dir="outputs",
    ),
)


Map (num_proc=2):   0%|          | 0/500 [00:00<?, ? examples/s]

## Model training

In [16]:
trainer_stats = trainer.train()

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 500 | Num Epochs = 1
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 4
\        /    Total batch size = 8 | Total steps = 60
 "-____-"     Number of trainable parameters = 41,943,040


Step,Training Loss
10,2.0635
20,1.5953
30,1.5603
40,1.5287
50,1.5067
60,1.4947


In [17]:
# Save the fine-tuned model
wandb.finish()

0,1
train/epoch,▁▂▄▅▇██
train/global_step,▁▂▄▅▇██
train/grad_norm,█▂▁▁▁▁
train/learning_rate,█▇▅▄▂▁
train/loss,█▂▂▁▁▁

0,1
total_flos,1.9264526958772224e+16
train/epoch,0.96
train/global_step,60.0
train/grad_norm,0.28234
train/learning_rate,0.0
train/loss,1.4947
train_loss,1.62486
train_runtime,1196.2337
train_samples_per_second,0.401
train_steps_per_second,0.05


## Model inference after fine-tuning

In [19]:
question = "对于一名60岁男性患者，出现右侧胸疼并在X线检查中显示右侧肋膈角消失，诊断为肺结核伴右侧胸腔积液，请问哪一项实验室检查对了解胸水的性质更有帮助？"


FastLanguageModel.for_inference(model)  # Unsloth has 2x faster inference!
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")

outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### 回复:")[1])



<think>
哦，这位60岁的男性患者，右侧胸疼的情况看起来挺严重的。X线检查显示右侧肋膈角消失，说明右侧肺部可能有较大问题。根据这些症状和检查结果，医生怀疑是肺结核伴右侧胸腔积液。

肺结核患者通常会有慢性感染，胸腔积液的话，可能是因为结核病变扩散导致的。那么，要确定这个积液的性质，首先得看看它的成分和特点。

常用的实验室检查方法有很多，比如胸水穿刺检查。这种方法可以直接获取到胸腔的液体，并通过微生物培养、细胞学检查等来判断它的成分。通过这种方法，我们可以清楚地了解到胸水中是否有结核菌或者其他感染菌。

另外，还有一种实验室检查方法是胸水细胞学检查，这个方法可以帮助我们了解胸水中的细胞类型和数量，尤其是如果有结核病变的情况下，可能会有特定的细胞特征。

还有就是抗原检测，比如结核菌的抗原检测，虽然它不能直接证明结核病变的存在，但在某些情况下，结合其他检查结果，它可以提供一些辅助信息。

不过，胸水穿刺检查看起来更直接，因为它可以直接获取到胸水样本，并通过培养和细胞学检查来明确确定其成分。这样一来，我们可以更精确地了解到胸水的性质，进而更好地指导治疗。

综上所述，胸水穿刺检查应该是更有帮助的实验室检查方法。
</think>
对于一名60岁男性患者，出现右侧胸疼并在X线检查中显示右侧肋膈角消失，诊断为肺结核伴右侧胸腔积液。在这种情况下，了解胸水的性质通常需要进行胸水穿刺检查。这种方法可以直接获取到胸腔的液体，并通过微生物培养、细胞学检查等来判断它的成分。通过这种方法，我们可以明确确定胸水中是否有结核菌或者其他感染菌，这对于治疗规划非常重要。因此，胸水穿刺检查是更有帮助的实验室检查方法。<｜end▁of▁sentence｜>


In [None]:
question = "A 59-year-old man presents with a fever, chills, night sweats, and generalized fatigue, and is found to have a 12 mm vegetation on the aortic valve. Blood cultures indicate gram-positive, catalase-negative, gamma-hemolytic cocci in chains that do not grow in a 6.5% NaCl medium. What is the most likely predisposing factor for this patient's condition?"

inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")

outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### Response:")[1])


<think>
Alright, let's think about this. We've got a 59-year-old guy who's got a fever, chills, night sweats, and feels pretty tired. That sounds like a classic picture of an infection, right? And there's a vegetation on his aortic valve. Hmm, that's not great news. 

Now, looking at the blood culture results: gram-positive, catalase-negative, gamma-hemolytic cocci in chains. That means it's a Streptococcus species, specifically Streptococcus pyogenes. It's not Streptococcus pneumoniae because it doesn't grow in NaCl. And this guy's symptoms, combined with the vegetation, point towards endocarditis, which is an infection of the heart valves.

Okay, so let's figure out what could have caused this. I'm thinking about infections, not just from bacteria, but also from fungi or viruses. It's important to consider these possibilities because sometimes you can miss something. 

But wait, let's think about the main culprit. Streptococcus pyogenes is pretty aggressive and can cause a lot of pr

## Saving the model locally

In [21]:
new_model_online = "xqdong/DeepSeek-R1-Medical-COT"
new_model_local = "DeepSeek-R1-Medical-COT"
model.save_pretrained(new_model_local) # Local saving
tokenizer.save_pretrained(new_model_local)


('DeepSeek-R1-Medical-COT/tokenizer_config.json',
 'DeepSeek-R1-Medical-COT/special_tokens_map.json',
 'DeepSeek-R1-Medical-COT/tokenizer.json')

## Pushing the model to Hugging Face hub

In [22]:
model.push_to_hub(new_model_online) # Online saving
tokenizer.push_to_hub(new_model_online) # Online saving

README.md:   0%|          | 0.00/625 [00:00<?, ?B/s]

  0%|          | 0/1 [00:00<?, ?it/s]

adapter_model.safetensors:   0%|          | 0.00/168M [00:00<?, ?B/s]

Saved model to https://huggingface.co/xqdong/DeepSeek-R1-Medical-COT


  0%|          | 0/1 [00:00<?, ?it/s]

tokenizer.json:   0%|          | 0.00/17.2M [00:00<?, ?B/s]

In [None]:
model.save_pretrained_merged(new_model_local, tokenizer, save_method = "merged_16bit",)
model.push_to_hub_merged(new_model_online, tokenizer, save_method = "merged_16bit")

Unsloth: You have 1 CPUs. Using `safe_serialization` is 10x slower.
We shall switch to Pytorch saving, which might take 3 minutes and not 30 minutes.
To force `safe_serialization`, set it to `None` instead.
Unsloth: Kaggle/Colab has limited disk space. We need to delete the downloaded
model which will save 4-16GB of disk space, allowing you to save on Kaggle/Colab.
Unsloth: Will remove a cached repo with size 6.0G


Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 4.83 out of 12.67 RAM for saving.
Unsloth: Saving model... This might take 5 minutes ...


 34%|███▍      | 11/32 [00:00<00:01, 15.03it/s]
We will save to Disk and not RAM now.
100%|██████████| 32/32 [01:34<00:00,  2.96s/it]


Unsloth: Saving tokenizer... Done.
Unsloth: Saving DeepSeek-R1-Medical-COT/pytorch_model-00001-of-00004.bin...
Unsloth: Saving DeepSeek-R1-Medical-COT/pytorch_model-00002-of-00004.bin...
Unsloth: Saving DeepSeek-R1-Medical-COT/pytorch_model-00003-of-00004.bin...
Unsloth: Saving DeepSeek-R1-Medical-COT/pytorch_model-00004-of-00004.bin...
Done.


Unsloth: You are pushing to hub, but you passed your HF username = xqdong.
We shall truncate xqdong/DeepSeek-R1-Medical-COT to DeepSeek-R1-Medical-COT


Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 3.23 out of 12.67 RAM for saving.
Unsloth: Saving model... This might take 5 minutes ...


100%|██████████| 32/32 [02:37<00:00,  4.92s/it]


Unsloth: Saving tokenizer...

No files have been modified since last commit. Skipping to prevent empty commit.


 Done.
Unsloth: Saving DeepSeek-R1-Medical-COT/pytorch_model-00001-of-00004.bin...
Unsloth: Saving DeepSeek-R1-Medical-COT/pytorch_model-00002-of-00004.bin...
Unsloth: Saving DeepSeek-R1-Medical-COT/pytorch_model-00003-of-00004.bin...
