## Setting up

In [1]:
%%capture
#!pip install unsloth
!pip install unsloth==2025.2.5 -i https://mirrors.aliyun.com/pypi/simple

# Also get the latest nightly Unsloth!
!pip install --force-reinstall --no-cache-dir --no-deps git+https://gitee.com/July1921/unsloth.git -i https://mirrors.aliyun.com/pypi/simple
#上述命令中又会安装最新的unsloth，所以重装
!pip install unsloth==2025.2.5 -i https://mirrors.aliyun.com/pypi/simple

In [1]:
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048
dtype = None
load_in_4bit = True

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!


可视化训练过程，将数据报送到wandb中

In [3]:
!pip install wandb -i https://mirrors.aliyun.com/pypi/simple

Looking in indexes: https://mirrors.aliyun.com/pypi/simple
[0m

In [4]:
## 这里不使用环境变量，直接填入wandb的token,如果没有token可以去官网下载一个
## 同时wandb.init时默认超时时间是90s，因为网站是国外网站可能经常出现init超过90s的现象，所以增加超时时间为300s

import wandb
import os

os.environ['WANDB_INIT_TIMEOUT'] = '1200'  #Increase timeout settings
os.environ['WANDB_DEBUG'] = "true"           #Enable debugging
#
#wandb.login(key="70de2d8c11b25ebc5eea1f8f09f3a5...")
run = wandb.init(
    project='my fint-tune on deepseek r1 with medical data',
    job_type="training",
    anonymous="allow",
    settings=wandb.Settings(init_timeout=1200)
)

[34m[1mwandb[0m: Currently logged in as: [33msanyinjiang[0m ([33msanyinjiang-zt[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.


## Loading the model and tokenizer

In [5]:
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "/opt/code_repos/AI_models/unsloth-DeepSeek-R1-Distill-Llama-8B", # 这里改成你本地模型，以我的为例，我已经huggingface上的模型文件下载到本地。
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

==((====))==  Unsloth 2025.2.5: Fast Llama patching. Transformers: 4.48.3.
   \\   /|    GPU: NVIDIA A40. Max memory: 44.339 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.6.0+cu124. CUDA: 8.6. CUDA Toolkit: 12.4. Triton: 3.2.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.29.post3. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

## Model inference before fine-tuning

In [6]:
prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context.
Write a response that appropriately completes the request.
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.

### Instruction:
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning.
Please answer the following medical question.

### Question:
{}

### Response:
<think>{}"""

In [7]:
#question = "A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?"
question = "一名61岁女性，长期在咳嗽或打喷嚏等活动时不自觉排尿，但夜间无漏尿，现接受妇科检查和Q-tip检查。基于这些发现，膀胱造瘘术最有可能揭示她的残余容量和逼尿肌收缩情况？"


FastLanguageModel.for_inference(model)  # Unsloth has 2x faster inference!
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")

outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### Response:")[1])


<think>
嗯，好的，我来想想这个问题。首先，题目描述的是一个61岁的女性，她长期在咳嗽或打喷嚏时不自觉排尿，但夜间没有漏尿。现在她正在接受妇科检查和Q-tip检查。问题是，膀胱造瘘术最有可能揭示她的残余容量和逼尿肌收缩情况吗？

首先，我需要理解什么是膀胱造瘘术。膀胱造瘘术是一种治疗膀胱功能障碍的手术方法，通常用于膀胱收缩或狭窄。手术的目的是通过放松膀胱壁和减少膀胱内压来改善排尿功能。

接下来，Q-tip检查是什么？Q-tip检查是一种用于评估膀胱功能的检查方法，通过将Q-tip catheter插入膀胱，测量膀胱压力和残余容量。残余容量是指膀胱在排空后的最大容量，而膀胱压力则反映了膀胱内的压力水平。

现在，患者在咳嗽或打喷嚏时不自觉排尿，这可能与膀胱的反射活动有关，可能是由于膀胱的高压或狭窄导致的。夜间没有漏尿，说明夜间的膀胱排空情况良好，但白天可能有不自觉的排尿，这可能与白天的活动或膀胱功能有关。

问题问的是膀胱造瘘术是否能揭示她的残余容量和逼尿肌收缩情况。根据我的理解，膀胱造瘘术主要用于治疗膀胱功能障碍，通过手术改变膀胱结构，改善排尿功能。而Q-tip检查则用于测量膀胱的残余容量和压力，这些数据可以帮助诊断膀胱的功能问题。

所以，膀胱造瘘术本身是治疗手段，而不是诊断手段。Q-tip检查是诊断手段，能够测量残余容量和膀胱压力，从而评估膀胱的功能状态。

因此，问题中的描述中，患者正在接受妇科检查和Q-tip检查，这两者结合起来，可以帮助医生了解她的膀胱功能状况，包括残余容量和膀胱压力。膀胱造瘘术则是治疗性的手术，用于改善膀胱功能障碍。

总结一下，膀胱造瘘术用于治疗，而Q-tip检查用于诊断，结合这两者，可以更好地了解膀胱的残余容量和逼尿肌的情况。
</think>

膀胱造瘘术主要用于治疗膀胱功能障碍，通过手术改善膀胱结构和功能，而Q-tip检查则用于诊断膀胱的残余容量和压力，帮助评估膀胱功能状态。因此，膀胱造瘘术本身并不是用于揭示残余容量和逼尿肌收缩情况的手段。<｜end▁of▁sentence｜>


In [9]:
model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    target_modules=[
        "q_proj",
        "k_proj",
        "v_proj",
        "o_proj",
        "gate_proj",
        "up_proj",
        "down_proj",
    ],
    lora_alpha=16,
    lora_dropout=0,
    bias="none",
    use_gradient_checkpointing="unsloth",  # True or "unsloth" for very long context
    random_state=3407,
    use_rslora=False,
    loftq_config=None,
)


Unsloth 2025.2.5 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.


## Loading and processing the dataset

In [10]:
train_prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context.
Write a response that appropriately completes the request.
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.

### Instruction:
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning.
Please answer the following medical question.

### Question:
{}

### Response:
<think>
{}
</think>
{}"""


In [11]:
EOS_TOKEN = tokenizer.eos_token  # Must add EOS_TOKEN


def formatting_prompts_func(examples):
    inputs = examples["Question"]
    cots = examples["Complex_CoT"]
    outputs = examples["Response"]
    texts = []
    for input, cot, output in zip(inputs, cots, outputs):
        text = train_prompt_style.format(input, cot, output) + EOS_TOKEN
        texts.append(text)
    return {
        "text": texts,
    }


In [12]:
from datasets import load_dataset

#dataset = load_dataset("/root/autodl-tmp", "en",split = "train[0:500]") # 这里同样去huggingface上面下载数据集，然后放到本地
#修改上面这一行内容load_dataset方法调用的第一个参数。其余不变
dataset = load_dataset("/opt/code_repos/AI_datasets/AI-ModelScope---medical-o1-reasoning-SFT", "zh",split = "train[0:500]") # 这里同样去huggingface上面下载数据集，然后放到本地
dataset = dataset.map(formatting_prompts_func, batched = True,)
dataset["text"][0]

'Below is an instruction that describes a task, paired with an input that provides further context.\nWrite a response that appropriately completes the request.\nBefore answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.\n\n### Instruction:\nYou are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning.\nPlease answer the following medical question.\n\n### Question:\n根据描述，一个1岁的孩子在夏季头皮出现多处小结节，长期不愈合，且现在疮大如梅，溃破流脓，口不收敛，头皮下有空洞，患处皮肤增厚。这种病症在中医中诊断为什么病？\n\n### Response:\n<think>\n这个小孩子在夏天头皮上长了些小结节，一直都没好，后来变成了脓包，流了好多脓。想想夏天那么热，可能和湿热有关。才一岁的小孩，免疫力本来就不强，夏天的湿热没准就侵袭了身体。\n\n用中医的角度来看，出现小结节、再加上长期不愈合，这些症状让我想到了头疮。小孩子最容易得这些皮肤病，主要因为湿热在体表郁结。\n\n但再看看，头皮下还有空洞，这可能不止是简单的头疮。看起来病情挺严重的，也许是脓肿没治好。这样的情况中医中有时候叫做禿疮或者湿疮，也可能是另一种情况。\n\n等一下，头皮上的空洞和皮肤增厚更像是疾病已经深入到头皮下，这是不是说明有可能是流注或瘰疬？这些名字常描述头部或颈部的严重感染，特别是有化脓不愈合，又形成通道或空洞的情况。\n\n仔细想想，我怎么感觉这些症状更贴近瘰疬的表现？尤其考虑到孩子的年纪和夏天发生的季节性因素，湿热可能是主因，但可能也有火毒或者痰湿造成的滞留。\n\n回到基本

## Setting up the model

In [13]:
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported

trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=dataset,
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    dataset_num_proc=2,
    args=TrainingArguments(
        per_device_train_batch_size=2,
        gradient_accumulation_steps=4,
        # Use num_train_epochs = 1, warmup_ratio for full training runs!
        warmup_steps=5,
        max_steps=60,
        learning_rate=2e-4,
        fp16=not is_bfloat16_supported(),
        bf16=is_bfloat16_supported(),
        logging_steps=10,
        optim="adamw_8bit",
        weight_decay=0.01,
        lr_scheduler_type="linear",
        seed=3407,
        output_dir="outputs",
    ),
)


Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.


## Model training

In [14]:
trainer_stats = trainer.train()

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 500 | Num Epochs = 1
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 4
\        /    Total batch size = 8 | Total steps = 60
 "-____-"     Number of trainable parameters = 41,943,040


Step,Training Loss
10,2.1143
20,1.65
30,1.6206
40,1.5887
50,1.5688
60,1.5521


In [15]:
# Save the fine-tuned model
wandb.finish()

0,1
train/epoch,▁▂▄▅▇██
train/global_step,▁▂▄▅▇██
train/grad_norm,▁█▁▁▁▁
train/learning_rate,█▇▅▄▂▁
train/loss,█▂▂▁▁▁

0,1
total_flos,1.8786357420539904e+16
train/epoch,0.96
train/global_step,60.0
train/grad_norm,0.29246
train/learning_rate,0.0
train/loss,1.5521
train_loss,1.68245
train_runtime,288.2084
train_samples_per_second,1.665
train_steps_per_second,0.208


## Model inference after fine-tuning

In [16]:
#question = "A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?"
question = "一名61岁女性，长期在咳嗽或打喷嚏等活动时不自觉排尿，但夜间无漏尿，现接受妇科检查和Q-tip检查。基于这些发现，膀胱造瘘术最有可能揭示她的残余容量和逼尿肌收缩情况？"

FastLanguageModel.for_inference(model)  # Unsloth has 2x faster inference!
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")

outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### Response:")[1])



<think>
首先，这位61岁女性的症状非常特别。咳嗽或打喷嚏时不自觉排尿，但夜间没有漏尿。这种情况下，首先想到的就是膀胱功能异常的问题。

接下来，考虑到她的症状，特别是她在特定活动时会不自觉排尿，这很可能是膀胱的功能异常所引起的。通常来说，膀胱的功能异常可能与膀胱的容量变化有关，或者是膀胱的肌肉收缩。

然后，想到Q-tip检查是非常有帮助的。Q-tip检查是为了评估膀胱的残余容量和膀胱的压力情况的。通过这个检查，可以了解膀胱在不同状态下的容量变化。

再想想，既然Q-tip检查可以揭示膀胱的残余容量和膀胱的压力，那么它在这种情况下就非常适合用来帮助诊断。

最后，结合这些信息，可以明确得出结论：Q-tip检查在这种情况下是最有可能揭示她残余容量和膀胱肌肉收缩情况的工具。
</think>
根据这位61岁女性的症状，她在咳嗽或打喷嚏时不自觉排尿，但夜间没有漏尿，这表明可能存在膀胱功能异常。结合Q-tip检查的目的，Q-tip检查是非常适合用于评估膀胱残余容量和膀胱肌肉收缩情况的工具。因此，基于这些发现，Q-tip检查最有可能揭示她的残余容量和膀胱肌肉收缩情况。<｜end▁of▁sentence｜>


In [17]:
#question = "A 59-year-old man presents with a fever, chills, night sweats, and generalized fatigue, and is found to have a 12 mm vegetation on the aortic valve. Blood cultures indicate gram-positive, catalase-negative, gamma-hemolytic cocci in chains that do not grow in a 6.5% NaCl medium. What is the most likely predisposing factor for this patient's condition?"
question = "59岁男性，发热，寒颤，盗汗，全身疲劳，主动脉瓣上有12毫米的赘生物。血液培养显示革兰氏阳性，过氧化氢酶阴性，γ -溶血性球菌链，不能在6.5%的NaCl培养基中生长。这个病人的病情最有可能的诱发因素是什么？"

inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")

outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### Response:")[1])


<think>
首先，我注意到这个病人有发热、寒颤和盗汗，这些症状让我想到流感或者其他类似的感染症状。接着，血液培养结果显示革兰氏阳性，说明有可能是革兰氏阳性菌引起的感染。

哦，等等，过氧化氢酶阴性，这让我觉得可能不是大肠杆菌，因为大肠杆菌通常是过氧化氢酶阳性。然后，γ -溶血性球菌链的存在让我想到可能是溶菌性病毒，像流感病毒或者其他RNA病毒。

再看培养基条件，无法在6.5%的NaCl培养基中生长，这让我想到了某些细菌可能需要特定的环境，比如像流感病毒可能需要细胞培养才能生长。

哦，这样看来，结合这些因素，病人可能患有流感。流感病毒确实有这种特点，尤其是γ -溶血性球菌链的存在。病人发热、疲劳，正好是流感的常见症状。

总结一下，病人的症状和培养结果都指向流感病毒的可能性。流感病毒确实需要细胞环境才能繁殖，这与无法在6.5%的NaCl培养基中生长的结果相符。因此，病人最有可能的诱发因素是流感病毒。
</think>
根据病人的症状和血液培养结果，结合病人的发热、寒颤、盗汗等典型流感症状，以及培养结果显示的γ -溶血性球菌链以及不能在6.5%的NaCl培养基中生长的特点，病人最有可能的诱发因素是流感病毒。流感病毒需要细胞环境才能繁殖，这与无法在NaCl培养基中生长的结果相符。因此，病人最有可能的诱发因素是流感病毒。<｜end▁of▁sentence｜>


## Saving the model locally

In [18]:
new_model_local = "DeepSeek-R1-Medical-COT"
model.save_pretrained(new_model_local) # 保存微调的模型相关文件到本地
tokenizer.save_pretrained(new_model_local)

('DeepSeek-R1-Medical-COT/tokenizer_config.json',
 'DeepSeek-R1-Medical-COT/special_tokens_map.json',
 'DeepSeek-R1-Medical-COT/tokenizer.json')

## Pushing the model to Hugging Face hub

In [None]:
model.push_to_hub(new_model_online) # Online saving
tokenizer.push_to_hub(new_model_online) # Online saving

In [None]:
model.save_pretrained_merged(new_model_local, tokenizer, save_method = "merged_16bit",)
model.push_to_hub_merged(new_model_online, tokenizer, save_method = "merged_16bit")