<a href="https://colab.research.google.com/github/liangxi2004/eco_waste_colab/blob/main/eco_waste.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [16]:
from google.colab import drive
drive.mount('/content/drive')  # 挂载到 /content/drive

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [17]:
# Step1. Setting Up
# （1）安装依赖：
%%capture
!pip install unsloth
!pip install --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git

In [18]:
# （2）登录Hugging Face CLI：
from huggingface_hub import login
# 推荐方式：从Colab secrets获取token
from google.colab import userdata
login(token=userdata.get('HF_TOKEN'))

In [19]:
# Step2. 加载模型和分词器(tokenizer)
from unsloth import FastLanguageModel

max_seq_length = 2048  # 垃圾回收报告通常需要较长上下文
dtype = None
load_in_4bit = True

# 加载环保领域适配的蒸馏模型
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

==((====))==  Unsloth 2025.3.19: Fast Qwen2 patching. Transformers: 4.50.3.
   \\   /|    Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.6.0+cu124. CUDA: 7.5. CUDA Toolkit: 12.4. Triton: 3.2.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.29.post3. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


In [20]:
# Step3. 微调前的模型推理
# （1）定义环保专家提示模板
prompt_style = """你是一位资深的环境保护专家，专注于废弃物管理、资源回收和可持续发展领域。
请基于专业知识回答以下问题，回答前请先进行逐步推理。

### 问题:
{}
### 思考过程:
<think>{}
### 回答:
"""

In [21]:
# （2）测试垃圾分类问题
question = "塑料瓶盖应该投放到哪个分类垃圾桶？"

FastLanguageModel.for_inference(model)
inputs = tokenizer(
    [prompt_style.format(question, "")],
    return_tensors="pt"
).to("cuda")

outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=800,  # 环保问题通常比农业问题需要更短的回答
    temperature=0.7,
)

response = tokenizer.batch_decode(outputs)
print(response[0].split("### 回答:")[1])


塑料瓶盖应该投放到“可回收物”或“垃圾”分类垃圾桶。

**思考过程总结：**
1. **分类原则**：塑料瓶盖属于可回收物，通常归入“可回收物”或“垃圾”分类。具体取决于分类标准，如“可回收物”更侧重于易降解和可回收的材料，而“垃圾”主要指不可回收或有害的物质。

2. **分类标准**：如果分类标准为“可回收物”，则塑料瓶盖应归入该类别；如果为“垃圾”，则归入“有害废物”或“其他”分类。

3. **使用建议**：投放塑料瓶盖时，应确保其表面清洁，避免污染环境，以符合环保要求。

4. **注意事项**：确保塑料瓶盖在投放后能够正确降解或通过回收系统处理，以维持分类的正确性。

通过上述思考，得出塑料瓶盖应投放到“可回收物”或“垃圾”分类垃圾桶。
</think>

塑料瓶盖应该投放到“可回收物”或“垃圾”分类垃圾桶。

**思考过程总结：**

1. **分类原则**：塑料瓶盖属于可回收物，通常归入“可回收物”或“垃圾”分类。具体取决于分类标准，如“可回收物”更侧重于易降解和可回收的材料，而“垃圾”主要指不可回收或有害的物质。

2. **分类标准**：如果分类标准为“可回收物”，塑料瓶盖应归入该类别；如果为“垃圾”，则归入“有害废物”或“其他”分类。

3. **使用建议**：投放塑料瓶盖时，应确保其表面清洁，避免污染环境，以符合环保要求。

4. **注意事项**：确保塑料瓶盖在投放后能够正确降解或通过回收系统处理，以维持分类的正确性。

通过上述思考，得出塑料瓶盖应投放到“可回收物”或“垃圾”分类垃圾桶。<｜end▁of▁sentence｜>


In [22]:
# Step4. 加载和处理转换后的数据集
from datasets import load_dataset

# 加载我们转换后的数据集
dataset = load_dataset("liangxi2004/eco_waste_dataset", split="train")  # 或使用本地路径

# 修改训练提示模板以匹配环保领域
train_prompt_style = """你是一位资深的环境工程专家，专注于废弃物管理和循环经济。
请基于专业知识和以下思考过程，详细回答垃圾处理问题。

### 问题:
{}
### 专业思考:
<think>{}</think>
### 专业回答:
{}"""

EOS_TOKEN = tokenizer.eos_token

def formatting_prompts_func(examples):
    texts = []
    for q, cot, ans in zip(examples["Question"], examples["Complex_CoT"], examples["Response"]):
        text = train_prompt_style.format(q, cot, ans) + EOS_TOKEN
        texts.append(text)
    return {"text": texts}

# 应用格式化函数
dataset = dataset.map(formatting_prompts_func, batched=True)

In [23]:
# Step5. 配置微调模型
# （1）LoRA配置
model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
                   "gate_proj", "up_proj", "down_proj"],
    lora_alpha=16,
    lora_dropout=0.1,  # 稍高的dropout防止过拟合
    bias="none",
    use_gradient_checkpointing="unsloth",
)

Unsloth: Dropout = 0 is supported for fast patching. You are using dropout = 0.1.
Unsloth will patch all other layers, except LoRA matrices, causing a performance hit.


In [24]:
# （2）训练参数配置
from trl import SFTTrainer
from transformers import TrainingArguments

trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=dataset,
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    args=TrainingArguments(
        per_device_train_batch_size=2,
        gradient_accumulation_steps=4,
        warmup_ratio=0.1,
        max_steps=300,  # 环保数据集通常比农业数据集小
        learning_rate=2e-4,
        optim="adamw_8bit",
        weight_decay=0.01,
        fp16=True,
        logging_steps=10,
        output_dir="eco_outputs",
    ),
)

Unsloth: Tokenizing ["text"] (num_proc=2):   0%|          | 0/1470 [00:00<?, ? examples/s]

In [25]:
# Step6. 模型训练
trainer_stats = trainer.train()

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 1,470 | Num Epochs = 2 | Total steps = 300
O^O/ \_/ \    Batch size per device = 2 | Gradient accumulation steps = 4
\        /    Data Parallel GPUs = 1 | Total batch size (2 x 4 x 1) = 8
 "-____-"     Trainable parameters = 18,464,768/5,000,000,000 (0.37% trained)
[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.


<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter:

 ··········


[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33mliangxi20041014[0m ([33mliangxi20041014-southwest-university[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


Unsloth: Will smartly offload gradients to save VRAM!


Step,Training Loss
10,2.8473
20,2.4467
30,1.6643
40,0.7824
50,0.3252
60,0.1818
70,0.1426
80,0.125
90,0.1136
100,0.1079


In [26]:
# Step7. 微调后推理测试
question = "外卖餐盒应该如何正确处理？"
FastLanguageModel.for_inference(model)
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")
outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=800,
    temperature=0.7,
)
print(tokenizer.batch_decode(outputs)[0].split("### 回答:")[1])


关于外卖餐盒的详细处理方案：

【专业处理方案】
- 化学处理：使用酸碱中和、氧化还原等方法分解有害物质，避免有害物质接触人体。
- 生物降解：利用微生物分解有机成分，实现有机质的完全分解。

【政策法规标准】
- 《巴塞尔公约》关于跨境转移的规定
- 欧盟循环经济行动计划相关标准

【技术创新】
- 区块链追溯系统，全程监控废物流向
- 生物酶解技术，加速有机质分解过程

【注意事项】
- 不同类别需分开存放，防止交叉污染
- 必须佩戴防护手套和口罩，避免直接接触

【温馨提示】
- 危险废物应使用专用容器盛装，标明成分和危险性
- 更多信息可查询当地环保部门最新指南
<｜end▁of▁sentence｜>


In [27]:
# Step8. 保存模型
new_model_name = "DeepSeek-R1-EcoWaste"
model.save_pretrained_merged(new_model_name, tokenizer, save_method="merged_16bit")

Unsloth: You have 1 CPUs. Using `safe_serialization` is 10x slower.
We shall switch to Pytorch saving, which might take 3 minutes and not 30 minutes.
To force `safe_serialization`, set it to `None` instead.
Unsloth: Kaggle/Colab has limited disk space. We need to delete the downloaded
model which will save 4-16GB of disk space, allowing you to save on Kaggle/Colab.
Unsloth: Will remove a cached repo with size 1.8G


Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 5.35 out of 12.67 RAM for saving.
Unsloth: Saving model... This might take 5 minutes ...


100%|██████████| 28/28 [00:00<00:00, 41.10it/s]


Unsloth: Saving tokenizer... Done.
Unsloth: Saving DeepSeek-R1-EcoWaste/pytorch_model.bin...
Done.


In [28]:
# Step9. 上传到Hugging Face Hub
model.push_to_hub_merged("liangxi2004/DeepSeek-R1-EcoWaste",
                        tokenizer,
                        save_method="merged_16bit")

Unsloth: You are pushing to hub, but you passed your HF username = liangxi2004.
We shall truncate liangxi2004/DeepSeek-R1-EcoWaste to DeepSeek-R1-EcoWaste


Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 5.29 out of 12.67 RAM for saving.
Unsloth: Saving model... This might take 5 minutes ...


100%|██████████| 28/28 [00:00<00:00, 53.09it/s]


Unsloth: Saving tokenizer...

tokenizer.json:   0%|          | 0.00/11.4M [00:00<?, ?B/s]

 Done.
Unsloth: Saving DeepSeek-R1-EcoWaste/pytorch_model.bin...


README.md:   0%|          | 0.00/632 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/3.55G [00:00<?, ?B/s]

Done.
Saved merged model to https://huggingface.co/liangxi2004/DeepSeek-R1-EcoWaste
