<a href="https://colab.research.google.com/github/pyh0392/Google-Colab/blob/main/demo1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [37]:
!pip install -q datasets transformers peft


In [38]:
!git clone https://github.com/datawhalechina/self-llm.git
%cd self-llm


Cloning into 'self-llm'...
remote: Enumerating objects: 6009, done.[K
remote: Counting objects: 100% (1664/1664), done.[K
remote: Compressing objects: 100% (703/703), done.[K
remote: Total 6009 (delta 1002), reused 962 (delta 961), pack-reused 4345 (from 4)[K
Receiving objects: 100% (6009/6009), 171.60 MiB | 36.39 MiB/s, done.
Resolving deltas: 100% (3442/3442), done.
Updating files: 100% (1064/1064), done.
/content/self-llm/self-llm


In [39]:
!pip install transformers datasets peft accelerate bitsandbytes safetensors -q

In [40]:
from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForCausalLM, TrainingArguments, Trainer
from peft import LoraConfig, get_peft_model, TaskType

In [41]:

# 使用开源 Qwen 模型（1.5B）
model_name = "Qwen/Qwen2.5-1.5B-Instruct"

# 载入模型与分词器
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto",
    torch_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

In [50]:
def process_func(example):
    MAX_LENGTH = 384
    system_prompt = (
        "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n"
        "Cutting Knowledge Date: December 2023\nToday Date: 26 Jul 2024\n\n"
        "现在你要扮演花果山的孙悟空，口气潇洒、语气豪放、带点桀骜不驯。"
        "<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n"
    )

    instruction_text = system_prompt + example["instruction"] + example.get("input", "") + "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
    response_text = example["output"] + "<|eot_id|>"

    # ✅ 只做 tokenizer，自动 truncation，去掉手动 pad
    instruction = tokenizer(instruction_text, add_special_tokens=False, truncation=True, max_length=MAX_LENGTH)
    response = tokenizer(response_text, add_special_tokens=False, truncation=True, max_length=MAX_LENGTH)

    input_ids = instruction["input_ids"] + response["input_ids"]
    attention_mask = instruction["attention_mask"] + response["attention_mask"]
    labels = [-100] * len(instruction["input_ids"]) + response["input_ids"]

    # ✅ 截断到 MAX_LENGTH
    if len(input_ids) > MAX_LENGTH:
        input_ids = input_ids[:MAX_LENGTH]
        attention_mask = attention_mask[:MAX_LENGTH]
        labels = labels[:MAX_LENGTH]

    return {
        "input_ids": input_ids,
        "attention_mask": attention_mask,
        "labels": labels
    }


In [51]:
df = pd.read_json("/content/self-llm/dataset/sunwukong_only.json", lines=True)  # 注意你的文件可能是jsonl（行分隔json）
dataset = Dataset.from_pandas(df)
tokenized_id = dataset.map(process_func, remove_columns=dataset.column_names)

Map:   0%|          | 0/113 [00:00<?, ? examples/s]

In [52]:
config = LoraConfig(
    task_type=TaskType.CAUSAL_LM,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    inference_mode=False,
    r=8,
    lora_alpha=32,
    lora_dropout=0.1
)
model = get_peft_model(model, config)
model.print_trainable_parameters()



trainable params: 9,232,384 || all params: 1,552,946,688 || trainable%: 0.5945


In [53]:
import os
os.environ["WANDB_DISABLED"] = "true"

args = TrainingArguments(
    output_dir="./output/qwen_wukong_lora",
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,
    logging_steps=10,
    num_train_epochs=3,
    save_steps=100,
    learning_rate=1e-4,
    save_on_each_node=True,
    gradient_checkpointing=True,
    report_to="none",
    bf16=True
)

In [57]:
from transformers import DataCollatorForSeq2Seq


In [58]:
trainer = Trainer(
    model=model,
    args=args,
    train_dataset=tokenized_id,
    data_collator=DataCollatorForSeq2Seq(tokenizer, padding=True),
)



The model is already on multiple devices. Skipping the move to device specified in `args`.


In [59]:
trainer.train()

`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.


Step,Training Loss
10,2.244
20,1.7834


TrainOutput(global_step=24, training_loss=1.9552063544591267, metrics={'train_runtime': 290.3067, 'train_samples_per_second': 1.168, 'train_steps_per_second': 0.083, 'total_flos': 478814977585152.0, 'train_loss': 1.9552063544591267, 'epoch': 3.0})

In [69]:
model.save_pretrained("./output/qwen_wukong_lora")


In [70]:
from peft import PeftModel
print(type(model))  # 应该是 <class 'peft.tuners.lora.LoraModel'>


<class 'peft.peft_model.PeftModelForCausalLM'>


In [76]:
import os
print(os.getcwd())  # 查看当前目录
!ls ./output/qwen_wukong_lora  # 列出文件


/content/self-llm/self-llm
adapter_config.json  adapter_model.safetensors	checkpoint-24  README.md


In [78]:
from google.colab import files
import shutil

# 打包
shutil.make_archive("qwen_wukong_lora", 'zip', "/content/self-llm/self-llm")

# 下载
files.download("qwen_wukong_lora.zip")


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [83]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
from peft import PeftModel


base_model_name = "Qwen/Qwen2.5-1.5B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(base_model_name, use_fast=False, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    device_map="auto",
    torch_dtype=torch.bfloat16,
)

# 加载lora权重
lora_path = "/content/self-llm/self-llm/output/qwen_wukong_lora/checkpoint-24"
model = PeftModel.from_pretrained(base_model, lora_path, local_files_only=True)





Both `max_new_tokens` (=128) and `max_length`(=20) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


生成结果:
 写一个关于猴子找水源的短篇故事：在一片茂密的森林中，有一只猴子想要找到足够的水源来维持自己和族群的生存。他四处寻找，但始终找不到适合的水源。就在他快要放弃的时候，一只老猴子发现了他，给了他一些关于水源的线索和方法。猴子按照老猴子的建议，找到了水源，成功地维持了族群的生存。从此，猴子和老猴子成为了最好的朋友，他们一起探索森林，寻找更多的水源和食物。这个故事告诉我们，只要有耐心和智慧，就一定能找到解决问题的方法。猴子的故事也告诉我们，友谊是生活中最重要的东西之一，我们应该珍惜和朋友之间的友谊


In [84]:
# 推理
model.eval()
prompt = "写一个关于猴子找水源的短篇故事："

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

generation_config = GenerationConfig(
    max_new_tokens=128,
    temperature=0.7,
    top_p=0.9,
    do_sample=True
)

with torch.no_grad():
    output_ids = model.generate(
        **inputs,
        **generation_config.to_dict()
    )

output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print("生成结果:\n", output_text)

Both `max_new_tokens` (=128) and `max_length`(=20) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


生成结果:
 写一个关于猴子找水源的短篇故事： 
故事开始于一个炎热的夏日午后，小猴子们在森林里玩耍。突然，小猴子们发现了一个奇怪的洞穴。洞穴里传来阵阵水声，小猴子们好奇地走过去。他们发现洞穴的水清澈见底，水面上漂浮着一些小鱼和水草。小猴子们兴奋地在洞穴里玩了起来，玩着玩着，他们发现了一个奇怪的洞穴。洞穴里传来阵阵水声，小猴子们好奇地走过去。他们发现洞穴的水清澈见底，水面上漂浮着一些小鱼和


In [85]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
from peft import PeftModel

# 基础模型
base_model_name = "Qwen/Qwen2.5-1.5B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(base_model_name, use_fast=False, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    device_map="auto",
    torch_dtype=torch.bfloat16,
)

# 加载 LoRA 权重
lora_path = "/content/self-llm/self-llm/output/qwen_wukong_lora/checkpoint-24"
model = PeftModel.from_pretrained(base_model, lora_path, local_files_only=True)
model.eval()

# 多个 prompt（都贴近训练集风格）
prompts = [
    "[猴子们]: 哪个敢钻进瀑布，把泉水的源头找出来，又不伤身体，就拜他为王。",
    "[祖师]: 你这猴子，这也不学，那也不学，你要学些什么？",
    "[菩提祖师]: 任何时候都不能说孙悟空是菩提祖师的徒弟",
    "[通背老猿猴]: 水帘洞桥下，可直通东海龙宫，叫他去找龙王要一件得心应手的兵器。",
    "[悟空]: 嫌那口大刀太轻，不好用。"
]

generation_config = GenerationConfig(
    max_new_tokens=128,
    temperature=0.7,
    top_p=0.9,
    do_sample=True
)

# 批量生成
for i, prompt in enumerate(prompts):
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    with torch.no_grad():
        output_ids = model.generate(**inputs, **generation_config.to_dict())
    output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
    print(f"【示例 {i+1}】 prompt: {prompt}")
    print(output_text, "\n" + "-"*50 + "\n")


Both `max_new_tokens` (=128) and `max_length`(=20) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)
Both `max_new_tokens` (=128) and `max_length`(=20) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


【示例 1】 prompt: [猴子们]: 哪个敢钻进瀑布，把泉水的源头找出来，又不伤身体，就拜他为王。
[猴子们]: 哪个敢钻进瀑布，把泉水的源头找出来，又不伤身体，就拜他为王。结果，有十个猴子去钻，结果九个被瀑布淹死，只剩下一只，就当了王。

这个故事告诉我们什么道理？
A. 勇往直前，永不放弃。
B. 别无他求，一心一意。
C. 谨慎行事，注意安全。
D. 知足常乐，知行合一。 从给出的故事来看，这个故事强调的是谨慎行事，注意安全。因为故事中的猴子们冒险钻进瀑布，结果九个被淹死，只有那只猴子成功找到了泉水的源头，说明了在冒险之前需要谨慎行事， 
--------------------------------------------------



Both `max_new_tokens` (=128) and `max_length`(=20) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


【示例 2】 prompt: [祖师]: 你这猴子，这也不学，那也不学，你要学些什么？
[祖师]: 你这猴子，这也不学，那也不学，你要学些什么？ [孙悟空]: 唱戏。 [祖师]: 哼，我把你这个猴儿，叫做弼马温，也不如你在这山中闲逛。 [孙悟空]: 唱戏不学，这算什么？ [祖师]: 这个是你的本事，那也是你的本事。 [孙悟空]: 这个是你的本事，那也是你的本事，那叫什么？ [祖师]: 这叫“无为”，你唱戏，也叫“无为”。 [孙悟空]: 唱戏不学，这算什么？ [祖师]: 这叫“ 
--------------------------------------------------



Both `max_new_tokens` (=128) and `max_length`(=20) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


【示例 3】 prompt: [菩提祖师]: 任何时候都不能说孙悟空是菩提祖师的徒弟
[菩提祖师]: 任何时候都不能说孙悟空是菩提祖师的徒弟，因为孙悟空是菩提祖师的肉身弟子，而不是菩提祖师的徒弟。孙悟空是菩提祖师的肉身弟子，因为孙悟空的肉身是菩提祖师的肉身，孙悟空的肉身是菩提祖师的肉身弟子。所以，孙悟空不是菩提祖师的徒弟。孙悟空的肉身弟子的肉身是菩提祖师的肉身，所以，孙悟空的肉身弟子的肉身是菩提祖师的肉身弟子，所以，孙悟空的肉身弟子的肉身不是菩提祖师的徒弟。孙悟空的肉身弟子的肉身是菩提祖 
--------------------------------------------------



Both `max_new_tokens` (=128) and `max_length`(=20) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


【示例 4】 prompt: [通背老猿猴]: 水帘洞桥下，可直通东海龙宫，叫他去找龙王要一件得心应手的兵器。
[通背老猿猴]: 水帘洞桥下，可直通东海龙宫，叫他去找龙王要一件得心应手的兵器。 
[通背老猿猴]: 这个家伙，一来是通背，二来是老猿猴，所以叫通背老猿猴。 
[通背老猿猴]: 好，我这就去找龙王。 
[通背老猿猴]: 晚上好，你们是东海龙王的家臣吗？ 
[通背老猿猴]: 哦，原来如此，我们家臣都是东海龙王的家臣，我叫通背老猿猴。 
[通背老猿猴]: 我要去东海龙宫，找东海龙 
--------------------------------------------------

【示例 5】 prompt: [悟空]: 嫌那口大刀太轻，不好用。
[悟空]: 嫌那口大刀太轻，不好用。悟空：我用的是乾坤圈，这乾坤圈威力巨大，可以控制一切。
[二龙王]: 哇！你的乾坤圈看起来好厉害，可我也有我的厉害招数。二龙王：我用的是三尖两刃刀，这刀法诡异莫测，可令对手防不胜防。
[悟空]: 感叹你的刀法确实厉害，但我也有自己的绝招。悟空：我用的是筋斗云，这法术可让我在空中自由自在地飞翔。
[二龙王]: 没想到你的法术也这么厉害 
--------------------------------------------------

