p-tuning 训练问题 #962

ChenBinfighting1 · 2023-05-08T06:12:28Z

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

1，直接运行train.sh 报错，信息如下：
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/jinchen/jc/nlp/llm/ChatGLM-6B/ptuning/main.py:430 in │
│ │
│ 427 │
│ 428 │
│ 429 if name == "main": │
│ ❱ 430 │ main() │
│ 431 │
│ │
│ /home/jinchen/jc/nlp/llm/ChatGLM-6B/ptuning/main.py:128 in main │
│ │
│ 125 │ │
│ 126 │ if model_args.quantization_bit is not None: │
│ 127 │ │ print(f"Quantized to {model_args.quantization_bit} bit") │
│ ❱ 128 │ │ model = model.quantize(model_args.quantization_bit) │
│ 129 │ if model_args.pre_seq_len is not None: │
│ 130 │ │ # P-tuning v2 │
│ 131 │ │ model = model.half() │
│ │
│ /home/jinchen/.cache/huggingface/modules/transformers_modules/chatglm-6b/modeling_chatglm.py:140 │
│ 3 in quantize │
│ │
│ 1400 │ │ │
│ 1401 │ │ self.config.quantization_bit = bits │
│ 1402 │ │ │
│ ❱ 1403 │ │ self.transformer = quantize(self.transformer, bits, empty_init=empty_init, **kwa │
│ 1404 │ │ return self │
│ 1405 │
│ │
│ /home/jinchen/.cache/huggingface/modules/transformers_modules/chatglm-6b/quantization.py:157 in │
│ quantize │
│ │
│ 154 │ """Replace fp16 linear with quantized linear""" │
│ 155 │ │
│ 156 │ for layer in model.layers: │
│ ❱ 157 │ │ layer.attention.query_key_value = QuantizedLinear( │
│ 158 │ │ │ weight_bit_width=weight_bit_width, │
│ 159 │ │ │ weight_tensor=layer.attention.query_key_value.weight.to(torch.cuda.current_d │
│ 160 │ │ │ bias_tensor=layer.attention.query_key_value.bias, │
│ │
│ /home/jinchen/.cache/huggingface/modules/transformers_modules/chatglm-6b/quantization.py:137 in │
│ init │
│ │
│ 134 │ │ │ self.weight_scale = (weight_tensor.abs().max(dim=-1).values / ((2 ** (weight │
│ 135 │ │ │ self.weight = torch.round(weight_tensor / self.weight_scale[:, None]).to(tor │
│ 136 │ │ │ if weight_bit_width == 4: │
│ ❱ 137 │ │ │ │ self.weight = compress_int4_weight(self.weight) │
│ 138 │ │ │
│ 139 │ │ self.weight = Parameter(self.weight.to(kwargs["device"]), requires_grad=False) │
│ 140 │ │ self.weight_scale = Parameter(self.weight_scale.to(kwargs["device"]), requires_g │
│ │
│ /home/jinchen/.cache/huggingface/modules/transformers_modules/chatglm-6b/quantization.py:76 in │
│ compress_int4_weight │
│ │
│ 73 │ │ stream = torch.cuda.current_stream() │
│ 74 │ │ │
│ 75 │ │ gridDim = (n, 1, 1) │
│ ❱ 76 │ │ blockDim = (min(round_up(m, 32), 1024), 1, 1) │
│ 77 │ │ │
│ 78 │ │ kernels.int4WeightCompression( │
│ 79 │ │ │ gridDim, │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
NameError: name 'round_up' is not defined

2，去掉 --quantization_bit 4 参数后可以训练，训练后有私域数据的问答能力，但是输出的文本长度限制变小了，信息如下：
问题：假冒他人专利进行买卖算诈骗吗
标准答案：假冒他人专利进行买卖属于知识产权侵权行为，可能涉及到欺诈和诈骗行为。如果买卖双方都知道专利是假冒的，那么这种行为可能构成欺诈；如果卖方故意隐瞒专利的真实情况，骗取买方的钱财，那么这种行为可能构成诈骗。具体情况需要根据相关法律法规进行判断和处理。作为律师，我建议您及时咨询专业律师，以便更好地保护自己的权益。
实际输出：假冒他人专利进行买卖属于知识产权侵权行为，可能涉及到欺诈和诈骗行为。如果买卖双方都知道专利是假冒的，那么这种行为可能构成欺诈；如果卖方故意隐瞒专利的真实情况，骗取买方的钱财，那么这种行为可能构成诈骗。具体情况需要根据相关法律法规进行判断和处理。作为律师，我

测试了几个，效果一样，输出到一半就结束了

Expected Behavior

No response

Steps To Reproduce

帮忙看一下以上的2个问题，谢谢

Environment

- OS: ubuntu
- Python: 3.8.16
- Transformers: 4.28.0.dev0
- PyTorch: 2.0.0
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

No response

ChenBinfighting1 · 2023-05-09T13:08:08Z

问题已解决

pengyong94 · 2023-05-09T14:24:34Z

麻烦请问是如何解决的呢

guarx · 2023-05-09T14:34:30Z

咋解决的，分享下

ChenBinfighting1 · 2023-05-10T03:13:47Z

1，有一次运行突然可以了，或者你将模型改成chatglm-6b-int4
2，max_target_length 参数改大就好，我现在设置为512

shimudong · 2023-07-07T09:05:20Z

是不是没装cpm_kernels
from cpm_kernels.kernels.base import LazyKernelCModule, KernelFunction, round_up

zhangch9 closed this as completed Aug 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

p-tuning 训练问题 #962

p-tuning 训练问题 #962

ChenBinfighting1 commented May 8, 2023

ChenBinfighting1 commented May 9, 2023

pengyong94 commented May 9, 2023

guarx commented May 9, 2023

ChenBinfighting1 commented May 10, 2023

shimudong commented Jul 7, 2023

p-tuning 训练问题 #962

p-tuning 训练问题 #962

Comments

ChenBinfighting1 commented May 8, 2023

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

ChenBinfighting1 commented May 9, 2023

pengyong94 commented May 9, 2023

guarx commented May 9, 2023

ChenBinfighting1 commented May 10, 2023

shimudong commented Jul 7, 2023