Calibration Dataset: how to avoid computing loss on instructions? #525

RanchiZhao · 2024-06-28T03:41:51Z

I would like to know about chat models. When I use AWQ for calibration, I do not want to compute the loss for the instructions, but only for the responses. I want to know how to handle this when inputting the calibration dataset. For example, how should I handle the labels and attention masks?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calibration Dataset: how to avoid computing loss on instructions? #525

Calibration Dataset: how to avoid computing loss on instructions? #525

RanchiZhao commented Jun 28, 2024

Calibration Dataset: how to avoid computing loss on instructions? #525

Calibration Dataset: how to avoid computing loss on instructions? #525

Comments

RanchiZhao commented Jun 28, 2024