Did the number of parameters take into account the parameters in the tunable classification head? #16

ShengdingHu · 2021-12-30T11:16:43Z

Thanks for releasing the code! I noticed that the reporting number of parameters of Lora module is 0.3 M for roberta-base. After experiments, I found that there are 0.5M parameters tunable in the sequence classification head (but it's the same for all baselines, so I am not arguing about the fairness). I wonder was I correct about the setting? Did the performances in the paper were the ones that also tune a classification head for classification tasks?

edwardjhu · 2022-03-05T18:07:40Z

Hi Shengding,

You are right. We didn't include the classification head when counting parameters for all baselines, even though they were trained (otherwise they would be random). We'll add a note in our manuscript stating this explicitly.

P.S. My apology for the delayed response. I left Microsoft late last year and stopped receiving notifications.

ShengdingHu changed the title ~~Did the number of parameter take into account the parameters in the tunable language model head?~~ Did the number of parameters take into account the parameters in the tunable classification head? Dec 30, 2021

edwardjhu closed this as completed Mar 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Did the number of parameters take into account the parameters in the tunable classification head? #16

Did the number of parameters take into account the parameters in the tunable classification head? #16

ShengdingHu commented Dec 30, 2021

edwardjhu commented Mar 5, 2022

Did the number of parameters take into account the parameters in the tunable classification head? #16

Did the number of parameters take into account the parameters in the tunable classification head? #16

Comments

ShengdingHu commented Dec 30, 2021

edwardjhu commented Mar 5, 2022