You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for releasing the code! I noticed that the reporting number of parameters of Lora module is 0.3 M for roberta-base. After experiments, I found that there are 0.5M parameters tunable in the sequence classification head (but it's the same for all baselines, so I am not arguing about the fairness). I wonder was I correct about the setting? Did the performances in the paper were the ones that also tune a classification head for classification tasks?
The text was updated successfully, but these errors were encountered:
ShengdingHu
changed the title
Did the number of parameter take into account the parameters in the tunable language model head?
Did the number of parameters take into account the parameters in the tunable classification head?
Dec 30, 2021
You are right. We didn't include the classification head when counting parameters for all baselines, even though they were trained (otherwise they would be random). We'll add a note in our manuscript stating this explicitly.
P.S. My apology for the delayed response. I left Microsoft late last year and stopped receiving notifications.
Thanks for releasing the code! I noticed that the reporting number of parameters of Lora module is 0.3 M for roberta-base. After experiments, I found that there are 0.5M parameters tunable in the sequence classification head (but it's the same for all baselines, so I am not arguing about the fairness). I wonder was I correct about the setting? Did the performances in the paper were the ones that also tune a classification head for classification tasks?
The text was updated successfully, but these errors were encountered: