Model- and data-dependent hyperparameters #25

negar-foroutan · 2022-09-19T19:05:34Z

Hi!
Thank you very much for making your implementation publicly available.
I want to use ROME on different LMs and datasets than those you tried in the paper. I was wondering which hyperparameters are model- or data-dependent and whether you have an intuition/strategy for finding values for them.
Thanks!

kmeng01 · 2022-09-19T19:53:33Z

Hi, this is a great question! Looking at the GPT-J hparams, clamp_norm_factor is perhaps the most important. It is a hard constraint that determines how large $v_*$'s norm can be, with respect to the original hidden representation. If it's too high, bleedover will be high (update unnecessarily large), but if low, the update will not work.

Other soft constraints like weight decay and KL divergence should also be tuned. A good rule of thumb is to start with non-constraining values (e.g., no weight decay, no KL loss, high clamp factor) and make sure the maximum-DOF update works. Then increase constraints to eliminate bleedover effects.

The ROME notebook (notebooks/rome.ipynb) is an excellent place to experiment with these values. The hparams files are hot-reloaded on every run of the execution cell, so iteration speed is relatively fast.

kmeng01 · 2022-09-19T19:53:50Z

If you have any model-specific questions, I'd be happy to take a look when I get a moment. lmk!

negar-foroutan · 2022-09-21T07:03:52Z

Thank you very much.

kmeng01 closed this as completed Sep 21, 2022

moyix mentioned this issue Sep 30, 2022

Documentation for hparams? #27

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model- and data-dependent hyperparameters #25

Model- and data-dependent hyperparameters #25

negar-foroutan commented Sep 19, 2022

kmeng01 commented Sep 19, 2022 •

edited

kmeng01 commented Sep 19, 2022 •

edited

negar-foroutan commented Sep 21, 2022

Model- and data-dependent hyperparameters #25

Model- and data-dependent hyperparameters #25

Comments

negar-foroutan commented Sep 19, 2022

kmeng01 commented Sep 19, 2022 • edited

kmeng01 commented Sep 19, 2022 • edited

negar-foroutan commented Sep 21, 2022

kmeng01 commented Sep 19, 2022 •

edited

kmeng01 commented Sep 19, 2022 •

edited