You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! Thanks for your outstanding work!
I wanted to edit LLaMA2 with PMET, but the experiment failed. Can you share the settings for conducting batch editing on LLaMA2-7B?
By the way, when will the SWEAOS code be open source? I am very excited and thank you very much for your new SOTA approach.
Looking forward to your reply!
The text was updated successfully, but these errors were encountered:
Thank you for your interest in our work. We utilized the following hyperparameter settings for LLama-2 editing with PMET: { "layers": [ 4, 5, 6, 7, 8 ], "clamp_norm_factor": 1, "layer_selection": "all", "fact_token": "subject_last", "v_num_grad_steps": 20, "v_lr": 0.5, "v_loss_layer": 31, "v_weight_decay": 0.5, "kl_factor": 0.0625, "rewrite_module_tmp": "model.layers.{}.mlp.down_proj", "rewrite_module_tmps": ["model.layers.{}.mlp.down_proj"], "layer_module_tmp": "model.layers.{}", "mlp_module_tmp": "model.layers.{}.mlp.down_proj", "attn_module_tmp": "model.layers.{}.self_attn.o_proj", "ln_f_module": "model.norm", "lm_head_module": "lm_head", "mom2_adjustment": true, "mom2_update_weight": 6000, "mom2_dataset": "wikipedia", "mom2_n_samples": 100000, "mom2_dtype": "float32", "nll_loss_factor": 1 }
Please note that there is a slight difference between the tokenizer of LLama-2 and that of GPT-J. The latter requires a space before each word for correct encoding, whereas the former does not. Our SWEAOS code will be released soon.
Hello! Thanks for your outstanding work!
I wanted to edit LLaMA2 with PMET, but the experiment failed. Can you share the settings for conducting batch editing on LLaMA2-7B?
By the way, when will the SWEAOS code be open source? I am very excited and thank you very much for your new SOTA approach.
Looking forward to your reply!
The text was updated successfully, but these errors were encountered: