New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to prune the model from the very begigning? #28
Comments
Hi, thanks for checking out our repo! I think T5 would be a bit tricky as it's a sequence-to-sequence model. So you would need to add masks to components on both the source and target side of the model, but the logic to add the masks to both sides is the same. I suggest you check out |
Hi @xiamengzhou thanks for your kind reply. In your code, do you use |
Sorry to bother again, may I ask that how do you prune the model during the process. I see in this line you obtain the https://github.com/princeton-nlp/CoFiPruning/blob/main/trainer/trainer.py#L282 But then how to use the https://github.com/princeton-nlp/CoFiPruning/blob/main/utils/cofi_utils.py#L107 Many thanks !!! @xiamengzhou |
Hi, Sorry for the late reply! For the first question: I pruned the head and layers at the same time, with a set of masks For the second question: in each training forward pass, we get the sampled mask zs from l0_module, and pass it to the model to get the loss, and backward the loss to update the model parameters and parameters of l0_modules. You can use |
Hi @xiamengzhou , thanks for your contribution. But in your code, you use
Model.from_pretrained
to load the model architecture, and the files you have already provided. But if I want to prune my own, original model, for instance T5 model, using your method in the paper. Which code should I check? Many thanks:)The text was updated successfully, but these errors were encountered: