Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

method of composing reductions across layers #19

Closed
KTALS opened this issue Jan 15, 2024 · 2 comments
Closed

method of composing reductions across layers #19

KTALS opened this issue Jan 15, 2024 · 2 comments
Labels
question Further information is requested

Comments

@KTALS
Copy link

KTALS commented Jan 15, 2024

Hello! Thanks for your idea and codes, and I am applying the code to my model. There are two questions for me now:

  1. The paper says "greedily search" over the parameters and have a "simple compose strategy" when composing reductions across layers。Does this mean that search the best rate in different later MLP layers and then simply compose them?
  2. Can I use a single command line to realize composing reductions across layers? Or i need to repeat doing intervention on a single layer for a few times to compose reductions?

Thank you!

@pratyushasharma
Copy link
Owner

pratyushasharma commented Jan 17, 2024

Hello, thank you for your interest in our work!

  1. There might be a dependence on whether a reduction at a certain layer is helpful, given there is a reduction made in another layer. Therefore, we cannot just compose reductions after individually searching for the best rate per layer, while holding other layers fixed.

  2. Added a script that walks through our procedure for this.

We wanted to check if the benefits of performing laser across layers are additive or not, so we employed a simple strategy. The current strategy we use is as follows:
1. Initialize a vector that represents different amounts of reduction across each of the different layers
2. Edit the model with Laser starting from the final layer (restricted over a set $\rho$ values)
3. Validate over the validation set of the dataset
4. use the signal from step 2 to reduce or increase the amount of reduction
5. repeat until convergence
6. return the vector of reductions

I have also added a script that walks through this procedure under the scripts directory.

One thing to note: This search procedure is probably not the most efficient and only performs a sparse search over possible $\rho$ values and only over the encoder MLP layers ('MLP_FC_IN') layers. A more thorough search might result in additional increased improvements!

@pratyushasharma pratyushasharma added the question Further information is requested label Jan 17, 2024
@dkmisra dkmisra mentioned this issue Jan 19, 2024
@KTALS
Copy link
Author

KTALS commented Jan 22, 2024

MANY THANKS!!!!

@KTALS KTALS closed this as completed Jan 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants