Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

layers difference help #9

Closed
zhujiem opened this issue Nov 20, 2022 · 5 comments
Closed

layers difference help #9

zhujiem opened this issue Nov 20, 2022 · 5 comments

Comments

@zhujiem
Copy link

zhujiem commented Nov 20, 2022

Hi, I am very interested in your work and want to try their applications. I have figured out the usage of "monarch_linear.py". But I am still confused about other layers with similar names.

Could you please briefly introduce them to help me better understand your code? Thx in advance.

@zhujiem
Copy link
Author

zhujiem commented Nov 20, 2022

I found that monarch_linear is a pure pytorch implementation, but some others are based on huggingface / trion backend. Which one is faster?

@tridao
Copy link
Contributor

tridao commented Nov 23, 2022

These are just various kinds of weight matrices that we've tried / played with over several projects.

  • monarch_linear.py implements Monarch matrices as described in the Monarch paper, with 2 block-diagonal matrices and 2 permutations.
  • blockdiag_linear.py implements a linear layer whose weight matrix is just a block-diagonal matrix. This is simpler than Monarch.
  • blocksparse_linear.py implements a linear layer whose weight matrices are block-sparse. This is more general than blockdiag_linear, and requires fast block-sparse multiply from either huggingface or triton.
  • fastlinear.py: a bunch of different things we tried at some point. Experimental and should not be used.
  • structured_linear.py: this is the base class, it takes care of some common steps (converting from sparse to dense, etc.).

@zhujiem
Copy link
Author

zhujiem commented Nov 23, 2022

Hi Tri. Thank you very much for your introduction!
It is much more clear for me now. But it did not point out to pixelatedbutterfly linear layer. I even thought blocksparse_linear is for pixelfly. Could you also give me the quick link? I'd like to compare Monarch and Pixelfly.

@tridao
Copy link
Contributor

tridao commented Nov 23, 2022

Pixelfly is blocksparse_linear.py with a specific sparsity pattern (FlatBlockButterflySparsityConfig).
You can check the config here to see an example.

@zhujiem
Copy link
Author

zhujiem commented Nov 23, 2022

Many thanks!

@zhujiem zhujiem closed this as completed Nov 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants