Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fx] Add common node in model linearize #1542

Merged

Conversation

Cypher30
Copy link
Contributor

@Cypher30 Cypher30 commented Sep 5, 2022

What's New

We were inspired by the common node idea in merak, and modify the naive process of defining a common node in merak's source code. Now we are able to address the attention mask problem in hugginface BERT model, separating each attention block and MLP in the encoder.

Moreover, I also fixed the problem that our solver might repeatedly count the memory for combination like "BN-RELU", as the current linearize strategy will generate more isolated in-place RELU node in the linearized node list.

…her30/ColossalAI into feature/add_common_node_to_linearize
@super-dainiu super-dainiu merged commit 46c6cc7 into hpcaitech:main Sep 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants