-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fx] add rules to linearize computation graphs for searching. #1461
[fx] add rules to linearize computation graphs for searching. #1461
Conversation
…on checkpointing usages
…on checkpointing usages
…on checkpointing usages
* [fx] activation checkpointing using Chen strategies. * [fx] add test for ckpt_solver_chen * [fx] add vanilla activation checkpoint search with test on resnet and densenet * [fx] add a namespace code for solver_chen. * [fx] fix the false interpretation of algorithm 3 in https://arxiv.org/abs/1604.06174. * [fx] fix lowercase naming conventions. * [fx] simplify test for ckpt.
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages * [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages * [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages * [fx] merge development into main (#1) * [fx] activation checkpointing using Chen strategies. * [fx] add test for ckpt_solver_chen * [fx] add vanilla activation checkpoint search with test on resnet and densenet * [fx] add a namespace code for solver_chen. * [fx] fix the false interpretation of algorithm 3 in https://arxiv.org/abs/1604.06174. * [fx] fix lowercase naming conventions. * [fx] simplify test for ckpt. * [fx] fix test and algorithm bugs in activation checkpointing. * [fx] polish ckpt_test. * [fx] add rules to linearize computation graphs for searching.
…ColossalAI into feature/linear_ckpt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I see all the changes, currently I don't spot any mistakes, I will modify the codegen to check if we need to use use_reentrant=False
when call checkpoint functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just hold the PR~
Hey, anyone to review this?? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let's check the if CI could pass
What's new?
Since in most existing frameworks of activation checkpoints, the forward graph is assumed to be linearized, I developed a small tool function to detect all potential checkpoint nodes. In this way,
chen_greedy()
will operate in a linearized manner, without checkpointing something within the skip connection blocks. I also removedchen_sqrtn()
because it is a bit outdated.And hopefully, with the new checkpoint function in Colossal-AI (#1460), we can now do linearized searches on arbitrary computation graphs, such as
resnet18()