-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[InstanceGraph] Speed up instance graph constructions, NFCI #5520
Conversation
In large designs, it takes non-negligible amount of time to construct instance graphs. This PR removes an unnecessary walk, and make instance operations accumlation parallel. Before: 50.1090 ( 1.6%) 50.1090 ( 6.1%) LowerFIRRTLAnnotations 17.6776 ( 0.6%) 17.6776 ( 2.2%) (A) circt::firrtl::InstanceGraph 16.6704 ( 0.5%) 16.6704 ( 2.0%) LowerIntrinsics 16.6657 ( 0.5%) 16.6657 ( 2.0%) (A) circt::firrtl::InstanceGraph After: 38.6719 ( 1.2%) 38.6719 ( 5.1%) LowerFIRRTLAnnotations 9.4730 ( 0.3%) 9.4730 ( 1.2%) (A) circt::firrtl::InstanceGraph 9.7834 ( 0.3%) 9.7834 ( 1.3%) LowerIntrinsics 9.7817 ( 0.3%) 9.7817 ( 1.3%) (A) circt::firrtl::InstanceGraph
7db1eee
to
3ca2579
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🏅
Very silly that we were recursing in to the module, thank you for fixing this!
lib/Dialect/HW/InstanceGraphBase.cpp
Outdated
parent->walk<mlir::WalkOrder::PreOrder>([&](HWModuleLike module) { | ||
moduleToInstances.push_back({module, {}}); | ||
// Skip. | ||
return WalkResult::skip(); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can use something like for (auto module : parent->getBody()->getOps<HWModuleLike>()
In large designs, it takes non-negligible amount of time to construct instance graphs (especially before dedup in firtool pipeline) since it was visiting entire circuits twice sequentially. This PR removes an unnecessary walk, and make instance operations accumulation parallel. It reduced the graph construction time by 40% in large designs (3~4% reduction in the entire execution time). Before: ``` 50.1090 ( 1.6%) 50.1090 ( 6.1%) LowerFIRRTLAnnotations 17.6776 ( 0.6%) 17.6776 ( 2.2%) (A) circt::firrtl::InstanceGraph 16.6704 ( 0.5%) 16.6704 ( 2.0%) LowerIntrinsics 16.6657 ( 0.5%) 16.6657 ( 2.0%) (A) circt::firrtl::InstanceGraph ``` After: ``` 38.6719 ( 1.2%) 38.6719 ( 5.1%) LowerFIRRTLAnnotations 9.4730 ( 0.3%) 9.4730 ( 1.2%) (A) circt::firrtl::InstanceGraph 9.7834 ( 0.3%) 9.7834 ( 1.3%) LowerIntrinsics 9.7817 ( 0.3%) 9.7817 ( 1.3%) (A) circt::firrtl::InstanceGraph ```
In large designs, it takes non-negligible amount of time to construct instance graphs (especially before dedup in firtool pipeline) since it was visiting entire circuits twice sequentially. This PR removes an unnecessary walk, and make instance operations accumlation parallel. It reduced graph construction time by 40% in large designs. (3~4% reduction in the entire execution time)
Before:
After: