-
Notifications
You must be signed in to change notification settings - Fork 66
perf(autograd): optimize grey_dilation with striding #2589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
3 files reviewed, no comments
Edit PR Review Bot Settings | Greptile
Diff CoverageDiff: origin/develop...HEAD, staged and unstaged changes
Summary
tidy3d/plugins/autograd/functions.py |
266d6a0 to
be0b9b0
Compare
groberts-flex
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for this implementation, the speed up looks awesome especially for a function that will be in a lot of robust optimizations!
left some comments/questions, some just for my own understanding!
be0b9b0 to
21486df
Compare
The previous implementation of `grey_dilation` was based on convolution, which was slow for both the forward and backward passes. This commit replaces it with a high-performance implementation that uses NumPy's `as_strided` to create sliding window views of the input array. This avoids redundant computations and memory allocations, leading to significant speedups. The VJP (gradient) for the primitive is also updated to use the same striding technique, ensuring the backward pass is also much faster. Benchmarks show speedups of 10-100x depending on the array and kernel size.
3f63287 to
781a0f5
Compare
groberts-flex
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @yaugenst-flex this is great! I understand now much better the multiplicity/multiple maximum part reading through the updated comment. thanks for the clarifications and changes.
The previous implementation of
grey_dilationwas based on convolution, which was slow for both the forward and backward passes.This PR replaces it with a high-performance implementation that uses NumPy's
sliding_window_viewto create sliding window views of the input array. I also wrote a custom VJP that uses the same striding technique to make the backward pass faster too.I also simplified the implementation of
grey_erosionso thatgrey_dilationis now the only function that does the heavy lifting.Benchmarks show speedups of 10-100x depending on the array and kernel size.
This should make these ops much more usable in topopt @groberts-flex
Greptile Summary
Significant performance optimization of the
grey_dilationmorphological operation by replacing convolution-based implementation with NumPy'ssliding_window_viewfor strided array operations.tidy3d/plugins/autograd/functions.py, achieving 10-100x speedupgrey_erosionby expressing it through duality withgrey_dilation