Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(segment|grouped)_matmul implementation using MKL BLAS #146

Conversation

DamianSzwichtenberg
Copy link
Member

By default PyTorch is built with MKL BLAS support, we can take advantage of that and use gemm_batch to implement (segment|grouped)_matmul.

segment_matmul C++ benchmarks (24 core machine) showed that the new version is on average 3.5 times faster than the previous one (with maximum speedup up to 6 times - note that chosen shapes are quite small, so speedup will be more impressive if a problem becomes more complex).

Additionally, HeteroLinear benchmarks were performed (thanks to @puririshi98 for sharing the benchmarks code):
segment_matmul_bench
Same benchmark with time in log2 scale:
segment_linear_bench_log2
We can see that segment_matmul execution path starts shining from num_node_types=32 onward.

Copy link
Member

@rusty1s rusty1s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks great. Thanks for the effort!

Two questions:

  • Can we get a better review in than mine from @pyg-team/intel-team?
  • Can we enable it in our CPP tests to ensure that it is working correct?

CONTRIBUTING.md Outdated Show resolved Hide resolved
@DamianSzwichtenberg
Copy link
Member Author

Can we get a better review in than mine from @pyg-team/intel-team?

Sure, I'll ping someone.

Can we enable it in our CPP tests to ensure that it is working correct?

Do you mean to enable it in the CI? 😉

@rusty1s
Copy link
Member

rusty1s commented Nov 17, 2022

Do you mean to enable it in the CI? 😉

Yes.

@yanbing-j
Copy link
Contributor

LGTM. Thanks for your hard work!
I have question about the improvement in E2E, do you verify this in a real benchmark?

@codecov-commenter
Copy link

codecov-commenter commented Nov 23, 2022

Codecov Report

Merging #146 (32db269) into master (2eab973) will decrease coverage by 2.01%.
The diff coverage is 81.81%.

❗ Current head 32db269 differs from pull request most recent head a6048a4. Consider uploading reports for the commit a6048a4 to get more accurate results

@@            Coverage Diff             @@
##           master     #146      +/-   ##
==========================================
- Coverage   94.01%   92.00%   -2.02%     
==========================================
  Files          20       20              
  Lines         535      588      +53     
==========================================
+ Hits          503      541      +38     
- Misses         32       47      +15     
Impacted Files Coverage Δ
pyg_lib/csrc/ops/cpu/matmul_kernel.cpp 83.33% <81.81%> (-16.67%) ⬇️
pyg_lib/csrc/ops/matmul.h 50.00% <0.00%> (-50.00%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@DamianSzwichtenberg
Copy link
Member Author

DamianSzwichtenberg commented Nov 23, 2022

Do you mean to enable it in the CI? 😉

Yes.

@rusty1s Just added. Please take another look. 😉

@DamianSzwichtenberg
Copy link
Member Author

LGTM. Thanks for your hard work! I have question about the improvement in E2E, do you verify this in a real benchmark?

Training and inference benchmarks, unfortunately, do not cover the segment_matmul execution path in HeteroLinear or RGCNConv.

Copy link

@mingfeima mingfeima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have test cases for it?

pyg_lib/csrc/ops/cpu/matmul_kernel.cpp Outdated Show resolved Hide resolved
CMakeLists.txt Show resolved Hide resolved
pyg_lib/csrc/ops/cpu/matmul_kernel.cpp Outdated Show resolved Hide resolved
pyg_lib/csrc/ops/cpu/matmul_kernel.cpp Show resolved Hide resolved
pyg_lib/csrc/ops/cpu/matmul_kernel.cpp Outdated Show resolved Hide resolved
pyg_lib/csrc/ops/cpu/matmul_kernel.cpp Outdated Show resolved Hide resolved
pyg_lib/csrc/ops/cpu/matmul_kernel.cpp Outdated Show resolved Hide resolved
@DamianSzwichtenberg
Copy link
Member Author

Do we have test cases for it?

Yes, here and here.

@DamianSzwichtenberg
Copy link
Member Author

@mingfeima Please take another look. 😉

pyg_lib/csrc/ops/cpu/matmul_kernel.cpp Show resolved Hide resolved
pyg_lib/csrc/ops/cpu/matmul_kernel.cpp Show resolved Hide resolved
pyg_lib/csrc/ops/cpu/matmul_kernel.cpp Outdated Show resolved Hide resolved
pyg_lib/csrc/ops/cpu/matmul_kernel.cpp Show resolved Hide resolved
@DamianSzwichtenberg DamianSzwichtenberg merged commit df545db into pyg-team:master Nov 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants