Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Performance improvements #7164

merged 49 commits into from Feb 16, 2019


Copy link

commented Feb 13, 2019


This PR provides couple of significant changes that improve performance of libnd4j ops on CPU and CUDA:

  • TADs now can have positive EWS under certain conditions
  • EWS now can't be negative. Instead of -1 value it'll have value of 0 where applicable.
  • Automatic downshift to unsigned int for shapeInfo is provided to speed up offsets calculation
  • SummaryStats loop updated to recent implementation
  • ReduceOps get back original OMP support
raver119 and others added 30 commits Feb 12, 2019
- TAD improvement for #7156
- couple of tests
Yurii Shyrma
Yurii Shyrma
Yurii Shyrma
Yurii Shyrma
Yurii Shyrma
Yurii Shyrma
- transforms updated
- RandomGenerator fix
raver119 and others added 19 commits Feb 14, 2019
Yurii Shyrma
Merge branch 'master' into r119_ab_benchmarking
# Conflicts:
#	libnd4j/include/graph/impl/Graph.cpp
#	libnd4j/include/helpers/impl/OmpLaunchHelper.cpp

@raver119 raver119 merged commit eb6faca into master Feb 16, 2019

0 of 2 checks passed

Codacy/PR Quality Review Not up to standards. This pull request quality could be better.
codeclimate 7 issues to fix

@raver119 raver119 deleted the r119_ab_benchmarking branch Feb 16, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
1 participant
You can’t perform that action at this time.