New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AVX intrinsics to vectorize & speed up FP16-CPU computations #574

Merged
merged 2 commits into from Oct 23, 2018

Conversation

Projects
None yet
2 participants
@alsrgv
Collaborator

alsrgv commented Oct 20, 2018

Use CPU intrinsics to do fast FP16 conversion & vectorization. Observe 10-12% speedup over two nodes.

@alsrgv alsrgv self-assigned this Oct 20, 2018

@alsrgv alsrgv requested a review from tgaddair Oct 20, 2018

setup.py Outdated
@@ -67,7 +67,7 @@ def check_tf_version():
def get_cpp_flags(build_ext):
last_err = None
default_flags = ['-std=c++11', '-fPIC', '-O2']
default_flags = ['-std=c++11', '-fPIC', '-O2', '-mf16c']

This comment has been minimized.

@alsrgv

alsrgv Oct 20, 2018

Collaborator

TODO: additional ./configure-style test to check if -mf16c is allowed on the machine we're installing Horovod.

auto* in = (unsigned short*)invec;
auto* inout = (unsigned short*)inoutvec;
int i = 0;

This comment has been minimized.

@tgaddair

tgaddair Oct 23, 2018

Collaborator

Why not initialize i within the for loop?

This comment has been minimized.

@alsrgv

alsrgv Oct 23, 2018

Collaborator

Because if we have AVX, we will process all the blocks divisible by 8, and the rest will be processed as "leftovers" using the slow algorithm. If we don't have AVX, i will start at 0.

@alsrgv alsrgv merged commit 156c61b into master Oct 23, 2018

3 checks passed

License Compliance All checks passed.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
license/cla Contributor License Agreement is signed.
Details

@alsrgv alsrgv deleted the fp16_cpu_intrin branch Oct 23, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment