New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed issue with dynamically linking PyTorch module on Mac OSX #494

Merged
merged 2 commits into from Sep 14, 2018

Conversation

Projects
None yet
2 participants
@tgaddair
Collaborator

tgaddair commented Sep 13, 2018

due to hidden symbol

@tgaddair tgaddair requested a review from alsrgv Sep 13, 2018

horovod.exp Outdated
@@ -3,5 +3,5 @@
*PyInit*
initmpi_lib_v2
# Legacy PyTorch binding
init_mpi_lib
init_mpi_lib_impl
_init_mpi_lib

This comment has been minimized.

@alsrgv

alsrgv Sep 13, 2018

Collaborator

Would this break Linux? Can you check symbols exported in legacy binding on Linux? May need to include both, or simply use *.

This comment has been minimized.

@alsrgv

alsrgv Sep 13, 2018

Collaborator

I guess the real question is whether we need to update horovod.lds as well, or they should be explicitly different? Should the new initmpi_lib_v2 have underscore prefix on Mac OS X, too?

This comment has been minimized.

@tgaddair

tgaddair Sep 13, 2018

Collaborator

Here's what I found on Linux.

Using torch_nightly:

 > nm ./build/lib.linux-x86_64-3.6/horovod/torch/mpi_lib_v2.cpython-36m-x86_64-linux-gnu.so | grep PyInit
0000000000048d60 T PyInit_mpi_lib_v2

> nm ./build/lib.linux-x86_64-3.6/horovod/torch/mpi_lib_v2.cpython-36m-x86_64-linux-gnu.so | grep init_mpi
0000000000048730 t _ZN7horovod5torchL24pybind11_init_mpi_lib_v2ERN8pybind116moduleE

Using legacy:

> nm ./build/lib.linux-x86_64-3.6/horovod/torch/mpi_lib_impl/_mpi_lib_impl.cpython-36m-x86_64-linux-gnu.so | grep mpi
000000000004add0 T PyInit__mpi_lib_impl

Can't find any reference to init_mpi_lib, init_mpi_lib_impl, or initmpi_lib_v2, only PyInit.

This comment has been minimized.

@tgaddair

tgaddair Sep 13, 2018

Collaborator

Haven't had any luck getting nightly torch installed in my virtualenv with Mac OSX. Have you gotten that to work?

I think the safest option may be to just wrap the symbols in *'s.

This comment has been minimized.

@alsrgv

alsrgv Sep 13, 2018

Collaborator

Looks like presence of _ is different on Mac OS X and Linux indeed:

(Linux)# nm -g /usr/local/lib/python2.7/dist-packages/horovod/torch/mpi_lib/_mpi_lib.so | grep init_mpi
0000000000007ce0 T init_mpi_lib
(Mac)# nm /usr/local/lib/python2.7/site-packages/horovod/torch/mpi_lib/_mpi_lib.so | grep init_mpi
0000000000001210 T _init_mpi_lib

With torch-nightly:

(Linux)# nm -g /usr/local/lib/python2.7/dist-packages/horovod/torch/mpi_lib_v2.so | grep initmpi
0000000000048e60 T initmpi_lib_v2

I am not able to install torch-nightly on my Mac to try out at the moment, but I suspect it will have _ prefix as well.

Since it's possible to use libtool on Linux, I suggest we just add * to beginning of init... in both horovod.exp and horovod.lds for consistency.

@alsrgv

alsrgv approved these changes Sep 14, 2018

@tgaddair tgaddair merged commit d66ac3a into master Sep 14, 2018

1 of 2 checks passed

continuous-integration/travis-ci/pr The Travis CI build could not complete due to an error
Details
license/cla Contributor License Agreement is signed.
Details

@tgaddair tgaddair deleted the symbol_fix_osx branch Sep 14, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment