Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed issue with dynamically linking PyTorch module on Mac OSX #494

Merged
merged 2 commits into from
Sep 14, 2018
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions horovod.exp
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@
*PyInit*
initmpi_lib_v2
# Legacy PyTorch binding
init_mpi_lib
init_mpi_lib_impl
_init_mpi_lib
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this break Linux? Can you check symbols exported in legacy binding on Linux? May need to include both, or simply use *.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the real question is whether we need to update horovod.lds as well, or they should be explicitly different? Should the new initmpi_lib_v2 have underscore prefix on Mac OS X, too?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what I found on Linux.

Using torch_nightly:

 > nm ./build/lib.linux-x86_64-3.6/horovod/torch/mpi_lib_v2.cpython-36m-x86_64-linux-gnu.so | grep PyInit
0000000000048d60 T PyInit_mpi_lib_v2

> nm ./build/lib.linux-x86_64-3.6/horovod/torch/mpi_lib_v2.cpython-36m-x86_64-linux-gnu.so | grep init_mpi
0000000000048730 t _ZN7horovod5torchL24pybind11_init_mpi_lib_v2ERN8pybind116moduleE

Using legacy:

> nm ./build/lib.linux-x86_64-3.6/horovod/torch/mpi_lib_impl/_mpi_lib_impl.cpython-36m-x86_64-linux-gnu.so | grep mpi
000000000004add0 T PyInit__mpi_lib_impl

Can't find any reference to init_mpi_lib, init_mpi_lib_impl, or initmpi_lib_v2, only PyInit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't had any luck getting nightly torch installed in my virtualenv with Mac OSX. Have you gotten that to work?

I think the safest option may be to just wrap the symbols in *'s.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like presence of _ is different on Mac OS X and Linux indeed:

(Linux)# nm -g /usr/local/lib/python2.7/dist-packages/horovod/torch/mpi_lib/_mpi_lib.so | grep init_mpi
0000000000007ce0 T init_mpi_lib
(Mac)# nm /usr/local/lib/python2.7/site-packages/horovod/torch/mpi_lib/_mpi_lib.so | grep init_mpi
0000000000001210 T _init_mpi_lib

With torch-nightly:

(Linux)# nm -g /usr/local/lib/python2.7/dist-packages/horovod/torch/mpi_lib_v2.so | grep initmpi
0000000000048e60 T initmpi_lib_v2

I am not able to install torch-nightly on my Mac to try out at the moment, but I suspect it will have _ prefix as well.

Since it's possible to use libtool on Linux, I suggest we just add * to beginning of init... in both horovod.exp and horovod.lds for consistency.

_init_mpi_lib_impl