Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration with PyTorch #161

Closed
csukuangfj opened this issue Sep 17, 2020 · 6 comments
Closed

Integration with PyTorch #161

csukuangfj opened this issue Sep 17, 2020 · 6 comments

Comments

@csukuangfj
Copy link
Collaborator

csukuangfj commented Sep 17, 2020

With the introduction of k2::Context, the current mechanism to communicate with PyTorch via DLPack is not sufficient since we need to allocate/deallocate memory on different devices, while DLPack can only pass pre-allocated memory around.

I would like to update the build system to link against PyTorch with the following goals in mind:

(1) The build system should be simple. The PyTorch dependency will be installed with pip install torch so that
C++ shares the same PyTorch version with Python.

(2) Replace the current k2::CudaContext with the one from PyTorch, which is much faster with memory caching.

@qindazhu
Copy link
Collaborator

RE 1): Is the build system supposed to work with any installation of PyTorch? e.g. if users build PyTorch from source, or install using Conda
RE 2): Why do we need to replace CudaContext? I thought we can just create a new class PyTorchContext which inherits from Context, then we can operate PyTorch tensor (no matter GPU or CPU, exposed with DLPack) with current interfaces (e.g. Array1) and algorithms. Correct me if I'm wrong.

@csukuangfj
Copy link
Collaborator Author

csukuangfj commented Sep 17, 2020

It takes several hours to build PyTorch from source. Another problem to build from source is that there is a good chance the PyTorch version K2 uses is different from the one Python uses.

The current CudaContext is a naive implementation. What I want to do is to change GetCudaContext so that
it returns a pytorch context.

I have no experience with conda. But I guess it will work.

@danpovey
Copy link
Collaborator

danpovey commented Sep 18, 2020 via email

@csukuangfj
Copy link
Collaborator Author

To the best of my knowledge, there are two approaches to build against PyTorch:

(1) Build PyTorch from source. I think only PyTorch developers use this method. The building process can take several hours.
It's very likely that the source version is different from the one installed with pip.

(2) Build against LibTorch. https://pytorch.org/ provides links to download libtorch-shared-with-deps-xxx.zip, which
contains a set of header files and shared/static libraries. These files are also contained in the PyTorch package installed via pip install torch. pip install torch is chosen so that we use the same PyTorch version with Python.

cd build/bin
readelf -d cu_array_test

prints

 0x0000000000000001 (NEEDED)             Shared library: [libfsa.so]
 0x0000000000000001 (NEEDED)             Shared library: [libgtest_maind.so]
 0x0000000000000001 (NEEDED)             Shared library: [libtorch.so]
 0x0000000000000001 (NEEDED)             Shared library: [libtorch_cpu.so]
 0x0000000000000001 (NEEDED)             Shared library: [libtorch_cuda.so]
 0x0000000000000001 (NEEDED)             Shared library: [libc10_cuda.so]
 0x0000000000000001 (NEEDED)             Shared library: [libc10.so]
 0x0000000000000001 (NEEDED)             Shared library: [libnvToolsExt.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libcudart.so.10.0]
 0x0000000000000001 (NEEDED)             Shared library: [libgtestd.so]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000f (RPATH)              Library rpath: [/xxx/pyenv/versions/3.7.8/lib/python3.7/site-packages/torch/lib:/xxx/cuda/lib64:/xxx/cuda/lib64/stubs]

Part of the linking commands for cu_array_test is given below:

/usr/bin/g++ xxx/array_test.cu.o -o ../../bin/cu_array_test  
-Wl,-rpath,/xxx/k2/build/lib:/xxx/pyenv/versions/3.7.8/lib/python3.7/site-packages/torch/lib
/xxx/pyenv/versions/3.7.8/lib/python3.7/site-packages/torch/lib/libtorch.so 
-Wl,--no-as-needed,/xxx/pyenv/versions/3.7.8/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so 
-Wl,--no-as-needed,/xxx/pyenv/versions/3.7.8/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so 
-Wl,--as-needed /xxx/pyenv/versions/3.7.8/lib/python3.7/site-packages/torch/lib/libc10_cuda.so 
/xxx/pyenv/versions/3.7.8/lib/python3.7/site-packages/torch/lib/libc10.so

We can see that k2 uses the same PyTorch libraries from the torch package.


Note that there are no restrictions on the PyTorch version. Users can choose which PyTorch version to
use by pip install torch==x.x.x and K2 will pick the corresponding version correctly.

@csukuangfj
Copy link
Collaborator Author

how does your pull request relate to this, in terms of PyTorch versions and compatibility?

k2/cmake/torch.cmake

Lines 3 to 7 in f784a2c

execute_process(
COMMAND "${PYTHON_EXECUTABLE}" -c "import os; import torch; print(os.path.dirname(torch.__file__))"
OUTPUT_STRIP_TRAILING_WHITESPACE
OUTPUT_VARIABLE TORCH_DIR
)

K2 uses the PyTorch installation information from the pip package, so it's guaranteed that k2 uses whichever version Python
is currently using.

target_link_libraries(context PUBLIC ${TORCH_LIBRARIES})

@danpovey
Copy link
Collaborator

danpovey commented Sep 18, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants