Integration with PyTorch #161

csukuangfj · 2020-09-17T13:55:06Z

With the introduction of k2::Context, the current mechanism to communicate with PyTorch via DLPack is not sufficient since we need to allocate/deallocate memory on different devices, while DLPack can only pass pre-allocated memory around.

I would like to update the build system to link against PyTorch with the following goals in mind:

(1) The build system should be simple. The PyTorch dependency will be installed with pip install torch so that
C++ shares the same PyTorch version with Python.

(2) Replace the current k2::CudaContext with the one from PyTorch, which is much faster with memory caching.

The text was updated successfully, but these errors were encountered:

qindazhu · 2020-09-17T14:25:36Z

RE 1): Is the build system supposed to work with any installation of PyTorch? e.g. if users build PyTorch from source, or install using Conda
RE 2): Why do we need to replace CudaContext? I thought we can just create a new class PyTorchContext which inherits from Context, then we can operate PyTorch tensor (no matter GPU or CPU, exposed with DLPack) with current interfaces (e.g. Array1) and algorithms. Correct me if I'm wrong.

csukuangfj · 2020-09-17T14:38:59Z

It takes several hours to build PyTorch from source. Another problem to build from source is that there is a good chance the PyTorch version K2 uses is different from the one Python uses.

The current CudaContext is a naive implementation. What I want to do is to change GetCudaContext so that
it returns a pytorch context.

I have no experience with conda. But I guess it will work.

danpovey · 2020-09-18T05:48:34Z

Fangjun, how does your pull request relate to this, in terms of PyTorch versions and compatibility? I'm not sure that I fully understand what it will be doing from a linking perspective.. will it try to pick up symbols from the already-loaded PyTorch dynamic library?

…

On Thu, Sep 17, 2020 at 10:39 PM Fangjun Kuang ***@***.***> wrote: It takes *several hours* to build PyTorch from source. Another problem to build from source is that there is a good chance the PyTorch version K2 uses is different from the PyTorch python version. The current CudaContext is a naive implementation. What I want to do is to change GetCudaContext so that it returns a pytorch context. I have no experience with conda. But I guess it will work. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#161 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZFLO75U2OFPUBHO2EFVELSGINRLANCNFSM4RQPJU3Q> .

csukuangfj · 2020-09-18T06:21:05Z

To the best of my knowledge, there are two approaches to build against PyTorch:

(1) Build PyTorch from source. I think only PyTorch developers use this method. The building process can take several hours.
It's very likely that the source version is different from the one installed with pip.

(2) Build against LibTorch. https://pytorch.org/ provides links to download libtorch-shared-with-deps-xxx.zip, which
contains a set of header files and shared/static libraries. These files are also contained in the PyTorch package installed via pip install torch. pip install torch is chosen so that we use the same PyTorch version with Python.

cd build/bin
readelf -d cu_array_test

prints

 0x0000000000000001 (NEEDED)             Shared library: [libfsa.so]
 0x0000000000000001 (NEEDED)             Shared library: [libgtest_maind.so]
 0x0000000000000001 (NEEDED)             Shared library: [libtorch.so]
 0x0000000000000001 (NEEDED)             Shared library: [libtorch_cpu.so]
 0x0000000000000001 (NEEDED)             Shared library: [libtorch_cuda.so]
 0x0000000000000001 (NEEDED)             Shared library: [libc10_cuda.so]
 0x0000000000000001 (NEEDED)             Shared library: [libc10.so]
 0x0000000000000001 (NEEDED)             Shared library: [libnvToolsExt.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libcudart.so.10.0]
 0x0000000000000001 (NEEDED)             Shared library: [libgtestd.so]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000f (RPATH)              Library rpath: [/xxx/pyenv/versions/3.7.8/lib/python3.7/site-packages/torch/lib:/xxx/cuda/lib64:/xxx/cuda/lib64/stubs]

Part of the linking commands for cu_array_test is given below:

/usr/bin/g++ xxx/array_test.cu.o -o ../../bin/cu_array_test  
-Wl,-rpath,/xxx/k2/build/lib:/xxx/pyenv/versions/3.7.8/lib/python3.7/site-packages/torch/lib
/xxx/pyenv/versions/3.7.8/lib/python3.7/site-packages/torch/lib/libtorch.so 
-Wl,--no-as-needed,/xxx/pyenv/versions/3.7.8/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so 
-Wl,--no-as-needed,/xxx/pyenv/versions/3.7.8/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so 
-Wl,--as-needed /xxx/pyenv/versions/3.7.8/lib/python3.7/site-packages/torch/lib/libc10_cuda.so 
/xxx/pyenv/versions/3.7.8/lib/python3.7/site-packages/torch/lib/libc10.so

We can see that k2 uses the same PyTorch libraries from the torch package.

Note that there are no restrictions on the PyTorch version. Users can choose which PyTorch version to
use by pip install torch==x.x.x and K2 will pick the corresponding version correctly.

csukuangfj · 2020-09-18T07:12:15Z

how does your pull request relate to this, in terms of PyTorch versions and compatibility?

k2/cmake/torch.cmake

Lines 3 to 7 in f784a2c

    
           execute_process( 
        
             COMMAND "${PYTHON_EXECUTABLE}" -c "import os; import torch; print(os.path.dirname(torch.__file__))" 
        
             OUTPUT_STRIP_TRAILING_WHITESPACE 
        
             OUTPUT_VARIABLE TORCH_DIR 
        
           )

K2 uses the PyTorch installation information from the pip package, so it's guaranteed that k2 uses whichever version Python
is currently using.

k2/k2/csrc/CMakeLists.txt

Line 36 in f784a2c

target_link_libraries(context PUBLIC ${TORCH_LIBRARIES})

danpovey · 2020-09-18T08:50:20Z

Fantastic!

…

On Fri, Sep 18, 2020 at 3:12 PM Fangjun Kuang ***@***.***> wrote: how does your pull request relate to this, in terms of PyTorch versions and compatibility? https://github.com/k2-fsa/k2/blob/f784a2cc3b8ac698ca99f12a458758cd452c280a/cmake/torch.cmake#L3-L7 K2 uses the PyTorch installation information from the pip package, so it's guaranteed that k2 uses whichever version Python is currently using. https://github.com/k2-fsa/k2/blob/f784a2cc3b8ac698ca99f12a458758cd452c280a/k2/csrc/CMakeLists.txt#L36 — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#161 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZFLO6O5KYEHDE3ACDRT7LSGMB53ANCNFSM4RQPJU3Q> .

qindazhu closed this as completed Nov 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integration with PyTorch #161

Integration with PyTorch #161

csukuangfj commented Sep 17, 2020 •

edited

Loading

qindazhu commented Sep 17, 2020

csukuangfj commented Sep 17, 2020 •

edited

Loading

danpovey commented Sep 18, 2020 via email

csukuangfj commented Sep 18, 2020

csukuangfj commented Sep 18, 2020

danpovey commented Sep 18, 2020 via email

Integration with PyTorch #161

Integration with PyTorch #161

Comments

csukuangfj commented Sep 17, 2020 • edited Loading

qindazhu commented Sep 17, 2020

csukuangfj commented Sep 17, 2020 • edited Loading

danpovey commented Sep 18, 2020 via email

csukuangfj commented Sep 18, 2020

csukuangfj commented Sep 18, 2020

danpovey commented Sep 18, 2020 via email

csukuangfj commented Sep 17, 2020 •

edited

Loading

csukuangfj commented Sep 17, 2020 •

edited

Loading