Can we use Conda environment for installing torch? #341

issactoast · 2020-11-01T23:13:12Z

I am on Windows10 using WSL2, which requires CUDA 11.0.

I can install PyTorch using Conda environment and using WSL2 at the same time but can't use the torch in R. I think this lack of ability to combine virtual env for the torch in R blowing tons of possible users for the package.

dfalbel · 2020-11-01T23:23:17Z

Hi @issactoast

Currently we rely on LibTorch 1.5 which does not support CUDA 11.0, but the next version of torch will use LibTorch 1.7, so CUDA 11.0 will be supported.

Just to make sure I understand... are you suggesting that you should be able to conda install torch-r or that the R package should use the same LibTorch that is packaged with PyTorch?

issactoast · 2020-11-02T22:49:03Z

Hello @dfalbel

Thanks for the quick response. Either way could solve the problem now, but enabling conda install r-torch looks quicker solution for Windows users who want to use GPU capability with conda env.

dfalbel · 2020-11-03T19:40:29Z

OK! I'll need some help from the community on that. I have never submitted R packages to conda and wound't know where to start, sorry!

izahn · 2020-11-22T22:11:32Z

Building conda packages from CRAN is easy, basic instructions are at https://docs.conda.io/projects/conda-build/en/latest/user-guide/tutorials/build-r-pkgs.html

I built a conda package and uploaded to https://anaconda.org/izahn/r-torch

This package works well on my Arch Linux system, but fails on RHEL 7 with this error message:

Torch failed to start, restart your R session to try again. 
/opt/R/library/4.0/torch/deps/liblantern.so - /lib64/libm.so.6: version `GLIBC_2.23' not found
(required by /opt/R/library/4.0/torch/deps/./libtorch_cpu.so)

The conda installation has a copy of libm.so.6, but it seems that libtorch insists on using the one at /lib64/libm.so.6. Any ideas about how to help it find the libm.so.6 in the conda environment instead?

dfalbel · 2020-11-23T17:05:34Z

That's awesome @izahn ! Thanks!

Doesn't this: version GLIBC_2.23' not found` message means that we need an updated version of glibc in this environment? Searching for that message shows that updating glibc might solve it.

izahn · 2020-11-23T17:20:31Z

@dfalbel the conda build system includes glibc (or at least libm.so.6); the problem is that torch::install_torch tries to use host system version at /lib64/libm.so.6 instead. So in a sense I have updated glibc, I just don't know how to tell torch::install_torch to use that updated version.

skeydan · 2020-11-24T10:22:30Z

have you tried setting LD_LIBRARY_PATH?

izahn · 2020-11-24T12:45:07Z

I tried setting LD_LIBRARY_PATH, but that caused other errors during the build process itself (even before the install_torch() part).

This seems to be where conda packaging gets more complicated. The conda build system uses a sysroot, as described in https://docs.conda.io/projects/conda-build/en/latest/resources/define-metadata.html#host and conda/conda-build#3696. I'm a bit out of my depth here, but as far as I can figure building the torch R package uses the conda sysroot, but install_torch doesn't know about it and tries to use host system libraries.

Is it possible to build the torch libraries instead of installing the pre-built ones with install_torch()? I think (or at least hope) if we could do that the conda build system would kick in and use the correct libraries.

dfalbel · 2020-11-24T13:06:53Z

Yes, you can build libtorch with instructions here: https://github.com/pytorch/pytorch/blob/master/docs/libtorch.rst#building-libtorch-using-cmake

And lantern (the C interface to libtorch that we use in the R package) here: https://github.com/mlverse/torch/blob/master/tools/buildlantern.R

Maybe you could also point to the the lib included in the torch conda package (conda install torch) by setting the TORCH_HOME env var? As they might have fixed that somehow?

izahn · 2021-02-16T17:47:50Z

OK, I've made some progress on this front and submitted a conda package recipe at conda-forge/staged-recipes#13992

Setting up CUDA packages for conda is more complicated, so this is CPU-only for now. I do hope to add CUDA support in the future.

Finally, I'd love some help maintaining the conda package, let me know if you are interested and I'll add you to the maintainers list.

izahn · 2021-02-17T21:58:13Z

Further update -- I've given up for now on packaging torch for conda. Fundamentally conda doesn't want repackaged binaries, and the torch package doesn't make it easy to install without repackaged binaries., I fought with it for a while, but kept ending up with

> library('torch'); torch_tensor(1)
Warning message:
Torch failed to start, restart your R session to try again. /home/conda/staged-recipes/build_artifacts/r-torch_1613595975690/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/lib/R/library/torch/deps/liblantern.so - libc10.so: cannot open shared object file: No such file or directory 

 *** caught segfault ***
address (nil), cause 'memory not mapped'

Traceback:
 1: cpp_torch_float32()
 2: initialize(...)
 3: torch_dtype$new(cpp_torch_float32())
 4: torch_float()
 5: methods$initialize(self, self$private, ...)
 6: Tensor$new(data, dtype, device, requires_grad, pin_memory)
 7: torch_tensor(1)
An irrecoverable exception occurred. R is aborting now ...
/home/conda/staged-recipes/build_artifacts/r-torch_1613595975690/test_tmp/run_test.sh: line 7:  4273 Segmentation fault      (core dumped) $R -e "library('torch'); torch_tensor(1)"

or similar.

dfalbel · 2021-02-17T23:04:28Z

Hi @izahn ,

Thanks for your efforts and sorry it didn't work!

torch package doesn't make it easy to install without repackaged binaries

What should we change in the R package to solve this? Is it related to separated compilation steps for libtorch and liblantern?
I am not sure if conda allows this, but in theory we could deliver both binaries in an inst/deps/ folder.

we could perhaps download the binaries in this script:

https://github.com/izahn/staged-recipes/blob/89827cfedbceec76e3ced3d937732ad6518dc642/recipes/r-torch/build.sh

and patch the .Rbuildignore to allow the binaries to be included in the built package.

Is it possible to see the logs for the builds?

izahn · 2021-02-18T01:57:11Z

Hi @izahn ,

Thanks for your efforts and sorry it didn't work!

torch package doesn't make it easy to install without repackaged binaries

What should we change in the R package to solve this? Is it related to separated compilation steps for libtorch and liblantern?
I am not sure if conda allows this, but in theory we could deliver both binaries in an inst/deps/ folder.

I'm still relatively new to conda packaging and not totally sure how it works. The package building process definitely flags the pre-built libraries though. I tried telling it to ignore them in https://github.com/conda-forge/staged-recipes/pull/13992/files#diff-f21c0b2e0f37c9ea8dac5100f7bcecab20c39783571405ff1d7425d4beea380aR22, which kind of works. My (admittedly limited) understanding is that conda-forge wants to build everything so that everything is built with the same toolchain.

we could perhaps download the binaries in this script:

https://github.com/izahn/staged-recipes/blob/89827cfedbceec76e3ced3d937732ad6518dc642/recipes/r-torch/build.sh

and patch the .Rbuildignore to allow the binaries to be included in the built package.

Maybe, I don't know if that will help or not.

Is it possible to see the logs for the builds?

There are some older logs (probably not helpful, from before I realized I actually needed at least torch_tensor(1) in the tests) at
https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=278862&view=results . The "passing" tests there would have failed on torch_tensor(1) I'm pretty sure.

I re-started the CI so you can see the result of my latest effort over at https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=279239&view=logs&j=6f142865-96c3-535c-b7ea-873d86b887bd&t=22b0682d-ab9e-55d7-9c79-49f3c3ba4823

issactoast · 2021-04-13T02:03:14Z

Close this. For future reference, you can use a torch with GPU support on WSL2 ubuntu 18.04.

izahn mentioned this issue Feb 17, 2021

Add r-torch via conda_r_skeleton_helper conda-forge/staged-recipes#13992

Closed

9 tasks

issactoast closed this as completed Apr 13, 2021

TomAugspurger mentioned this issue Apr 27, 2022

Add torch backend for sits [R] microsoft/planetary-computer-containers#36

Open

TomAugspurger mentioned this issue May 20, 2022

[bot-automerge] r-sits v1.0.0 conda-forge/r-sits-feedstock#16

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we use Conda environment for installing torch? #341

Can we use Conda environment for installing torch? #341

issactoast commented Nov 1, 2020

dfalbel commented Nov 1, 2020

issactoast commented Nov 2, 2020

dfalbel commented Nov 3, 2020

izahn commented Nov 22, 2020

dfalbel commented Nov 23, 2020

izahn commented Nov 23, 2020

skeydan commented Nov 24, 2020

izahn commented Nov 24, 2020

dfalbel commented Nov 24, 2020

izahn commented Feb 16, 2021 •

edited

izahn commented Feb 17, 2021

dfalbel commented Feb 17, 2021

izahn commented Feb 18, 2021 •

edited

issactoast commented Apr 13, 2021

Can we use Conda environment for installing torch? #341

Can we use Conda environment for installing torch? #341

Comments

issactoast commented Nov 1, 2020

dfalbel commented Nov 1, 2020

issactoast commented Nov 2, 2020

dfalbel commented Nov 3, 2020

izahn commented Nov 22, 2020

dfalbel commented Nov 23, 2020

izahn commented Nov 23, 2020

skeydan commented Nov 24, 2020

izahn commented Nov 24, 2020

dfalbel commented Nov 24, 2020

izahn commented Feb 16, 2021 • edited

izahn commented Feb 17, 2021

dfalbel commented Feb 17, 2021

izahn commented Feb 18, 2021 • edited

issactoast commented Apr 13, 2021

izahn commented Feb 16, 2021 •

edited

izahn commented Feb 18, 2021 •

edited