Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing some binary in distributed prebuilt #5

Open
lamhoangtung opened this issue Apr 16, 2021 · 5 comments
Open

Missing some binary in distributed prebuilt #5

lamhoangtung opened this issue Apr 16, 2021 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@lamhoangtung
Copy link

lamhoangtung commented Apr 16, 2021

Error message indicate this bug:

Error: libgomp-7c85b1e2.so.1: cannot open shared object file: No such file or directory

libgomp-7c85b1e2.so.1 was available in /torch-js/build/libtorch/lib/libgomp-a34b3233.so.1 when build from source and when unzip libtorch downloaded from pytorch.org. But some how cmake-js and prebuilt does not include any .so.1 file with the distributed prebuilt

A temporary workarround on Linux is to create some symlink:

/usr/lib/x86_64-linux-gnu/libgomp-7c85b1e2.so.1 -> /usr/lib/x86_64-linux-gnu/libgomp.so.1
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libcudart-6d56b25a.so.11.0 -> /usr/local/cuda-11.1/targets/x86_64-linux/lib/libcudart.so.11.0
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvToolsExt-24de1d56.so.1 -> /usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvToolsExt.so.1

Also 1080Ti also having this issue:

libcudart.so.10.2: cannot open shared object file: No such file or directory

Will investigate later cc @b21quocbao

@lamhoangtung lamhoangtung added the bug Something isn't working label Apr 16, 2021
@lamhoangtung lamhoangtung self-assigned this Apr 16, 2021
@jeswr
Copy link

jeswr commented Jul 13, 2021

I've needed this project. I found that for docker container created using the following works (solves both Linux and 1080Ti issues)

Dockerfile

FROM nvidia/cuda:11.1.1-devel-ubuntu18.04

RUN apt-get update

# Install python et al.
RUN apt-get install python3.8 python3-pip wget curl gcc-8 unzip libssl1.0.0 software-properties-common -y
RUN add-apt-repository ppa:ubuntu-toolchain-r/test
RUN apt-get update
# https://stackoverflow.com/questions/63190229/glibcxx-3-4-26-not-found-running-cross-complied-program-on-beaglebone
RUN apt-get install --only-upgrade libstdc++6 -y

# Link extra packages
RUN ln -sf /usr/lib/x86_64-linux-gnu/libgomp.so.1 /usr/lib/x86_64-linux-gnu/libgomp-7c85b1e2.so.1
RUN ln -sf /app/node_modules/@techainer1t/torch-js/build/Release/libc10_cuda.so /usr/lib/x86_64-linux-gnu/libc10_cuda.so
RUN ln -sf /usr/local/cuda-11.1/targets/x86_64-linux/lib/libcudart.so.11.0 /usr/lib/x86_64-linux-gnu/libcudart-6d56b25a.so.11.0
RUN ln -sf /usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvToolsExt.so.1 /usr/lib/x86_64-linux-gnu/libnvToolsExt-24de1d56.so.1

# Installing node
RUN curl -sL https://deb.nodesource.com/setup_14.x | bash -
RUN apt-get install nodejs -y

RUN npm i -g @types/node typescript ts-node

WORKDIR /app

COPY requirements.txt /app/requirements.txt
RUN pip3 install -r requirements.txt
RUN rm requirements.txt

COPY package.json /app/package.json
COPY package-lock.json /app/package-lock.json

RUN npm install

COPY ./index.js /app/index.js
RUN ts-node index.js

# CMD ["node", "./index.js"]

requirements.txt

torch==1.8.1

package.json

{
  "name": "",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "",
  "license": "MIT",
  "dependencies": {
    "@techainer1t/torch-js": "^0.13.0"
  }
}

index.js

const torch = require("@techainer1t/torch-js");
const modelPath = `/app/model.pt`;

(async () => {
  console.log(torch)
  const model = new torch.ScriptModule(modelPath);
  const inputA = torch.rand([1, 5]);
  const inputB = torch.rand([1, 5]);
  const res = await model.forward(inputA, inputB);
  console.log(res)
})();

@lamhoangtung
Copy link
Author

Thanks for the dockerfile @jeswr. We will fix this completely on the next release

@credwood
Copy link

Thanks for the dockerfile @jeswr. We will fix this completely on the next release

Thanks a lot! Thanks for your work on this package.

I was wondering if you have (or anyone else has) a quick workaround for the libcudart.so.10.2: cannot open shared object file: No such file or directory issue?

@lamhoangtung
Copy link
Author

How did you install CUDA on your os @credwood ?

@credwood
Copy link

credwood commented Oct 1, 2021

How did you install CUDA on your os @credwood ?

I am using the relevant parts of the Dockerfile above with the same base image nvidia/cuda:11.1.1-devel-ubuntu18.04. Thanks again @lamhoangtung

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants