Skip to content
This repository has been archived by the owner on Jan 22, 2024. It is now read-only.

Cannot compile against nvidia drivers in the container #103

Closed
hsysuper opened this issue Jun 3, 2016 · 7 comments
Closed

Cannot compile against nvidia drivers in the container #103

hsysuper opened this issue Jun 3, 2016 · 7 comments
Labels

Comments

@hsysuper
Copy link

hsysuper commented Jun 3, 2016

Currently, the provisioned nvidia-driver volume only includes the driver library files (e.g. libnvidia-opencl.so.352.93) and runtime symbolic link (e.g. libnvidia-opencl.so.1 -> libnvidia-opencl.so.352.93). This allows binary programs using nvidia drivers to run without a problem.

However, when compiling programs against nvidia driver libraries. The linker is looking for the .so file (e.g. libnvidia-opencl.so) to link against, which cannot be achieved in current volume configuration.

Could nvidia-docker also make these .so symbolic links when provisioning the volume? So that we can compile programs in the container in case people want to use the container as a development/debugging environment.

Thanks!

@flx42
Copy link
Member

flx42 commented Jun 3, 2016

libnvidia-opencl.so* is just an example, right? You should not link against this particular library.
Let's take other examples that make more sense: libcuda.so and libnvidia-ml.so. Stub libraries are present in the devel version of the images we provide at location /usr/local/cuda/lib64/stubs/

@hsysuper
Copy link
Author

hsysuper commented Jun 3, 2016

Thanks for the fast reply.

Yes, libnvidia-opencl.so* was just an example and it is very good to know that several libraries are available under /usr/local/cuda/lib64/stubs/. However, the particular library that I am interested to link against is libnvidia-encode.so. nvidia-encode is the library required by the NvEncoder example in the Nvidia-Video-Codec-SDK.

Could this library be provisioned by nvidia-docker?

@3XX0
Copy link
Member

3XX0 commented Jun 3, 2016

Unfortunately, we do not have stubs for the video libraries:
libnvidia-encode.so, libnvcuvid.so, libnvidia-fbc.so, libnvidia-ifr.so

You would have to rely on dynamic dispatching (aka dlopen). You can look at the SDK samples common/src directory to see how it is done.

libnvidia-encode.so for example has a single entry point NvEncodeAPICreateInstance from there you can retrieve all the other function pointers.

@hsysuper
Copy link
Author

hsysuper commented Jun 3, 2016

Thanks for the quick support.

After a few trial and error, I found that NVENC does not work very well with nvidia-docker. In the end, I had to manually share devices between the host and the container and install the nvidia-XXX driver or the complete cuda-toolkit in order for the example to run.

I tried installing the nvidia-XXX driver or the cuda-toolkit using the Ubuntu apt-get method after installing the cuda deb repository. However, I kept getting NV_ENC_ERR_UNSUPPORTED_DEVICE (0x2) error from the NvEncOpenEncodeSessionEx API call.

However, after I provision the devices myself and installed the same driver, the error disappeared and I was able to run the example in the container.

@3XX0
Copy link
Member

3XX0 commented Jun 4, 2016

Please don't do that, the problem is elsewhere.
I just tried it and apparently nvidia-encode is looking for libcuda.so internally, which is an issue on our end. As a workaround, you can create the link yourself inside the container and everything will work as intended:

$ nvidia-docker run -ti nvidia/cuda
# ln -s /usr/local/nvidia/lib64/libcuda.so.1 /usr/lib/x86_64-linux-gnu/libcuda.so

If you want to try the NvEnc sample, you would have to remove -lnvidia-encode in the Makefile (an oversight in the samples)

@hsysuper
Copy link
Author

hsysuper commented Jun 5, 2016

I tired with your solution and it worked!

Now, I am able to run the NvEncoder example in the nvidia-docker enabled docker container. It seems that when building custom applications, there is no need for -lnvidia-encode in LDFLAGS, as the library can be dynamically loaded. I believe the sample has affected some projects using NVENC to link against nvidia-encode during build.

For example, the GStreamer NVENC plugin specifies that it needs nvidia-encode at the build stage in its configure.ac script.

Thanks for helping investigate the issue and hope the two oversights could be fixed soon.

@3XX0 3XX0 added the bug label Jun 7, 2016
@3XX0
Copy link
Member

3XX0 commented Jun 17, 2016

Closing since this issue is now partly fixed with the addition of the libcuda.so symlink.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants