pbs_mom requires libraries even without a MIC #173

Open
brianandrus opened this Issue Aug 28, 2013 · 3 comments

Comments

Projects
None yet
2 participants

It seems I have to install some of the MIC packages on all the nodes even without a MIC card installed.
In particular, I need to install :
intel-mic-2.1.6720-16.2.6.32-358.el6.x86_64.rpm
for dependency: intel-mic-kmod-2.1.6720-16.2.6.32.358.el6.x86_64.rpm
for dependency: intel-mic-flash-2.1.386-3.2.6.32-358.el6.x86_64.rpm
intel-mic-gpl-2.1.6720-16.el6.x86_64.rpm

That is just to get the particular libraries: libcoi_host.so.0 and libscif.so.0
I also need to symlink libcoi_host.so.0 to be in /lib64 as well

If I don’t do the above steps, pbs_mom will not start.

I suggest there be code in pbs_mom similar to the gpu code in that if the kernel driver is not loaded, pbs_mom does not bother with MIC code.

From mom log on a host without a gpu:
20130808:08/08/2013 08:42:52;0001; pbs_mom.3417;Svr;pbs_mom;LOG_DEBUG::main, Not using Nvidia gpu support even though built with --enable-nvidia-gpus

Contributor

mej commented Aug 28, 2013

That's not possible. If pbs_mom is linked against a library, no matter what library it is, that library must be present at runtime, or the binary won't run. That's how dynamic linkers work.

What you're really asking for is a dynamically-loaded module for MIC support, and that's a whole different bucket of yikes.

Am I to assume that the gpu code is dynamically-loaded? Pbs_mom disables that if there are no gpus found in the system.
Just seems like a cleaner way to implement. At a minimum, perhaps the spec file could include the libraries needed when creating the packages.

Contributor

mej commented Aug 29, 2013

I don't believe the GPU support links against external libraries.

The spec file cannot do that. Externally-supplied libraries are not to be owned by the spec file of a different package. That would violate any reasonable packaging standard. and create all sorts of nasty conflicts. Imagine trying to upgrade your MIC libraries, but you can't because TORQUE not only requires the old ones, but OWNS them! Think of the symlink mess that could ensue if different minor library versions were installed with the same major versions. No, that would create a much, much bigger nightmare than installing a few dependencies. :-)

Your best bet is to install the libraries manually into your node image if you object to installing the dependencies. Whether or not they'll work without the kernel module, I don't know.

And if the MIC packages are putting 64-bit libraries into /lib, that's a bug that needs to be reported to Intel. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment