Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Theano may fail when running in a heterogeneous cluster #5643

Open
smoors opened this issue Jan 16, 2018 · 1 comment
Open

Theano may fail when running in a heterogeneous cluster #5643

smoors opened this issue Jan 16, 2018 · 1 comment

Comments

@smoors
Copy link
Contributor

smoors commented Jan 16, 2018

When you use the module for the first time on a node (import theano in python), an architecture-specific shared object is created at runtime in ~/.theano/compiledir_xxx. When you later try to use it on a node with an incompatible architecture, it will try to use the existing .so and fail.

I have submitted a PR [1] to fix this issue by including the architecture in the compiledir name (which works fine btw), but you were not really happy that approach. Using /tmp/theano-$USER/ does not work because, as you say, this will make Theano use the the $USER value of the installation user, rather than the value of $USER that is using Theano.

[1] #5464

@boegel
Copy link
Member

boegel commented Jan 30, 2018

@smoors As discussed, my suggestion would be to tell Theano to use a unique compiledir for each installation.

Since I don't think there's a way we can come up to get a human-readable value that can be used to identify different types of systems, I think we'll need to generate a small random 'label' (5 chars is probably sufficient) that can be used to ensure a unique compiledir per Theano installation:

import random, string
salt = ''.join(random.choice(string.letters) for i in range(5))

The cleanest way to do this would be to implement a small software-specific easyblock for Theano, to avoid implementing this in all Theano easyconfigs (and also to avoid using import in them).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants