Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mxnet & gluonts minimum versions required for multiple GPU support? #58

Open
joranE opened this issue Oct 23, 2023 · 0 comments
Open

mxnet & gluonts minimum versions required for multiple GPU support? #58

joranE opened this issue Oct 23, 2023 · 0 comments

Comments

@joranE
Copy link

joranE commented Oct 23, 2023

I'm having difficulty getting multiple GPUs to work via the instructions in the vignette. I'm on CUDA 10.1, with gluonts 0.8.0 & mxnet- cu101 0.7.0, Ubuntu 20.04, which works with a single GPU.

However, when I pass:

set_engine(...,ctx = list(mxnet$gpu(0),mxnet$gpu(1)))

I get the following error:

Error: pydantic.error_wrappers.ValidationError: 1 validation error for TrainerModel
ctx
expected string or bytes-like object (type=type_error)

It seems the gluonts model is expecting the ctx argument to be a string, and indeed it issues no complaints if I pass ctx = "gpu".

I've been trying different combinations of mxnet & gluonts versions with no luck, partly because any attempt with an mxnet-cu101 version higher than 1.7 doesn't work at all (CPU or just 1 GPU) with an error on not being able to find libnccl2.so, and I'm not sure why some versions of mxnet would be able to locate it and others wouldn't.

Is there some CUDA, mxnet & gluonts version combination that multiple GPU support is limited to?

Edit: I'm increasingly confused as to whether multiple GPU support ever worked at all, despite what appears in the vignette. This is basically the only reference I can find in the gluonts source code to adding multi-gpu support and it was never merged, and the ctx argument has apparently never allowed for lists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant