Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sanity checks when GPU is requested #903

Merged
merged 2 commits into from
Jun 24, 2024
Merged

Conversation

antoniupop
Copy link
Contributor

@antoniupop antoniupop commented Jun 19, 2024

Ensure that the compiler/runtime have GPU capability and that at least one device is available to run on.
If the compiler/runtime was not compiled with Cuda enabled (e.g., a non-GPU wheel) and GPU support is requested (use_gpu) emit warning message and continue without GPU, using CPU only with loop parallelism.
If the compiler or wheel are GPU capable, but no GPU is found, emit warning and continue on CPU.

@cla-bot cla-bot bot added the cla-signed label Jun 19, 2024
@antoniupop antoniupop force-pushed the antoniu/gpu-sanity-checks branch 2 times, most recently from 9750fe3 to 7fb35ca Compare June 21, 2024 12:27
Copy link
Contributor

@bcm-at-zama bcm-at-zama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asking for a friend: it would also be nice, at the frontend level, to be sure there is a GPU available, when people want to run in --use_gpu. Such that CML can return a clean error if users ask GPU execution without the wheel or the HW

@antoniupop
Copy link
Contributor Author

Asking for a friend: it would also be nice, at the frontend level, to be sure there is a GPU available, when people want to run in --use_gpu. Such that CML can return a clean error if users ask GPU execution without the wheel or the HW

So this can be made into an error, but now it's a warning. It can be brought further up, but it's already covered with this PR. If the wheel is not GPU capable you get:
This instance of the Concrete compiler does not support GPU acceleration. If you are using Concrete-Python, it means that the module installed is not GPU enabled. Continuing without GPU acceleration.
If the HW is not present or available:
No Cuda device available on this system (either not present or the driver is not online). Continuing without GPU acceleration.

@bcm-at-zama
Copy link
Contributor

I don't know exactly how it works @antoniupop but CML will need to be able to try/catch the error I think, so could you do something that they can intercept, please? Maybe it's already the case, just asking

@bcm-at-zama
Copy link
Contributor

Maybe making a CP function is_GPU_capable() or something like that, that they can easily call?

@antoniupop
Copy link
Contributor Author

I don't know exactly how it works @antoniupop but CML will need to be able to try/catch the error I think, so could you do something that they can intercept, please? Maybe it's already the case, just asking

It's not an exception, just a warning. If CML wants to test at their level, it's just a matter of querrying the Cuda runtime, but we can also provide a compiler level function to check.

@antoniupop
Copy link
Contributor Author

Maybe making a CP function is_GPU_capable() or something like that, that they can easily call?

Yes that would be the most reasonable. There is a function already, but only covers HW, so could add compiler configuration too.

@bcm-at-zama
Copy link
Contributor

Yes better to have a function in CP, for direct users of CP or for CML. Thank you

@antoniupop
Copy link
Contributor Author

Yes better to have a function in CP, for direct users of CP or for CML. Thank you

So I've added two functions to CP, concrete.compiler.check_gpu_enabled() and concrete.compiler.check_gpu_available() to check if the compiler is GPU capable and if a GPU can be used. I'll have to test these separately as the CI won't be able to assess this.

@antoniupop antoniupop force-pushed the antoniu/gpu-sanity-checks branch 2 times, most recently from bcf4958 to 287b3d6 Compare June 21, 2024 20:07
@antoniupop antoniupop force-pushed the antoniu/gpu-sanity-checks branch 4 times, most recently from 8ef0d1b to 084dd6f Compare June 22, 2024 07:04
…tion is selected to ensure that the compiler/runtime have GPU capability and that at least one device is available to run on.
…e compiler is GPU enabled and whether a GPU is available on the system.
@antoniupop antoniupop merged commit 2f17089 into main Jun 24, 2024
31 of 34 checks passed
@antoniupop antoniupop deleted the antoniu/gpu-sanity-checks branch June 24, 2024 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants