Skip to content

Conversation

RiverDave
Copy link
Collaborator

@RiverDave RiverDave commented Sep 19, 2025

This PR implements some missing blocks that allow us to effectively allow us to launch kernels from the host. All of the tests stated in this commit are now resolved.

I spent half a day figuring the following:

I tried experiementing performing host compilation(-fcuda-is-device) with target triple: nvptx64-nvidia-cuda but was getting a module verification error that, to keep it simple looked like: error: 'cir.call' op calling convention mismatch: expected ptx_kernel, but provided c.

I thought that was expected given that we're essentially using the device to compile on the host, which doesn't make a lot of sense. until I tried to replicate the same in OG and didn't really run into any problem in that regard. Are the calling conventions enforced in CIR much more strict as compared to OG? Or is that simply a bug from OG?

@RiverDave RiverDave changed the title [CIR][CUDA] FIx CUDA host compilation [CIR][CUDA] FIx CUDA host compilation on kernel launch Sep 19, 2025
@bcardosolopes
Copy link
Member

I thought that was expected given that we're essentially using the device to compile on the host, which doesn't make a lot of sense.

Interesting!

until I tried to replicate the same in OG and didn't really run into any problem in that regard. Are the calling conventions enforced in CIR much more strict as compared to OG? Or is that simply a bug from OG?

Seems like CIR is more strict, it's possible LLVM has some verifier issues or it could be a bug, I don't have (know) a good answer for you!

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@bcardosolopes bcardosolopes merged commit d8f3180 into llvm:main Sep 19, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants