Skip to content

Consider maintaining separate Module block_dim variables for CPU and GPU #564

@shi-eric

Description

@shi-eric

With the addition of a CPU fallback for the tile API in 83a5845, we can get into situations where modules are compiled on the GPU using a nonsense block_dim of 1, leading to unnecessary module recompilation when a GPU kernel is launched with the intended number of threads per block.

Metadata

Metadata

Assignees

Labels

No fields configured for Enhancement.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions