fix: Automatically send truncated long ints to cuda at shape analysis time #1541
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
aten::to
operator insertion at shape analysis time to insert the target devicePartitioningInfo
struct to store device information and produce cuda device stringLowerInfo
andPartitioningInfo
const to avoid altering struct fieldsUses schema
Instead of
This switch was made to ensure the device for truncated objects is GPU, regardless of their origin, to avoid adding another lowering pass for this case. Since an
aten::to
operation is already being inserted, use the opportunity to use correct tensor device (GPU).Type of change
Checklist: