Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Automatically send truncated long ints to cuda at shape analysis time #1541

Merged
merged 1 commit into from
Dec 12, 2022

Conversation

gs-olive
Copy link
Collaborator

@gs-olive gs-olive commented Dec 9, 2022

Description

  • Augment aten::to operator insertion at shape analysis time to insert the target device
  • Add functionality to PartitioningInfo struct to store device information and produce cuda device string
  • Make cuda device string function in LowerInfo and PartitioningInfo const to avoid altering struct fields

Uses schema

Tensor.to(device : Device,
          dtype : int,
          non_blocking : bool=False,
          copy : bool=False,
          memory_format : Optional[int]) -> Tensor

Instead of

Tensor.to(dtype : int,
          non_blocking : bool=False,
          copy : bool=False,
          memory_format : Optional[int]) -> Tensor

This switch was made to ensure the device for truncated objects is GPU, regardless of their origin, to avoid adding another lowering pass for this case. Since an aten::to operation is already being inserted, use the opportunity to use correct tensor device (GPU).

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • [ x ] My code follows the style guidelines of this project (You can use the linters)
  • [ x ] I have performed a self-review of my own code
  • [ x ] I have commented my code, particularly in hard-to-understand areas and hacks
  • [ x ] I have made corresponding changes to the documentation
  • [ x ] I have added tests to verify my fix or my feature
    • Tested on a few models, sample scripts + existing fallback test suite
  • [ x ] New and existing unit tests pass locally with my changes
  • [ x ] I have added the relevant labels to my PR in so that relevant reviewers are notified

- Augment `aten::to` operator insertion at shape analysis time to insert
the target device
- Add functionality to PartitioningInfo struct to store device
information and produce cuda device string
- Make cuda device string function in LowerInfo and PartitioningInfo
const to avoid altering fields
@gs-olive gs-olive self-assigned this Dec 9, 2022
@github-actions github-actions bot added component: api [C++] Issues re: C++ API component: core Issues re: The core compiler component: lowering Issues re: The lowering / preprocessing passes component: partitioning labels Dec 9, 2022
Copy link
Collaborator

@bowang007 bowang007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gs-olive gs-olive merged commit af39c65 into pytorch:master Dec 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed component: api [C++] Issues re: C++ API component: core Issues re: The core compiler component: lowering Issues re: The lowering / preprocessing passes component: partitioning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants