Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes the Pytorch Wrapper Codegen for CPU-only machines. #6590

Merged
merged 4 commits into from
Jan 27, 2022

Conversation

mgharbi
Copy link
Contributor

@mgharbi mgharbi commented Jan 27, 2022

This PR does two things:

  1. split the helper function that wraps PyTorch tensors into Halide buffers into 2: one for CPU tensors, one for GPU tensors. Before the PR, the wrapper may fail on CPU-only machine because halide_cuda_device_interface is missing.
  2. add a default __user_context = nullptr in CPU-only ops. The auto-scheduler can create intermediate functions that have a __user_context input. This case was not handled by the CodeGen, so compilation would fail. We now create a default null context instead.

Copy link
Contributor

@steven-johnson steven-johnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending green, with nits

src/runtime/HalidePyTorchHelpers.h Outdated Show resolved Hide resolved
src/runtime/HalidePyTorchHelpers.h Outdated Show resolved Hide resolved
@steven-johnson
Copy link
Contributor

Please also use the run-clang-format.sh script, or manually fix errors if you prefer

@steven-johnson steven-johnson merged commit c450bf4 into halide:master Jan 27, 2022
jrprice pushed a commit to jrprice/Halide that referenced this pull request Feb 4, 2022
* fixes pytorch op compilation for CPU only machines, adds default user context for auto-scheduled-ops

* rm redundant declarations

* fix spacing

Co-authored-by: Michael Gharbi <mgharbi@adobe.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants