Skip to content

Conversation

zhuhaozhe
Copy link
Collaborator

@zhuhaozhe zhuhaozhe commented May 28, 2024

In this PR:

(1)Fix the unary fusion for bf16 conv/linear.
Previously we registered same fusion pattern for bf16. fp16. And we do not check the dtype while matching the pattern. This results the fp16 case matched the bf16 pattern but in later replacement, we found that we have a float16 here which is not expected, so we do not fuse them. We fix it by checking dtypes to avoid fp16 case matched bf16 pattern.

  def _is_valid_computation_unary_fusion(computation_op, lowp_dtype=None):
      def fn(match):
          matched = _is_single_computation_op(computation_op, **lowp_dtype**)(match) # previously we do not check lowp_dtype here

It is not exposed before because we only check the match count, and the match count is anyway correct because we matched the pattern. To address this, we add check on number of generated_kernel. If it is not fused, there will be an additional kernel to compute the post op.

(2)Previous the ut

python test/inductor/test_mkldnn_pattern_matcher.py -k test_linear_binary

dose not check the fusion status, fix it in this PR.

(3)Extend test_conv_binary to test with lp.

Stack from ghstack (oldest at bottom):

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

Copy link

pytorch-bot bot commented May 28, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/127296

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 5fbf5bd with merge base 4d4d2a9 (image):

UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

zhuhaozhe added a commit that referenced this pull request May 28, 2024
ghstack-source-id: d9049f3
Pull Request resolved: #127296
[ghstack-poisoned]
@zhuhaozhe zhuhaozhe added the ciflow/trunk Trigger trunk jobs on your pull request label May 28, 2024
@zhuhaozhe zhuhaozhe marked this pull request as draft May 28, 2024 14:11
Comment on lines +501 to 503
mod = M(binary_fn, input_shape[-1], out_feature, bias).eval()
v = torch.randn(input_shape)
other = torch.randn(input_shape[:-1] + [out_feature]).to(dtype)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we not convert dtype on mod and input v but convert it on "out" here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Jiong.
Here we choose do not convert dtype on mod and v because we expected autocast to handle it. And for other, the autocast will not cast it because add is full-through op.

y = linear(x) + z

And currently we do not fuse "+ z" if z is float and linear(x) is lp.

Previous the ut
```
python test/inductor/test_mkldnn_pattern_matcher.py -k test_linear_binary
```
dose not check the fusion status, fix it in this PR.




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
@zhuhaozhe zhuhaozhe requested a review from jgong5 May 29, 2024 08:34
In this PR:

(1)Fix the unary fusion for bf16 conv/linear. 
    Previously we registered same fusion pattern for `bf16. fp16`. And we do not check the dtype while matching the pattern. This results the `fp16` case matched the `bf16` pattern but in later replacement, we found that we have a float16 here which is not expected, so we do not fuse them.  We fix it by checking dtypes to avoid `fp16` case matched `bf16` pattern.

```
  def _is_valid_computation_unary_fusion(computation_op, lowp_dtype=None):
      def fn(match):
          matched = _is_single_computation_op(computation_op, **lowp_dtype**)(match) # previously we do not check lowp_dtype here

```

It is not exposed before because we only check the match count, and the match count is anyway correct because we matched the pattern. To address this, we add check on number of `generated_kernel`. If it is not fused, there will be an additional kernel to compute the post op.

(2)Previous the ut
```
python test/inductor/test_mkldnn_pattern_matcher.py -k test_linear_binary
```
dose not check the fusion status, fix it in this PR.




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
zhuhaozhe added a commit that referenced this pull request May 29, 2024
ghstack-source-id: befb35e
Pull Request resolved: #127296
In this PR:

(1)Fix the unary fusion for bf16 conv/linear. 
    Previously we registered same fusion pattern for `bf16. fp16`. And we do not check the dtype while matching the pattern. This results the `fp16` case matched the `bf16` pattern but in later replacement, we found that we have a float16 here which is not expected, so we do not fuse them.  We fix it by checking dtypes to avoid `fp16` case matched `bf16` pattern.

```
  def _is_valid_computation_unary_fusion(computation_op, lowp_dtype=None):
      def fn(match):
          matched = _is_single_computation_op(computation_op, **lowp_dtype**)(match) # previously we do not check lowp_dtype here

```

It is not exposed before because we only check the match count, and the match count is anyway correct because we matched the pattern. To address this, we add check on number of `generated_kernel`. If it is not fused, there will be an additional kernel to compute the post op.

(2)Previous the ut
```
python test/inductor/test_mkldnn_pattern_matcher.py -k test_linear_binary
```
dose not check the fusion status, fix it in this PR.




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
In this PR:

(1)Fix the unary fusion for bf16 conv/linear. 
    Previously we registered same fusion pattern for `bf16. fp16`. And we do not check the dtype while matching the pattern. This results the `fp16` case matched the `bf16` pattern but in later replacement, we found that we have a float16 here which is not expected, so we do not fuse them.  We fix it by checking dtypes to avoid `fp16` case matched `bf16` pattern.

```
  def _is_valid_computation_unary_fusion(computation_op, lowp_dtype=None):
      def fn(match):
          matched = _is_single_computation_op(computation_op, **lowp_dtype**)(match) # previously we do not check lowp_dtype here

```

It is not exposed before because we only check the match count, and the match count is anyway correct because we matched the pattern. To address this, we add check on number of `generated_kernel`. If it is not fused, there will be an additional kernel to compute the post op.

(2)Previous the ut
```
python test/inductor/test_mkldnn_pattern_matcher.py -k test_linear_binary
```
dose not check the fusion status, fix it in this PR.




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
zhuhaozhe added a commit that referenced this pull request May 29, 2024
ghstack-source-id: 1a8773f
Pull Request resolved: #127296
@zhuhaozhe zhuhaozhe requested a review from jansel May 30, 2024 00:41
@zhuhaozhe zhuhaozhe marked this pull request as ready for review May 30, 2024 12:26
@zhuhaozhe
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@huydhn
Copy link
Contributor

huydhn commented May 30, 2024

@pytorchbot revert -m 'Sorry for reverting you change but one of the tests is failing on trunk ROCm. Please help fix and reland the change https://github.com/pytorch/pytorch/actions/runs/9302535020/job/25606932572' -c nosignal

@huydhn huydhn added the ciflow/rocm Trigger "default" config CI on ROCm label May 30, 2024
@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot added a commit that referenced this pull request May 30, 2024
This reverts commit cdeb242.

Reverted #127296 on behalf of https://github.com/huydhn due to Sorry for reverting you change but one of the tests is failing on trunk ROCm.  Please help fix and reland the change https://github.com/pytorch/pytorch/actions/runs/9302535020/job/25606932572 ([comment](#127296 (comment)))
@pytorchmergebot
Copy link
Collaborator

@zhuhaozhe your PR has been successfully reverted.

zhuhaozhe added a commit that referenced this pull request May 31, 2024
ghstack-source-id: 9bb09ce
Pull Request resolved: #127296
[ghstack-poisoned]
zhuhaozhe added a commit that referenced this pull request May 31, 2024
ghstack-source-id: 9482c60
Pull Request resolved: #127296
@zhuhaozhe
Copy link
Collaborator Author

@pytorchbot merge

[ghstack-poisoned]
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

@zhuhaozhe
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

petrex pushed a commit to petrex/pytorch that referenced this pull request Jun 5, 2024
In this PR:

(1)Fix the unary fusion for bf16 conv/linear.
    Previously we registered same fusion pattern for `bf16. fp16`. And we do not check the dtype while matching the pattern. This results the `fp16` case matched the `bf16` pattern but in later replacement, we found that we have a float16 here which is not expected, so we do not fuse them.  We fix it by checking dtypes to avoid `fp16` case matched `bf16` pattern.

```
  def _is_valid_computation_unary_fusion(computation_op, lowp_dtype=None):
      def fn(match):
          matched = _is_single_computation_op(computation_op, **lowp_dtype**)(match) # previously we do not check lowp_dtype here

```

It is not exposed before because we only check the match count, and the match count is anyway correct because we matched the pattern. To address this, we add check on number of `generated_kernel`. If it is not fused, there will be an additional kernel to compute the post op.

(2)Previous the ut
```
python test/inductor/test_mkldnn_pattern_matcher.py -k test_linear_binary
```
dose not check the fusion status, fix it in this PR.

(3)Extend `test_conv_binary` to test with lp.

Pull Request resolved: pytorch#127296
Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5, https://github.com/jansel
@github-actions github-actions bot deleted the gh/zhuhaozhe/34/head branch July 2, 2024 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/inductor ciflow/rocm Trigger "default" config CI on ROCm ciflow/trunk Trigger trunk jobs on your pull request Merged module: inductor open source Reverted topic: not user facing topic category
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

7 participants