Upstream master bump 0513 #77471

jjsjann123 · 2022-05-14T00:50:13Z

Updating nvfuser code base.

This should fix the indexing issue observed in pytorch/vision#6015.

Running tests locally as well. Will update the description here at a later point

@bypass-github-export-checks

* Allow cast from Int to Int32 type * Update test_binary_ops with scalar tests * Add integral scalars to optional cast exception list

)

... and reuse in lower_allocation pass.

fixes the assertion from pytorch#1325 on our devel branch. 1. update alias information after graph mutation 2. patch unsqueeze: i. support negative dimension; ii. fixing range check

Fixing a few smaller issues here and there: Exposing python API to switch single node fusion; Exposing python API to switch horizontal fusion (Needed to avoid PW scheduler failure on fusion with outputs of different shapes/ranks); Adding shape expression short-cut support for native_dropout (Bug reported by AOTAutograd); Fixing device check to avoid fusion of node with inputs on different device. Long term we should have supported this, but disabling it for now to avoid assert. (e.g. scalar cpu tensor can be operated on cuda tensors, feature from TensorIterator).

Summary: Pull Request resolved: pytorch#69964 Things added in this PR that requires review: 1. cuLaunchCooperativeKernel driver API added aten/src/ATen/cuda/detail/LazyNVRTC.cpp aten/src/ATen/cuda/nvrtc_stub/ATenNVRTC.h nvfuser code update: 1. perf turning on codegen scheduler that improves performance. 2. permutation support has been extended beyond contiguous/channels-last. (The improvements could be observed on PW benchmark) Things reverted from local changes: 1. aten::gelu with approximation 2. local changes that is upstreamed in PR pytorch#68804 Pull Request resolved: pytorch#69428 Reviewed By: ngimel Differential Revision: D33073817 Pulled By: wconstab fbshipit-source-id: e77d32e81d037d7370822b040456fd4c3bd68edb

* Refactor War Sync Insertion Pass (pytorch#1339) * Remove kir::Expr::scope_ (pytorch#1341) * Fusion IR Refactor (pytorch#1343) * Refactor KIR Step 1 - Remove kir::Node (pytorch#1347) * Refactor KIR Step 2 - TMP IrUtils change (pytorch#1348) * Refactor KIR Step 3 - Remove kir::Expr and kir::Val. (pytorch#1349) * Refactor KIR Step 4 - Remove kir::Bool,Double,Int,NamedScalar. (pytorch#1350) * Refactor KIR Step 5 - Remove kir::IterDomain/TensorDomain/TensorView (pytorch#1351) * Refactor KIR Step 6 - Remove kir::UnaryOp/BinaryOp/TernaryOp/ReductionOp/WelfordOp/BroadcastOp. (pytorch#1352) * Refactor KIR Step 7 - Remove kir dispatch (pytorch#1353) * Refactor KIR Step 8 - Clean up lower_utils (pytorch#1355) * Refactor KIR Step 9 - lower_utils ir_utils::applyReplacements. (pytorch#1354) * Refactor KIR Step 10 - Remove kir_printer in favor of io_stream (pytorch#1356)

This PR relaxes the constraint so that arbitrary padding sizes can be used as long as output domains don't get larger than input domains.

) * Implement alias_copy operations only for CudaFusionGroup to support fallback path * Remove alias (a) annotation from alias_copy schema

* force segment un-connected graphs * derive heuristic on empty groups * add test * lint * handled aliased output in batchnorm * empty tensor * lint and comment * clang format * check reference tv available in pointwise scheduler * comment * cleanup test and check utils

* Have Kernel Inherit IrContainer (pytorch#1375) * Kernel<-Fusion Step 1 - Convert ExprSort to StmtSort (pytorch#1376) * Kernel<-Fusion Step 2 - Mutator refactor (pytorch#1377) * Kernel<-Fusion Step 3 - Debug print for expr_eval and type promotion fix (pytorch#1379) * Kernel<-Fusion Step 4 - Have kernel inherit Fusion (pytorch#1380) * Kernel<-Fusion Step 5 - Move lowering passes into their own files (pytorch#1382) * Kernel<-Fusion Step 6 - Remove kir::IrBuilder (pytorch#1383) * Kernel<-Fusion Step 7 - Remove kir functions from ComputeAtMap (pytorch#1384) * Kernel<-Fusion Step 8 - Clean up [lower/executor] utils (pytorch#1387) * Kernel<-Fusion Step 9 - Remove TensorView::fuserTv (pytorch#1388) * Kernel<-Fusion Step 10 - Remove lowerVal/lowerExpr (pytorch#1389) * Kernel<-Fusion Step 11 - Finish cleaning up kir (pytorch#1390)

…1395)

Adds TensorView::doubleBuffer(). See the new tests how it is used. For an overview of the lowering algorithm, please see lower_double_buffer.h.

1. extend buildShapeExpression for squeeze_copy/unsqueeze_copy ops. 2. patching broadcastSizes insertion point for buildShapeExpression to avoid graph::copy() linter assert. 3. adding tests 4. supports no-op squeeze (squeezing on dimension that's not size-1) TODO (in follow up PRs): 1. extend buildShapeExpression to view_copy and reshape_copy as well 2. refactor broadcastSizesExpression to allow graceful failure instead of hard assert

facebook-github-bot · 2022-05-18T00:05:55Z

@eellison has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2022-05-18T00:59:55Z

@eellison has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2022-05-18T02:02:38Z

@eellison has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2022-05-18T18:17:51Z

@pytorchbot merge this

(Initiating merge automatically since Phabricator Diff has merged)

pytorchmergebot · 2022-05-18T18:19:09Z

Merge failed due to list index out of range
Raised by https://github.com/pytorch/pytorch/actions/runs/2347398630

eellison · 2022-05-18T18:22:40Z

Merge failed due to list index out of range
Raised by https://github.com/pytorch/pytorch/actions/runs/2347398630

cc @seemethere, do you know anything about this ? it landed internally

eellison · 2022-05-18T18:24:08Z

  File ".github/scripts/trymerge.py", line 923, in main
    pr.merge_into(repo, dry_run=args.dry_run, force=args.force, comment_id=args.comment_id)
  File ".github/scripts/trymerge.py", line 695, in merge_into
    repo._run_git("commit", f"--author=\"{self.get_author()}\"", "-m", msg)
  File ".github/scripts/trymerge.py", line 571, in get_author
    authors = self.get_authors()
  File ".github/scripts/trymerge.py", line 566, in get_authors
    rc[self.get_committer_login(idx)] = self.get_committer_author(idx)
  File ".github/scripts/trymerge.py", line 527, in get_committer_author
    return self._fetch_authors()[num][1]

seemethere · 2022-05-18T18:34:27Z

@pytorchbot force merge this

pytorchmergebot · 2022-05-18T18:35:53Z

Merge failed due to list index out of range
Raised by https://github.com/pytorch/pytorch/actions/runs/2347479349

seemethere · 2022-05-18T18:37:30Z

Looks like this PR is so massive that our GHF tooling can't actually handle it correctly, investigating on how to unblock

github-actions · 2022-05-18T18:52:43Z

Hey @jjsjann123.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

seemethere · 2022-05-18T18:53:00Z

Merged manually with the following modifications to trymerge.py:

diff --git a/.github/scripts/trymerge.py b/.github/scripts/trymerge.py
index f482358a85..067ca14184 100755
--- a/.github/scripts/trymerge.py
+++ b/.github/scripts/trymerge.py
@@ -521,10 +521,12 @@ class GitHubPR:
         return authors

     def get_committer_login(self, num: int = 0) -> str:
+        return "jjsjann123"
         return self._fetch_authors()[num][0]

     def get_committer_author(self, num: int = 0) -> str:
-        return self._fetch_authors()[num][1]
+        return "jjsjann123 <jiej@nvidia.com>"
+        # return self._fetch_authors()[num][1]

     def get_checkrun_conclusions(self) -> Dict[str, str]:
         """ Returns list of checkrun / conclusions """

Ran using the following commands:

python3 .github/scripts/trymerge.py --dry-run 77471
git push

seemethere · 2022-05-18T18:59:22Z

This is the last time we will be merging a PR of 20k lines

Summary: Updating nvfuser code base. This should fix the indexing issue observed in pytorch/vision#6015. Running tests locally as well. Will update the description here at a later point bypass-github-export-checks Pull Request resolved: #77471 Reviewed By: malfet, seemethere Differential Revision: D36393120 Pulled By: eellison fbshipit-source-id: 876f2d066e8e54b5d076de66ad1811f6970be1c8

Enable NVFuser in OSS. Retry of #77213, because it was breaking torchvision tests. Fix in #77471 has been verified by jjsjann123 Pull Request resolved: #77579 Approved by: https://github.com/eellison, https://github.com/malfet, https://github.com/atalman, https://github.com/seemethere

Summary: Enable NVFuser in OSS. Retry of #77213, because it was breaking torchvision tests. Fix in #77471 has been verified by jjsjann123 Pull Request resolved: #77579 Approved by: https://github.com/eellison, https://github.com/malfet, https://github.com/atalman, https://github.com/seemethere Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/38bc10ae25c6fd2f445926fdee148ac19a4a1c08 Reviewed By: seemethere Differential Revision: D36552636 Pulled By: seemethere fbshipit-source-id: 3ee5eb9ad5ee2638ef75105a366d90db54b0b436

cherry-picked from pytorch#77471 fixing MSVC build on lambda bug

Updating nvfuser code base. This should fix the indexing issue observed in pytorch/vision#6015. Running tests locally as well. Will update the description here at a later point @bypass-github-export-checks Pull Request resolved: pytorch/pytorch#77471 Approved by: https://github.com/seemethere, https://github.com/eellison

csarofeen and others added 30 commits December 15, 2021 12:31

Make dispatch of KIR and FusionIR more similar. (pytorch#1314)

e2f287a

Fix test names. (pytorch#1329)

578e6a9

add missing terminating " character (pytorch#1330)

7e84e15

Type Promotion Fixes (pytorch#1322)

541bd77

* Allow cast from Int to Int32 type * Update test_binary_ops with scalar tests * Add integral scalars to optional cast exception list

Implement a visitor like class for KIR, move passes to it. (pytorch#1332

1a616d9

)

fixing conv2d decomposition and tests (pytorch#1333)

ef62e4e

Create mutator class for kir and refactor passes (pytorch#1336)

2158dba

Refactor get allocation information in lower_utils (pytorch#1337)

59cbf76

... and reuse in lower_allocation pass.

Alias copy patch (pytorch#1338)

2672320

fixes the assertion from pytorch#1325 on our devel branch. 1. update alias information after graph mutation 2. patch unsqueeze: i. support negative dimension; ii. fixing range check

Add rsub for functorch support. (pytorch#1342)

b308af2

Fix segfault. (pytorch#1357)

be3267d

Clang format (pytorch#1360)

24313d9

Support more flexible padding sizes in shift and gather (pytorch#1334)

2c40949

This PR relaxes the constraint so that arbitrary padding sizes can be used as long as output domains don't get larger than input domains.

clang-tidy (pytorch#1363)

9e0c9af

Print CA info only when FIR (pytorch#1364)

7ce469c

Transposing scalar tensor patch (pytorch#1361)

99be762

Build error fix (and clang-format) (pytorch#1368)

850200c

Fixes pytorch#1310 - alias_copy assertion in fallback path (pytorch#1335

4f6c999

) * Implement alias_copy operations only for CudaFusionGroup to support fallback path * Remove alias (a) annotation from alias_copy schema

Avoid constructing a new TV in parsing. (pytorch#1374)

d78a0c4

clang-format (pytorch#1394)

6b66dce

Pass inputs to compileFusion to avoid redundant compilation (pytorch#…

589cbca

…1395)

Double buffering support (pytorch#1381)

0da82c4

Adds TensorView::doubleBuffer(). See the new tests how it is used. For an overview of the lowering algorithm, please see lower_double_buffer.h.

Some minor fixes. (pytorch#1401)

39082d7

new clang-format binary hash (pytorch#1398)

5a4716a

eellison approved these changes May 18, 2022

View reviewed changes

seemethere approved these changes May 18, 2022

View reviewed changes

csarofeen mentioned this pull request May 18, 2022

nvFuser compiling with -O0 option huggingface/pytorch-image-models#1244

Closed

seemethere closed this in a2802ad May 18, 2022

jjsjann123 mentioned this pull request May 24, 2022

fixing MSVC build csarofeen/pytorch#1728

Merged

jjsjann123 added a commit to csarofeen/pytorch that referenced this pull request May 25, 2022

fixing MSVC build (#1728)

fbbbe0a

cherry-picked from pytorch#77471 fixing MSVC build on lambda bug

jjsjann123 deleted the upstream_master_bump_0513 branch May 26, 2022 23:23

jjsjann123 mentioned this pull request Jun 7, 2022

[nvfuser_upstream_push] nvfuser code base bump 052422 #78244

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upstream master bump 0513 #77471

Upstream master bump 0513 #77471

jjsjann123 commented May 14, 2022 •

edited by eellison

Loading

facebook-github-bot commented May 18, 2022

facebook-github-bot commented May 18, 2022

facebook-github-bot commented May 18, 2022

facebook-github-bot commented May 18, 2022

pytorchmergebot commented May 18, 2022

eellison commented May 18, 2022

eellison commented May 18, 2022

seemethere commented May 18, 2022

pytorchmergebot commented May 18, 2022

seemethere commented May 18, 2022

github-actions bot commented May 18, 2022

seemethere commented May 18, 2022 •

edited

Loading

seemethere commented May 18, 2022

Upstream master bump 0513 #77471

Upstream master bump 0513 #77471

Conversation

jjsjann123 commented May 14, 2022 • edited by eellison Loading

facebook-github-bot commented May 18, 2022

facebook-github-bot commented May 18, 2022

facebook-github-bot commented May 18, 2022

facebook-github-bot commented May 18, 2022

pytorchmergebot commented May 18, 2022

eellison commented May 18, 2022

eellison commented May 18, 2022

seemethere commented May 18, 2022

pytorchmergebot commented May 18, 2022

seemethere commented May 18, 2022

github-actions bot commented May 18, 2022

seemethere commented May 18, 2022 • edited Loading

seemethere commented May 18, 2022

jjsjann123 commented May 14, 2022 •

edited by eellison

Loading

seemethere commented May 18, 2022 •

edited

Loading