Merge from master #266

ashishfarmer · 2018-10-11T20:24:15Z

No description provided.

Summary: Use at::empty instead. Pull Request resolved: pytorch#12360 Reviewed By: ezyang Differential Revision: D10215119 Pulled By: gchanan fbshipit-source-id: f9bb257dff1b1bf1ecd3a6e358c4791d81b5bd31

Summary: Add a pass to move all constants to the beginning of the graph, and deduplicate. This extends pytorch#10231 to also handle constants introduced in inlining, constant propagation, etc. Pull Request resolved: pytorch#12222 Reviewed By: driazati Differential Revision: D10201616 Pulled By: eellison fbshipit-source-id: bc9c5be26868c8b5414257a0d4462de025aeb9bd

Summary: Pull Request resolved: pytorch#12433 To test 3d conv, we need to pass lists in spec argument. We also don't want to set use_cudnn=True which is the default in brew. Reviewed By: llyfacebook, csummersea Differential Revision: D10234315 fbshipit-source-id: 96a39992a97e020d6e9dac103e6d64df0cc1020b

pytorch#11957) Summary: Pull Request resolved: pytorch#11957 For distributed inference, we want to use async_scheduling net to run the net as we need its async part. However, according to the profiling, async_net has big overhead of dispatching tasks onto worker threads. This diff improves the issue by generating a smaller number of chains/tasks by grouping the sync ops that can be run in one shot. Note that it also schedule individual async ops as a single chain because unlike gpu ops, rpc ops are not guaranteed to be linearized at the remote site. For example, if you have two rps ops `op1->op2`, op2 won't implicitly block until op1 finishes. Therefore we need to put each of the async op as one chain as async_scheduling net will only sync the tail of the chain. For the all sync op nets, this change give us `1.5X` slower than simple_net, while without the change, it is `7X` slower. Next step is to work on the executor to make the task scheduling faster. And add a fallback path to be able to run ops inline if it's a all-sync net. Reviewed By: ilia-cher Differential Revision: D9874140 fbshipit-source-id: fcd45328698c29211f2c06ee3287194acda12227

Summary: Pull Request resolved: pytorch#12253 Adding python bindings to unblock DAI development Reviewed By: duc0 Differential Revision: D10141621 fbshipit-source-id: efac7fb8a0cc787e1c4cc94515e673812529a997

Summary: Pull Request resolved: pytorch#12255 Simple algorithm to connect a subgraph Reviewed By: ZolotukhinM Differential Revision: D10141701 fbshipit-source-id: c79c5bc2be89100db602d0a5ff3d17e3dc332d8c

…2303) Summary: Adding back import{Node,Edge} as move{Node,Edge} and adding a new function moveSubgraph. Previous diff broke OSS Pull Request resolved: pytorch#12303 Differential Revision: D10182522 Pulled By: bwasti fbshipit-source-id: 9619431d6d1a44f128613a4f6d8b7f31232ccf28

) Summary: Pull Request resolved: pytorch#12455 - Mirror changes in pthreadpool Reviewed By: harouwu Differential Revision: D10240470 fbshipit-source-id: c1af769b5894f7865736fdaf4e0e5bf17c524614

Summary: Signed-off-by: Marcela Morales Quispe <marcela.morales.quispe@gmail.com> Pull Request resolved: pytorch#12440 Differential Revision: D10242642 Pulled By: SsnL fbshipit-source-id: f47d7579cf3df097c476a97b58149ca4b1eb17ab

Summary: This test only runs when you have torchvision installed, which is not the case on CI builds. When I run test_jit on my local machine, this fails, so fixing up the expect file here. Pull Request resolved: pytorch#12458 Differential Revision: D10244344 Pulled By: jamesr66a fbshipit-source-id: 728c5d9e6c37f807a0780066f20f6c31de84d544

Summary: Fix expect file that got out of sync Pull Request resolved: pytorch#12465 Differential Revision: D10244646 Pulled By: eellison fbshipit-source-id: 66d101d4c6c0a235ce9fa47dc3cce027624c86bc

… arg (pytorch#12353) Summary: ATenOp was handling `torch.where` incorrectly. Whereas the `torch.where` overload (and `aten::` function) had arguments in the order `Tensor condition, Tensor self, Tensor other`, ATenOp was emitting code that assumed that `self` was the 0th argument, and thus was trying to interpret the wrong value as the condition. Pull Request resolved: pytorch#12353 Differential Revision: D10218435 Pulled By: jamesr66a fbshipit-source-id: afe31c5d4f941e5fa500e6b0ef941346659c8d95

Summary: - Removed the old nccl file - Make open-source NCCL a submodule - CMake to make NCCL itself NCCL2 now is in the default build. Pull Request resolved: pytorch#12359 Reviewed By: orionr, yns88 Differential Revision: D10219665 Pulled By: teng-li fbshipit-source-id: 134ff47057512ba617b48bf390c1c816fff3f881

…duling"" (pytorch#12418) Summary: Pull Request resolved: pytorch#12418 Original commit changeset: 32921600925b Reviewed By: yinghai Differential Revision: D10231119 fbshipit-source-id: 7d09ea8de82ff2d911d9ded88d87af4226464d1b

Summary: Previously we tested if default-construction was noexcept, which doesn't really mean that the move constructor is noexcept too. Shuts up clang-tidy. Signed-off-by: Edward Z. Yang <ezyang@fb.com> CC goldsborough Pull Request resolved: pytorch#12369 Differential Revision: D10217348 Pulled By: ezyang fbshipit-source-id: b46437d8ac7a8d756cf03ed0c6bf4400db7ecde7

Summary: Pull Request resolved: pytorch#12391 Reviewed By: mlappelbaum Differential Revision: D10220000 fbshipit-source-id: 10fdbc8ebab931a5be31df964b5de5728048205d

Summary: Changes in this PR: 1. Intermediate Docker image is shared from build stage to test stage through ECR, in order to fix the Caffe2 flaky CUDA tests. 2. There are ~7 Caffe2 operator tests that are only flaky in `caffe2_py2_gcc4_8_ubuntu14_04_test` on CPU. Disabling those tests on that config only, which is okay to do because we are still running those tests in other test jobs. After this PR is merged, CircleCI will be running on master automatically, and will be running on PRs if the author rebased their PR onto the newest master (which we will ask all the authors to do when we switch off Jenkins for Linux). Pull Request resolved: pytorch#12389 Differential Revision: D10224267 Pulled By: yf225 fbshipit-source-id: dd1a90a425c3d13b870d3d328cb301eee2e6e2cd

Summary: Pull Request resolved: pytorch#12453 Differential Revision: D10244130 Pulled By: SsnL fbshipit-source-id: e425c76bfb721fe118a32ddd1fa6eca3a3cd86f0

Summary: Pull Request resolved: pytorch#12235 SSA is actually implicitly maintained so not only was this function not implemented, it never should be implemented. Reviewed By: duc0 Differential Revision: D10133928 fbshipit-source-id: e8e5e2386f8b57812b0be2c380af85ed07cd3152

Summary: Pull Request resolved: pytorch#12237 This diff creates named functions and cleans up a lot of the basic block usage throughout the code Reviewed By: duc0 Differential Revision: D10134363 fbshipit-source-id: d0c4ae0bbb726236a15251dbfd529d4fddcd9e9f

Summary: The value_info proto field was being processed in BuildGraph, but control flow blocks used buildBlocks instead. This PR moves moves that step to BuildBlock. I removed DecoderBase because it was making the code confusing and we never needed it in the first place. closes pytorch#12319 Pull Request resolved: pytorch#12351 Differential Revision: D10212411 Pulled By: li-roy fbshipit-source-id: 47f289a462a1ab7391ff57368185401673980233

Integrate from upstream

* rocRAND brings Poisson RNG - don't disable functions * At the very least the WARP_SIZE is wrong for ROCm. * Do not disable renorm kernel * Put the ifdef directly in to ensure better maintainability. * scalar_cast does not exist any longer. * Implement the one ifdef directly, CUDA_VERSION check is not in the file any longer.

* Add miopengemm as a proper, required dependency to LoadHIP. * Always install hip-thrust

Performance script fix

Commit Messages: - Fix build error (ROCm#264) (ROCm#266) Co-authored-by: Prachi Gupta <pracgupt@amd.com> PRs: - ROCm/apex#266 Fixes: - https://example.com/issue-264 - https://example.com/issue-266

Commit Messages: - Fix build error (ROCm#264) (ROCm#266) Co-authored-by: Prachi Gupta <pracgupt@amd.com> PRs: - ROCm/apex#266 Fixes: - https://example.com/issue-266 - https://example.com/issue-264

Commit Messages: - Fix build error (ROCm#264) (ROCm#266) Co-authored-by: Prachi Gupta <pracgupt@amd.com> PRs: - ROCm/apex#266 Fixes: - https://example.com/issue-264 - https://example.com/issue-266

Commit Messages: - Fix build error (#264) (#266) Co-authored-by: Prachi Gupta <pracgupt@amd.com> PRs: - ROCm/apex#266 Fixes: - https://example.com/issue-264 - https://example.com/issue-266

gchanan and others added 28 commits October 8, 2018 11:39

Remove Type.tensor(). (pytorch#12360)

83b4dc6

Summary: Use at::empty instead. Pull Request resolved: pytorch#12360 Reviewed By: ezyang Differential Revision: D10215119 Pulled By: gchanan fbshipit-source-id: f9bb257dff1b1bf1ecd3a6e358c4791d81b5bd31

Add python bindings (pytorch#12253)

7103d0d

Summary: Pull Request resolved: pytorch#12253 Adding python bindings to unblock DAI development Reviewed By: duc0 Differential Revision: D10141621 fbshipit-source-id: efac7fb8a0cc787e1c4cc94515e673812529a997

Induce edges on subgraphs (pytorch#12255)

cf2b88f

Summary: Pull Request resolved: pytorch#12255 Simple algorithm to connect a subgraph Reviewed By: ZolotukhinM Differential Revision: D10141701 fbshipit-source-id: c79c5bc2be89100db602d0a5ff3d17e3dc332d8c

Implement 3D and 4D parallelization in Caffe2 thread pool (pytorch#12455

a55b9f7

) Summary: Pull Request resolved: pytorch#12455 - Mirror changes in pthreadpool Reviewed By: harouwu Differential Revision: D10240470 fbshipit-source-id: c1af769b5894f7865736fdaf4e0e5bf17c524614

Add missing url links to README.md file. (pytorch#12440)

d4b4c1f

Summary: Signed-off-by: Marcela Morales Quispe <marcela.morales.quispe@gmail.com> Pull Request resolved: pytorch#12440 Differential Revision: D10242642 Pulled By: SsnL fbshipit-source-id: f47d7579cf3df097c476a97b58149ca4b1eb17ab

fix expect file (pytorch#12465)

d0e1dca

Summary: Fix expect file that got out of sync Pull Request resolved: pytorch#12465 Differential Revision: D10244646 Pulled By: eellison fbshipit-source-id: 66d101d4c6c0a235ce9fa47dc3cce027624c86bc

Add clamping functionality to stats_put_ops

5a0d2c7

Summary: Pull Request resolved: pytorch#12391 Reviewed By: mlappelbaum Differential Revision: D10220000 fbshipit-source-id: 10fdbc8ebab931a5be31df964b5de5728048205d

Fix a bunch of warnings in TestNN

d400502

Summary: Pull Request resolved: pytorch#12453 Differential Revision: D10244130 Pulled By: SsnL fbshipit-source-id: e425c76bfb721fe118a32ddd1fa6eca3a3cd86f0

Merge remote-tracking branch 'rocm_upstream/upstream' into ifu

ca71c11

Merge pull request #254 from iotamudelta/ifu

5b2b998

Integrate from upstream

Enable multigpu tests in test_cuda and test_nn (#257)

acb364a

RHEL enablement (#259)

8516a28

* Add miopengemm as a proper, required dependency to LoadHIP. * Always install hip-thrust

Run Resnet benchmark with no bias model

c8f1a3c

Merge pull request #265 from ashishfarmer/perflab_fix

2834bc1

Performance script fix

ashishfarmer requested a review from ezyang as a code owner October 11, 2018 20:24

ashishfarmer merged commit 81302f6 into plb-v1 Oct 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Merge from master #266

Merge from master #266

Uh oh!

ashishfarmer commented Oct 11, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants

Merge from master #266

Merge from master #266

Uh oh!

Conversation

ashishfarmer commented Oct 11, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants