forked from pytorch/pytorch
-
Notifications
You must be signed in to change notification settings - Fork 75
Merge from master #266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Merge from master #266
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Summary: Use at::empty instead. Pull Request resolved: pytorch#12360 Reviewed By: ezyang Differential Revision: D10215119 Pulled By: gchanan fbshipit-source-id: f9bb257dff1b1bf1ecd3a6e358c4791d81b5bd31
Summary: Add a pass to move all constants to the beginning of the graph, and deduplicate. This extends pytorch#10231 to also handle constants introduced in inlining, constant propagation, etc. Pull Request resolved: pytorch#12222 Reviewed By: driazati Differential Revision: D10201616 Pulled By: eellison fbshipit-source-id: bc9c5be26868c8b5414257a0d4462de025aeb9bd
Summary: Pull Request resolved: pytorch#12433 To test 3d conv, we need to pass lists in spec argument. We also don't want to set use_cudnn=True which is the default in brew. Reviewed By: llyfacebook, csummersea Differential Revision: D10234315 fbshipit-source-id: 96a39992a97e020d6e9dac103e6d64df0cc1020b
pytorch#11957) Summary: Pull Request resolved: pytorch#11957 For distributed inference, we want to use async_scheduling net to run the net as we need its async part. However, according to the profiling, async_net has big overhead of dispatching tasks onto worker threads. This diff improves the issue by generating a smaller number of chains/tasks by grouping the sync ops that can be run in one shot. Note that it also schedule individual async ops as a single chain because unlike gpu ops, rpc ops are not guaranteed to be linearized at the remote site. For example, if you have two rps ops `op1->op2`, op2 won't implicitly block until op1 finishes. Therefore we need to put each of the async op as one chain as async_scheduling net will only sync the tail of the chain. For the all sync op nets, this change give us `1.5X` slower than simple_net, while without the change, it is `7X` slower. Next step is to work on the executor to make the task scheduling faster. And add a fallback path to be able to run ops inline if it's a all-sync net. Reviewed By: ilia-cher Differential Revision: D9874140 fbshipit-source-id: fcd45328698c29211f2c06ee3287194acda12227
Summary: Pull Request resolved: pytorch#12253 Adding python bindings to unblock DAI development Reviewed By: duc0 Differential Revision: D10141621 fbshipit-source-id: efac7fb8a0cc787e1c4cc94515e673812529a997
Summary: Pull Request resolved: pytorch#12255 Simple algorithm to connect a subgraph Reviewed By: ZolotukhinM Differential Revision: D10141701 fbshipit-source-id: c79c5bc2be89100db602d0a5ff3d17e3dc332d8c
…2303) Summary: Adding back import{Node,Edge} as move{Node,Edge} and adding a new function moveSubgraph. Previous diff broke OSS Pull Request resolved: pytorch#12303 Differential Revision: D10182522 Pulled By: bwasti fbshipit-source-id: 9619431d6d1a44f128613a4f6d8b7f31232ccf28
) Summary: Pull Request resolved: pytorch#12455 - Mirror changes in pthreadpool Reviewed By: harouwu Differential Revision: D10240470 fbshipit-source-id: c1af769b5894f7865736fdaf4e0e5bf17c524614
Summary: Signed-off-by: Marcela Morales Quispe <marcela.morales.quispe@gmail.com> Pull Request resolved: pytorch#12440 Differential Revision: D10242642 Pulled By: SsnL fbshipit-source-id: f47d7579cf3df097c476a97b58149ca4b1eb17ab
Summary: This test only runs when you have torchvision installed, which is not the case on CI builds. When I run test_jit on my local machine, this fails, so fixing up the expect file here. Pull Request resolved: pytorch#12458 Differential Revision: D10244344 Pulled By: jamesr66a fbshipit-source-id: 728c5d9e6c37f807a0780066f20f6c31de84d544
Summary: Fix expect file that got out of sync Pull Request resolved: pytorch#12465 Differential Revision: D10244646 Pulled By: eellison fbshipit-source-id: 66d101d4c6c0a235ce9fa47dc3cce027624c86bc
… arg (pytorch#12353) Summary: ATenOp was handling `torch.where` incorrectly. Whereas the `torch.where` overload (and `aten::` function) had arguments in the order `Tensor condition, Tensor self, Tensor other`, ATenOp was emitting code that assumed that `self` was the 0th argument, and thus was trying to interpret the wrong value as the condition. Pull Request resolved: pytorch#12353 Differential Revision: D10218435 Pulled By: jamesr66a fbshipit-source-id: afe31c5d4f941e5fa500e6b0ef941346659c8d95
Summary: - Removed the old nccl file - Make open-source NCCL a submodule - CMake to make NCCL itself NCCL2 now is in the default build. Pull Request resolved: pytorch#12359 Reviewed By: orionr, yns88 Differential Revision: D10219665 Pulled By: teng-li fbshipit-source-id: 134ff47057512ba617b48bf390c1c816fff3f881
…duling"" (pytorch#12418) Summary: Pull Request resolved: pytorch#12418 Original commit changeset: 32921600925b Reviewed By: yinghai Differential Revision: D10231119 fbshipit-source-id: 7d09ea8de82ff2d911d9ded88d87af4226464d1b
Summary: Previously we tested if default-construction was noexcept, which doesn't really mean that the move constructor is noexcept too. Shuts up clang-tidy. Signed-off-by: Edward Z. Yang <ezyang@fb.com> CC goldsborough Pull Request resolved: pytorch#12369 Differential Revision: D10217348 Pulled By: ezyang fbshipit-source-id: b46437d8ac7a8d756cf03ed0c6bf4400db7ecde7
Summary: Pull Request resolved: pytorch#12391 Reviewed By: mlappelbaum Differential Revision: D10220000 fbshipit-source-id: 10fdbc8ebab931a5be31df964b5de5728048205d
Summary: Changes in this PR: 1. Intermediate Docker image is shared from build stage to test stage through ECR, in order to fix the Caffe2 flaky CUDA tests. 2. There are ~7 Caffe2 operator tests that are only flaky in `caffe2_py2_gcc4_8_ubuntu14_04_test` on CPU. Disabling those tests on that config only, which is okay to do because we are still running those tests in other test jobs. After this PR is merged, CircleCI will be running on master automatically, and will be running on PRs if the author rebased their PR onto the newest master (which we will ask all the authors to do when we switch off Jenkins for Linux). Pull Request resolved: pytorch#12389 Differential Revision: D10224267 Pulled By: yf225 fbshipit-source-id: dd1a90a425c3d13b870d3d328cb301eee2e6e2cd
Summary: Pull Request resolved: pytorch#12453 Differential Revision: D10244130 Pulled By: SsnL fbshipit-source-id: e425c76bfb721fe118a32ddd1fa6eca3a3cd86f0
Summary: Pull Request resolved: pytorch#12235 SSA is actually implicitly maintained so not only was this function not implemented, it never should be implemented. Reviewed By: duc0 Differential Revision: D10133928 fbshipit-source-id: e8e5e2386f8b57812b0be2c380af85ed07cd3152
Summary: Pull Request resolved: pytorch#12237 This diff creates named functions and cleans up a lot of the basic block usage throughout the code Reviewed By: duc0 Differential Revision: D10134363 fbshipit-source-id: d0c4ae0bbb726236a15251dbfd529d4fddcd9e9f
Summary: The value_info proto field was being processed in BuildGraph, but control flow blocks used buildBlocks instead. This PR moves moves that step to BuildBlock. I removed DecoderBase because it was making the code confusing and we never needed it in the first place. closes pytorch#12319 Pull Request resolved: pytorch#12351 Differential Revision: D10212411 Pulled By: li-roy fbshipit-source-id: 47f289a462a1ab7391ff57368185401673980233
Integrate from upstream
* rocRAND brings Poisson RNG - don't disable functions * At the very least the WARP_SIZE is wrong for ROCm. * Do not disable renorm kernel * Put the ifdef directly in to ensure better maintainability. * scalar_cast does not exist any longer. * Implement the one ifdef directly, CUDA_VERSION check is not in the file any longer.
* Add miopengemm as a proper, required dependency to LoadHIP. * Always install hip-thrust
Performance script fix
skkkumar
pushed a commit
to skkkumar/pytorch
that referenced
this pull request
Jul 29, 2025
Commit Messages: - Fix build error (ROCm#264) (ROCm#266) Co-authored-by: Prachi Gupta <pracgupt@amd.com> PRs: - ROCm/apex#266 Fixes: - https://example.com/issue-264 - https://example.com/issue-266
skkkumar
pushed a commit
to skkkumar/pytorch
that referenced
this pull request
Jul 29, 2025
Commit Messages: - Fix build error (ROCm#264) (ROCm#266) Co-authored-by: Prachi Gupta <pracgupt@amd.com> PRs: - ROCm/apex#266 Fixes: - https://example.com/issue-266 - https://example.com/issue-264
skkkumar
pushed a commit
to skkkumar/pytorch
that referenced
this pull request
Jul 29, 2025
Commit Messages: - Fix build error (ROCm#264) (ROCm#266) Co-authored-by: Prachi Gupta <pracgupt@amd.com> PRs: - ROCm/apex#266 Fixes: - https://example.com/issue-264 - https://example.com/issue-266
skkkumar
pushed a commit
to skkkumar/pytorch
that referenced
this pull request
Jul 29, 2025
Commit Messages: - Fix build error (ROCm#264) (ROCm#266) Co-authored-by: Prachi Gupta <pracgupt@amd.com> PRs: - ROCm/apex#266 Fixes: - https://example.com/issue-264 - https://example.com/issue-266
skkkumar
pushed a commit
to skkkumar/pytorch
that referenced
this pull request
Jul 29, 2025
Commit Messages: - Fix build error (ROCm#264) (ROCm#266) Co-authored-by: Prachi Gupta <pracgupt@amd.com> PRs: - ROCm/apex#266 Fixes: - https://example.com/issue-264 - https://example.com/issue-266
amd-sriram
pushed a commit
that referenced
this pull request
Jul 29, 2025
Commit Messages: - Fix build error (#264) (#266) Co-authored-by: Prachi Gupta <pracgupt@amd.com> PRs: - ROCm/apex#266 Fixes: - https://example.com/issue-264 - https://example.com/issue-266
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.