Skip to content

Conversation

jfix71
Copy link
Contributor

@jfix71 jfix71 commented Sep 30, 2021

Summary:
We use split_module to split the input model that we want to const fold into const and non-const subgraphs. Previously we were taking the non-const graph and trying to hack it back into the same signature as the input model. However this was complex/buggy.

Instead, refactor to just keep using the base split module that contains both const and non-const graphs. This means we:

  • Inline the non-const graph into the split module
  • Remove the const graph from the module and replace it with a getattr that will be run to insert that attr when we run_folding

Test Plan: Added test coverage to cover newly supported folding, and updated other tests for new strategy.

Differential Revision: D31293307

@pytorch-probot
Copy link

pytorch-probot bot commented Sep 30, 2021

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/jfix71/pytorch/blob/447d55da21782cf447ecbafda987e26a1192448c/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-bionic-py3.6-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/xla ✅ triggered
linux-xenial-cuda11.3-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux ✅ triggered
linux-xenial-py3.6-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers ✅ triggered
linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux ✅ triggered
linux-xenial-py3.6-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/win ✅ triggered
Skipped Workflows
libtorch-linux-xenial-cuda10.2-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow 🚫 skipped
linux-xenial-cuda10.2-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow 🚫 skipped
parallelnative-linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux 🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda11.1-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.1-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
puretorch-linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux 🚫 skipped

You can add a comment to the PR and tag @pytorchbot with the following commands:
# ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun

# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow

For more information, please take a look at the CI Flow Wiki.

@jfix71 jfix71 force-pushed the export-D31293307 branch 4 times, most recently from a41cd9a to 3d29535 Compare September 30, 2021 17:49
@jfix71
Copy link
Contributor Author

jfix71 commented Sep 30, 2021

@pytorchbot ciflow rerun linux-xenial-py3.6-gcc5.4

@jfix71
Copy link
Contributor Author

jfix71 commented Oct 1, 2021

@pytorchbot ciflow rerun linux-xenial-py3.6

…orrectly handle non-single-Tensor outputs (pytorch#65933)

Summary:
Pull Request resolved: pytorch#65933

We use `split_module` to split the input model that we want to const fold into const and non-const subgraphs. Previously we were taking the non-const graph and trying to hack it back into the same signature as the input model. However this was complex/buggy.

Instead, refactor to just keep using the base split module that contains both const and non-const graphs. This means we:
- Inline the non-const graph into the split module
- Remove the const graph from the module and replace it with a getattr that will be run to insert that attr when we `run_folding`

Test Plan: Added test coverage to cover newly supported folding, and updated other tests for new strategy.

Reviewed By: yinghai

Differential Revision: D31293307

fbshipit-source-id: e1174337e1473c7c3a5f82f0227a6fd0ce42db8c
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Oct 1, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 447d55d (more details on the Dr. CI page):


  • 1/5 failures introduced in this PR
  • 4/5 broken upstream at merge base 4f5ea59 on Sep 30 from 10:12pm to 10:27pm

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See GitHub Actions build linux-xenial-py3.6-gcc7-bazel-test / build-and-test (1/1)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-10-01T05:54:48.3206885Z ModuleNotFoundError: No module named 'boto3'
2021-10-01T05:54:48.3199275Z   File "/home/ec2-user/actions-runner/_work/pytorch/pytorch/tools/stats/upload_binary_size_to_scuba.py", line 155, in <module>
2021-10-01T05:54:48.3200072Z     register_rds_schema("binary_size", schema_from_sample(sample_data))
2021-10-01T05:54:48.3200983Z   File "/home/ec2-user/actions-runner/_work/pytorch/pytorch/tools/stats/scribe.py", line 94, in register_rds_schema
2021-10-01T05:54:48.3201606Z     invoke_rds(event)
2021-10-01T05:54:48.3202345Z   File "/home/ec2-user/actions-runner/_work/pytorch/pytorch/tools/stats/scribe.py", line 72, in invoke_rds
2021-10-01T05:54:48.3203108Z     return invoke_lambda("rds-proxy", events)
2021-10-01T05:54:48.3203908Z   File "/home/ec2-user/actions-runner/_work/pytorch/pytorch/tools/stats/scribe.py", line 30, in invoke_lambda
2021-10-01T05:54:48.3204713Z     res = aws_lambda().invoke(FunctionName=name, Payload=json.dumps(payload).encode())
2021-10-01T05:54:48.3205619Z   File "/home/ec2-user/actions-runner/_work/pytorch/pytorch/tools/stats/scribe.py", line 21, in aws_lambda
2021-10-01T05:54:48.3206270Z     import boto3  # type: ignore[import]
2021-10-01T05:54:48.3206885Z ModuleNotFoundError: No module named 'boto3'
2021-10-01T05:54:48.3279210Z ##[group]Run # detached container should get cleaned up by teardown_ec2_linux
2021-10-01T05:54:48.3279980Z �[36;1m# detached container should get cleaned up by teardown_ec2_linux�[0m
2021-10-01T05:54:48.3280476Z �[36;1mexport SHARD_NUMBER=0�[0m
2021-10-01T05:54:48.3280933Z �[36;1m# TODO: Stop building test binaries as part of the build phase�[0m
2021-10-01T05:54:48.3281522Z �[36;1m# Make sure we copy test results from bazel-testlogs symlink to�[0m
2021-10-01T05:54:48.3282068Z �[36;1m# a regular directory ./test/test-reports�[0m
2021-10-01T05:54:48.3282526Z �[36;1mcontainer_name=$(docker run \�[0m
2021-10-01T05:54:48.3282932Z �[36;1m  -e BUILD_ENVIRONMENT \�[0m
2021-10-01T05:54:48.3283333Z �[36;1m  -e CUSTOM_TEST_ARTIFACT_BUILD_DIR \�[0m
2021-10-01T05:54:48.3283727Z �[36;1m  -e GITHUB_ACTIONS \�[0m

🚧 4 fixed upstream failures:

These were probably caused by upstream breakages that were already fixed.

Please rebase on the viable/strict branch (expand for instructions)

If your commit is older than viable/strict, run these commands:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase FETCH_HEAD

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants