[pytorch] deprecate default_op_deps.yaml #59573

ljk53 · 2021-06-07T18:41:50Z

Stack from ghstack:

[pytorch] deprecate default_op_deps.yaml #59573 [pytorch] deprecate default_op_deps.yaml

To do mobile selective build, we have several options:

static dispatch;
dynamic dispatch + static analysis (to create the dependency graph);
dynamic dispatch + tracing;

We are developing 3. For open source, we used to only support 1, and
currently we support both 1 and 2.

This file is only used for 2. It was introduced when we deprecated
the static dispatch (1). The motivation was to make sure we have a
low-friction selective build workflow for dynamic dispatch (2).
As the name indicates, it is the default dependency graph that users
can try if they don't bother to run the static analyzer themselves.
We have a CI to run the full workflow of 2 on every PR, which creates
the dependency graph on-the-fly instead of using the committed file.

Since the workflow to automatically update the file has been broken
for a while, it started to confuse other pytorch developers as people
are already manually editing it, and it might be broken for some models
already.

We reintroduced the static dispatch recently, so we decide to deprecate
this file now and automatically turn on static dispatch if users run
selective build without providing the static analysis graph.

The tracing-based selective build will be the ultimate solution we'd
like to provide for OSS, but it will take some more effort to polish
and release.

Differential Revision: D28941020

This file is only for open source selective build. It was introduced when we deprecated the static dispatch. The motivation was to make sure we have a low-friction selective build workflow for dynamic dispatch. As the name indicates, it is the *default* dependency graph that users can try if they don't bother to run the static analyzer themselves. Since the workflow to automatically update the file has been broken for a while, it started to confuse other pytorch developers as people are already manually editing it. We reintroduced the static dispatch recently, so we decide to deprecate this file now and automatically turn on static dispatch if users run selective build without providing the static analysis graph. The tracing-based selective build will be the ultimate solution we'd like to provide for OSS, but it will take some more effort to polish and release. [ghstack-poisoned]

facebook-github-bot · 2021-06-07T18:41:55Z

💊 CI failures summary and remediations

As of commit 3cb9832 (more details on the Dr. CI page):

3/3 failures introduced in this PR

🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_linux_bionic_py3_8_gcc9_coverage_test2 (1/2)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jun 07 20:55:18 [E request_callback_no_python.c...quest type 275: Unexpected end of pickler archive.

Jun 07 20:55:18 frame #9: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x97 (0x7f7f567acd97 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
Jun 07 20:55:18 frame #10: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x61 (0x7f7f567a0fd1 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
Jun 07 20:55:18 frame #11: <unknown function> + 0x2145e1f (0x7f7f66792e1f in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
Jun 07 20:55:18 frame #12: <unknown function> + 0x2146f95 (0x7f7f66793f95 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
Jun 07 20:55:18 frame #13: c10::ThreadPool::main_loop(unsigned long) + 0x3bc (0x7f7f63faa44c in /opt/conda/lib/python3.8/site-packages/torch/lib/libc10.so)
Jun 07 20:55:18 frame #14: <unknown function> + 0x787aa (0x7f7f63faa7aa in /opt/conda/lib/python3.8/site-packages/torch/lib/libc10.so)
Jun 07 20:55:18 frame #15: <unknown function> + 0xc819d (0x7f7f4d79419d in /opt/conda/lib/libstdc++.so.6)
Jun 07 20:55:18 frame #16: <unknown function> + 0x76db (0x7f7f758aa6db in /lib/x86_64-linux-gnu/libpthread.so.0)
Jun 07 20:55:18 frame #17: clone + 0x3f (0x7f7f755d371f in /lib/x86_64-linux-gnu/libc.so.6)
Jun 07 20:55:18 
Jun 07 20:55:18 [E request_callback_no_python.cpp:552] Received error while processing request type 275: Unexpected end of pickler archive.
Jun 07 20:55:18 Exception raised from readSlowWithBuffer at /var/lib/jenkins/workspace/torch/csrc/jit/serialization/unpickler.cpp:756 (most recent call first):
Jun 07 20:55:18 frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x59 (0x7f7f63fbbce9 in /opt/conda/lib/python3.8/site-packages/torch/lib/libc10.so)
Jun 07 20:55:18 frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, char const*) + 0x161 (0x7f7f63f73657 in /opt/conda/lib/python3.8/site-packages/torch/lib/libc10.so)
Jun 07 20:55:18 frame #2: <unknown function> + 0x8ba8c03 (0x7f7f563e8c03 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
Jun 07 20:55:18 frame #3: torch::jit::Unpickler::run() + 0x130 (0x7f7f563f53a0 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
Jun 07 20:55:18 frame #4: torch::jit::Unpickler::parse_ivalue() + 0x37 (0x7f7f563f7357 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
Jun 07 20:55:18 frame #5: torch::jit::unpickle(std::function<unsigned long (char*, unsigned long)>, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>) + 0x2b5 (0x7f7f563b3fe5 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
Jun 07 20:55:18 frame #6: torch::jit::unpickle(char const*, unsigned long, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>) + 0xf6 (0x7f7f563b4136 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
Jun 07 20:55:18 frame #7: torch::distributed::autograd::CleanupAutogradContextReq::fromMessage(torch::distributed::rpc::Message const&) + 0xe1 (0x7f7f5678a291 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
Jun 07 20:55:18 frame #8: torch::distributed::rpc::deserializeRequest(torch::distributed::rpc::Message const&) + 0x296 (0x7f7f567ea4c6 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)

Linux CI (pytorch-linux-xenial-py3.6-gcc5.4) / test (2/2)

Step: "Test PyTorch" (full log | diagnosis details | 🔁 rerun)

2021-06-07T21:15:51.9254248Z [E request_callbac...quest type 275: Unexpected end of pickler archive.

2021-06-07T21:15:51.9228828Z frame #7: torch::distributed::autograd::CleanupAutogradContextReq::fromMessage(torch::distributed::rpc::Message const&) + 0xcb (0x7f7733098e8b in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9232613Z frame #8: torch::distributed::rpc::deserializeRequest(torch::distributed::rpc::Message const&) + 0x1ed (0x7f77330df8fd in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9236714Z frame #9: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x7f (0x7f77330b3cef in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9241111Z frame #10: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x57 (0x7f77330abfc7 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9244480Z frame #11: <unknown function> + 0xd323e0 (0x7f773b8893e0 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
2021-06-07T21:15:51.9247105Z frame #12: c10::ThreadPool::main_loop(unsigned long) + 0x2a3 (0x7f773a4fe283 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
2021-06-07T21:15:51.9248845Z frame #13: <unknown function> + 0xc8421 (0x7f772ef9b421 in /opt/conda/lib/libstdc++.so.6)
2021-06-07T21:15:51.9250615Z frame #14: <unknown function> + 0x76ba (0x7f774996e6ba in /lib/x86_64-linux-gnu/libpthread.so.0)
2021-06-07T21:15:51.9252214Z frame #15: clone + 0x6d (0x7f77496a451d in /lib/x86_64-linux-gnu/libc.so.6)
2021-06-07T21:15:51.9252923Z 
2021-06-07T21:15:51.9254248Z [E request_callback_no_python.cpp:552] Received error while processing request type 275: Unexpected end of pickler archive.
2021-06-07T21:15:51.9256577Z Exception raised from readSlowWithBuffer at /var/lib/jenkins/workspace/torch/csrc/jit/serialization/unpickler.cpp:756 (most recent call first):
2021-06-07T21:15:51.9259869Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x69 (0x7f773a50e329 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
2021-06-07T21:15:51.9263030Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, char const*) + 0xc5 (0x7f773a50aae5 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
2021-06-07T21:15:51.9265693Z frame #2: <unknown function> + 0x3de3be8 (0x7f7732e2abe8 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9268134Z frame #3: torch::jit::Unpickler::run() + 0xdf (0x7f7732e3515f in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9270756Z frame #4: torch::jit::Unpickler::parse_ivalue() + 0x2e (0x7f7732e352ce in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9274222Z frame #5: torch::jit::unpickle(std::function<unsigned long (char*, unsigned long)>, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>) + 0x25c (0x7f7732e0823c in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9278242Z frame #6: torch::jit::unpickle(char const*, unsigned long, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>) + 0xdd (0x7f7732e0874d in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9282219Z frame #7: torch::distributed::autograd::CleanupAutogradContextReq::fromMessage(torch::distributed::rpc::Message const&) + 0xcb (0x7f7733098e8b in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9286074Z frame #8: torch::distributed::rpc::deserializeRequest(torch::distributed::rpc::Message const&) + 0x1ed (0x7f77330df8fd in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)

1 failure not recognized by patterns:

Job	Step	Action
^{Windows CI (pytorch-win-vs2019-cpu-py3) / render_test_results}	^Unknown	🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

This file is only for open source selective build. It was introduced when we deprecated the static dispatch. The motivation was to make sure we have a low-friction selective build workflow for dynamic dispatch. As the name indicates, it is the *default* dependency graph that users can try if they don't bother to run the static analyzer themselves. Since the workflow to automatically update the file has been broken for a while, it started to confuse other pytorch developers as people are already manually editing it. We reintroduced the static dispatch recently, so we decide to deprecate this file now and automatically turn on static dispatch if users run selective build without providing the static analysis graph. The tracing-based selective build will be the ultimate solution we'd like to provide for OSS, but it will take some more effort to polish and release. [ghstack-poisoned]

ljk53 · 2021-06-07T19:24:26Z

@ljk53 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

To do mobile selective build, we have several options: 1. static dispatch; 2. dynamic dispatch + static analysis (to create the dependency graph); 3. dynamic dispatch + tracing; We are developing 3. For open source, we used to only support 1, and currently we support both 1 and 2. This file is only used for 2. It was introduced when we deprecated the static dispatch (1). The motivation was to make sure we have a low-friction selective build workflow for dynamic dispatch (2). As the name indicates, it is the *default* dependency graph that users can try if they don't bother to run the static analyzer themselves. We have a CI to run the full workflow of 2 on every PR, which creates the dependency graph on-the-fly instead of using the committed file. Since the workflow to automatically update the file has been broken for a while, it started to confuse other pytorch developers as people are already manually editing it, and it might be broken for some models already. We reintroduced the static dispatch recently, so we decide to deprecate this file now and automatically turn on static dispatch if users run selective build without providing the static analysis graph. The tracing-based selective build will be the ultimate solution we'd like to provide for OSS, but it will take some more effort to polish and release. Differential Revision: [D28941020](https://our.internmc.facebook.com/intern/diff/D28941020) [ghstack-poisoned]

To do mobile selective build, we have several options: 1. static dispatch; 2. dynamic dispatch + static analysis (to create the dependency graph); 3. dynamic dispatch + tracing; We are developing 3. For open source, we used to only support 1, and currently we support both 1 and 2. This file is only used for 2. It was introduced when we deprecated the static dispatch (1). The motivation was to make sure we have a low-friction selective build workflow for dynamic dispatch (2). As the name indicates, it is the *default* dependency graph that users can try if they don't bother to run the static analyzer themselves. We have a CI to run the full workflow of 2 on every PR, which creates the dependency graph on-the-fly instead of using the committed file. Since the workflow to automatically update the file has been broken for a while, it started to confuse other pytorch developers as people are already manually editing it, and it might be broken for some models already. We reintroduced the static dispatch recently, so we decide to deprecate this file now and automatically turn on static dispatch if users run selective build without providing the static analysis graph. The tracing-based selective build will be the ultimate solution we'd like to provide for OSS, but it will take some more effort to polish and release. ghstack-source-id: b2e9541 Pull Request resolved: #59573

ljk53 · 2021-06-07T19:48:54Z

@ljk53 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

dhruvbird

This looks good! Removing the file should prevent people from adding stuff to it.

ezyang · 2021-06-08T01:54:51Z

thx for doing this!

facebook-github-bot · 2021-06-08T02:38:58Z

@ljk53 merged this pull request in 501320e.

Summary: Pull Request resolved: pytorch#59573 To do mobile selective build, we have several options: 1. static dispatch; 2. dynamic dispatch + static analysis (to create the dependency graph); 3. dynamic dispatch + tracing; We are developing 3. For open source, we used to only support 1, and currently we support both 1 and 2. This file is only used for 2. It was introduced when we deprecated the static dispatch (1). The motivation was to make sure we have a low-friction selective build workflow for dynamic dispatch (2). As the name indicates, it is the *default* dependency graph that users can try if they don't bother to run the static analyzer themselves. We have a CI to run the full workflow of 2 on every PR, which creates the dependency graph on-the-fly instead of using the committed file. Since the workflow to automatically update the file has been broken for a while, it started to confuse other pytorch developers as people are already manually editing it, and it might be broken for some models already. We reintroduced the static dispatch recently, so we decide to deprecate this file now and automatically turn on static dispatch if users run selective build without providing the static analysis graph. The tracing-based selective build will be the ultimate solution we'd like to provide for OSS, but it will take some more effort to polish and release. Differential Revision: D28941020 D28941020 Test Plan: Imported from OSS Reviewed By: dhruvbird Pulled By: ljk53 fbshipit-source-id: 9977ab8568e2cc1bdcdecd3d22e29547ef63889e

facebook-github-bot added the cla signed label Jun 7, 2021

ljk53 requested review from dhruvbird, ezyang and gchanan June 7, 2021 19:19

dhruvbird approved these changes Jun 7, 2021

View reviewed changes

facebook-github-bot closed this in 501320e Jun 8, 2021

facebook-github-bot added the Merged label Jun 8, 2021

facebook-github-bot deleted the gh/ljk53/195/head branch June 11, 2021 14:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pytorch] deprecate default_op_deps.yaml #59573

[pytorch] deprecate default_op_deps.yaml #59573

Uh oh!

ljk53 commented Jun 7, 2021 •

edited

Loading

Uh oh!

facebook-github-bot commented Jun 7, 2021 •

edited

Loading

Uh oh!

ljk53 commented Jun 7, 2021

Uh oh!

ljk53 commented Jun 7, 2021

Uh oh!

dhruvbird left a comment

Uh oh!

ezyang commented Jun 8, 2021

Uh oh!

facebook-github-bot commented Jun 8, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[pytorch] deprecate default_op_deps.yaml #59573

[pytorch] deprecate default_op_deps.yaml #59573

Uh oh!

Conversation

ljk53 commented Jun 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Jun 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

🕵️ 2 new failures recognized by patterns

pytorch_linux_bionic_py3_8_gcc9_coverage_test2 (1/2)

Linux CI (pytorch-linux-xenial-py3.6-gcc5.4) / test (2/2)

1 failure not recognized by patterns:

Uh oh!

ljk53 commented Jun 7, 2021

Uh oh!

ljk53 commented Jun 7, 2021

Uh oh!

dhruvbird left a comment

Choose a reason for hiding this comment

Uh oh!

ezyang commented Jun 8, 2021

Uh oh!

facebook-github-bot commented Jun 8, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ljk53 commented Jun 7, 2021 •

edited

Loading

facebook-github-bot commented Jun 7, 2021 •

edited

Loading