Skip to content

Conversation

ljk53
Copy link
Contributor

@ljk53 ljk53 commented Jun 7, 2021

Stack from ghstack:

To do mobile selective build, we have several options:

  1. static dispatch;
  2. dynamic dispatch + static analysis (to create the dependency graph);
  3. dynamic dispatch + tracing;

We are developing 3. For open source, we used to only support 1, and
currently we support both 1 and 2.

This file is only used for 2. It was introduced when we deprecated
the static dispatch (1). The motivation was to make sure we have a
low-friction selective build workflow for dynamic dispatch (2).
As the name indicates, it is the default dependency graph that users
can try if they don't bother to run the static analyzer themselves.
We have a CI to run the full workflow of 2 on every PR, which creates
the dependency graph on-the-fly instead of using the committed file.

Since the workflow to automatically update the file has been broken
for a while, it started to confuse other pytorch developers as people
are already manually editing it, and it might be broken for some models
already.

We reintroduced the static dispatch recently, so we decide to deprecate
this file now and automatically turn on static dispatch if users run
selective build without providing the static analysis graph.

The tracing-based selective build will be the ultimate solution we'd
like to provide for OSS, but it will take some more effort to polish
and release.

Differential Revision: D28941020

This file is only for open source selective build. It was introduced
when we deprecated the static dispatch. The motivation was to make
sure we have a low-friction selective build workflow for dynamic
dispatch.

As the name indicates, it is the *default* dependency graph that users
can try if they don't bother to run the static analyzer themselves.

Since the workflow to automatically update the file has been broken
for a while, it started to confuse other pytorch developers as people
are already manually editing it.

We reintroduced the static dispatch recently, so we decide to deprecate
this file now and automatically turn on static dispatch if users run
selective build without providing the static analysis graph.

The tracing-based selective build will be the ultimate solution we'd
like to provide for OSS, but it will take some more effort to polish
and release.

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jun 7, 2021

💊 CI failures summary and remediations

As of commit 3cb9832 (more details on the Dr. CI page):


  • 3/3 failures introduced in this PR

🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_bionic_py3_8_gcc9_coverage_test2 (1/2)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jun 07 20:55:18 [E request_callback_no_python.c...quest type 275: Unexpected end of pickler archive.
Jun 07 20:55:18 frame #9: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x97 (0x7f7f567acd97 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
Jun 07 20:55:18 frame #10: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x61 (0x7f7f567a0fd1 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
Jun 07 20:55:18 frame #11: <unknown function> + 0x2145e1f (0x7f7f66792e1f in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
Jun 07 20:55:18 frame #12: <unknown function> + 0x2146f95 (0x7f7f66793f95 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
Jun 07 20:55:18 frame #13: c10::ThreadPool::main_loop(unsigned long) + 0x3bc (0x7f7f63faa44c in /opt/conda/lib/python3.8/site-packages/torch/lib/libc10.so)
Jun 07 20:55:18 frame #14: <unknown function> + 0x787aa (0x7f7f63faa7aa in /opt/conda/lib/python3.8/site-packages/torch/lib/libc10.so)
Jun 07 20:55:18 frame #15: <unknown function> + 0xc819d (0x7f7f4d79419d in /opt/conda/lib/libstdc++.so.6)
Jun 07 20:55:18 frame #16: <unknown function> + 0x76db (0x7f7f758aa6db in /lib/x86_64-linux-gnu/libpthread.so.0)
Jun 07 20:55:18 frame #17: clone + 0x3f (0x7f7f755d371f in /lib/x86_64-linux-gnu/libc.so.6)
Jun 07 20:55:18 
Jun 07 20:55:18 [E request_callback_no_python.cpp:552] Received error while processing request type 275: Unexpected end of pickler archive.
Jun 07 20:55:18 Exception raised from readSlowWithBuffer at /var/lib/jenkins/workspace/torch/csrc/jit/serialization/unpickler.cpp:756 (most recent call first):
Jun 07 20:55:18 frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x59 (0x7f7f63fbbce9 in /opt/conda/lib/python3.8/site-packages/torch/lib/libc10.so)
Jun 07 20:55:18 frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, char const*) + 0x161 (0x7f7f63f73657 in /opt/conda/lib/python3.8/site-packages/torch/lib/libc10.so)
Jun 07 20:55:18 frame #2: <unknown function> + 0x8ba8c03 (0x7f7f563e8c03 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
Jun 07 20:55:18 frame #3: torch::jit::Unpickler::run() + 0x130 (0x7f7f563f53a0 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
Jun 07 20:55:18 frame #4: torch::jit::Unpickler::parse_ivalue() + 0x37 (0x7f7f563f7357 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
Jun 07 20:55:18 frame #5: torch::jit::unpickle(std::function<unsigned long (char*, unsigned long)>, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>) + 0x2b5 (0x7f7f563b3fe5 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
Jun 07 20:55:18 frame #6: torch::jit::unpickle(char const*, unsigned long, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>) + 0xf6 (0x7f7f563b4136 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
Jun 07 20:55:18 frame #7: torch::distributed::autograd::CleanupAutogradContextReq::fromMessage(torch::distributed::rpc::Message const&) + 0xe1 (0x7f7f5678a291 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
Jun 07 20:55:18 frame #8: torch::distributed::rpc::deserializeRequest(torch::distributed::rpc::Message const&) + 0x296 (0x7f7f567ea4c6 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)

See GitHub Actions build Linux CI (pytorch-linux-xenial-py3.6-gcc5.4) / test (2/2)

Step: "Test PyTorch" (full log | diagnosis details | 🔁 rerun)

2021-06-07T21:15:51.9254248Z [E request_callbac...quest type 275: Unexpected end of pickler archive.
2021-06-07T21:15:51.9228828Z frame #7: torch::distributed::autograd::CleanupAutogradContextReq::fromMessage(torch::distributed::rpc::Message const&) + 0xcb (0x7f7733098e8b in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9232613Z frame #8: torch::distributed::rpc::deserializeRequest(torch::distributed::rpc::Message const&) + 0x1ed (0x7f77330df8fd in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9236714Z frame #9: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x7f (0x7f77330b3cef in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9241111Z frame #10: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x57 (0x7f77330abfc7 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9244480Z frame #11: <unknown function> + 0xd323e0 (0x7f773b8893e0 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
2021-06-07T21:15:51.9247105Z frame #12: c10::ThreadPool::main_loop(unsigned long) + 0x2a3 (0x7f773a4fe283 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
2021-06-07T21:15:51.9248845Z frame #13: <unknown function> + 0xc8421 (0x7f772ef9b421 in /opt/conda/lib/libstdc++.so.6)
2021-06-07T21:15:51.9250615Z frame #14: <unknown function> + 0x76ba (0x7f774996e6ba in /lib/x86_64-linux-gnu/libpthread.so.0)
2021-06-07T21:15:51.9252214Z frame #15: clone + 0x6d (0x7f77496a451d in /lib/x86_64-linux-gnu/libc.so.6)
2021-06-07T21:15:51.9252923Z 
2021-06-07T21:15:51.9254248Z [E request_callback_no_python.cpp:552] Received error while processing request type 275: Unexpected end of pickler archive.
2021-06-07T21:15:51.9256577Z Exception raised from readSlowWithBuffer at /var/lib/jenkins/workspace/torch/csrc/jit/serialization/unpickler.cpp:756 (most recent call first):
2021-06-07T21:15:51.9259869Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x69 (0x7f773a50e329 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
2021-06-07T21:15:51.9263030Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, char const*) + 0xc5 (0x7f773a50aae5 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
2021-06-07T21:15:51.9265693Z frame #2: <unknown function> + 0x3de3be8 (0x7f7732e2abe8 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9268134Z frame #3: torch::jit::Unpickler::run() + 0xdf (0x7f7732e3515f in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9270756Z frame #4: torch::jit::Unpickler::parse_ivalue() + 0x2e (0x7f7732e352ce in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9274222Z frame #5: torch::jit::unpickle(std::function<unsigned long (char*, unsigned long)>, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>) + 0x25c (0x7f7732e0823c in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9278242Z frame #6: torch::jit::unpickle(char const*, unsigned long, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>) + 0xdd (0x7f7732e0874d in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9282219Z frame #7: torch::distributed::autograd::CleanupAutogradContextReq::fromMessage(torch::distributed::rpc::Message const&) + 0xcb (0x7f7733098e8b in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-06-07T21:15:51.9286074Z frame #8: torch::distributed::rpc::deserializeRequest(torch::distributed::rpc::Message const&) + 0x1ed (0x7f77330df8fd in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)

1 failure not recognized by patterns:

Job Step Action
GitHub Actions Windows CI (pytorch-win-vs2019-cpu-py3) / render_test_results Unknown 🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

This file is only for open source selective build. It was introduced
when we deprecated the static dispatch. The motivation was to make
sure we have a low-friction selective build workflow for dynamic
dispatch.

As the name indicates, it is the *default* dependency graph that users
can try if they don't bother to run the static analyzer themselves.

Since the workflow to automatically update the file has been broken
for a while, it started to confuse other pytorch developers as people
are already manually editing it.

We reintroduced the static dispatch recently, so we decide to deprecate
this file now and automatically turn on static dispatch if users run
selective build without providing the static analysis graph.

The tracing-based selective build will be the ultimate solution we'd
like to provide for OSS, but it will take some more effort to polish
and release.

[ghstack-poisoned]
@ljk53 ljk53 requested review from dhruvbird, ezyang and gchanan June 7, 2021 19:19
@ljk53
Copy link
Contributor Author

ljk53 commented Jun 7, 2021

@ljk53 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

To do mobile selective build, we have several options:
1. static dispatch;
2. dynamic dispatch + static analysis (to create the dependency graph);
3. dynamic dispatch + tracing;

We are developing 3. For open source, we used to only support 1, and
currently we support both 1 and 2.

This file is only used for 2. It was introduced when we deprecated
the static dispatch (1). The motivation was to make sure we have a
low-friction selective build workflow for dynamic dispatch (2).
As the name indicates, it is the *default* dependency graph that users
can try if they don't bother to run the static analyzer themselves.
We have a CI to run the full workflow of 2 on every PR, which creates
the dependency graph on-the-fly instead of using the committed file.

Since the workflow to automatically update the file has been broken
for a while, it started to confuse other pytorch developers as people
are already manually editing it, and it might be broken for some models
already.

We reintroduced the static dispatch recently, so we decide to deprecate
this file now and automatically turn on static dispatch if users run
selective build without providing the static analysis graph.

The tracing-based selective build will be the ultimate solution we'd
like to provide for OSS, but it will take some more effort to polish
and release.

Differential Revision: [D28941020](https://our.internmc.facebook.com/intern/diff/D28941020)

[ghstack-poisoned]
ljk53 added a commit that referenced this pull request Jun 7, 2021
To do mobile selective build, we have several options:
1. static dispatch;
2. dynamic dispatch + static analysis (to create the dependency graph);
3. dynamic dispatch + tracing;

We are developing 3. For open source, we used to only support 1, and
currently we support both 1 and 2.

This file is only used for 2. It was introduced when we deprecated
the static dispatch (1). The motivation was to make sure we have a
low-friction selective build workflow for dynamic dispatch (2).
As the name indicates, it is the *default* dependency graph that users
can try if they don't bother to run the static analyzer themselves.
We have a CI to run the full workflow of 2 on every PR, which creates
the dependency graph on-the-fly instead of using the committed file.

Since the workflow to automatically update the file has been broken
for a while, it started to confuse other pytorch developers as people
are already manually editing it, and it might be broken for some models
already.

We reintroduced the static dispatch recently, so we decide to deprecate
this file now and automatically turn on static dispatch if users run
selective build without providing the static analysis graph.

The tracing-based selective build will be the ultimate solution we'd
like to provide for OSS, but it will take some more effort to polish
and release.

ghstack-source-id: b2e9541
Pull Request resolved: #59573
@ljk53
Copy link
Contributor Author

ljk53 commented Jun 7, 2021

@ljk53 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@dhruvbird dhruvbird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! Removing the file should prevent people from adding stuff to it.

@ezyang
Copy link
Contributor

ezyang commented Jun 8, 2021

thx for doing this!

@facebook-github-bot
Copy link
Contributor

@ljk53 merged this pull request in 501320e.

deniskokarev pushed a commit to deniskokarev/pytorch that referenced this pull request Jun 9, 2021
Summary:
Pull Request resolved: pytorch#59573

To do mobile selective build, we have several options:
1. static dispatch;
2. dynamic dispatch + static analysis (to create the dependency graph);
3. dynamic dispatch + tracing;

We are developing 3. For open source, we used to only support 1, and
currently we support both 1 and 2.

This file is only used for 2. It was introduced when we deprecated
the static dispatch (1). The motivation was to make sure we have a
low-friction selective build workflow for dynamic dispatch (2).
As the name indicates, it is the *default* dependency graph that users
can try if they don't bother to run the static analyzer themselves.
We have a CI to run the full workflow of 2 on every PR, which creates
the dependency graph on-the-fly instead of using the committed file.

Since the workflow to automatically update the file has been broken
for a while, it started to confuse other pytorch developers as people
are already manually editing it, and it might be broken for some models
already.

We reintroduced the static dispatch recently, so we decide to deprecate
this file now and automatically turn on static dispatch if users run
selective build without providing the static analysis graph.

The tracing-based selective build will be the ultimate solution we'd
like to provide for OSS, but it will take some more effort to polish
and release.

Differential Revision:
D28941020
D28941020

Test Plan: Imported from OSS

Reviewed By: dhruvbird

Pulled By: ljk53

fbshipit-source-id: 9977ab8568e2cc1bdcdecd3d22e29547ef63889e
@facebook-github-bot facebook-github-bot deleted the gh/ljk53/195/head branch June 11, 2021 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants