Skip to content

Conversation

hameerabbasi
Copy link
Collaborator

Since the last one was apparently reverted.

@hameerabbasi hameerabbasi requested a review from ezyang March 27, 2020 11:47
@hameerabbasi hameerabbasi changed the title Torch function benchmark Add __torch_function__ benchmarks. Mar 27, 2020
@dr-ci
Copy link

dr-ci bot commented Mar 27, 2020

💊 CircleCI build failures summary and remediations

As of commit 369c0ce (more details on the Dr. CI page):


  • 4/4 failures introduced in this PR

🕵️ 4 new failures recognized by patterns

The following build failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_backward_compatibility_check_test (1/4)

Step: "Test" (full log | pattern match details) <confirmed not flaky by 2 failures>

Mar 30 17:13:56 The PR is introducing backward incompatible changes to the operator library. Please contact PyTorch team to confirm whether this change is wanted or not.
Mar 30 17:13:56 processing existing schema:  aten::sparse_coo_tensor.size(int[] size, *, int dtype, int layout, Device device, bool pin_memory=False) -> (Tensor) 
Mar 30 17:13:56 processing existing schema:  aten::sparse_coo_tensor.indices(Tensor indices, Tensor values, *, int? dtype=None, int? layout=None, Device? device=None, bool? pin_memory=None) -> (Tensor) 
Mar 30 17:13:56 processing existing schema:  aten::sparse_coo_tensor.indices_size(Tensor indices, Tensor values, int[] size, *, int? dtype=None, int? layout=None, Device? device=None, bool? pin_memory=None) -> (Tensor) 
Mar 30 17:13:56 processing existing schema:  aten::split_with_sizes(Tensor self, int[] split_sizes, int dim=0) -> (Tensor[]) 
Mar 30 17:13:56 processing existing schema:  aten::squeeze(Tensor(a) self) -> (Tensor(a)) 
Mar 30 17:13:56 processing existing schema:  aten::squeeze.dim(Tensor(a) self, int dim) -> (Tensor(a)) 
Mar 30 17:13:56 processing existing schema:  aten::stft(Tensor self, int n_fft, int? hop_length=None, int? win_length=None, Tensor? window=None, bool normalized=False, bool onesided=True) -> (Tensor) 
Mar 30 17:13:56 skipping schema:  aten::sub_.Tensor(Tensor(a!) self, Tensor other, *, Scalar alpha=1) -> (Tensor(a!)) 
Mar 30 17:13:56 skipping schema:  aten::sub_.Scalar(Tensor(a!) self, Scalar other, Scalar alpha=1) -> (Tensor(a!)) 
Mar 30 17:13:56 processing existing schema:  aten::t(Tensor(a) self) -> (Tensor(a)) 
Mar 30 17:13:56 The PR is introducing backward incompatible changes to the operator library. Please contact PyTorch team to confirm whether this change is wanted or not.  
Mar 30 17:13:56  
Mar 30 17:13:56 Broken ops: [ 
Mar 30 17:13:56 	aten::owner(RRef(t) self) -> (__torch__.torch.classes.dist_rpc.WorkerInfo) 
Mar 30 17:13:56 	prepacked::conv2d_clamp_run(Tensor X, __torch__.torch.classes.xnnpack.Conv2dOpContext W_prepack) -> (Tensor Y) 
Mar 30 17:13:56 	prepacked::conv2d_clamp_prepack(Tensor W, Tensor? B, int[2] stride, int[2] padding, int[2] dilation, int groups, float? output_min=None, float? output_max=None) -> (__torch__.torch.classes.xnnpack.Conv2dOpContext) 
Mar 30 17:13:56 	prepacked::linear_clamp_run(Tensor X, __torch__.torch.classes.xnnpack.LinearOpContext W_prepack) -> (Tensor Y) 
Mar 30 17:13:56 	prepacked::linear_clamp_prepack(Tensor W, Tensor? B=None, float? output_min=None, float? output_max=None) -> (__torch__.torch.classes.xnnpack.LinearOpContext) 
Mar 30 17:13:56 ] 
Mar 30 17:13:56 + cleanup 
Mar 30 17:13:56 + retcode=1 

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_test (2/4)

Step: "Test" (full log | pattern match details) <confirmed not flaky by 2 failures>

Mar 30 18:35:33 [E request_callback_impl.cpp:94] Received error while processing request type 2: PickleError: ScriptModules cannot be deepcopied using copy.deepcopy or saved using torch.save. Mixed serialization of script and non-script modules is not supported. For purely script modules use my_script_module.save() instead.
Mar 30 18:35:33   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(86): serialize 
Mar 30 18:35:33   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(135): serialize 
Mar 30 18:35:33  
Mar 30 18:35:33 [E request_callback_impl.cpp:94] Received error while processing request type 2: PickleError: ScriptModules cannot be deepcopied using copy.deepcopy or saved using torch.save. Mixed serialization of script and non-script modules is not supported. For purely script modules use my_script_module.save(<filename>) instead. 
Mar 30 18:35:33  
Mar 30 18:35:33 At: 
Mar 30 18:35:33   /opt/conda/lib/python3.6/site-packages/torch/jit/__init__.py(1773): __getstate__ 
Mar 30 18:35:33   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(86): serialize 
Mar 30 18:35:33   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(135): serialize 
Mar 30 18:35:33  
Mar 30 18:35:33 [E request_callback_impl.cpp:94] Received error while processing request type 2: PickleError: ScriptModules cannot be deepcopied using copy.deepcopy or saved using torch.save. Mixed serialization of script and non-script modules is not supported. For purely script modules use my_script_module.save(<filename>) instead. 
Mar 30 18:35:33  
Mar 30 18:35:33 At: 
Mar 30 18:35:33   /opt/conda/lib/python3.6/site-packages/torch/jit/__init__.py(1773): __getstate__ 
Mar 30 18:35:33   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(86): serialize 
Mar 30 18:35:33   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(135): serialize 
Mar 30 18:35:33  
Mar 30 18:35:33 ok (1.121s) 
Mar 30 18:35:34   test_unexepected_kwarg_is_specified (__main__.JitRpcTestWithSpawn) ... ok (1.117s) 
Mar 30 18:35:35   test_user_rrefs_confirmed (__main__.JitRpcTestWithSpawn) ... ok (1.118s) 
Mar 30 18:35:36   test_user_rrefs_confirmed_remote (__main__.JitRpcTestWithSpawn) ... ok (1.117s) 

See CircleCI build pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test (3/4)

Step: "Test" (full log | pattern match details) <confirmed not flaky by 2 failures>

Mar 30 18:46:37 RuntimeError: test_autograd failed!
Mar 30 18:46:37 Generated XML report: test-reports/python-unittest/TEST-TestAutograd-20200330183849.xml 
Mar 30 18:46:37 Generated XML report: test-reports/python-unittest/TEST-TestAutogradDeviceTypeCPU-20200330183849.xml 
Mar 30 18:46:37 Generated XML report: test-reports/python-unittest/TEST-TestAutogradDeviceTypeCUDA-20200330183849.xml 
Mar 30 18:46:37 Generated XML report: test-reports/python-unittest/TEST-TestAutogradFunctional-20200330183849.xml 
Mar 30 18:46:37 Generated XML report: test-reports/python-unittest/TEST-TestMultithreadAutograd-20200330183849.xml 
Mar 30 18:46:37 Traceback (most recent call last): 
Mar 30 18:46:37   File "test/run_test.py", line 682, in <module> 
Mar 30 18:46:37     main() 
Mar 30 18:46:37   File "test/run_test.py", line 675, in main 
Mar 30 18:46:37     raise RuntimeError(message) 
Mar 30 18:46:37 RuntimeError: test_autograd failed! 
Mar 30 18:46:38 + cleanup 
Mar 30 18:46:38 + retcode=1 
Mar 30 18:46:38 + set +x 
Mar 30 18:46:38 =================== sccache compilation log =================== 
Mar 30 18:46:38 =========== If your build fails, please take a look at the log above for possible reasons =========== 
Mar 30 18:46:38 Compile requests                 0 
Mar 30 18:46:38 Compile requests executed        0 
Mar 30 18:46:38 Cache hits                       0 
Mar 30 18:46:38 Cache misses                     0 
Mar 30 18:46:38 Cache timeouts                   0 

See CircleCI build pytorch_linux_xenial_py3_clang5_asan_test (4/4)

Step: "Test" (full log | pattern match details) <confirmed not flaky by 2 failures>

Mar 30 17:22:42 caused by: Connection refused (os error 111)
Mar 30 17:22:42 +++ eval 'extract_trap_cmd ' 
Mar 30 17:22:42 ++++ extract_trap_cmd 
Mar 30 17:22:42 ++++ printf '%s\n' '' 
Mar 30 17:22:42 +++ printf '%s\n' cleanup 
Mar 30 17:22:42 ++ trap -- ' 
Mar 30 17:22:42 cleanup' EXIT 
Mar 30 17:22:42 ++ which sccache 
Mar 30 17:22:42 ++ sccache --stop-server 
Mar 30 17:22:42 Stopping sccache server... 
Mar 30 17:22:42 error: couldn't connect to server 
Mar 30 17:22:42 caused by: Connection refused (os error 111) 
Mar 30 17:22:42 ++ true 
Mar 30 17:22:42 ++ rm /var/lib/jenkins/sccache_error.log 
Mar 30 17:22:42 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 
Mar 30 17:22:42 ++ SCCACHE_IDLE_TIMEOUT=1200 
Mar 30 17:22:42 ++ RUST_LOG=sccache::server=error 
Mar 30 17:22:42 ++ sccache --start-server 
Mar 30 17:22:42 Starting sccache server... 
Mar 30 17:22:42 ++ sccache --zero-stats 
Mar 30 17:22:42 Compile requests                 0 
Mar 30 17:22:42 Compile requests executed        0 

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 27 times.

@hameerabbasi hameerabbasi force-pushed the torch-function-benchmark branch from 05049cb to 09ae99a Compare March 27, 2020 11:53
@hameerabbasi hameerabbasi force-pushed the torch-function-benchmark branch 2 times, most recently from 24b004e to 7a84697 Compare March 30, 2020 08:34
@hameerabbasi hameerabbasi force-pushed the torch-function-benchmark branch from 7a84697 to 4b34fb7 Compare March 30, 2020 09:34
@ezyang
Copy link
Contributor

ezyang commented Mar 30, 2020

Last time #34645

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You haven't fixed the issue that caused CI to fail?

Mar 26 20:13:22 + python bench.py -n 1 -m 1
Mar 26 20:13:22 ~/workspace/benchmarks/overrides_benchmark ~/workspace
Mar 26 20:13:22 Traceback (most recent call last):
Mar 26 20:13:22   File "bench.py", line 67, in <module>
Mar 26 20:13:22     main()
Mar 26 20:13:22   File "bench.py", line 61, in main
Mar 26 20:13:22     t.__name__, (10 ** 6) * bench_min, (10 ** 6) * bench_std,
Mar 26 20:13:22 UnicodeEncodeError: 'ascii' codec can't encode character '\u03bc' in position 54: ordinal not in range(128)

You may have to push to pytorch/pytorch on a branch named ci-all/your-branch-name to trigger the relevant ci job

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I assumed it was hitting that error when parsing the file, not when printing the char.

Copy link
Collaborator Author

@hameerabbasi hameerabbasi Mar 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may have to push to pytorch/pytorch on a branch named ci-all/your-branch-name to trigger the relevant ci job

Pushed to ci-all/torch-function-benchmarks

@hameerabbasi hameerabbasi force-pushed the torch-function-benchmark branch from fa968a5 to 369c0ce Compare March 30, 2020 16:13
@vincentqb vincentqb added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Mar 30, 2020
@hameerabbasi
Copy link
Collaborator Author

This time, the failures really are unrelated.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ezyang merged this pull request in 8c534bb.

@suo
Copy link
Member

suo commented Apr 1, 2020

I think this broke master:

Apr 01 15:52:49 + python python pyspybench.py Tensor -n 1
Apr 01 16:33:10 python: can't open file 'python': [Errno 2] No such file or directory

example build: https://app.circleci.com/pipelines/github/pytorch/pytorch/149594/workflows/e8dd000c-8a01-4938-9fa8-ab9598b46e65/jobs/5020601

@hameerabbasi
Copy link
Collaborator Author

I apologize, I wonder why the CI didn't catch that.

facebook-github-bot pushed a commit that referenced this pull request Apr 10, 2020
Summary:
Re-land of #35530 and #34645
Pull Request resolved: #36138

Differential Revision: D20893770

Pulled By: ezyang

fbshipit-source-id: 75ab688a086f5fb87412a853df5246c0c39704ca
ashishfarmer pushed a commit to ashishfarmer/pytorch that referenced this pull request Apr 13, 2020
Summary:
Re-land of pytorch#35530 and pytorch#34645
Pull Request resolved: pytorch#36138

Differential Revision: D20893770

Pulled By: ezyang

fbshipit-source-id: 75ab688a086f5fb87412a853df5246c0c39704ca
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants