Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit std::string instead of c10::string_view for Lazy IR class #74029

Merged
merged 1 commit into from
Mar 11, 2022

Conversation

desertfire
Copy link
Contributor

@desertfire desertfire commented Mar 10, 2022

Summary: c10::string_view may be pointing to a temp string, which is not
guaranteed to be valid when accessed later, so we store the passed-in
string_view into a string.

Fixes #73963

@pytorch-bot
Copy link

pytorch-bot bot commented Mar 10, 2022

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/5814817bba3fab9bee01a2ab8a5bc344f35db951/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
linux-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
linux-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
linux-binary-manywheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
linux-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/trunk ✅ triggered
linux-bionic-rocm4.5-py3.7 ciflow/all, ciflow/default, ciflow/linux, ciflow/rocm, ciflow/trunk ✅ triggered
linux-docs ciflow/all, ciflow/cpu, ciflow/default, ciflow/docs, ciflow/linux, ciflow/trunk ✅ triggered
linux-vulkan-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7-no-ops ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
macos-arm64-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-arm64-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
macos-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
windows-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
windows-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
windows-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
docker-builds ciflow/all, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-custom-ops ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-metal ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-x86-64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow, ciflow/trunk 🚫 skipped
linux-docs-push ciflow/all, ciflow/cpu, ciflow/linux, ciflow/scheduled 🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops ciflow/all, ciflow/cuda, ciflow/linux, ciflow/trunk 🚫 skipped
macos-10-15-py3-arm64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-11-py3-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.1-py3.7-gcc7-debug ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.1-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
periodic-win-vs2019-cuda11.5-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build ciflow/all, ciflow/android, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
pytorch-xla-linux-bionic-py3.7-clang8 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk, ciflow/xla 🚫 skipped

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Mar 10, 2022

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 46df399 (more details on the Dr. CI page):



🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See GitHub Actions build linux-xenial-py3.7-gcc5.4 / test (backwards_compat, 1, 1, linux.2xlarge) (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-11T00:41:40.2768255Z RuntimeError:
2022-03-11T00:41:39.7814968Z Author: PyTorch Team
2022-03-11T00:41:39.7815216Z Author-email: packages@pytorch.org
2022-03-11T00:41:39.7815432Z License: BSD-3
2022-03-11T00:41:39.7815763Z Location: /opt/conda/lib/python3.7/site-packages
2022-03-11T00:41:39.7816007Z Requires: typing-extensions
2022-03-11T00:41:39.7816218Z Required-by: 
2022-03-11T00:41:39.8123125Z + python check_forward_backward_compatibility.py --existing-schemas nightly_schemas.txt
2022-03-11T00:41:40.2767447Z Traceback (most recent call last):
2022-03-11T00:41:40.2767800Z   File "check_forward_backward_compatibility.py", line 308, in <module>
2022-03-11T00:41:40.2768053Z     s = parse_schema(line.strip())
2022-03-11T00:41:40.2768255Z RuntimeError: 
2022-03-11T00:41:40.2768502Z Unknown custom class type profiler._RecordFunction. Please ensure it is registered.:
2022-03-11T00:41:40.2769057Z profiler::_record_function_exit._RecordFunction(__torch__.torch.classes.profiler._RecordFunction _0) -> ()
2022-03-11T00:41:40.2769430Z                                                                                  ~~~~~~~~~~~~~~~ <--- HERE
2022-03-11T00:41:40.2769575Z 
2022-03-11T00:41:40.3458711Z + cleanup
2022-03-11T00:41:40.3458921Z + retcode=1
2022-03-11T00:41:40.3459128Z + set +x
2022-03-11T00:41:40.3498216Z ##[error]Process completed with exit code 1.
2022-03-11T00:41:40.3529584Z ##[group]Run # Ensure the working directory gets chowned back to the current user
2022-03-11T00:41:40.3529930Z �[36;1m# Ensure the working directory gets chowned back to the current user�[0m

❄️ 1 failure tentatively classified as flaky

but reruns have not yet been triggered to confirm:

See GitHub Actions build linux-bionic-rocm4.5-py3.7 / test (distributed, 1, 1, linux.rocm.gpu) (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun) ❄️

2022-03-11T03:12:08.2021900Z RuntimeError: Proc...ated or timed out after 100.06461071968079 seconds
2022-03-11T03:12:08.2004862Z ERROR [100.095s]: test_local_optimizer_parity (__main__.TestZeroRedundancyOptimizerDistributed)
2022-03-11T03:12:08.2007968Z When combined with DDP, check that ZeroRedundancyOptimizer(optimizer) and the same monolithic optimizer
2022-03-11T03:12:08.2010809Z ----------------------------------------------------------------------
2022-03-11T03:12:08.2012293Z Traceback (most recent call last):
2022-03-11T03:12:08.2013825Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_distributed.py", line 484, in wrapper
2022-03-11T03:12:08.2014931Z     self._join_processes(fn)
2022-03-11T03:12:08.2016364Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_distributed.py", line 703, in _join_processes
2022-03-11T03:12:08.2017784Z     self._check_return_codes(elapsed_time)
2022-03-11T03:12:08.2019379Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_distributed.py", line 755, in _check_return_codes
2022-03-11T03:12:08.2020668Z     i, elapsed_time
2022-03-11T03:12:08.2021900Z RuntimeError: Process 0 terminated or timed out after 100.06461071968079 seconds
2022-03-11T03:12:08.2022832Z 
2022-03-11T03:12:08.2024108Z ----------------------------------------------------------------------
2022-03-11T03:12:08.2030441Z Ran 27 tests in 309.546s
2022-03-11T03:12:08.2031158Z 
2022-03-11T03:12:08.2032047Z FAILED (errors=1, skipped=9, unexpected successes=3)
2022-03-11T03:12:08.2032620Z 
2022-03-11T03:12:08.2032949Z Generating XML reports...
2022-03-11T03:12:08.2164933Z Generated XML report: test-reports/python-unittest/distributed.optim.test_zero_redundancy_optimizer/TEST-TestZeroRedundancyOptimizerDistributed-20220311030658.xml
2022-03-11T03:12:08.2179959Z Generated XML report: test-reports/python-unittest/distributed.optim.test_zero_redundancy_optimizer/TEST-TestZeroRedundancyOptimizerSingleRank-20220311030658.xml
2022-03-11T03:12:09.2127442Z Traceback (most recent call last):

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Copy link
Collaborator

@alanwaketan alanwaketan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@@ -105,7 +105,7 @@ def gen(self, f: Union[NativeFunctionsGroup, NativeFunction]) -> List[str]:
node_ctor_args = ", ".join([f"const {i.cpp_type()}& {i.name}" for i in all_types])
scalar_initializers = ",\n ".join([f"{t.name}({t.name})" for t in scalar_types])
comma_if_scalar_initializers = ",\n" if len(scalar_initializers) else ""
scalar_decls = "\n ".join([f"{t.cpp_type()} {t.name};" for t in scalar_types])
scalar_decls = "\n ".join([f"std::string {t.name};" if t.cpp_type() == "c10::string_view" else f"{t.cpp_type()} {t.name};" for t in scalar_types])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i do want the lint to pass so the line can't be too long

more importantly, i think we should probably fix this inside the data model (basically, t.cpp_type() should probably return std::string for us)

its fine to land this without updating the data model, but it might be easy to do that at the same time.

please TAL at my refactor PR #73939 - it might be easier to modify the data model after that PR lands, or reviewing it might help your understanding of the data model

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will break the line to pass lint.

Having t.cpp_type() return c10::string_view seems more consistent with other types. E.g. there is another call to cpp_type() at Line 105, which result seems more natural because the parameter is declared as const c10::string_view& and the passed-in variable is also declared as c10::string_view at the callsite in core.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will merge the diff now and update it later.

Copy link
Contributor

@wconstab wconstab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice find. this is an ugly bug! i wonder if we have any other bugs like this. it would be easy enough to review the types we are using in our IR and make sure all of them seem correctly copied into owning memory.

Summary: c10::string_view may be pointing to a temp string, which is not
guaranteed to be valid when accessed later, so we store the passed-in
string_view into a string.
@desertfire desertfire merged commit bb49d0d into lazy_tensor_staging Mar 11, 2022
@github-actions github-actions bot deleted the binbao/merge2 branch February 15, 2024 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants