Fix Hash(c10::Scalar), account for garbage data in union #68201

wconstab · 2021-11-11T21:01:25Z

Summary:
Hash(c10::Scalar) made a bad assumption that it was valid to just hash over all the bytes of data of the c10::Scalar struct.

Becuase c10::Scalar stores a union of different (float/int/complex) types with different sizes, not all bytes are valid in all cases. Hash() should only read the bytes corresponding to the currently active type.

Test Plan: Added new unit tests. Verified HashTest.Scalar failed with the original Hash() impl and then fixed.

Differential Revision: D32367564

pytorch-probot · 2021-11-11T21:01:27Z

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/wconstab/pytorch/blob/c77d02a0087cea29ed266973852de8d4225a376c/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default

Workflows	Labels (bold enabled)	Status
Triggered Workflows
linux-bionic-py3.6-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/noarch`, `ciflow/xla`	✅ triggered
linux-vulkan-bionic-py3.6-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/vulkan`	✅ triggered
linux-xenial-cuda11.3-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/linux`	✅ triggered
linux-xenial-py3-clang5-mobile-build	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`	✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-dynamic	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`	✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`	✅ triggered
linux-xenial-py3.6-clang7-asan	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/sanitizers`	✅ triggered
linux-xenial-py3.6-clang7-onnx	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/onnx`	✅ triggered
linux-xenial-py3.6-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
linux-xenial-py3.6-gcc7	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
linux-xenial-py3.6-gcc7-bazel-test	`ciflow/all`, `ciflow/bazel`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
win-vs2019-cpu-py3	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/win`	✅ triggered
win-vs2019-cuda11.3-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/win`	✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.6-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`	🚫 skipped
docker-builds	`ciflow/all`	🚫 skipped
ios-12-5-1-arm64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-arm64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-arm64-custom-ops	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-arm64-full-jit	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-arm64-metal	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-x86-64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-x86-64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-x86-64-full-jit	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`	🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`	🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/slow`	🚫 skipped
linux-xenial-py3-clang5-mobile-code-analysis	`ciflow/all`, `ciflow/linux`, `ciflow/mobile`	🚫 skipped
macos-10-15-py3-arm64	`ciflow/all`, `ciflow/macos`	🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64	`ciflow/all`, `ciflow/macos`	🚫 skipped
macos-10-15-py3-x86-64	`ciflow/all`, `ciflow/macos`	🚫 skipped
parallelnative-linux-xenial-py3.6-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`	🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`, `ciflow/slow`, `ciflow/slow-gradcheck`	🚫 skipped
periodic-linux-xenial-cuda11.1-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-win-vs2019-cuda11.1-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/scheduled`, `ciflow/win`	🚫 skipped

You can add a comment to the PR and tag @pytorchbot with the following commands:

# ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun

# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow

For more information, please take a look at the CI Flow Wiki.

facebook-github-bot · 2021-11-11T21:01:30Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/68201
📄 Preview docs built from this PR
📄 Preview C++ docs built from this PR
🔧 Opt-in to CIFlow to control what jobs run on your PRs

💊 CI failures summary and remediations

As of commit c77d02a (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

facebook-github-bot · 2021-11-11T21:02:03Z

This pull request was exported from Phabricator. Differential Revision: D32367564

torch/csrc/lazy/core/hash.h

alanwaketan · 2021-11-11T21:16:02Z

test/cpp/lazy/test_misc.cpp

Can we do static_cast<uint8_t*> here? Can you explain the memory layout here a bit such that it's easier for me to understand why writing to "offset 0" makes sense? Is there any way we can tell the differences from a to b such that we can also verify that this operation succeeds?

There isn't any way to verify it without violating the API of the Scalar. I didn't know the memory layout either, except by experimenting. I found it's sizeof() was 32 bytes, which surprised me. I don't know exactly what memory i'm stepping on here, but, for the purposes of this experiment it's not important as long as it isn't memory used by the Long member field, which it isn't.

I did verify it though, by observing that the below EXPECTs all failed when I clobber this memory without fixing the hash function.

I don't see the point of changing the cast. I don't benefit from any compiler safety checks here, in fact, i'm doing something 'unsafe' on purpose.

Just for you, I decided to try this :)
But it doesn't appear to work:
Static_cast from 'c10::Scalar *' to 'uint8_t *' (aka 'unsigned char *') is not allowed

Anyway, regular cast is fine.

Yea, my bad. It should be reinterpret_cast for your use case:
*(reinterpret_cast<uint8_t*>(&b)) = 1;.

Did a little research. It seems non-trivial to find any resources to answer me the memory layout question. So, it's fine to keep the comment intact.

Summary: Pull Request resolved: pytorch#68201 Hash(c10::Scalar) made a bad assumption that it was valid to just hash over all the bytes of data of the c10::Scalar struct. Becuase c10::Scalar stores a union of different (float/int/complex) types with different sizes, not all bytes are valid in all cases. Hash() should only read the bytes corresponding to the currently active type. Test Plan: Added new unit tests. Verified HashTest.Scalar failed with the original Hash() impl and then fixed. Differential Revision: D32367564 fbshipit-source-id: be1dbc4932890ebb70898184483b6776b068c6b0

facebook-github-bot · 2021-11-11T22:04:13Z

This pull request was exported from Phabricator. Differential Revision: D32367564

facebook-github-bot · 2021-11-11T22:05:19Z

This pull request was exported from Phabricator. Differential Revision: D32367564

facebook-github-bot · 2021-11-12T15:21:40Z

This pull request has been merged in dc24503.

Summary: Pull Request resolved: #68201 Hash(c10::Scalar) made a bad assumption that it was valid to just hash over all the bytes of data of the c10::Scalar struct. Becuase c10::Scalar stores a union of different (float/int/complex) types with different sizes, not all bytes are valid in all cases. Hash() should only read the bytes corresponding to the currently active type. Test Plan: Added new unit tests. Verified HashTest.Scalar failed with the original Hash() impl and then fixed. Reviewed By: alanwaketan Differential Revision: D32367564 fbshipit-source-id: ac30dd4f6dd0513954986d3d23c0c11ba802c37b

pytorch-probot bot added the ciflow/default label Nov 11, 2021

facebook-github-bot added the cla signed label Nov 11, 2021

facebook-github-bot added the fb-exported label Nov 11, 2021

alanwaketan reviewed Nov 11, 2021

View reviewed changes

wconstab force-pushed the export-D32367564 branch from c36e18e to d34527e Compare November 11, 2021 22:04

wconstab force-pushed the export-D32367564 branch from d34527e to c77d02a Compare November 11, 2021 22:05

alanwaketan approved these changes Nov 11, 2021

View reviewed changes

facebook-github-bot closed this in dc24503 Nov 12, 2021

facebook-github-bot added the Merged label Nov 12, 2021

Fix Hash(c10::Scalar), account for garbage data in union #68201

Fix Hash(c10::Scalar), account for garbage data in union #68201

Uh oh!

Conversation

wconstab commented Nov 11, 2021

Uh oh!

pytorch-probot bot commented Nov 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚛️ CI Flow

Uh oh!

facebook-github-bot commented Nov 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

💊 CI failures summary and remediations

Uh oh!

facebook-github-bot commented Nov 11, 2021

Uh oh!

Uh oh!

alanwaketan Nov 11, 2021

Choose a reason for hiding this comment

Uh oh!

wconstab Nov 11, 2021

Choose a reason for hiding this comment

Uh oh!

wconstab Nov 11, 2021

Choose a reason for hiding this comment

Uh oh!

alanwaketan Nov 11, 2021

Choose a reason for hiding this comment

Uh oh!

alanwaketan Nov 11, 2021

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Nov 11, 2021

Uh oh!

facebook-github-bot commented Nov 11, 2021

Uh oh!

facebook-github-bot commented Nov 12, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pytorch-probot bot commented Nov 11, 2021 •

edited

Loading

facebook-github-bot commented Nov 11, 2021 •

edited

Loading