Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rpc] Fix assertion on vector length during message parsing #108414

Closed

Conversation

apach301
Copy link
Contributor

@apach301 apach301 commented Sep 1, 2023

Hi!

I've been fuzzing different pytorch modules with with sydr-fuzz, and found a heap buffer overflow error that occurs during Python object deserialization routine. Vector with IValues is verified to contain at least 3 elements, which are subsequently removed from vector. The rest of vector is passed further, where it is expected to contain at least one more element. The crash occurs on empty vector.

Docker to reproduce found error: Dockerfile.

PoC:

crash-6d634f38a76bfeaa1fffc9472e8ea7b88ee8e776.txt

ASAN report

==339647==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x604000105388 at pc 0x000000c2b3bc bp 0x7fffffffb8d0 sp 0x7fffffffb8c8
READ of size 4 at 0x604000105388 thread T0
    #0 0xc2b3bb in c10::IValue::isString() const /pytorch/aten/src/ATen/core/ivalue.h:685:27
    #1 0xc2b3bb in c10::IValue::toStringRef[abi:cxx11]() const /pytorch/aten/src/ATen/core/ivalue_inl.h:2308:3
    #2 0x101ce65f in torch::distributed::rpc::SerializedPyObj::fromIValues(std::vector<c10::IValue, std::allocator<c10::IValue> >) /pytorch/torch/csrc/distributed/rpc/types.cpp:103:39
    #3 0x1006a7a0 in torch::distributed::rpc::PythonRemoteCall::fromMessage(torch::distributed::rpc::Message const&) /pytorch/torch/csrc/distributed/rpc/python_remote_call.cpp:58:26
    #4 0x101d02e1 in torch::distributed::rpc::deserializeRequest(torch::distributed::rpc::Message const&) /pytorch/torch/csrc/distributed/rpc/utils.cpp:111:14
    #5 0x8db738 in LLVMFuzzerTestOneInput /message_deserialize.cc:192:27
    #6 0x8d84cd in ExecuteFilesOnyByOne /AFLplusplus/utils/aflpp_driver/aflpp_driver.c:255:7
    #7 0x8d82d8 in LLVMFuzzerRunDriver /AFLplusplus/utils/aflpp_driver/aflpp_driver.c
    #8 0x8d7e98 in main /AFLplusplus/utils/aflpp_driver/aflpp_driver.c:300:10
    #9 0x7ffff7a37082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082) (BuildId: 1878e6b475720c7c51969e69ab2d276fae6d1dee)
    #10 0x817c4d in _start (/message_deserialize_afl+0x817c4d)

0x604000105388 is located 8 bytes to the left of 48-byte region [0x604000105390,0x6040001053c0)
allocated by thread T0 here:
    #0 0x8d54ca in operator new(unsigned long) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/asan/asan_new_delete.cpp:95:3

SUMMARY: AddressSanitizer: heap-buffer-overflow /pytorch/aten/src/ATen/core/ivalue.h:685:27 in c10::IValue::isString() const

@pytorch-bot
Copy link

pytorch-bot bot commented Sep 1, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/108414

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Merge Blocking SEVs

There is 1 active merge blocking SEVs. Please view them below:

If you must merge, use @pytorchbot merge -f.

❌ 1 New Failure

As of commit 807277b with merge base 591cb77 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@ezyang
Copy link
Contributor

ezyang commented Sep 4, 2023

Any chance of getting a test case? Thanks!

@ezyang ezyang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Sep 4, 2023
@ezyang
Copy link
Contributor

ezyang commented Sep 4, 2023

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 4, 2023
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Not merging any PRs at the moment because there is a merge blocking https://github.com/pytorch/pytorch/labels/ci:%20sev issue open at:
#108524

Details for Dev Infra team Raised by workflow job

@jeanschmidt
Copy link
Contributor

@pytorchbot merge -f -m "forcing merge in order to gatekeep GHA jobs"

@pytorch-bot
Copy link

pytorch-bot bot commented Sep 4, 2023

❌ 🤖 pytorchbot command failed:

@pytorchbot merge: error: argument -f/--force: expected one argument

usage: @pytorchbot merge [-f MESSAGE | -i] [-ic] [-r [{viable/strict,main}]]

Try @pytorchbot --help for more info.

@jeanschmidt
Copy link
Contributor

@pytorchbot merge -f "forcing merge in order to gatekeep GHA jobs"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@apach301 apach301 deleted the fix-message-parsing-assertion branch September 5, 2023 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged open source release notes: distributed (rpc) release notes category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants