Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heap buffer overflow in ditributed/rpc module #105537

Closed
wants to merge 1 commit into from

Conversation

apach301
Copy link
Contributor

Hi! we've been fuzzing PyTorch project with sydr-fuzz.
We've found a couple heap-buffer-overflows in distributed/rpc module.

PyTorch version: 0f1621d

OS: Ubuntu 20.04

How to reproduce

  1. Build docker from this Dockerfile and run the container.
  2. Then run message_deserialize-afl++ fuzzing target on provided crash-inputs (crash-056826339f6da8dbb97c944178e94494369a9e22.zip, crash-4f85db9f19fe152c0018f6675c3b4c122227058f.zip):
unzip crash-4f85db9f19fe152c0018f6675c3b4c122227058f.zip
/message_deserialize-afl++ crash-4f85db9f19fe152c0018f6675c3b4c122227058f

Heap buffer overflow in torch/csrc/jit/serialization/pickle.cpp:144

crash-056826339f6da8dbb97c944178e94494369a9e22.zip

    "==7614==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60b001b58355 at pc 0x0000005d1147 bp 0x7fffffffa610 sp 0x7fffffff9de0",
    "READ of size 256 at 0x60b001b58355 thread T0",
    "    #0 0x5d1146 in __asan_memcpy /llvm-project-llvmorg-14.0.6/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp:22:3",
    "    #1 0xd1cd19f in torch::jit::unpickle(char const*, unsigned long, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>, c10::Type::SingletonOrSharedTypePtr<c10::Type> (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&))::$_3::operator()(char*, unsigned long) const /pytorch/torch/csrc/jit/serialization/pickle.cpp:144:9",
    "    #2 0xd1cd19f in unsigned long std::__invoke_impl<unsigned long, torch::jit::unpickle(char const*, unsigned long, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>, c10::Type::SingletonOrSharedTypePtr<c10::Type> (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&))::$_3&, char*, unsigned long>(std::__invoke_other, torch::jit::unpickle(char const*, unsigned long, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>, c10::Type::SingletonOrSharedTypePtr<c10::Type> (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&))::$_3&, char*&&, unsigned long&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/invoke.h:60:14",
    "    #3 0xd27aa48 in std::function<unsigned long (char*, unsigned long)>::operator()(char*, unsigned long) const /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/std_function.h:622:14",
    "    #4 0xd27a61c in torch::jit::Unpickler::readSlowWithBuffer(char*, unsigned long) /pytorch/torch/csrc/jit/serialization/unpickler.cpp:1047:23",
    "    #5 0xd2698b8 in unsigned char torch::jit::Unpickler::read<unsigned char>() /pytorch/torch/csrc/jit/serialization/unpickler.h:111:7",
    "    #6 0xd268816 in torch::jit::Unpickler::readOpCode() /pytorch/torch/csrc/jit/serialization/unpickler.h:130:38",
    "    #7 0xd268816 in torch::jit::Unpickler::run() /pytorch/torch/csrc/jit/serialization/unpickler.cpp:238:17",
    "    #8 0xd268522 in torch::jit::Unpickler::parse_ivalue() /pytorch/torch/csrc/jit/serialization/unpickler.cpp:204:3",
    "    #9 0xd1c8502 in torch::jit::unpickle(std::function<unsigned long (char*, unsigned long)>, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>, c10::Type::SingletonOrSharedTypePtr<c10::Type> (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)) /pytorch/torch/csrc/jit/serialization/pickle.cpp:126:20",
    "    #10 0xd1c8dbd in torch::jit::unpickle(char const*, unsigned long, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>, c10::Type::SingletonOrSharedTypePtr<c10::Type> (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)) /pytorch/torch/csrc/jit/serialization/pickle.cpp:136:10",
    "    #11 0xe56b16d in torch::distributed::rpc::readWrappedPayload(std::vector<char, std::allocator<char> >&, torch::distributed::rpc::Message const&) /pytorch/torch/csrc/distributed/rpc/utils.cpp:515:18",
    "    #12 0xe3d8f29 in torch::distributed::autograd::RpcWithProfilingReq::fromMessage(torch::distributed::rpc::Message const&) /pytorch/torch/csrc/distributed/autograd/rpc_messages/rpc_with_profiling_req.cpp:112:24",
    "    #13 0xe55f692 in torch::distributed::rpc::deserializeRequest(torch::distributed::rpc::Message const&) /pytorch/torch/csrc/distributed/rpc/utils.cpp:138:14",
    "    #14 0x6120a8 in LLVMFuzzerTestOneInput /message_deserialize.cc:192:27",
    "    #15 0x535de1 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:611:15",
    "    #16 0x51fcec in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:324:6",
    "    #17 0x525a3b in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:860:9",
    "    #18 0x54eff2 in main /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10",
    "    #19 0x7ffff7a37082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082) (BuildId: 1878e6b475720c7c51969e69ab2d276fae6d1dee)",
    "    #20 0x51a60d in _start (/message_deserialize_fuzz+0x51a60d)",
    "",
    "0x60b001b58355 is located 0 bytes to the right of 101-byte region [0x60b001b582f0,0x60b001b58355)",
    "allocated by thread T0 here:",
    "    #0 0x60c7bd in operator new(unsigned long) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/asan/asan_new_delete.cpp:95:3",
    "    #1 0x62c7fd in std::_Vector_base<char, std::allocator<char> >::_M_allocate(unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/stl_vector.h:346:20",
    "    #2 0x62c7fd in void std::vector<char, std::allocator<char> >::_M_range_initialize<unsigned char const*>(unsigned char const*, unsigned char const*, std::forward_iterator_tag) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/stl_vector.h:1582:14",
    "    #3 0x612913 in std::vector<char, std::allocator<char> >::vector<unsigned char const*, void>(unsigned char const*, unsigned char const*, std::allocator<char> const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/stl_vector.h:657:4",
    "    #4 0x611c4a in LLVMFuzzerTestOneInput /message_deserialize.cc:181:21",
    "    #5 0x535de1 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:611:15",
    "    #6 0x51fcec in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:324:6",
    "    #7 0x525a3b in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:860:9",
    "    #8 0x54eff2 in main /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10",
    "    #9 0x7ffff7a37082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082) (BuildId: 1878e6b475720c7c51969e69ab2d276fae6d1dee)",
    "",
    "SUMMARY: AddressSanitizer: heap-buffer-overflow /llvm-project-llvmorg-14.0.6/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp:22:3 in __asan_memcpy",
    "Shadow bytes around the buggy address:",
    "  0x0c1680363010: 00 00 00 fa fa fa fa fa fa fa fa fa 00 00 00 00",
    "  0x0c1680363020: 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa fa",
    "  0x0c1680363030: fa fa 00 00 00 00 00 00 00 00 00 00 00 00 00 fa",
    "  0x0c1680363040: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00",
    "  0x0c1680363050: 00 00 00 00 00 fa fa fa fa fa fa fa fa fa 00 00",
    "=>0x0c1680363060: 00 00 00 00 00 00 00 00 00 00[05]fa fa fa fa fa",
    "  0x0c1680363070: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00",
    "  0x0c1680363080: 05 fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa",
    "  0x0c1680363090: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa",
    "  0x0c16803630a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa",
    "  0x0c16803630b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa",
    "Shadow byte legend (one shadow byte represents 8 application bytes):",
    "  Addressable:           00",
    "  Partially addressable: 01 02 03 04 05 06 07",
    "  Heap left redzone:       fa",
    "  Freed heap region:       fd",
    "  Stack left redzone:      f1",
    "  Stack mid redzone:       f2",
    "  Stack right redzone:     f3",
    "  Stack after return:      f5",
    "  Stack use after scope:   f8",
    "  Global redzone:          f9",
    "  Global init order:       f6",
    "  Poisoned by user:        f7",
    "  Container overflow:      fc",
    "  Array cookie:            ac",
    "  Intra object redzone:    bb",
    "  ASan internal:           fe",
    "  Left alloca redzone:     ca",
    "  Right alloca redzone:    cb",
    "==7614==ABORTING"

Heap-buffer-overflow in aten/src/ATen/core/ivalue.h:432

crash-4f85db9f19fe152c0018f6675c3b4c122227058f.zip

    "==60983==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6150001e4108 at pc 0x000000601877 bp 0x7fffffff9fd0 sp 0x7fffffff9fc8",
    "READ of size 4 at 0x6150001e4108 thread T0",
    "    #0 0x601876 in c10::IValue::isTensor() const /pytorch/aten/src/ATen/core/ivalue.h:432:27",
    "    #1 0x601876 in c10::IValue::destroy() /pytorch/aten/src/ATen/core/ivalue.h:1148:9",
    "    #2 0x699f72 in c10::IValue::~IValue() /pytorch/aten/src/ATen/core/ivalue.h:236:5",
    "    #3 0x699f72 in void std::_Destroy<c10::IValue>(c10::IValue*) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/stl_construct.h:140:19",
    "    #4 0x699f72 in void std::_Destroy_aux<false>::__destroy<c10::IValue*>(c10::IValue*, c10::IValue*) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/stl_construct.h:152:6",
    "    #5 0x699f72 in void std::_Destroy<c10::IValue*>(c10::IValue*, c10::IValue*) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/stl_construct.h:184:7",
    "    #6 0x699f72 in void std::_Destroy<c10::IValue*, c10::IValue>(c10::IValue*, c10::IValue*, std::allocator<c10::IValue>&) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/alloc_traits.h:738:7",
    "    #7 0x699f72 in std::vector<c10::IValue, std::allocator<c10::IValue> >::_M_erase_at_end(c10::IValue*) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/stl_vector.h:1796:6",
    "    #8 0x699e4a in std::vector<c10::IValue, std::allocator<c10::IValue> >::_M_erase(__gnu_cxx::__normal_iterator<c10::IValue*, std::vector<c10::IValue, std::allocator<c10::IValue> > >, __gnu_cxx::__normal_iterator<c10::IValue*, std::vector<c10::IValue, std::allocator<c10::IValue> > >) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/vector.tcc:191:4",
    "    #9 0xea5b11e in torch::jit::Unpickler::readInstruction() /pytorch/torch/csrc/jit/serialization/unpickler.cpp:454:14",
    "    #10 0xea57d97 in torch::jit::Unpickler::run() /pytorch/torch/csrc/jit/serialization/unpickler.cpp:251:27",
    "    #11 0xea579f1 in torch::jit::Unpickler::parse_ivalue() /pytorch/torch/csrc/jit/serialization/unpickler.cpp:204:3",
    "    #12 0xe9a435e in torch::jit::unpickle(std::function<unsigned long (char*, unsigned long)>, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>, c10::Type::SingletonOrSharedTypePtr<c10::Type> (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)) /pytorch/torch/csrc/jit/serialization/pickle.cpp:126:20",
    "    #13 0xe9a471c in torch::jit::unpickle(char const*, unsigned long, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>, c10::Type::SingletonOrSharedTypePtr<c10::Type> (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)) /pytorch/torch/csrc/jit/serialization/pickle.cpp:136:10",
    "    #14 0xfcd034b in torch::distributed::autograd::PropagateGradientsReq::fromMessage(torch::distributed::rpc::Message const&) /pytorch/torch/csrc/distributed/autograd/rpc_messages/propagate_gradients_req.cpp:54:18",
    "    #15 0xfe720ff in torch::distributed::rpc::deserializeRequest(torch::distributed::rpc::Message const&) /pytorch/torch/csrc/distributed/rpc/utils.cpp:132:14",
    "    #16 0x5c5c93 in LLVMFuzzerTestOneInput /message_deserialize.cc:192:27",
    "    #17 0x5c2bfd in ExecuteFilesOnyByOne /AFLplusplus/utils/aflpp_driver/aflpp_driver.c:255:7",
    "    #18 0x5c2a08 in LLVMFuzzerRunDriver /AFLplusplus/utils/aflpp_driver/aflpp_driver.c",
    "    #19 0x5c25c8 in main /AFLplusplus/utils/aflpp_driver/aflpp_driver.c:300:10",
    "    #20 0x7ffff7a37082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082) (BuildId: 1878e6b475720c7c51969e69ab2d276fae6d1dee)",
    "    #21 0x50237d in _start (/message_deserialize_afl+0x50237d)",
    "",
    "0x6150001e4108 is located 8 bytes to the right of 512-byte region [0x6150001e3f00,0x6150001e4100)",
    "allocated by thread T0 here:",
    "    #0 0x5bfbfa in operator new(unsigned long) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/asan/asan_new_delete.cpp:95:3",
    "",
    "SUMMARY: AddressSanitizer: heap-buffer-overflow /pytorch/aten/src/ATen/core/ivalue.h:432:27 in c10::IValue::isTensor() const",
    "Shadow bytes around the buggy address:",
    "  0x0c2a800347d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa",
    "  0x0c2a800347e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00",
    "  0x0c2a800347f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00",
    "  0x0c2a80034800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00",
    "  0x0c2a80034810: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00",
    "=>0x0c2a80034820: fa[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa",
    "  0x0c2a80034830: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa",
    "  0x0c2a80034840: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa",
    "  0x0c2a80034850: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa",
    "  0x0c2a80034860: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa",
    "  0x0c2a80034870: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa",
    "Shadow byte legend (one shadow byte represents 8 application bytes):",
    "  Addressable:           00",
    "  Partially addressable: 01 02 03 04 05 06 07",
    "  Heap left redzone:       fa",
    "  Freed heap region:       fd",
    "  Stack left redzone:      f1",
    "  Stack mid redzone:       f2",
    "  Stack right redzone:     f3",
    "  Stack after return:      f5",
    "  Stack use after scope:   f8",
    "  Global redzone:          f9",
    "  Global init order:       f6",
    "  Poisoned by user:        f7",
    "  Container overflow:      fc",
    "  Array cookie:            ac",
    "  Intra object redzone:    bb",
    "  ASan internal:           fe",
    "  Left alloca redzone:     ca",
    "  Right alloca redzone:    cb",
    "==60983==ABORTING"

@pytorch-bot
Copy link

pytorch-bot bot commented Jul 19, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/105537

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 9cf041e:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: distributed (rpc) release notes category label Jul 19, 2023
@albanD albanD added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jul 20, 2023
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for these fixes!

@albanD
Copy link
Collaborator

albanD commented Jul 20, 2023

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jul 20, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@albanD albanD added the topic: not user facing topic category label Jul 20, 2023
@apach301 apach301 deleted the add-rpc-checks branch July 21, 2023 08:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged open source release notes: distributed (rpc) release notes category topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants