Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add size check before calling stack_.at(dict_pos) in unpickler.cpp #94300

Closed

Conversation

m4drat
Copy link
Contributor

@m4drat m4drat commented Feb 7, 2023

Hi!

I've been fuzzing different pytorch modules, and found a crash inside one of them.

Specifically, I'm talking about a module for unpickling and a function called Unpickler::readInstruction(). Running this function with provided crash file results in a crash, which occurs while calling auto dict = stack_.at(dict_pos).toGenericDict(); unpickler.cpp:561. The crash occurs, because the index dict_pos is out of bounds (which itself happens because the stack size is 0).

Besides this pull-request, there is another one related to unpickler hardening: #84343

All tests were performed on this pytorch version: abc54f93145830b502400faa92bec86e05422fbd

How to reproduce

  1. To reproduce the crash, use provided docker: Dockerfile

  2. Build the container: docker build -t oss-sydr-fuzz-pytorch-reproduce .

  3. Copy crash file to the current directory:

  4. Run the container: docker run --privileged --network host -v `pwd`:/homedir --rm -it oss-sydr-fuzz-pytorch-reproduce /bin/bash

  5. And execute the binary: /message_deserialize_sydr /homedir/crash-042dff5e121580425d9d34d0f293918f3c9fbf1e

After execution completes you will see this error message:

terminate called after throwing an instance of 'std::out_of_range'
  what():  vector::_M_range_check: __n (which is 18446744073709551613) >= this->size() (which is 0)

And this stacktrace:

erminate called after throwing an instance of 'std::out_of_range'
  what():  vector::_M_range_check: __n (which is 18446744073709551613) >= this->size() (which is 0)
==39== ERROR: libFuzzer: deadly signal
    #0 0x5d0df1 in __sanitizer_print_stack_trace /llvm-project/compiler-rt/lib/asan/asan_stack.cpp:87:3
    #1 0x545727 in fuzzer::PrintStackTrace() /llvm-project/compiler-rt/lib/fuzzer/FuzzerUtil.cpp:210:5
    #2 0x52b933 in fuzzer::Fuzzer::CrashCallback() /llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:233:3
    #3 0x7f9118e0341f  (/lib/x86_64-linux-gnu/libpthread.so.0+0x1441f)
    #4 0x7f9118c2300a in raise (/lib/x86_64-linux-gnu/libc.so.6+0x4300a)
    #5 0x7f9118c02858 in abort (/lib/x86_64-linux-gnu/libc.so.6+0x22858)
    #6 0x7f9119040910  (/lib/x86_64-linux-gnu/libstdc++.so.6+0x9e910)
    #7 0x7f911904c38b  (/lib/x86_64-linux-gnu/libstdc++.so.6+0xaa38b)
    #8 0x7f911904c3f6 in std::terminate() (/lib/x86_64-linux-gnu/libstdc++.so.6+0xaa3f6)
    #9 0x7f911904c6a8 in __cxa_throw (/lib/x86_64-linux-gnu/libstdc++.so.6+0xaa6a8)
    #10 0x7f91190433aa  (/lib/x86_64-linux-gnu/libstdc++.so.6+0xa13aa)
    #11 0x63acdf in std::vector<c10::IValue, std::allocator<c10::IValue> >::_M_range_check(unsigned long) const /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/stl_vector.h:1073:4
    #12 0xce8f93e in std::vector<c10::IValue, std::allocator<c10::IValue> >::at(unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/stl_vector.h:1094:2
    #13 0xce8f93e in torch::jit::Unpickler::readInstruction() /pytorch_fuzz/torch/csrc/jit/serialization/unpickler.cpp:546:26
    #14 0xce8d527 in torch::jit::Unpickler::run() /pytorch_fuzz/torch/csrc/jit/serialization/unpickler.cpp:235:27
    #15 0xce8d1c2 in torch::jit::Unpickler::parse_ivalue() /pytorch_fuzz/torch/csrc/jit/serialization/unpickler.cpp:192:3
    #16 0xcdf0792 in torch::jit::unpickle(std::function<unsigned long (char*, unsigned long)>, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>, c10::Type::SingletonOrSharedTypePtr<c10::Type> (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)) /pytorch_fuzz/torch/csrc/jit/serialization/pickle.cpp:127:20
    #17 0xcdf104d in torch::jit::unpickle(char const*, unsigned long, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>, c10::Type::SingletonOrSharedTypePtr<c10::Type> (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)) /pytorch_fuzz/torch/csrc/jit/serialization/pickle.cpp:137:10
    #18 0xe0532db in torch::distributed::rpc::ScriptRemoteCall::fromMessage(torch::distributed::rpc::Message const&) /pytorch_fuzz/torch/csrc/distributed/rpc/script_remote_call.cpp:74:16
    #19 0xe0ffa10 in torch::distributed::rpc::deserializeRequest(torch::distributed::rpc::Message const&) /pytorch_fuzz/torch/csrc/distributed/rpc/utils.cpp:108:14
    #20 0x602a41 in LLVMFuzzerTestOneInput /message_deserialize_fuzz.cc:192:27
    #21 0x52ce61 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:611:15
    #22 0x516d7c in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:324:6
    #23 0x51cacb in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:860:9
    #24 0x546062 in main /llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
    #25 0x7f9118c04082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082)
    #26 0x51169d in _start (/message_deserialize_fuzz+0x51169d)

NOTE: libFuzzer has rudimentary signal handlers.
      Combine libFuzzer with AddressSanitizer or similar for better crash reports.
SUMMARY: libFuzzer: deadly signal

@pytorch-bot
Copy link

pytorch-bot bot commented Feb 7, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/94300

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ No Failures

As of commit cbfc10f:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: jit release notes category label Feb 7, 2023
@malfet malfet added topic: bug fixes topic category topic: security triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Feb 7, 2023
Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the fix. Do you mind adding unit test to test_jit.py that attempts to deserialize an hand crafted jit script.

@m4drat
Copy link
Contributor Author

m4drat commented Feb 9, 2023

Do you mind adding unit test to test_jit.py that attempts to deserialize an hand crafted jit script.

I don't think it is possible to craft a good testcase by hand. In this example, fuzzing harness mainly targets code related to serialization/deserialization of rpc requests/responses. It's quite hard to generate a similar testcase that reaches the problematic code path using, for example, pickled model loading torch.jit.load.

@apach301
Copy link
Contributor

apach301 commented Apr 5, 2023

@malfet
Hi! Could you please review changes or add another reviewers to proceed merging process?

@SweetVishnya
Copy link

@eellison, can we merge this PR that fixes crash in JIT?

@SweetVishnya
Copy link

@ezyang, can we merge this approved PR?

@ezyang
Copy link
Contributor

ezyang commented May 2, 2023

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 2, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR is too stale; the last push date was more than 3 days ago. Please rebase and try again. You can rebase and merge by leaving the following comment on this PR:
@pytorchbot merge -r
Or just rebase by leaving @pytorchbot rebase comment

Details for Dev Infra team Raised by workflow job

@m4drat
Copy link
Contributor Author

m4drat commented May 2, 2023

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a rebase job. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased unpickler-user-friendly-bounds-check onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout unpickler-user-friendly-bounds-check && git pull --rebase)

@pytorchmergebot pytorchmergebot force-pushed the unpickler-user-friendly-bounds-check branch from 88ad76f to cbfc10f Compare May 2, 2023 15:46
@ezyang
Copy link
Contributor

ezyang commented May 2, 2023

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged open source release notes: jit release notes category topic: bug fixes topic category topic: security triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants