Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strictly sanitize mmapped AppendVec file contents #7464

Merged
merged 13 commits into from Dec 18, 2019

Conversation

ryoqun
Copy link
Member

@ryoqun ryoqun commented Dec 13, 2019

Problem

Currently, It's very easy to cause DoS with crafted AppendVec data file. That's because data_len is directly used to allocate the data_len number of u8[], and is used for the offset calculation without overflow check, for example.

Also, I've carefully audited the fields in the AppendVec data file this time. Most of fields including Pubkey, Hash and lamports can legally contain arbitrary values for its type domain. So there aren't much to sanitize them at the AppendVec layer. However there are only two exceptions: data_len and executable.

As mentioned before, data_len must be sensible u64 for memory allocation. This is obvious and simple.

And exeutable is a bit subtle. It's bool consuming 1 bit logically in Rust land, but it consumes 8 bits physically. That means the higher 7 bits are usually not touched, however we must sanitize those bits to be cleared when snapshot ingestion. Otherwise, it's undefined behavior so bogus checks for exeutable could be possible depending on some myriad of combination of runtime configuration (rust version, compiler optimization, machine architecture, OS varieties).

After all, we should be super careful; we're fearless and very rare people to dare to mmap completely untrusted (=not even semi-trusted) data directly with minimal sanitization... :p We're proudly performance-obsessed. :)

Summary of Changes

  • Small preparatory clean up two commits
  • The actual meat including some unsafe {}s in both production and test code (mandatory due to the need to prepare malicious (=crafted) bytes and to guard against it)
    • data_len: Protect by the way of strict offset calculation sanitization. This PR doesn't explicitly impose limits on it; In combination with Sanitize AppendVec's file_size #7373, it'll effectively limit huge memory allocation because data_len in this PR won't be greater than AppendVec's file_size.
    • executable: Simply forbid any bad value other than 0b0000_0000 and 0b0000_00001.

Part of #7167

@ryoqun ryoqun requested a review from sakridge December 13, 2019 02:25
@@ -389,8 +417,8 @@ impl Serialize for AppendVec {
S: serde::ser::Serializer,
{
use serde::ser::Error;
let len = std::mem::size_of::<usize>() as u64;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These casts are odd...


if !self.sanitize_layout_and_length() {
return Err(std::io::Error::new(
std::io::ErrorKind::Other,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know using those Errors is a bit off...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea.. I would prefer using either a custom Result type or maybe even something like io::Result::InvalidInput https://doc.rust-lang.org/std/io/enum.ErrorKind.html#variant.InvalidInput

// Yes, this really hannpens; see test_set_file_crafted_executable
let executable_bool: &bool = &self.account_meta.executable;
// UNSAFE: Force to interpret mmap-backed bool as u8 to ensure higher 7-bits are cleared correctly.
let executable_byte: &u8 = unsafe { &*(executable_bool as *const bool as *const u8) };
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unsafe is in production code path. But risk should have been minimized; it only reads a byte of memory with narrowest scoping.

@@ -13,11 +13,12 @@ use std::{
sync::Mutex,
};

//Data is aligned at the next 64 byte offset. Without alignment loading the memory may
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fairly certain 64 byte offset is wrong description; it should be 8 byte offset or 64 bit offset if you prefer bits. Padding at 64 byte boundary would be too wasteful. I've never heard of such architecture. Also, the macro impl doesn't look like actualy aligning with 64 byte, too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea 64-byte in the description is wrong, but some vector instructions like vmovapd can require 64-byte alignment for avx-512 moves:
https://www.felixcloutier.com/x86/movapd

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course compilers will probably always emit the unaligned-tolerant versions of those instructions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avx-512 moves

Oh, the mighty 512 bits! Yeah, 64-byte alignment will be warranted in some special cases! Thanks for the tip!

@@ -187,17 +199,39 @@ impl AppendVec {

let map = unsafe { MmapMut::map_mut(&data)? };
self.map = map;

if !self.sanitize_layout_and_length() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds additional sanitization costs for the snapshot ingestion codepath. Its impact on the overall validator performance should be minimal because it's only done only once when starting a validator from snapshot.

This PR intentionally didn't added these checks for the actual AppendVec write codepath for the performance concerns and its dubious merits.

Copy link
Member Author

@ryoqun ryoqun Dec 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this PR didn't add these check for snapshot generation code path as well with the same reason.

return None;
}
let data = &self.map[offset..offset + size];
//Data is aligned at the next 64 byte offset. Without alignment loading the memory may
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, these comments are redundant at best; so removed them.

@ryoqun ryoqun changed the title Sanitize append vec mmap Strictly sanitize mmapped AppendVec file contents Dec 13, 2019

av.flush().unwrap();
let result = av.set_file(path);
assert_matches!(result, Err(ref message) if message.to_string() == *"incorrect layout/length");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better assertion could be possible...

@codecov
Copy link

codecov bot commented Dec 13, 2019

Codecov Report

Merging #7464 into master will decrease coverage by 9.8%.
The diff coverage is 79%.

@@           Coverage Diff            @@
##           master   #7464     +/-   ##
========================================
- Coverage    80.7%   70.8%   -9.9%     
========================================
  Files         244     245      +1     
  Lines       48682   55276   +6594     
========================================
- Hits        39291   39170    -121     
- Misses       9391   16106   +6715

let executable_bool: &bool = &account.account_meta.executable;
// we can not use assert_eq!...
// *executable_bool is true but its actual memory value is crafted_executable, not 1
assert!(*executable_bool != true);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dark side of unsafe (part 1) xD

runtime/src/append_vec.rs Outdated Show resolved Hide resolved
assert_eq!(executable_bool, false);
// UNSAFE: Force to interpret mmap-backed bool as u8 to really read the actual memory content
let executable_byte: u8 = unsafe { std::mem::transmute::<bool, u8>(executable_bool) };
assert_eq!(executable_byte, 0); // Wow, not crafted_executable!
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dark side of unsafe (part 2) xD

// *executable_bool is true but its actual memory value is crafted_executable, not 1
assert!(*executable_bool != true);
// UNSAFE: Force to interpret mmap-backed bool as u8 to really read the actual memory content
let executable_byte: &u8 = unsafe { &*(executable_bool as *const bool as *const u8) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this unsafe block/casting is repeated in the tests a few times, can we have a function that is assert_eq_bool(ptr, expected_bool_value);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was a bit annoyed the repeated unsafes... Thanks for suggestion! I've done the cleanup differentially, though. How does that look for you?: 6d62daa

runtime/src/append_vec.rs Outdated Show resolved Hide resolved
runtime/src/append_vec.rs Outdated Show resolved Hide resolved
failures:

---- append_vec::tests::test_set_file_crafted_executable stdout ----
thread 'append_vec::tests::test_set_file_crafted_executable' panicked at 'assertion failed: `(left == right)`
  left: `true`,
 right: `true`', runtime/src/append_vec.rs:683:13
stack backtrace:
runtime/src/append_vec.rs Outdated Show resolved Hide resolved
runtime/src/append_vec.rs Outdated Show resolved Hide resolved
@ryoqun
Copy link
Member Author

ryoqun commented Dec 16, 2019

@mvines @sakridge I've polished this up! Could you review again in your free time? I think this PR is ready for merge. :)

Copy link
Member

@mvines mvines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, @sakridge is a better reviewer for this change though so I defer approval to him 👑

@ryoqun
Copy link
Member Author

ryoqun commented Dec 17, 2019

lgtm, @sakridge is a better reviewer for this change though so I defer approval to him crown

Thank you very much!

@sakridge How does this look now?

Copy link
Member

@sakridge sakridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ryoqun ryoqun merged commit 629a4b5 into solana-labs:master Dec 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants