Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/config limit #439

Merged
merged 6 commits into from Dec 11, 2021
Merged

Feature/config limit #439

merged 6 commits into from Dec 11, 2021

Conversation

VictorKoenders
Copy link
Contributor

Adds a limit to configuration and keeps track of how many bytes are going to be read, returning a DecodeError::LimitExceeded if this exceeds the amount of bytes read.

The reason this is set up like this is because we use a lot of Container::with_capacity when decoding containers. The limit exists primarily to not allocate too much value, for example when data is garbage, we don't want to accidentally allocate 10GB of memory when the data is invalid.

I've added an unclaim system as well. As per the Decoder comments:

When decoding container types, a typical implementation would claim to read len * size_of::<T>() bytes.
This is to ensure that bincode won't allocate several GB of memory while constructing the container.

Because the implementation claims len * size_of::<T>(), but then has to decode each T, this would be marked
as double. This function allows us to un-claim each T that gets decoded.

We cannot check if len * size_of::<T>() is valid without claiming it, because this would mean that if you have
a nested container (e.g. Vec<Vec<T>>), it does not know how much memory is already claimed, and could easily
allocate much more than the user intends.

@VictorKoenders
Copy link
Contributor Author

VictorKoenders commented Nov 29, 2021

TODO(self):

  • Add tests to see if the limit actually gets hit
  • Maybe add a check SliceReader to see if the slice is larger than the current limit, and error immediately?

@codecov-commenter
Copy link

codecov-commenter commented Nov 29, 2021

Codecov Report

Merging #439 (afc6719) into trunk (882e227) will increase coverage by 0.68%.
The diff coverage is 92.30%.

Impacted file tree graph

@@            Coverage Diff             @@
##            trunk     #439      +/-   ##
==========================================
+ Coverage   62.19%   62.87%   +0.68%     
==========================================
  Files          46       46              
  Lines        3383     3456      +73     
==========================================
+ Hits         2104     2173      +69     
- Misses       1279     1283       +4     
Impacted Files Coverage Δ
src/error.rs 0.00% <ø> (ø)
src/config.rs 80.95% <50.00%> (-7.29%) ⬇️
src/de/decoder.rs 65.00% <76.92%> (+22.14%) ⬆️
src/de/impls.rs 79.07% <100.00%> (+1.80%) ⬆️
src/de/mod.rs 79.31% <100.00%> (+10.88%) ⬆️
src/features/impl_alloc.rs 98.24% <100.00%> (+0.16%) ⬆️
src/features/impl_std.rs 75.86% <100.00%> (+0.33%) ⬆️
tests/alloc.rs 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 882e227...afc6719. Read the comment docs.

@VictorKoenders VictorKoenders changed the title [WIP] Feature/config limit Feature/config limit Dec 9, 2021
src/de/decoder.rs Show resolved Hide resolved
src/de/impls.rs Outdated Show resolved Hide resolved
super::impl_core::collect_into_array(&mut (0..N).map(|_| T::decode(&mut decoder)));
let result = super::impl_core::collect_into_array(&mut (0..N).map(|_| {
// See the documentation on `unclaim_bytes_read` as to why we're doing this here
decoder.unclaim_bytes_read(core::mem::size_of::<T>());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than unclaiming in a loop, we could just unclaim right after claiming. That should reduce loop maths.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I unclaim per item is because of nested containers.

Consider the situation where there is data type Vec<Vec<u8>>, where each Vec is 100 entries. This means the entire container would allocate 10kb of memory.

If our limit is set to 2500, the current impl would:

  • Allocate the outer Vec<Vec<_>> (100 * 24), count = 2400
  • Deallocate 1 inner Vec<u8> (24 bytes), count = 2376
  • Allocate the inner Vec<u8> (100 bytes), count = 2476
  • Deallocate 1 inner Vec<u8> (24 bytes), count = 2452
  • Allocate the inner Vec<u8> (100 bytes), count = 2552

This would fail at the 3rd allocation

Whereas if we unclaim the entire vec immediately, we would get:

  • Allocate the outer Vec<Vec<_>> (100 * 24), count = 2400
  • Deallocate this, count = 0
  • 25x:
    • Allocate the inner Vec<u8> (100 bytes), count = 100, 200, etc

This would fail at the 25th allocation

@@ -52,8 +52,13 @@ where
{
fn decode<D: Decoder>(mut decoder: D) -> Result<Self, DecodeError> {
let len = crate::de::decode_slice_len(&mut decoder)?;
decoder.claim_container_read::<T>(len)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this used sometimes and other times (like the array impl) done manually? Also, same comment here about immediately unclaiming

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is specifically for containers (e.g. Vec, HashMap, etc). An array [T; N] is in my eyes not a container

@VictorKoenders VictorKoenders merged commit 240558d into trunk Dec 11, 2021
@VictorKoenders VictorKoenders deleted the feature/config-limit branch December 11, 2021 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants