Skip to content
This repository has been archived by the owner on May 4, 2024. It is now read-only.

[security] Cherry-picking recent security fixes from Aptos #950

Merged
merged 7 commits into from Mar 10, 2023

Conversation

wrwg
Copy link
Member

@wrwg wrwg commented Mar 1, 2023

Security fixes from the last few weeks. This is a series of PRs addressing DoS attacks on the bytecode verifier. The attacks have been reported via the bug bounty program and each of them has been converted to a test. The general method of defense is a new so-called 'meter' in the bytecode verifier which counts complexity. For example, for absint, the number of joins and steps is counted and weighted.

vgao1996 and others added 6 commits February 28, 2023 20:17
…ysis (move-language#58)

This adds a simple meter to the bytecode verifier which counts the number of abstract operations performed and can enforce a limit. The meter is for now only connected to locals and reference analysis, but plumped to all phases of the verifier so they can easily make use of it.

A set of test cases have been added which exercise the new meter for a number of known pathological cases.

PR history:

- Add metering in type safety, to capture cost of very large types. This reduces timing of large_type_test to 1/4
- Adjusting max metering units upwards and adding a new sample which needs it
- Addressing reviewer comments
- Add links to security advisories, and verify that all are covered.
- Switching metering granularity from function to module.
- Adding non-linear growing penalty to using input refs x output refs relations (bicycles), for dealing better with `test_bicliques`. Adding printing size in benchmarks.
…move-language#62)

Specifically the test below now runs in 1/2 of the time. This adjustment appeard useful because the overall time allocation had to be increased to 8000 million units in production. Adjusted this as the default here too.

```
--> test_merge_state: verification time: 59.414ms, result: CONSTRAINT_NOT_SATISFIED, size: 63kb
```

Also adjusts the default to what aptos uses now in production.
@dariorussi
Copy link
Member

this is pretty good, give us a little bit to review and thanks!

@dariorussi
Copy link
Member

I like this overall, a lot.
I think it captures well the different dimensions of looping and allocating that can be used as DOS attack.
I am going to let others contribute until Monday when I will come and accept if not accepted sooner.

It seems there is some more to do but it feels it is a good direction for now and possibly for quite some time.
Curious to hear if others have feelings about long term solutions and how this plugs in towards the final goal (whatever that may be)

@@ -54,7 +56,8 @@ pub trait TransferFunctions {
instr: &Bytecode,
index: CodeOffset,
last_index: CodeOffset,
) -> Result<(), Self::Error>;
meter: &mut impl Meter,
) -> PartialVMResult<()>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change needed? hard to tell for me at glance

We switched this back to Self::Error to let us use our own error types for our adapter
for custom verifier passes. And I could see that being helpful for other tooling.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now why you did this. I do not remember the details, but if we would not do that, it would be harder to bubble up the error from meter which is a VM result. Perhaps you can switch it back?

@@ -36,6 +36,7 @@ pub struct Program {
//**************************************************************************************************

#[derive(Debug, Clone, PartialEq, Eq)]
#[allow(clippy::large_enum_variant)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What caused this change? Not a big deal, just curious

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left over from some experiments, removed.

}

pub(crate) fn verify<'a>(
resolver: &'a BinaryIndexedView<'a>,
function_view: &'a FunctionView<'a>,
meter: &mut impl Meter, // currently unused
) -> PartialVMResult<()> {
let verifier = &mut TypeSafetyChecker::new(resolver, function_view);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary, but could the meter have gone inside the TypeSafetyChecker? Might have reduced the number of changes needed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps. I remember there was some issue because the TypeSafetyChecker wasn't used everywhere.

file_format::{Bytecode, CodeOffset},
};
use std::collections::BTreeMap;

/// Trait for finite-height abstract domains. Infinite height domains would require a more complex
/// trait with widening and a partial order.
pub trait AbstractDomain: Clone + Sized {
fn join(&mut self, other: &Self) -> JoinResult;
fn join(&mut self, other: &Self, meter: &mut impl Meter) -> PartialVMResult<JoinResult>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my comment for type_safety, do we need the meter as an extra argument here?
Could we just make the meter a field on the AbstractDomain implementer? Though that definitely feels a bit funny

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally would have preferred this. Ideally there would be a mut Context going into all this verifier code.

Copy link
Member

@dariorussi dariorussi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we tend to like the approach though we may want to change something in our repo and push it back as needed.
Hope that is ok with you all.
In general - for all Move community - thinking about ways to plug the metering in the least intrusive way should be a priority of some kind.
It seems inevitable that different metering (or not at all) will be a reality moving forward.
So something that allows all parties to move freely would be a good thing. Particularly considering this is the verifier...

If someone could answer @tnowacki's comments that would be appreciated

@wrwg
Copy link
Member Author

wrwg commented Mar 10, 2023

we tend to like the approach though we may want to change something in our repo and push it back as needed. Hope that is ok with you all. In general - for all Move community - thinking about ways to plug the metering in the least intrusive way should be a priority of some kind. It seems inevitable that different metering (or not at all) will be a reality moving forward. So something that allows all parties to move freely would be a good thing. Particularly considering this is the verifier...

If someone could answer @tnowacki's comments that would be appreciated

Sorry for the delay -- first traveling, then catching up. Yes, please feel free to massage and also push useful changes (like @tnowacki suggesting) upstream.

Instead of just metering size of types on the operand stack, also meter size of type instantiations in calls and other places. This e.g. capture the size of types in calls like `f<T>()`, where the type does not appear on the operand stack.
@wrwg wrwg enabled auto-merge (squash) March 10, 2023 22:40
@wrwg wrwg merged commit bbd5caf into move-language:main Mar 10, 2023
@wrwg wrwg deleted the security branch March 10, 2023 23:10
nkysg pushed a commit to starcoinorg/move that referenced this pull request Mar 14, 2023
…uage#950)

* [verifier] limit the number of back edges

* [verifier] fix incorrect error code for per-module back edge limit check

* copyloc-pop test (#54)

* [gas] allow natives to read the gas balance

* [bytecode-verifier] Add metering logic and apply to absint based analysis (#58)

This adds a simple meter to the bytecode verifier which counts the number of abstract operations performed and can enforce a limit. The meter is for now only connected to locals and reference analysis, but plumped to all phases of the verifier so they can easily make use of it.

A set of test cases have been added which exercise the new meter for a number of known pathological cases.

PR history:

- Add metering in type safety, to capture cost of very large types. This reduces timing of large_type_test to 1/4
- Adjusting max metering units upwards and adding a new sample which needs it
- Addressing reviewer comments
- Add links to security advisories, and verify that all are covered.
- Switching metering granularity from function to module.
- Adding non-linear growing penalty to using input refs x output refs relations (bicycles), for dealing better with `test_bicliques`. Adding printing size in benchmarks.

* [bytecode verifer] Adjust metering to decrease runtime of some tests. (#62)

Specifically the test below now runs in 1/2 of the time. This adjustment appeard useful because the overall time allocation had to be increased to 8000 million units in production. Adjusted this as the default here too.

```
--> test_merge_state: verification time: 59.414ms, result: CONSTRAINT_NOT_SATISFIED, size: 63kb
```

Also adjusts the default to what aptos uses now in production.

* [bytecode verifier] Meter type instantiations (#64)

Instead of just metering size of types on the operand stack, also meter size of type instantiations in calls and other places. This e.g. capture the size of types in calls like `f<T>()`, where the type does not appear on the operand stack.

---------

Co-authored-by: Victor Gao <vgao1996@gmail.com>
Co-authored-by: Teng Zhang <rahxephon89@163.com>
@oxade oxade mentioned this pull request Mar 14, 2023
nkysg pushed a commit to starcoinorg/move that referenced this pull request Mar 17, 2023
…uage#950)

* [verifier] limit the number of back edges

* [verifier] fix incorrect error code for per-module back edge limit check

* copyloc-pop test (#54)

* [gas] allow natives to read the gas balance

* [bytecode-verifier] Add metering logic and apply to absint based analysis (#58)

This adds a simple meter to the bytecode verifier which counts the number of abstract operations performed and can enforce a limit. The meter is for now only connected to locals and reference analysis, but plumped to all phases of the verifier so they can easily make use of it.

A set of test cases have been added which exercise the new meter for a number of known pathological cases.

PR history:

- Add metering in type safety, to capture cost of very large types. This reduces timing of large_type_test to 1/4
- Adjusting max metering units upwards and adding a new sample which needs it
- Addressing reviewer comments
- Add links to security advisories, and verify that all are covered.
- Switching metering granularity from function to module.
- Adding non-linear growing penalty to using input refs x output refs relations (bicycles), for dealing better with `test_bicliques`. Adding printing size in benchmarks.

* [bytecode verifer] Adjust metering to decrease runtime of some tests. (#62)

Specifically the test below now runs in 1/2 of the time. This adjustment appeard useful because the overall time allocation had to be increased to 8000 million units in production. Adjusted this as the default here too.

```
--> test_merge_state: verification time: 59.414ms, result: CONSTRAINT_NOT_SATISFIED, size: 63kb
```

Also adjusts the default to what aptos uses now in production.

* [bytecode verifier] Meter type instantiations (#64)

Instead of just metering size of types on the operand stack, also meter size of type instantiations in calls and other places. This e.g. capture the size of types in calls like `f<T>()`, where the type does not appear on the operand stack.

---------

Co-authored-by: Victor Gao <vgao1996@gmail.com>
Co-authored-by: Teng Zhang <rahxephon89@163.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants