Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce size of futures in HTTP API to prevent stack overflows #5104

Merged
merged 3 commits into from
Jan 23, 2024

Conversation

michaelsproul
Copy link
Member

Issue Addressed

Closes #5080.

Proposed Changes

  • Use Box::pin on a few futures in publish_block. This reduces the size of related futures from 100KB to ~33KB.
  • Use Arcs to reduce the amount of data on the stack in publish_block and related functions. This reduces the size of the publish block futures from ~33KB to ~16KB.

One of the reasons for the stack size blowup in futures is described in this blog post: https://swatinem.de/blog/future-size/.

Even with that knowledge, I still found it quite hard to determine interventions that would be effective, and did a lot of trial and error changing things and re-running the type-size analysis.

The commands I used for testing were:

RUSTFLAGS='-Zprint-type-sizes' cargo +nightly build --release > type_sizes.txt

Followed by post-processing:

cat type_sizes.txt | grep 'print-type-size type:' | sed 's/print-type-size type: `\(.*\)`: \([0-9]*\) bytes, alignment.*/"\1",\2/' | rg "warp" > output.csv

Additional Info

There's still more we could do here. Some types like the PreProcessingSnapshot end up on the stack during block publishing, and are quite large due to containing unarced blocks and states.

There's also an outstanding mystery as to how 100KB futures can cause an 8MB stack to overflow, when only they should only be nested ~1 at a time. Either I've misunderstood what happens when futures poll sub-futures recursively, or there's something in warp that creates the amplification.

NOTE: I have yet to measure whether the boxing and arcing has an impact on performance. It would be good to monitor metrics for block publication time after deploying this branch.

A backtrace implicating warp can be found here: https://gist.github.com/michaelsproul/d0a4b93bb9d21a5be1103acf4c0a3753.

Thanks to @dapplion for pairing on this and helping with some of the early progress.

@michaelsproul michaelsproul added bug Something isn't working optimization Something to make Lighthouse run more efficiently. v4.6.0 ETA Q1 2024 labels Jan 22, 2024
paulhauner added a commit that referenced this pull request Jan 22, 2024
Squashed commit of the following:

commit 4d51a99
Author: Michael Sproul <michael@sigmaprime.io>
Date:   Mon Jan 22 17:08:43 2024 +1100

    Arc the blocks early in publication

commit 8a749db
Author: Michael Sproul <michael@sigmaprime.io>
Date:   Mon Jan 22 15:19:57 2024 +1100

    Box::pin a few big futures
@paulhauner paulhauner mentioned this pull request Jan 22, 2024
2 tasks
Copy link
Member

@jxs jxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Michael :)

@@ -866,14 +866,14 @@ where
panic!("Should always be a full payload response");
};

let signed_block = block_response.block.sign(
let signed_block = Arc::new(block_response.block.sign(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this test only code? If so do we also have stack overflows during tests?
(Question applies to all the uses in this file)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this is just for tests, I ended up having to make quite a few changes to make the types consistent. There's a bit of inefficient Arc-ing and de-Arc-ing, but it's just in tests so it shouldn't be too bad

Jimmy got some stack overflows in tests, but only in debug mode

@@ -380,6 +379,10 @@ pub async fn reconstruct_block<T: BeaconChainTypes>(
None
};

// Perf: cloning the block here to unblind it is a little sub-optimal. This is considered an
// acceptable tradeoff to avoid passing blocks around on the stack (unarced), which blows up
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// acceptable tradeoff to avoid passing blocks around on the stack (unarced), which blows up
// acceptable tradeoff to avoid passing blocks around on the stack (unArc'ed), which blows up

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda like the lowercase spelling

@michaelsproul michaelsproul added the ready-for-review The code is ready for review label Jan 23, 2024
Copy link
Member

@paulhauner paulhauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Simple changes that seem to make a lot of difference. It was an impressive research effort to get to the bottom of it.

@paulhauner
Copy link
Member

I'll merge this because there doesn't seem to be anything substantial outstanding and @michaelsproul said it's ready for "review/merge".

@paulhauner paulhauner merged commit a403138 into sigp:unstable Jan 23, 2024
27 of 28 checks passed
@michaelsproul michaelsproul deleted the http-stack-overflow-fix branch January 23, 2024 04:52
danielramirezch pushed a commit to danielramirezch/lighthouse that referenced this pull request Feb 14, 2024
)

* Box::pin a few big futures

* Arc the blocks early in publication

* Fix more tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working optimization Something to make Lighthouse run more efficiently. ready-for-review The code is ready for review v4.6.0 ETA Q1 2024
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants