Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: reduce peak memory usage and open files when loading bindle #440

Merged
merged 1 commit into from
May 4, 2022

Conversation

dicej
Copy link
Contributor

@dicej dicej commented May 4, 2022

This patch does two things:

  1. Use bindle::client::Client::get_parcel_stream instead of get_parcel. The
    latter loads the whole asset into memory, while the former allows us to stream
    into a local file chunk-by-chunk.

  2. Limit parallel I/O in spin_loader::bindle::assets::Copier to avoid
    hammering the Bindle server and also avoid running out of file descriptors.

This addresses #413.

Signed-off-by: Joel Dice joel.dice@gmail.com

@dicej
Copy link
Contributor Author

dicej commented May 4, 2022

This reduces peak memory usage from ~200MB to 12MB (edit: actually ~22MB once I fixed a bug where only part of each asset was downloaded) for this test case.

Copy link
Contributor

@itowlson itowlson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, and I learned a ton. Thank you! (The one suggestion is minor.)

.await
.with_context(|| anyhow!("Failed to fetch asset parcel '{}@{}'", self.id, p.sha256))?;

let mut file = fs::File::create(&to).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a with_context(...) here so that the user can see where the failure was and which file/parcel it occurred on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch -- done.

Clippy also caught a bug, so I fixed that, too. Note to self: for x in stream.try_next().await? { .. } almost certainly does not do what you want it to do.

This patch does two things:

1. Use `bindle::client::Client::get_parcel_stream` instead of `get_parcel`.  The
latter loads the whole asset into memory, while the former allows us to stream
into a local file chunk-by-chunk.

2. Limit parallel I/O in `spin_loader::bindle::assets::Copier` to avoid
hammering the Bindle server and also avoid running out of file descriptors.

This addresses fermyon#413.

Signed-off-by: Joel Dice <joel.dice@gmail.com>
.into_iter()
.filter_map(|r| r.err())
match stream::iter(parcels.iter().map(|p| self.copy(p, &dir)))
.buffer_unordered(MAX_PARALLEL_COPIES)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamreese I knew this was implemented somewhere!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@lann lann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Thank you!

Copy link
Member

@radu-matei radu-matei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you so much!
Manually tested and profiled again, this is in line with what we are seeing when running Spin with local assets.

Thank you again, LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants