Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add a forest-tool subcommand that generates a merged actor bundle #3325

Merged
merged 18 commits into from
Aug 31, 2023

Conversation

hanabi1224
Copy link
Contributor

@hanabi1224 hanabi1224 commented Aug 4, 2023

Summary of changes

Changes introduced in this pull request:

  • Move shared CAR concat functions into build mod to avoid code duplication
  • Add forest-tool state-migration actor-bundle subcommand that generates a merged actor bundle
  • Removed src/mod.rs

Reference issue to close (if applicable)

Closes #3324

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

@hanabi1224 hanabi1224 marked this pull request as ready for review August 4, 2023 09:41
@hanabi1224 hanabi1224 requested a review from a team as a code owner August 4, 2023 09:41
@hanabi1224 hanabi1224 requested review from ruseinov and lemmih and removed request for a team August 4, 2023 09:41
src/mod.rs Outdated

pub type BlockPair = (Cid, Vec<u8>);

pub fn read_car_as_stream<R>(reader: CarReader<R>) -> impl Stream<Item = BlockPair>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the new and improved crate::utils::db::car_stream::CarStream

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that will require bringing CarStream to the context of build script, might be too much effort

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. Let's change car_cmd.rs to not use the build module.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. Let's change car_cmd.rs to not use the build module.

If I understand this correctly - the proposal is to use the new CarStream in car_cmd.rs, while maintaining the old logic in the build script?

When I look at this it seems to me that removing the workspace completely might have been premature. If we'd be able to have this shared logic in another package - we would easily be able to reuse it within the build script. Perhaps a good enough reason to re-introduce the workspace with at least two packages forest-core and forest-util, for example. That would definitely allow us to get rid of mod.rs as per Aatif's suggestion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to assemble the bundle when there's a new network version rather than in the build script.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to assemble the bundle when there's a new network version rather than in the build script.

True, there's only one downside to it - anything that's not automated is prone to human error.

I'm fine with either approach as long as we don't have to maintain the outdated build script logic.

@hanabi1224 hanabi1224 marked this pull request as draft August 8, 2023 12:04
@hanabi1224 hanabi1224 changed the title fix: move shared CAR concat functions into build mod feat: add a forest-tool subcommand that generates a merged actor bundle Aug 18, 2023
@hanabi1224 hanabi1224 marked this pull request as ready for review August 18, 2023 08:48
Copy link
Contributor

@ruseinov ruseinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, some nits, comments and questions.

}

let all_roots = car_streams
.iter()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if there's anything to be gained in parallelising this. Could be interesting depending on the times it takes to run this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

roots are part of CAR header which has been parsed in CarStream::new so this call is super fast


let mut zstd_encoder = ZstdEncoder::with_quality(
async_fs::File::create(Path::new(DEFAULT_BUNDLE_FILE_NAME)).await?,
async_compression::Level::Precise(22),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this number coming from? I'd at least make it a const and comment on the reasoning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

22 is the max supported level. I'm not able to find a corresponding const from either zstd or zstd-safe, will make it a const and add a link to the doc in the comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there is a function zstd_safe::max_c_level() available, switched

@@ -55,7 +55,7 @@ impl CarCommands {
}
}

fn merge_car_streams<R>(
pub fn merge_car_streams<R>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big of importing methods defined in cmd into another crate. Perhaps move them into some shared location? That also goes for dedup_block_stream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to car_util.rs

// Copyright 2019-2023 ChainSafe Systems
// SPDX-License-Identifier: Apache-2.0, MIT

//! This module contains code that is shared between the library and the build script
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about this as it does not seem like build.rs has any references to that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It used to do, but not anymore, and that's why I remove it in this PR

@LesnyRumcajs LesnyRumcajs mentioned this pull request Aug 24, 2023
4 tasks
Copy link
Member

@LesnyRumcajs LesnyRumcajs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hanabi1224 hanabi1224 added this pull request to the merge queue Aug 31, 2023
Merged via the queue into main with commit d38f68c Aug 31, 2023
23 checks passed
@hanabi1224 hanabi1224 deleted the hm/shared-car-concat-fn branch August 31, 2023 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

src/mod.rs is weird and should probably be removed
4 participants