Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: fetch arbitrary state-trees #2979

Merged
merged 20 commits into from
Jun 14, 2023
Merged

feat: fetch arbitrary state-trees #2979

merged 20 commits into from
Jun 14, 2023

Conversation

lemmih
Copy link
Contributor

@lemmih lemmih commented Jun 13, 2023

Summary of changes

Changes introduced in this pull request:

  • New command: forest-cli state fetch {CID}
  • The command uses libp2p-bitswap to fetch an IPLD graph in parallel and store it in the database.

Reference issue to close (if applicable)

Closes

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

for new_cid in ipld.iter().filter_map(&mut get_ipld_link) {
counter += 1;
if counter % 1_000 == 0 {
// set RUST_LOG=forest_rpc::state_api=debug to enable these printouts.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe leave a TODO and refer to #2955 if it's meant to be replaced by the desired progress bar

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Showing a progress bar requires knowing how much work is left. We should show some progress indicators in the client, though. Maybe as a separate PR.

response_channel: tx,
})
.await?;
// Bitswap requests do not fail. They are just ignored if no-one has
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI there is a send_dont_have option in BitswapRequest to ask peers to respond even if they don't have the requested block, if that could be beneficial here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not available through the NetworkMessage interface.

const REQUEST_TIMEOUT: Duration = Duration::from_secs(10);

let sem = Arc::new(Semaphore::new(MAX_CONCURRENT_REQUESTS));
let mut seen: HashSet<Cid> = HashSet::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe CidHashSet here if collision is not critical

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use CidHashSet if insert didn't take an extra callback function. :)

Copy link
Member

@LesnyRumcajs LesnyRumcajs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Can we add some tests for this feature? Both where the Cid we know is present and where we know it won't be there.
  2. Could this be used instead of downloading actor bundles from Github? Not in this PR, but just a thought. It would give us some resilience from GH outages (but also make us dependent on other peers).

node/rpc/src/state_api.rs Outdated Show resolved Hide resolved
@lemmih
Copy link
Contributor Author

lemmih commented Jun 14, 2023

  1. Can we add some tests for this feature? Both where the Cid we know is present and where we know it won't be there.

I added a test that fetches the state-root of epoch 1. It should always be available but it's not 100% guaranteed to stay available in the future. If it becomes a problem, we can switch to walking the genesis block which we always have available.

  1. Could this be used instead of downloading actor bundles from Github? Not in this PR, but just a thought. It would give us some resilience from GH outages (but also make us dependent on other peers).

Yes, we should be able to fetch the bundles. I'll leave it for another PR.

@lemmih lemmih enabled auto-merge (squash) June 14, 2023 10:02
@lemmih lemmih merged commit 9e617c5 into main Jun 14, 2023
19 checks passed
@lemmih lemmih deleted the lemmih/bitswap-snapshot branch June 14, 2023 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants