Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

forest-cli archive export --epoch=X #3159

Closed
9 tasks done
Tracked by #3141
lemmih opened this issue Jul 10, 2023 · 6 comments · Fixed by #3169
Closed
9 tasks done
Tracked by #3141

forest-cli archive export --epoch=X #3159

lemmih opened this issue Jul 10, 2023 · 6 comments · Fixed by #3169
Assignees

Comments

@lemmih
Copy link
Contributor

lemmih commented Jul 10, 2023

A Forest node can generate up-to-date snapshot files. However, we often want to take large snapshots and trim them to a specific epoch. This can be done entirely without network access.

The forest-cli archive export command takes a snapshot file as input and yields a smaller snapshot file as output. For example, the input snapshot might cover epochs 1 to 5000, and the output snapshot could cover epochs 1000 to 2000.

Subtasks:

  • Take CAR file as input,
  • Index CAR file using car-backed Blockstore,
  • Get recent-roots and epoch from cli flags,
  • Use export function to generate snapshot data,
  • Compress with zst.

Error handling:

  • recent-roots too low (less than chain_finality),
  • recent-roots too high (greater than available roots),
  • epoch not in CAR input file,
  • input file may be malformed or corrupted.

Export function

pub async fn export<W, D>(
&self,
tipset: &Tipset,
recent_roots: ChainEpoch,
writer: W,
compressed: bool,
skip_checksum: bool,
) -> Result<Option<digest::Output<D>>, Error>

@ruseinov ruseinov self-assigned this Jul 10, 2023
@lemmih
Copy link
Contributor Author

lemmih commented Jul 10, 2023

Testing can be done by downloading a calibnet snapshot and re-exporting it in various ways (different offsets, different number of tipsets).

@lemmih
Copy link
Contributor Author

lemmih commented Jul 10, 2023

Automated testing with the MemoryDB would also be nice.

@ruseinov
Copy link
Contributor

@lemmih Do we also want the checksum here?

@lemmih
Copy link
Contributor Author

lemmih commented Jul 11, 2023

@lemmih Do we also want the checksum here?

We can leave out the checksum for now.

@ruseinov
Copy link
Contributor

One more question: do we really want to call a param recent-roots? I would probably do something like
epoch-from, epoch-to and make sure that epoch-from is at least chain_finality` (defaulting to it too).

We could come up with even better naming perhaps
--epoch and --depth.

@lemmih
Copy link
Contributor Author

lemmih commented Jul 11, 2023

One more question: do we really want to call a param recent-roots? I would probably do something like epoch-from, epoch-to and make sure that epoch-from is at least chain_finality` (defaulting to it too).

We could come up with even better naming perhaps --epoch and --depth.

--epoch and --depth looks good.

@ruseinov ruseinov changed the title forest-cli archive snapshot --epoch=X forest-cli archive export --epoch=X Jul 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants