Skip to content
This repository has been archived by the owner on Oct 23, 2022. It is now read-only.

feat: ipfs-unixfs get or "walk over anything" #189

Merged
merged 76 commits into from
Jun 17, 2020

Conversation

koivunej
Copy link
Collaborator

@koivunej koivunej commented Jun 15, 2020

Add /get endpoint, which passes modified conformance tests (PR not yet created). There are "a few" commits throughout the days but I sadly did not find a way to make this any smaller.

Most importantly this adds ipfs_unixfs::walk::Walker which can be used to walk the whole dag-pb/unixfs tree in a fashion useful when exporting files to filesystem or creating a tarball out of those. If you look at earlier commits, this used to be ipfs_unixfs::dir::walk::Walker. Initially it had a much less ergonomic API as well, but it has been ironed out.

This PR includes some minor changes to path walking to fix the issues created by refactoring ipfs_unixfs::UnexpectedNodeType and eyeballing the code correctness.

Reviewing:

  • I'd start with the 1k walk.rs file, look at the tests first
  • for a more complicated version of the walks, you can find the /get implementation

Making the Walker api more ergonomic resulted in a few more internal unwrap's which I think are all good.

Pending in this PR:

  • add hamtshard checks when loading those to Walker
  • get windows specific code to compile
  • remove the double buffering in tar helper

Later:

  • unify InnerKind in walk.rs Root versions and use Cids in all of them (or none of them)
  • find a way to expose more of the Entry values at ContinuedWalk level; not sure how to do this without it becoming a self-referential struct, perhaps Walker::continue_walk could become Walker::continue_walk(&mut self, ...)? The current item could always be moved to the ContinuedWalk.
  • migrate IpfsPath and ipfs_http::v0::refs::walk_path over to ipfs under ipfs::Ipfs::resolve and merge the IpfsPath, leave only supported features
  • cleanup ipfs::Ipfs::{get, add} which only work on single block files

Open questions:

unixfs/src/dir/walk.rs Outdated Show resolved Hide resolved
unixfs/src/dir/walk.rs Outdated Show resolved Hide resolved
unixfs/src/dir/walk.rs Outdated Show resolved Hide resolved
unixfs/src/dir/walk.rs Outdated Show resolved Hide resolved
unixfs/src/dir/walk.rs Outdated Show resolved Hide resolved
unixfs/src/dir/walk.rs Outdated Show resolved Hide resolved
Ok(s) => s,
Err(e) => {
eprintln!("IPFS_PATH is not set or could not be read: {}", e);
std::process::exit(1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unwrap_or_else can be used here

unixfs/examples/get.rs Outdated Show resolved Hide resolved
unixfs/src/dir/walk.rs Outdated Show resolved Hide resolved
unixfs/src/dir/walk.rs Outdated Show resolved Hide resolved
unixfs/src/dir/walk.rs Outdated Show resolved Hide resolved
unixfs/src/dir/walk.rs Outdated Show resolved Hide resolved
unixfs/src/dir/walk.rs Outdated Show resolved Hide resolved
unixfs/src/dir/walk.rs Outdated Show resolved Hide resolved
unixfs/src/dir/walk.rs Outdated Show resolved Hide resolved
unixfs/src/dir/walk.rs Outdated Show resolved Hide resolved
unixfs/src/dir/walk.rs Outdated Show resolved Hide resolved
@koivunej
Copy link
Collaborator Author

Now that I have the http api and conformance tests going in another branch based on this, and there'll be a lot of rework of unmerged code, I'll probably scrap this review and open a new one a bit later.

@koivunej
Copy link
Collaborator Author

It appears I ended up just continuing where I left off. I'll try to write a better PR description before ready for review.

http/src/v0/root_files.rs Outdated Show resolved Hide resolved
@koivunej
Copy link
Collaborator Author

This seems to be ready for review, with the linked tar-rs issue and such! Even the CI agrees: only transient failures left (macos sigsegv and nightly probably timeouts -- 23min without a test log).

@koivunej koivunej requested a review from ljedrz June 17, 2020 10:22
http/src/v0/root_files.rs Outdated Show resolved Hide resolved

#[derive(Debug)]
enum GetError {
NonUtf8Symlink,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't we want to provide the underlying bytes of a symlink that is not UTF-8? It might be "mangled" in a way that is not visible to the user, so if this is not immediately thrown only when given a direct symlink, it would be a good idea to indicate where the issue is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error will actually be eaten by hyper as it currently doesn't support returning trailer headers. No one would ever see the bytes. The tar functionality is not exposed in the ipfs crate. This is by design, we currently can't do it using current tar-rs interfaces.

Copy link
Collaborator Author

@koivunej koivunej Jun 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "design" idea was that we could provide the means to get to local filesystem, which we almost do with ipfs::unixfs::ll::walk::Walker (or ipfs_unixfs::walk::Walker). Exporting to filesystem safely is a bit different task which I don't want to grow this massive PR into.

http/src/v0/root_files.rs Outdated Show resolved Hide resolved
Joonas Koivunen and others added 25 commits June 17, 2020 17:27
two buffers would allow better reusing, but only with concurrency.
this does not hit the buffer cycling cases but hits all other file
cases.
this is probably being overly strict but at least there will not be any
misunderstandings.
also the file was forgotten.
added hints about users preferring to use `ipfs_unixfs::walk::Walker`.
with patching we can run all of our outstanding work.
Including mostly comment fixes and removal of an extra &mut.

Co-authored-by: ljedrz <ljedrz@users.noreply.github.com>
docs, comments and minor API changes again.

Co-authored-by: ljedrz <ljedrz@users.noreply.github.com>
Co-authored-by: ljedrz <ljedrz@users.noreply.github.com>
@koivunej
Copy link
Collaborator Author

Slight rebasing, merging.

@koivunej koivunej merged commit f5bad26 into rs-ipfs:master Jun 17, 2020
@koivunej koivunej deleted the feat_initial_get branch September 24, 2020 12:55
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UnixFS reading (or exporter)
2 participants