Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read CAR Files #254

Merged
merged 24 commits into from
Mar 10, 2020
Merged

Read CAR Files #254

merged 24 commits into from
Mar 10, 2020

Conversation

ec2
Copy link
Member

@ec2 ec2 commented Mar 6, 2020

Summary of changes
Changes introduced in this pull request:

  • Reads car files

Reference issue to close (if applicable)

Closes #222

Other information and links

ipld/car/Cargo.toml Outdated Show resolved Hide resolved
ipld/car/Cargo.toml Outdated Show resolved Hide resolved
ipld/car/src/lib.rs Outdated Show resolved Hide resolved
}

/// Loads a CAR buffer into a BlockStore
pub fn load_car<R: Read, B: BlockStore>(s: &mut B, buf_reader: BufReader<R>) -> Result<(), Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need a mutable reference to the database, this will cause a headache for anyone using

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I just assumed you need it as mut becuase we're writing, but that is in fact not the case. fixed

let block = car_reader.next_block()?;
let check_cid = Cid::new_from_prefix(&block.cid.prefix(), &block.data)
.map_err(|e| Error::Other(e.to_string()))?;
assert_eq!(&check_cid, &block.cid);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the assert now, you shouldn't be asserting on inputs not handled in this utility

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. forgot to remove

Comment on lines 42 to 49
let (_len, buf) = ld_read(&mut buf_reader)?;
let header: CarHeader = from_slice(&buf).map_err(|e| Error::ParsingError(e.to_string()))?;
if header.roots.is_empty() {
return Err(Error::ParsingError("empty CAR file".to_owned()));
}
if header.version != 1 {
return Err(Error::InvalidFile("CAR file version must be 1".to_owned()));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arguably this header read should not be in the reader constructor? Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be in the reader constructor, I'd say. This allows us to parse and verify the header to fail early if its not a valid header

}

pub(crate) fn read_node<R: Read>(buf_reader: &mut R) -> Result<(Cid, Vec<u8>), Error> {
let (_len, buf) = ld_read(buf_reader)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this first read discarded?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually ld_read doesnt need to return len. So i will remove

Comment on lines 25 to 46
pub(crate) fn read_cid(buf: &[u8]) -> Result<(Cid, u64), Error> {
// TODO: Add some checks for cid v0
let (version, buf) =
unsigned_varint::decode::u64(buf).map_err(|e| Error::ParsingError(e.to_string()))?;
let (codec, multihash_with_data) =
unsigned_varint::decode::u64(buf).map_err(|e| Error::ParsingError(e.to_string()))?;
// multihash part
let (_hashcode, buf) = unsigned_varint::decode::u64(multihash_with_data)
.map_err(|e| Error::ParsingError(e.to_string()))?;
let hashcode_len_diff = multihash_with_data.len() - buf.len();
let (len, _) =
unsigned_varint::decode::u64(buf).map_err(|e| Error::ParsingError(e.to_string()))?;

let cid: Cid = Cid::new(
cid::Codec::from(codec)?,
cid::Version::from(version)?,
cid::multihash::Multihash::from_bytes(
multihash_with_data[0..=(len as usize + hashcode_len_diff)].to_vec(),
)?,
);
let len = cid.to_bytes().len() as u64;
Ok((cid, len))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't make sense to me why you convert the reader into a slice, then to use it here as a reader. This seems like logic that should be attached to the Cid crate. Thoughts on making this take in a reader and attaching it to Cid crate? You wouldn't need that ld_read function or all this indexing that could panic the program on a bad input

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a todo for read_Cid using a bufreader

Comment on lines 27 to 32
use self::Error::*;

match self {
ParsingError(_) => "Failed to parse CAR file",
Error::InvalidFile(_) => "Invalid CAR file",
Other(_) => "Other Cid Error",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the use if you are going to specify the enum prefix? Also, this is consuming the error messages, what is the point of this error type? If it doesn't matter then just remove this type and make the error type &'static str

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True

ipld/car/src/lib.rs Outdated Show resolved Hide resolved
Comment on lines 4 to 14
use blockstore::BlockStore;
use cid::Cid;
use forest_encoding::from_slice;
use serde::{Deserialize, Serialize};
use std::io::{BufReader, Read};

mod error;
mod util;
use crate::util::read_node;
use error::*;
use util::ld_read;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you move the mods to the top and group all uses together?

ipld/car/src/error.rs Outdated Show resolved Hide resolved
@ec2 ec2 requested a review from austinabell March 10, 2020 15:48
@ec2 ec2 merged commit 0c45139 into master Mar 10, 2020
@ec2 ec2 deleted the ec2/car branch March 10, 2020 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement CAR file utility
3 participants