Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bincode deserialization could be faster #25

Open
ajeetdsouza opened this issue Oct 29, 2021 · 2 comments
Open

Bincode deserialization could be faster #25

ajeetdsouza opened this issue Oct 29, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@ajeetdsouza
Copy link

ajeetdsouza commented Oct 29, 2021

I noticed that the structs you're using with serde-based deserializers actually have an owned String in them. However, bincode doesn't need to copy strings (I'm not aware about the others).

I'm actually using this in zoxide right now. Here's my struct definitions:

#[derive(Debug, Deserialize, Serialize)]
pub struct DirList<'a>(#[serde(borrow)] Vec<Dir<'a>>);

#[derive(Clone, Debug, Deserialize, Serialize)]
pub struct Dir<'a> {
    #[serde(borrow)]
    pub path: Cow<'a, str>,
    pub rank: Rank,
    pub last_accessed: Epoch,
}

Since the necessary bytes are already in the Vec<u8> that you're trying to deserialize, bincode will just give you a pointer to it instead of copying a whole new String into memory, which is much faster. You can verify this by checking that the deserialized struct actually contains a Cow::Borrowed. You might be able to use a regular &str too, although I'm not sure how bincode would handle different endianness in that case.

P.S. thanks for your fantastic work on rkyv and these benchmarks! Not only is rkyv's performance amazing, it's ideas like structver that would make this a serious contender for anyone wanting a serialization library. I'm currently using a poor man's structver myself, so the idea of rkyv handling this for me sounds absolutely great.

@djkoloski
Copy link
Owner

Thanks for the suggestion and the kind words. I think that adding zero-copy benchmarks for serde-based crates is a good idea, though some work will have to go into making a set of structs that are compatible. The deserialize benchmark is meant to benchmark the cost of creating a completely owned structure, which is why this would have to go in a new benchmark. This is a bit unintuitive, so I'll add some clarification to the deserialize benchmark description.

The deserialize benchmark bans all zero-copy features so that a fair comparison can be made between the zero-copy frameworks (e.g. rkyv, abomonation) and the more traditional serialization frameworks (e.g. bincode, serde-json). This is so that users who plan on completely deserializing their objects have a benchmark for all of the available options.

Two questions that I have are:

  • How much ZCD should serde-like libraries do? Just strings?
  • Should the deserialize benchmark be measured by deserializing directly into an owned structure, or by deserializing into a borrowed structure and then cloning?

@djkoloski djkoloski self-assigned this Oct 29, 2021
@djkoloski djkoloski added the enhancement New feature or request label Oct 29, 2021
@ajeetdsouza
Copy link
Author

ajeetdsouza commented Oct 29, 2021

How much ZCD should serde-like libraries do? Just strings?

As much as possible. In bincode's case, I don't think they have anything other than strings (str/OsStr/Path/[u8]). For your benchmarks, I think the natural way would be to add another entry for bincode (borrowed) in the first table itself with an asterisk explaining the situation. Bincode doesn't really fit in the ZCD table, because it's generating a native Rust struct, so there's no concept of access / update here.

Should the deserialize benchmark be measured by deserializing directly into an owned structure, or by deserializing into a borrowed structure and then cloning?

If there's a substantial difference, I think I'd want to see both. It would help people writing code in performance-sensitive contexts to know what to expect in each case.

@djkoloski djkoloski removed their assignment Jan 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants