Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hitting unreachable code in Chrono (and other direct callers of collect_str) #32

Closed
caemor opened this issue Apr 20, 2021 · 9 comments
Closed
Labels
v1.0.0 Issues that will be addressed in v1.0.0

Comments

@caemor
Copy link

caemor commented Apr 20, 2021

When trying to replace bincode with postcard::to_stdvec I am reaching unreachable-code here:

unreachable!()

I haven't looked further into it yet. I was just trying to do some benchmarks on size (and later speed) for my specific blobs.

Greetings,
Chris

Backtrace
running 1 test
thread 'model::transaction::tests::test_size' panicked at 'internal error: entered unreachable code', /home/caemor/.cargo/registry/src/github.com-1ecc6299db9ec823/postcard-0.6.1/src/ser/serializer.rs:263:9
stack backtrace:
   0: rust_begin_unwind
             at /rustc/cb75ad5db02783e8b0222fee363c5f63f7e2cf5b/library/std/src/panicking.rs:493:5
   1: core::panicking::panic_fmt
             at /rustc/cb75ad5db02783e8b0222fee363c5f63f7e2cf5b/library/core/src/panicking.rs:92:14
   2: core::panicking::panic
             at /rustc/cb75ad5db02783e8b0222fee363c5f63f7e2cf5b/library/core/src/panicking.rs:50:5
   3: <&mut postcard::ser::serializer::Serializer<F> as serde::ser::Serializer>::collect_str
             at /home/caemor/.cargo/registry/src/github.com-1ecc6299db9ec823/postcard-0.6.1/src/ser/serializer.rs:263:9
   4: chrono::datetime::serde::<impl serde::ser::Serialize for chrono::datetime::DateTime<Tz>>::serialize
             at /home/caemor/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.19/src/datetime.rs:2067:13
   5: <&mut postcard::ser::serializer::Serializer<F> as serde::ser::SerializeStruct>::serialize_field
             at /home/caemor/.cargo/registry/src/github.com-1ecc6299db9ec823/postcard-0.6.1/src/ser/serializer.rs:384:9
   6: enerdag_core::model::transaction::_::<impl serde::ser::Serialize for enerdag_core::model::transaction::Transaction>::serialize
             at ./src/model/transaction.rs:42:17
   7: serde::ser::impls::<impl serde::ser::Serialize for &T>::serialize
             at /home/caemor/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.123/src/ser/impls.rs:390:17
   8: postcard::ser::serialize_with_flavor
             at /home/caemor/.cargo/registry/src/github.com-1ecc6299db9ec823/postcard-0.6.1/src/ser/mod.rs:294:5
   9: postcard::ser::to_stdvec
             at /home/caemor/.cargo/registry/src/github.com-1ecc6299db9ec823/postcard-0.6.1/src/ser/mod.rs:186:5
  10: enerdag_utils::serialize
             at /home/caemor/git/enerdag/enerdag-utils/src/lib.rs:32:5
  11: enerdag_core::model::transaction::tests::test_size
             at ./src/model/transaction.rs:654:17
  12: enerdag_core::model::transaction::tests::test_size::{{closure}}
             at ./src/model/transaction.rs:638:5
  13: core::ops::function::FnOnce::call_once
             at /home/caemor/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227:5
  14: core::ops::function::FnOnce::call_once
             at /rustc/cb75ad5db02783e8b0222fee363c5f63f7e2cf5b/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
test model::transaction::tests::test_size ... FAILED
Full Backtrace: https://gist.github.com/caemor/d0b0b793ae8bf8d2ca4eb277da9b836f
@jamesmunns
Copy link
Owner

Hey @caemor, any chance you could put together a minimal repro of this? I think the most important thing would be the struct definition + code necessary to trigger this (and maybe the version of Serde/derive you used).

Once I have that, I'd be happy to look into implementing that, or at least turning it into a more meaningful error

@bearcage
Copy link

@jamesmunns I hit this with chrono's serde feature today. Quick reproducer here, but the tl;dr is that chrono's serde impl directly calls collect_str, with predictable results.

use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize};

#[derive(Serialize, Deserialize, Debug)]
pub struct Demo {
    ts: DateTime<Utc>,
}

pub fn main() -> anyhow::Result<()> {
    let dm = Demo { ts: Utc::now() };
    dbg!(&dm);
    let bytes = postcard::to_stdvec(&dm)?;
    dbg!(&bytes);
    Ok(())
}

@jamesmunns
Copy link
Owner

Awesome, thank you @bearcage for the reproduction case! I'll see if I can dive in and either improve the error, or see if there is a way to make it actually work.

I'm open to PRs if anyone gets to this before me :)

@jamesmunns jamesmunns changed the title Hitting unreachable code Hitting unreachable code in Chrono (and other direct callers of collect_str) Jul 14, 2021
@bearcage
Copy link

@jamesmunns I took a look at what this is meant to do this evening to see if I could come up with a reasonable implementation — tl;dr got a few ideas, but nothing that quite universally solves it.

As I understand it, collect_str is meant to allow for using a type's Display implementation to produce an output string, then delegate to the str encoder. Given the goals from postcard's readme of working without an allocator and being memory efficient, I suspect that one's going to be a little tricky to support infallibly, since we don't necessarily know how large a given type's Display output is gonna be.

One potentially interesting way to come at this would be to add a scratch() (or similar) function to the SerFlavor trait to get some space to format in — in other words, allow the flavor to come up with some bytes we can write to, however temporarily. For something that's allowed to allocate, that could just be a string or a buffer.

For the heapless and slice flavors where we can't allocate, we could either error out with an "oops sorry your variable length string stuff needs an allocator," or possibly use the as-yet-unwritten portion of the slice or HVec as a destination buffer for the formatted write. At first glance there seem to be two ways to do that:

  1. Move the cursor out just enough to allow for a varint length prefix, plus a byte (to avoid aliasing during the copy), format in the string, then memmove in to place. Note that this aliasing would necessitate a change in the slice flavor's copy code since core::slice::copy_from_slice is analogous to memcpy — it assumes no aliasing between src and dest. This'd necessitate a little unsafe I suspect.
  2. If working with the aliasing is a concern, another option would be to partition the remaining space, like [varint_padding | remaining/2 | remaining/2] and use the final segment as the write destination buffer. This'd probably stay in the realm of safe rust via split_at_mut or similar.

In both cases it's a little fiddly because the max size for encoding collect_str fields would be dictated by both the output buffer size and the field's placement within the struct (earlier fields would have more buffer left to play with). That second constraint may come as a bit of a surprise for folks.

I haven't messed with actually implementing any of those just yet, though, just musing on it while doing the dishes.

@quietlychris
Copy link

quietlychris commented Feb 5, 2022

Just wanted to add a +1 to this; I also just ran into this issue, which is sort of problematic, as I was sort of hoping to be able to add native chrono timestamps to some messages that I'm de/serializing using postcard. I'm pretty sure I can get away with converting to String types, then using chrono's parse() function on either end to wind up back in the DateTime struct, but it would be nice to avoid those steps manually :)

Also, thanks so much for this crate! It's generally been pretty great to work with, and I'm very appreciative that it's available.

@jamesmunns
Copy link
Owner

This will be fixed in v1.0.0, I've figured out what causes this (serializing dyn Debug items), and at least one way to fix it.

@jamesmunns jamesmunns added the v1.0.0 Issues that will be addressed in v1.0.0 label Jun 15, 2022
@jamesmunns
Copy link
Owner

Hey @caemor and @bearcage, I've addressed this by making collect_str make two passes through the data: one that counts the contents/length to be serialized, and a second that actually formats the data into the buffer.

This isn't optimal, because we make two passes, but in most cases the total data formatted should be relatively short, and hopefully has a lower cost than using @bearcage's memmove style approach, since we are just counting on the first pass, rather than writing then doing a move. This approach would also necessitate a change in the serialization Flavor API, which right now only handles "inserts", not moves.

I'm open to improvements to this implementation, but hopefully this is enough for most usages for now!

@bearcage
Copy link

That'll do the job for me, thanks for coming back to this one!

@caemor
Copy link
Author

caemor commented Jun 15, 2022

Thanks, thats enough for my usage!

bnavetta added a commit to bnavetta/platypos that referenced this issue Aug 14, 2022
Serde supports serializing `Display` values via `collect_str`, without
having to explicitly buffer them ahead of time. Using this for `Debug`
values produced by `tracing` means the ktrace protocol implementation no
longer needs to allocate.

The downside is that postcard implements `collect_str` by processing the
value twice, to determine its length. However, as
jamesmunns/postcard#32 points out, this is
likely fine since the data shouldn't be large.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v1.0.0 Issues that will be addressed in v1.0.0
Projects
None yet
Development

No branches or pull requests

4 participants