RON incorrectly adds extra level of Base64 when roundtripping Bytes #436

GoldsteinE · 2023-01-29T10:48:40Z

#[derive(Debug, serde::Deserialize, serde::Serialize)]
#[serde(rename = "b")]
struct BytesVal {
    pub b: bytes::Bytes,
}

#[derive(Debug, serde::Deserialize, serde::Serialize)]
#[serde(untagged)]
enum Bad {
    Bytes(BytesVal),
}

fn main() {
    let v: Bad = ron::from_str(r#"(b: "dGVzdA==")"#).unwrap();
    dbg!(&v);
    let s = ron::to_string(&v).unwrap();
    dbg!(&s);
    let v: Bad = ron::from_str(&s).unwrap();
    dbg!(&v);
    println!("---");
    let v: Bad = serde_json::from_str(r#"{"b": "dGVzdA=="}"#).unwrap();
    dbg!(&v);
    let s = serde_json::to_string(&v).unwrap();
    dbg!(&s);
    let v: Bad = serde_json::from_str(&s).unwrap();
    dbg!(&v);
}

has the following output:

[src/main.rs:23] &v = Bytes(
    BytesVal {
        b: b"dGVzdA==",
    },
)
[src/main.rs:25] &s = "(b:\"ZEdWemRBPT0=\")"
[src/main.rs:27] &v = Bytes(
    BytesVal {
        b: b"ZEdWemRBPT0=",
    },
)
---
[src/main.rs:30] &v = Bytes(
    BytesVal {
        b: b"dGVzdA==",
    },
)
[src/main.rs:32] &s = "{\"b\":[100,71,86,122,100,65,61,61]}"
[src/main.rs:34] &v = Bytes(
    BytesVal {
        b: b"dGVzdA==",
    },
)

It fully survives the roundtrip with serde_json, but adds extra level of Base64 with RON.

The text was updated successfully, but these errors were encountered:

juntyr · 2023-01-29T14:09:46Z

This really is a tricky one! I think though that I now understand what is going on here. The problem is the #[serde(untagged)] which means that the ron string is deserialised without having any type information available. Since ron encodes bytes as base64 encoded strings, but without declaring that they are bytes specifically, the deserialiser just sees a string when no type information is available. When bytes::Bytes then deserialises, it sees a string and just gets its bytes. I.e. even though they were base64 bytes, at no point was ron asked to turn them back into bytes.

ron in general is a format that does not support deserialising without any type information (i.e. anything that touches deserialize_any). However, I agree that this is a very confusing and unfortunate case. The only way of fixing this (to work with untagged enums) that I can think of is to make a breaking change to ron's format in how bytes are represented. It could be as simple as prefixing any byte string with b, i.e. b"base64" during serialisation. During deserialisation, we could continue to accept a string without the b-prefix if type information is available. However, old ron code would no longer be able to read files produced by new code. Those would be tough decisions to make ...

@torkleyy What are your thoughts on this?

torkleyy · 2023-06-20T11:54:49Z

I think having special support built-in for bytes / base64 encoding can be an advantage, even if it means breaking some compatibility. I'm not sure if we can ease the transition somehow to not cause any breakage in practice.

… user-side types

* Switch from base64 to rusty byte strings, deprecate base64 support * Add the Value::Bytes variant * Extend Value tests for Value::String and Value::Bytes * Include byte strings in the RON grammar * Fix ASCII escape decoding for strings and byte strings * Fix byte string error display for #462 test * Fix byte string error test * Add a CHANGELOG entry * Added a deprecation error test for v0.10 * Add tests for v0.9 optional base64 byte string support Co-authored-by: Sebastian Dröge <sebastian@centricular.com> * Add an example for using base64-encoded bytes with ron * Fix formatting in README * Remove outdated extension docs * Add tests for unescaped and raw byte strings * Fix fuzzer-found issue with serialising invalid UTF-8 byte strings * Fix fuzzer found issue with `br#` being parsed as the identifier `br` * Fix parsing of byte escapes in UTF-8 strings to produce proper Unicode characters * Fix fuzzer-found interaction with unwrap_variant_newtypes * Add support for strongly typed byte literals * Add missing Value serialising tests * Add test to show that #436 is solved with strongly typed base64 user-side types * Add more coverage tests --------- Co-authored-by: Sebastian Dröge <sebastian@centricular.com>

juntyr mentioned this issue Feb 6, 2023

Rusty byte strings in RON, deprecate base64 (byte) strings #438

Merged

10 tasks

juntyr added the bug label Apr 23, 2023

juntyr mentioned this issue Aug 18, 2023

Value meta-issue #397

Open

17 tasks

juntyr added a commit to juntyr/ron that referenced this issue Aug 25, 2023

Add test to show that ron-rs#436 is solved with strongly typed base64…

fb31cef

… user-side types

juntyr added a commit to juntyr/ron that referenced this issue Sep 1, 2023

Add test to show that ron-rs#436 is solved with strongly typed base64…

a073fe4

… user-side types

juntyr closed this as completed in #438 Sep 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RON incorrectly adds extra level of Base64 when roundtripping Bytes #436

RON incorrectly adds extra level of Base64 when roundtripping Bytes #436

GoldsteinE commented Jan 29, 2023

juntyr commented Jan 29, 2023

torkleyy commented Jun 20, 2023

RON incorrectly adds extra level of Base64 when roundtripping Bytes #436

RON incorrectly adds extra level of Base64 when roundtripping Bytes #436

Comments

GoldsteinE commented Jan 29, 2023

juntyr commented Jan 29, 2023

torkleyy commented Jun 20, 2023