Skip to content
This repository has been archived by the owner on Mar 25, 2024. It is now read-only.

u128 integers >= 64 bits serialize as floats #112

Closed
asomers opened this issue Oct 14, 2018 · 8 comments
Closed

u128 integers >= 64 bits serialize as floats #112

asomers opened this issue Oct 14, 2018 · 8 comments

Comments

@asomers
Copy link
Contributor

asomers commented Oct 14, 2018

The newly released 0.8.6 includes u128 support. However, values of 2^64 or larger serialize as floats instead of integers, as shown by this test case.

#[test]
fn test_u128_big() {
    let thing: u128 = 18446744073709551616;   // 2^64
    let yaml = unindent(
        "
        ---
        18446744073709551616",
    );
    test_serde(&thing, &yaml);
}
---- test_u128_big stdout ----
thread 'test_u128_big' panicked at 'assertion failed: `(left == right)`
  left: `"---\n18446744073709551616"`,
 right: `"---\n18446744073709553000.0"`', tests/test_serde.rs:40:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.

The i128 branch in my fork of this repo does not exhibit the same bug at revision d79d55e.

@dtolnay
Copy link
Owner

dtolnay commented Oct 14, 2018

The behavior in 0.8.6 is intentional. u128 values correctly serialize with full precision as seen in:

fn main() {
    println!("{}", serde_yaml::to_string(&18446744073709551616u128).unwrap());
}

The setup in tests/test_serde.rs does various combinations of passing the data through serde_yaml::Value, which cuts precision to 64 bits.

@dtolnay dtolnay closed this as completed Oct 14, 2018
@asomers
Copy link
Contributor Author

asomers commented Oct 14, 2018

It's not just that test. My application also fails when run with serde-yaml 0.8.6, but correctly serializes with my branch. I'm not explicitly using serde_yaml::Value; I'll try to figure out if some intervening library is.

@dtolnay
Copy link
Owner

dtolnay commented Oct 15, 2018

Sounds good. Let me know if you have a way to reproduce this without serde_yaml::Value involved, and we can reopen.

@asomers
Copy link
Contributor Author

asomers commented Oct 15, 2018

I found the culprit. I actually am using Value. In my application, I have a bunch of Things which contain a u128 field. I want to create a YAML map of Address => Thing. Creating the entire map and serializing it in one go would require holding a lock on every Thing simultaneously. Serializing each Thing one at a time and manually constructing the map wouldn't work; I'd have to manually reindent every line of the YAML strings. Rather than do that, I use serde_yaml::to_value(thing) and build a map of Address => Value. Then I serialize the entire map.

Given that 100% of users who use u128 with serde-yaml need Value support, can we please reintroduce it? Or do you have a better suggestion?

@dtolnay
Copy link
Owner

dtolnay commented Oct 15, 2018

Could you put together a minimal main.rs that demonstrates what your application is doing that doesn't work without Value? I can give suggestions based on that.

@asomers
Copy link
Contributor Author

asomers commented Oct 15, 2018

Here's one:

extern crate serde;
#[macro_use]
extern crate serde_derive;
extern crate serde_yaml;

use std::collections::BTreeMap;

#[derive(Serialize)]
struct Thing(u128);
type Address = u32;

struct Owner {
    id: u32,
    v: Vec<Thing>
}

impl Owner {
    fn add_thing(&mut self, thing: Thing) {
        self.v.push(thing);
    }

    // The Owner owns the Things, but I don't want the Owner itself to be part
    // of the output.  So I cant serialize the entire Owner.  And I can't return
    // a String here, because I need to combine this output with other Owners'
    // output.  And dump can't return references to the Things because of
    // lifetime restrictions.  So instead I convert the Things into
    // serde_yaml::Value objects, which can be returned.
    fn dump(&self, map: &mut BTreeMap<Address, serde_yaml::Value>) {
        for (i, thing) in self.v.iter().enumerate() {
            let key = self.id * 0x10000 + i as Address;
            map.insert(key, serde_yaml::to_value(thing).unwrap());
        }
    }

    fn new(id: u32) -> Self {
        Owner{id, v: Vec::new()}
    }
}

fn main() {
    let mut owner0 = Owner::new(0);
    owner0.add_thing(Thing(0x1_0000));
    owner0.add_thing(Thing(0x1_0000_0000));
    let mut owner1 = Owner::new(1);
    owner1.add_thing(Thing(0x1_0000_0000_0000));
    owner1.add_thing(Thing(0x1_0000_0000_0000_0000));

    let mut map = BTreeMap::new();
    owner0.dump(&mut map);
    owner1.dump(&mut map);
    println!("{}", serde_yaml::to_string(&map).unwrap());
}

@dtolnay
Copy link
Owner

dtolnay commented Oct 15, 2018

Here is how I would recommend writing this so that the things get serialized directly.

#[macro_use]
extern crate serde_derive;

extern crate serde;
extern crate serde_yaml;

use serde::ser::{Serialize, SerializeMap, Serializer};

#[derive(Serialize)]
struct Thing(u128);
type Address = u32;

struct Owner {
    id: u32,
    v: Vec<Thing>,
}

impl Owner {
    fn add_thing(&mut self, thing: Thing) {
        self.v.push(thing);
    }

    fn dump<M: SerializeMap>(&self, map: &mut M) -> Result<(), M::Error> {
        for (i, thing) in self.v.iter().enumerate() {
            let key = self.id * 0x10000 + i as Address;
            map.serialize_entry(&key, thing)?;
        }
        Ok(())
    }

    fn new(id: u32) -> Self {
        Owner { id, v: Vec::new() }
    }
}

struct DumpThings<'a>(&'a [&'a Owner]);

impl<'a> Serialize for DumpThings<'a> {
    fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {
        let count = None;
        let mut map = serializer.serialize_map(count)?;
        for owner in self.0 {
            owner.dump(&mut map)?;
        }
        map.end()
    }
}

fn main() {
    let mut owner0 = Owner::new(0);
    owner0.add_thing(Thing(0x1_0000));
    owner0.add_thing(Thing(0x1_0000_0000));
    let mut owner1 = Owner::new(1);
    owner1.add_thing(Thing(0x1_0000_0000_0000));
    owner1.add_thing(Thing(0x1_0000_0000_0000_0000));

    let things_to_dump = [&owner0, &owner1];
    let yaml = serde_yaml::to_string(&DumpThings(&things_to_dump)).unwrap();
    println!("{}", yaml);
}

@asomers
Copy link
Contributor Author

asomers commented Oct 15, 2018

What's the advantage of that method over using Value?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants