Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow lax decoding for invalid bson types #85

Closed
andreasterrius opened this issue Nov 10, 2017 · 7 comments
Closed

Allow lax decoding for invalid bson types #85

andreasterrius opened this issue Nov 10, 2017 · 7 comments

Comments

@andreasterrius
Copy link

andreasterrius commented Nov 10, 2017

I'm currently trying to port an existing microservice from java to rust to create a proof of concept. One of the thing is that in my current java microservice, the deserializer always forcefully coerce the type to whatever the java class is capable of holding, while it throws an error in rust

For example

struct definition

#[derive(Deserialize, Serialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct Test {
    #[serde(rename = "_id")]
    test_id : i32,
    amount : i32
}

bson data

{ 
    "_id" : NumberInt(12345), 
    "amount" : 1.0
}

will give me a

BsonSerializationError(InvalidType("i32"))

Is there a way we can enable/implement a lax serialization on bson-rs ?

@andreasterrius andreasterrius changed the title Allow lax decoding for mongo objects Allow lax decoding for invalid bson types Nov 10, 2017
@zonyitoo
Copy link
Contributor

zonyitoo commented Nov 11, 2017

cc @kyeah

There may be some ways to do that. But should it be an "expected" behavior for all users?

I personally prefer strongly typed, otherwise, you will never know what is happening before/after serialize/deserialize. For example, in your case, amount may be modified to value 1 of i32 type after serialization.

@andreasterrius
Copy link
Author

andreasterrius commented Nov 11, 2017

I prefer strongly typed too, but the library user should be able to somehow implement/use a lax deserialization 9or custom deser rule) if possible.

Anyway if I want to try implementing this in bson-rs what would be the suggested way of doing it ?
If I can get some pointers on where to look, it'll help a lot

@kyeah
Copy link
Contributor

kyeah commented Nov 11, 2017

This issue looks similar to what we experienced type-coercing unsigned values to floating points on serialization: #72. In particular, it seems like we'd want to introduce a bson::compat::i2f module like this:

#[derive(Deserialize, Serialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct Test {
    #[serde(rename = "_id")]
    test_id : i32,
    #[serde(with = "bson::compat::i2f")]
    amount : i32
}

Where i2f would contain functionality to convert between application-layer integers and database-layer floating points. See the u2f compatibility module defined here: https://github.com/zonyitoo/bson-rs/blob/master/src/compat/u2f.rs.

I'm not sure if this is something we'd like to support explicitly by providing that module in the repo, but this should be something that can be built and applied externally.

@andreasterrius
Copy link
Author

andreasterrius commented Nov 12, 2017

I ended up making something like

#[allow(non_snake_case)]
pub mod LaxNumber {
    use serde::{Serialize, Deserialize, Deserializer};
    use serde::de::DeserializeOwned;
    use bson::Bson;

    pub trait LaxFromF64 {
        fn from(v : f64) -> Self;
    }

    macro_rules! impl_laxfrom64 {
        ($to:ident) => (
            impl LaxFromF64 for $to {
                fn from(v : f64) -> $to {
                    v as $to
                }
            }
        )
    }

    impl_laxfrom64!(i8);
    impl_laxfrom64!(i16);
    impl_laxfrom64!(i32);
    impl_laxfrom64!(i64);
    impl_laxfrom64!(f32);
    impl_laxfrom64!(f64);
    
    pub fn deserialize<'de, T, D>(d: D) -> Result<T, D::Error>
        where D: Deserializer<'de>,
              T: LaxFromF64
    {
        f64::deserialize(d).map(T::from)
    }

     pub fn deserialize_nullable<'de, T, D>(d: D) -> Result<Option<T>, D::Error>
        where D: Deserializer<'de>,
              T: LaxFromF64
    {
        Option::<f64>::deserialize(d)
            .map(|x| {
                match x {
                    None => None,
                    Some(v) => Some(T::from(v))
                }
            })
    }
}
#[derive(Deserialize, Serialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct Test {
    #[serde(rename = "_id")]
    test_id : i32,

    #[serde(deserialize_with = "LaxNumber::deserialize")]
    amount : i32

    #[serde(deserialize_with = "LaxNumber::deserialize_nullable")]
    amount_nullable : Option<i32>
}

Is this the right way to do this ?
Won't the value be serialized to a f64 (regardless whether it is deserializing an actual i64, i32, or f64) first before being converted to our wanted type? Would this be a good approach ?

@zonyitoo
Copy link
Contributor

Well, yes, it is a good approach.

@kyeah
Copy link
Contributor

kyeah commented Nov 15, 2017

Hey @xyten! Sorry for the delay; understanding your example a little better now. I believe you may need to implement a custom visitor to deserialize directly from multiple types:

#[inline]
pub fn deserialize<D>(deserializer: D) -> Result<D::Ok, D::Error>
    where D: Deserializer<'de>
{
    deserializer.deserialize_i32(LaxI32Visitor)
}
use std::marker::PhantomData;

pub struct LaxI32Visitor<T: From<i32>>(PhantomData<T>);

impl<'de, T> Visitor<'de> for LaxI32Visitor<T> where T: From<i32> {
    type Value = T;

    fn expecting(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "expecting an integer or floating point")
    }

    #[inline]
    fn visit_i8<E>(self, value: i8) -> Result<T, E>
        where E: Error
    {
        Ok(T::from(value as i32))
    }

    #[inline]
    fn visit_i16<E>(self, value: i16) -> Result<T, E>
        where E: Error
    {
        Ok(T::from(value as i32))
    }

    #[inline]
    fn visit_i32<E>(self, value: i32) -> Result<T, E>
        where E: Error
    {
        Ok(T::from(value))
    }

    #[inline]
    fn visit_i64<E>(self, value: i64) -> Result<T, E>
        where E: Error
    {
        Ok(T::from(value as i32))
    }

    #[inline]
    fn visit_f32<E>(self, value: f32) -> Result<T, E> {
        Ok(T::from(value as i32))
    }

    #[inline]
    fn visit_f64<E>(self, value: f64) -> Result<T, E> {
        Ok(T::from(value as i32))
    }
}

something similar to this — i haven't quite played around with serde de/serialization in a while 😅 . whatever deserializer is being used should call into the visitor's visit_x methods accordingly.

play around with it and let me know if that helps.

@saghm
Copy link
Contributor

saghm commented Feb 10, 2020

Given that a reasonable workaround was found and no comments have been added to this issue for a couple of years, I'm going to close this issue.

@saghm saghm closed this as completed Feb 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants