Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cow Deserialize impl seems wasteful #1497

Closed
kaikalii opened this issue Mar 16, 2019 · 7 comments
Closed

Cow Deserialize impl seems wasteful #1497

kaikalii opened this issue Mar 16, 2019 · 7 comments

Comments

@kaikalii
Copy link

The Cow Deserialize implementation seems wasteful:

#[cfg(any(feature = "std", feature = "alloc"))]
impl<'de, 'a, T: ?Sized> Deserialize<'de> for Cow<'a, T>
where
    T: ToOwned,
    T::Owned: Deserialize<'de>,
{
    #[inline]
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: Deserializer<'de>,
    {
        T::Owned::deserialize(deserializer).map(Cow::Owned)
    }
}

Deserialize is implemented for every type in std that implements ToOwned. Rather than deserializing into the ToOwned::Owned type, shouldn't it just directly deserialize the borrowed type so as not to perform unnecessary clones? Something like:

#[cfg(any(feature = "std", feature = "alloc"))]
impl<'de, 'a, T: ?Sized> Deserialize<'de> for Cow<'a, T>
where
    T: ToOwned + Deserialize<'de>,
{
    #[inline]
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: Deserializer<'de>,
    {
        T::deserialize(deserializer).map(Cow::Borrowed)
    }
}
@dtolnay
Copy link
Member

dtolnay commented Mar 17, 2019

Your suggested impl does not compile because Cow::Borrowed contains &T, not T.

error[E0631]: type mismatch in function arguments
   |
   |         T::deserialize(deserializer).map(Cow::Borrowed)
   |                                      ^^^
   |                                      |
   |                                      expected signature of `fn(T) -> _`
   |                                      found signature of `fn(&_) -> _`

@agausmann
Copy link

agausmann commented Sep 4, 2019

I agree that it's not ideal, but it's the best "default" for Cow to simply delegate to the owned value. It's not reasonable to only have a trait bound on the borrowed type, because that can often fail if the deserializer is forced to allocate and return an owned type. The point of Cow is to be able to store either owned or borrowed values. Adding trait bounds for both borrowed and owned is a bit more reasonable based on that logic but might break some use cases and it can't be implemented generically.

For derives, we have #[serde(borrow)] that can be applied to the field and it uses custom code to borrow when possible, but there currently isn't any solution for manual implementations. I think a deserializing wrapper type would be nice to have, and we can implement Deserialize for it where the type is a well-known borrowable like strings or slices.

@agausmann
Copy link

agausmann commented Sep 4, 2019

Here's an example Borrowable type to demonstrate how it could be implemented. I only did strings, but you could easily modify it to also work with slices. (playground)

use std::borrow::Cow;
use std::fmt;
use serde::de::{self, Deserialize, Deserializer, Visitor};

struct Borrowable<'a, T: ?Sized + ToOwned>(Cow<'a, T>);

impl<'a, 'de: 'a> Deserialize<'de> for Borrowable<'a, str> {
    fn deserialize<D>(de: D) -> Result<Self, D::Error>
    where
        D: Deserializer<'de>,
    {
        struct MyVisitor;

        impl<'de> Visitor<'de> for MyVisitor {
            type Value = Borrowable<'de, str>;

            fn expecting(&self, f: &mut fmt::Formatter) -> fmt::Result {
                write!(f, "an owned or borrowed string")
            }

            fn visit_borrowed_str<E>(self, v: &'de str) -> Result<Self::Value, E>
            where
                E: de::Error,
            {
                Ok(Borrowable(Cow::Borrowed(v)))
            }

            fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
            where
                E: de::Error,
            {
                Ok(Borrowable(Cow::Owned(v.to_string())))
            }

            fn visit_string<E>(self, v: String) -> Result<Self::Value, E>
            where
                E: de::Error,
            {
                Ok(Borrowable(Cow::Owned(v)))
            }
        }

        de.deserialize_str(MyVisitor)
    }
}

@agausmann
Copy link

agausmann commented Sep 4, 2019

Just to clarify the point about generic implementations - you might try to implement it like this:

impl<'a, 'de: 'a, T> Deserialize<'de> for Borrowable<'a, T>
where
    T: ?Sized + ToOwned,
    &'a T: Deserialize<'de>,
    T::Owned: Deserialize<'de>,
{
    fn deserialize<D>(de: D) -> Result<Self, D::Error>
    where
        D: Deserializer<'de>,
    {
        <&T>::deserialize(de)
            .map(|x| Borrowable(Cow::Borrowed(x)))
            .or_else(|_| <T::Owned>::deserialize(de)
                .map(|x| Borrowable(Cow::Owned(x))))
    }
}

This fails with an error: use of moved value: `de` , because deserialize takes the deserializer by value, consuming it. Another option would be to replace de with &de, but we can't can't assume that &D: Deserializer<'de> because it is not implemented generically and such an implementation is impossible due to the nature of Deserializer

@jplatte
Copy link
Contributor

jplatte commented Oct 26, 2020

If at some point in the future, Rust has really powerful specialization support (such that even lifetime specialization works), serde could provide an impl<'de> Deserialize<'de> for Cow<'de, str> (and the same for Cow<'de, [u8]> I guess) that specializes the much more general existing impl.

In the meantime you can use #[serde(borrow)] for Cow struct fields (not sure for what sorts of Cows this works and what exactly it does under the hood but it does produce Cow::Borrowed where possible).

As a replacement for calling Cow<'_, str>::deserialize or String::deserialize directly if you really just want to have a &str but not fail on owned string inputs (such as strings containing escapes in JSON), you can use a visitor that combines the behavior of the String and Str visitors in serde, like this.

@Kixunil
Copy link
Contributor

Kixunil commented Mar 16, 2021

FYI, there's a crate with helper borrowing Cow. The difference between #[serde(borrow)] is this can be used in serde(try_from = ...).

@sffc
Copy link

sffc commented Apr 11, 2021

The Deserialize impl proposed above in #1497 (comment) is already present in core Serde if you opt in to it using #[serde(borrow)] on a Cow:

pub fn borrow_cow_str<'de: 'a, 'a, D, R>(deserializer: D) -> Result<R, D::Error>

The problem with making it the default impl is that unfortunately when a read stream is used as the data source, this impl fails to compile. For example, I took the impl from the comment above and wrote the following test, and the test fails to compile:

#[test]
fn test_borrow() {
    let json_buf = "\"Hello World\"".to_string();
    let data_new: Borrowable<str> = serde_json::from_reader(json_buf.as_bytes())
        .ok()
        .unwrap_or(Borrowable(Cow::Borrowed("fallback")));
    assert!(matches!(data_new.0, Cow::Borrowed("Hello World")));
}

Error message:

    |
148 |               serde_json::from_reader(json_buf.as_bytes()).ok().unwrap_or(Borrowable(Cow::Borrowed("fallback")));
    |               ^^^^^^^^^^^^^^^^^^^^^^^ implementation of `Deserialize` is not general enough
    | 
   ::: /path/to/serde-1.0.123/src/de/mod.rs:530:1
    |
530 | / pub trait Deserialize<'de>: Sized {
531 | |     /// Deserialize this value from the given Serde deserializer.
532 | |     ///
533 | |     /// See the [Implementing `Deserialize`][impl-deserialize] section of the
...   |
568 | |     }
569 | | }
    | |_- trait `Deserialize` defined here
    |
    = note: `Borrowable<'_, str>` must implement `Deserialize<'0>`, for any lifetime `'0`...
    = note: ...but `Borrowable<'_, str>` actually implements `Deserialize<'1>`, for some specific lifetime `'1`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants