Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce collect_seq and collect_map #736

Merged
merged 5 commits into from
Feb 4, 2017
Merged

Conversation

nox
Copy link
Contributor

@nox nox commented Jan 31, 2017

These are two convenience functions to serialize iterators.

@nox
Copy link
Contributor Author

nox commented Jan 31, 2017

@sfackler on IRC taught me that there is a wrapper behind the feature unstable to serialise iterators through a wrapper putting it in a RefCell. I posit that we should not do that, because that doesn't help for struct fields and friends, as it is an opaque thing that will consume the iterator on serialisation.

@nox
Copy link
Contributor Author

nox commented Jan 31, 2017

Silly me, it uses IntoIterator too so it doesn't consume it, but given there is no function bound to this behaviour, it can't be used in #[serde(serialize_with)].

If we think the wrapper is useful, we could keep it but simplify it using mitochondria::MoveCell. :P

@sfackler
Copy link
Contributor

If we were going to do this it'd probably be worth also having one that serializes an iterator of tuples as a map.

@nox
Copy link
Contributor Author

nox commented Jan 31, 2017

@sfackler Should I rename this one to serialize_seq, and make another one serialize_map? We could also make serialize_fixed_size_seq relying on TrustedLen once it's stable too.

@nox nox changed the title Introduce ser::serialize_iter Introduce serialize_seq and serialize_map Jan 31, 2017
@nox nox force-pushed the serialize-iter branch 3 times, most recently from 3c33c5d to 110e799 Compare January 31, 2017 23:29
@dtolnay
Copy link
Member

dtolnay commented Feb 1, 2017

Can you share some example code of how you imagine people using this?

As I see it, there are two primary use cases for these functions:

  • Helpers when implementing Serialize for a type. I don't think the tradeoff of including these functions for this purpose makes sense. Just use the loop way and it is 5 straightforward lines and those 5 lines match a pattern that is common to all other compound type serialization in Serde (tuples, tuple structs, structs, struct variants, tuple variants). In my opinion the typing you save by using these helpers does not outweigh the people stumbling over this when browsing our rustdoc, the confusion of mixing up ser::serialize_seq and Serializer::serialize_seq and needing to think about which one to use where.

  • As a serialize_with function in cases where the type is not from your crate and the original author does not want to add a Serialize implementation. In this case I would prefer to include these in Helper library of useful de/serialize_with functions #553 which is a library dedicated to this purpose, rather than in the main serde crate.

@nox
Copy link
Contributor Author

nox commented Feb 1, 2017

Can you share some example code of how you imagine people using this?

Same way as the unstable iterator thing, except that there is no "this must be serialised only once" as on the Iterator struct. Tuples, tuple structs and structs are similar, but there is no legit reason to support them from an iterator, after all tuples, tuple structs and structs cannot be iterated.

As for the fact that it is just a 5 lines loop, the Pair trait with its two implementations is certainly less trivial to repeat all the time.

IMO this is core enough that it should be in serde.

@dtolnay
Copy link
Member

dtolnay commented Feb 3, 2017

Sorry for the silence on my end - I have been busy with #739. I will consider this again tomorrow, including your suggestion from IRC of possibly making these Serializer trait methods with default implementations.

@sfackler
Copy link
Contributor

sfackler commented Feb 3, 2017

I like the default method approach personally. It's a nice convenience similar to https://doc.rust-lang.org/std/fmt/struct.DebugList.html#method.entries

@dtolnay
Copy link
Member

dtolnay commented Feb 3, 2017

It's a little different in that case because DebugList is a struct and not a trait. But sure, let's do it. Any thoughts on naming? Serializer::serialize_iter_as_seq and serialize_iter_as_map aren't perfect. It would be nice to reference Extend somehow because that is one of the most common methods taking an IntoIterator, but that would fit better as SerializeSeq::extend or something.

@nox
Copy link
Contributor Author

nox commented Feb 3, 2017

Rather than Extend, if you want to reuse iterator terminology, I would opt for collect_seq and collect_map methods on Serializer.

@dtolnay
Copy link
Member

dtolnay commented Feb 3, 2017

Good idea, let's go with that. Thanks!

@dtolnay
Copy link
Member

dtolnay commented Feb 3, 2017

Also I would prefer not to make the Pair changes. It makes sense for MapDeserializer because people pass around variables of type MapDeserializer<I, E> so additional redundant type parameters are a nuisance, but here let's just use K, V type parameters on collect_map.

@nox nox force-pushed the serialize-iter branch 2 times, most recently from 56f3805 to 8f7e8c4 Compare February 3, 2017 20:14
@dtolnay
Copy link
Member

dtolnay commented Feb 3, 2017

Do you expect any serializers to want to override these default implementations? If so, please add a comment that explains why someone would want to override. If not, please add a comment that serializers should not need to override. I don't want serializer implementors to wonder whether the default implementation is doing the wrong thing for their format.

@nox
Copy link
Contributor Author

nox commented Feb 3, 2017

I would expect serde_json to override it to not even have to allocate an intermediate String, instead writing directly on its output the string escaped appropriately. Wrong issue hah.

@nox
Copy link
Contributor Author

nox commented Feb 3, 2017

@dtolnay I added the comments, I don't expect implementors to have to override this.

Copy link
Member

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @oli-obk what do you think?

@dtolnay dtolnay requested a review from oli-obk February 3, 2017 23:08
@nox nox changed the title Introduce serialize_seq and serialize_map Introduce collect_seq and collect_map Feb 3, 2017
/// The default implementation serializes each item yielded by the iterator
/// using `Self::SerializeSeq`. Implementors should not need to override
/// this method.
fn collect_seq<I>(self, iter: I) -> Result<Self::Ok, Self::Error>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can immediately use this in https://github.com/nox/serde/blob/a0a4d782957d351b608d08aa892e7e86317b5182/serde/src/ser/impls.rs#L224

Can we use the same size-trick so we don't pass None?

Is there a way to use specialization to improve this impl for ExactSizeIterator? If so, can this function be written in a way so that is possible (by adding a feature=unstable specialization, which gets checked on nightly travis)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you even trust that size trick? I don't understand what tells you that this is actually correct to do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_hint() is primarily intended to be used for optimizations such as reserving space for the elements of the iterator, but must not be trusted to e.g. omit bounds checks in unsafe code. An incorrect implementation of size_hint() should not lead to memory safety violations.

Quoting the standard, since we don't guarantee that the Option in serialize_seq is correct, we can depend on the size_hint to be just as correct.

All correct Serialize impls will benefit from this, by being compatible with formats like bincode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then I don't understand the second question, if we can just check that the two hints are equal and use that, why specialise for ExactSizeIterator at all?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhm... because I confused myself? ;) ignore it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oli-obk I wonder if there is something to simplify around serialize_seq! in ser::impls though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well... we could replace it with collect_seq, but I wonder if that optimizes the same way. Without benchmarks that's not a change we should do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I introduced a trait LenHint that is specialised for I: ExactSizeIterator.

@nox
Copy link
Contributor Author

nox commented Feb 4, 2017

@oli-obk @dtolnay Btw, when do we kill the Iterator wrapper? In my opinion it is misguided.

@oli-obk
Copy link
Member

oli-obk commented Feb 4, 2017

Btw, when do we kill the Iterator wrapper? In my opinion it is misguided.

with your changes there's no need for it. Feel free to remove it in this PR. (it was behind an unstable feature flag anyway)

This function serializes the given iterator as a sequence. Its iter parameter
has type I: IntoIterator, <I as IntoIterator>::Item: Serialize, which means it
will work both for iterators passed by value, therefore consuming them, and as
the value for a #[serde(serialize_with)] attribute, where it will be called as
Serializer::collect_seq(&self.field, serializer), relying on the common practice
of implementing IntoIterator for &C where C is a data type representing some
kind of collection.
@oli-obk
Copy link
Member

oli-obk commented Feb 4, 2017

@dtolnay: you wanna take another look?

@dtolnay dtolnay merged commit 4bd1052 into serde-rs:master Feb 4, 2017
@dtolnay
Copy link
Member

dtolnay commented Feb 5, 2017

well... we could replace it with collect_seq, but I wonder if that optimizes the same way. Without benchmarks that's not a change we should do.

There is no difference in performance based on my benchmarks so I removed the old serialize_seq and serialize_map macros in 219abd2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants