Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved support for byte strings #72

Merged
merged 3 commits into from May 19, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
10 changes: 10 additions & 0 deletions serde_macros/src/de.rs
Expand Up @@ -578,6 +578,16 @@ fn deserialize_field_visitor(
_ => Err(::serde::de::Error::unknown_field_error(value)),
}
}

fn visit_bytes<E>(&mut self, value: &[u8]) -> ::std::result::Result<__Field, E>
where E: ::serde::de::Error,
{
// TODO: would be better to generate a byte string literal match
match ::std::str::from_utf8(value) {
Ok(s) => self.visit_str(s),
_ => Err(::serde::de::Error::syntax_error()),
}
}
}

deserializer.visit(__FieldVisitor)
Expand Down
47 changes: 43 additions & 4 deletions src/bytes.rs
@@ -1,17 +1,26 @@
//! Helper module to enable serializing bytes more efficiently

use std::ops;
use std::fmt;
use std::ascii;

use ser;
use de;

///////////////////////////////////////////////////////////////////////////////

/// `Bytes` wraps a `&[u8]` in order to serialize into a byte array.
#[derive(Clone, Copy, Eq, Hash, PartialEq, PartialOrd, Ord)]
pub struct Bytes<'a> {
bytes: &'a [u8],
}

impl<'a> fmt::Debug for Bytes<'a> {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "b\"{}\"", escape_bytestring(self.bytes))
}
}

impl<'a> From<&'a [u8]> for Bytes<'a> {
fn from(bytes: &'a [u8]) -> Self {
Bytes {
Expand All @@ -28,6 +37,12 @@ impl<'a> From<&'a Vec<u8>> for Bytes<'a> {
}
}

impl<'a> Into<&'a [u8]> for Bytes<'a> {
fn into(self) -> &'a [u8] {
self.bytes
}
}

impl<'a> ops::Deref for Bytes<'a> {
type Target = [u8];

Expand All @@ -46,7 +61,7 @@ impl<'a> ser::Serialize for Bytes<'a> {
///////////////////////////////////////////////////////////////////////////////

/// `ByteBuf` wraps a `Vec<u8>` in order to hook into serialize and from deserialize a byte array.
#[derive(Clone, Debug, Eq, Hash, PartialEq, PartialOrd, Ord)]
#[derive(Clone, Eq, Hash, PartialEq, PartialOrd, Ord)]
pub struct ByteBuf {
bytes: Vec<u8>,
}
Expand All @@ -65,10 +80,22 @@ impl ByteBuf {
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not also use Into here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would conflict with From<T> at line 94:

impl Into<Vec<u8>> for ByteBuf {...}
impl<T> From<T> for ByteBuf where T: Into<Vec<u8>> {...}
src/bytes.rs:100:1: 106:2 error: conflicting implementations for trait `core::convert::From` [E0119]
src/bytes.rs:100 impl<T> From<T> for ByteBuf where T: Into<Vec<u8>> {
src/bytes.rs:101     fn from(bytes: T) -> Self {
src/bytes.rs:102         ByteBuf {
src/bytes.rs:103             bytes: bytes.into(),
src/bytes.rs:104         }
src/bytes.rs:105     }
                 ...
src/bytes.rs:100:1: 106:2 note: conflicting implementation in crate `core`
src/bytes.rs:100 impl<T> From<T> for ByteBuf where T: Into<Vec<u8>> {
src/bytes.rs:101     fn from(bytes: T) -> Self {
src/bytes.rs:102         ByteBuf {
src/bytes.rs:103             bytes: bytes.into(),
src/bytes.rs:104         }
src/bytes.rs:105     }
                 ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest removing

impl<T> From<T> for ByteBuf where T: Into<Vec<u8>>

in favor of simpler

impl From<Vec<u8>> for ByteBuf

Though that would be a breaking change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does removing the From impl and adding the Into impl work?
Then the From impl is generated in the standard library if I read this correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the From impl is generated in the standard library if I read this correctly.

Yes, the offending implementation is

impl<T> From<T> for T

http://doc.rust-lang.org/src/core/convert.rs.html#152

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I mean is, you don't need to implement Into, it's generated, since From exists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the message actually mentions From implementations, not Into.

Anyway, committed this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be generated if there was impl From<ByteBuf> for Vec<u8>, not impl From<Vec<u8>> for ByteBuf

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh right... I got very confused with all those From and Into... sorry about that


impl<T> From<T> for ByteBuf where T: Into<Vec<u8>> {
fn from(bytes: T) -> Self {
impl fmt::Debug for ByteBuf {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "b\"{}\"", escape_bytestring(self.bytes.as_ref()))
}
}

impl Into<Vec<u8>> for ByteBuf {
fn into(self) -> Vec<u8> {
self.bytes
}
}

impl From<Vec<u8>> for ByteBuf {
fn from(bytes: Vec<u8>) -> Self {
ByteBuf {
bytes: bytes.into(),
bytes: bytes,
}
}
}
Expand Down Expand Up @@ -172,3 +199,15 @@ impl de::Deserialize for ByteBuf {
deserializer.visit_bytes(ByteBufVisitor)
}
}

///////////////////////////////////////////////////////////////////////////////

fn escape_bytestring(bytes: &[u8]) -> String {
let mut result = String::new();
for &b in bytes {
for esc in ascii::escape_default(b) {
result.push(esc as char);
}
}
result
}
19 changes: 19 additions & 0 deletions src/de/impls.rs
Expand Up @@ -3,6 +3,7 @@ use std::hash::Hash;
use std::marker::PhantomData;
use std::path;
use std::rc::Rc;
use std::str;
use std::sync::Arc;

use num::FromPrimitive;
Expand Down Expand Up @@ -198,6 +199,24 @@ impl Visitor for StringVisitor {
{
Ok(v)
}

fn visit_bytes<E>(&mut self, v: &[u8]) -> Result<String, E>
where E: Error,
{
match str::from_utf8(v) {
Ok(s) => Ok(s.to_string()),
Err(_) => Err(Error::syntax_error()),
}
}

fn visit_byte_buf<'a, E>(&mut self, v: Vec<u8>) -> Result<String, E>
where E: Error,
{
match String::from_utf8(v) {
Ok(s) => Ok(s),
Err(_) => Err(Error::syntax_error()),
}
}
}

impl Deserialize for String {
Expand Down
4 changes: 2 additions & 2 deletions src/de/mod.rs
Expand Up @@ -263,10 +263,10 @@ pub trait Visitor {
Err(Error::syntax_error())
}

fn visit_byte_buf<E>(&mut self, _v: Vec<u8>) -> Result<Self::Value, E>
fn visit_byte_buf<E>(&mut self, v: Vec<u8>) -> Result<Self::Value, E>
where E: Error,
{
Err(Error::syntax_error())
self.visit_bytes(&v)
}
}

Expand Down
54 changes: 54 additions & 0 deletions src/de/value.rs
Expand Up @@ -12,6 +12,7 @@ use std::hash::Hash;
use std::vec;

use de;
use bytes;

///////////////////////////////////////////////////////////////////////////////

Expand Down Expand Up @@ -409,3 +410,56 @@ impl<K, V> ValueDeserializer for HashMap<K, V>
MapDeserializer::new(self.into_iter(), len)
}
}

///////////////////////////////////////////////////////////////////////////////

impl<'a> ValueDeserializer for bytes::Bytes<'a>
{
type Deserializer = BytesDeserializer<'a>;

fn into_deserializer(self) -> BytesDeserializer<'a> {
BytesDeserializer(Some(self.into()))
}
}

pub struct BytesDeserializer<'a> (Option<&'a [u8]>);

impl<'a> de::Deserializer for BytesDeserializer<'a> {
type Error = Error;

fn visit<V>(&mut self, mut visitor: V) -> Result<V::Value, Error>
where V: de::Visitor,
{
match self.0.take() {
Some(bytes) => visitor.visit_bytes(bytes),
None => Err(de::Error::end_of_stream_error()),
}
}
}


///////////////////////////////////////////////////////////////////////////////

impl ValueDeserializer for bytes::ByteBuf
{
type Deserializer = ByteBufDeserializer;

fn into_deserializer(self) -> Self::Deserializer {
ByteBufDeserializer(Some(self.into()))
}
}

pub struct ByteBufDeserializer(Option<Vec<u8>>);

impl de::Deserializer for ByteBufDeserializer {
type Error = Error;

fn visit<V>(&mut self, mut visitor: V) -> Result<V::Value, Error>
where V: de::Visitor,
{
match self.0.take() {
Some(bytes) => visitor.visit_byte_buf(bytes),
None => Err(de::Error::end_of_stream_error()),
}
}
}