Skip to content

Conversation

nixpulvis
Copy link
Contributor

I'm not 100% sure if this is what we want to do in every case, but there is a problem with the StreamDeserializer where it reads a Err and then keeps reading the same error over and over because the stream hasn't been advanced. For example the JSON a will loop forever in the error case.

Here's a complete example:

extern crate serde_json;

use serde_json::*;
use std::io::{self, Read};

fn main() {
    let stdin = io::stdin();
    let stream_deserializer: StreamDeserializer<Value, _> = StreamDeserializer::new(stdin.bytes());

    for value in stream_deserializer {
        match value {
            Ok(v) => println!("{:?}", v),
            Err(_) => println!("Encountered a json parsing error, closing"),
        }
    }
}

@erickt
Copy link
Member

erickt commented Feb 28, 2016

I'm worried about leaving the stream in an inconsistent state. For example, consider a stream like {"foo": {"bar": {"baz": 2}}. If there is a network hiccup that drops some characters so what gets passed to serde is {"fo{{"baz": 2}}, wouldn't your patch cause the substring {"fo{" be rejected, then successfully parse {"baz": 2}, then error out again with a unexpected }?

@erickt
Copy link
Member

erickt commented Feb 28, 2016

Perhaps the stream deserializer needs a proper delimiter to separate json values? This would avoid falsetrue from being parsed correctly, which seems strange.

@nixpulvis
Copy link
Contributor Author

I've seen a few parsers that use white space to delimit these values. I think this would be a sane way to solve this issue.

@dtolnay
Copy link
Member

dtolnay commented May 13, 2016

I have come to agree with Erick that this is not a good idea. The input stream may even be valid JSON and we could still end up in a bad state.

For example in your code above, change StreamDeserializer<Value, _> to StreamDeserializer<String, _> i.e. we expect the input to be a stream of quoted strings.

Suppose the input is "a" {"b": 0} "c". A reasonable thing to expect as output would be "a" (error) "c" because the second JSON value fails to deserialize. Instead this patch gives us "a" (error) "b" (error) (error) (error) "c".

I am closing this PR but I filed #70 to revisit the issue in a better way.

@dtolnay dtolnay closed this May 13, 2016
@nixpulvis
Copy link
Contributor Author

Sounds good, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants