Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split a JSON byte stream into JSON objects/arrays. Fixes #2536 #2547

Closed
wants to merge 1 commit into from

Conversation

buchgr
Copy link
Contributor

@buchgr buchgr commented Jun 7, 2014

Motivation:

See GitHub Issue #2536.

Modifications:

Introduce the class JsonObjectDecoder to split a JSON byte stream
into individual JSON objets/arrays.

Result:

A Netty application can now handle a byte stream where multiple JSON
documents follow eachother as opposed to only a single JSON document
per request.

@trustin @daschl @normanmaurer please review :-)

@buchgr buchgr changed the title Split a JSON byte stream into JSON objects/arrays Split a JSON byte stream into JSON objects/arrays. Fixes #2536 Jun 7, 2014
@ghost
Copy link

ghost commented Jun 7, 2014

Build result for #2547 at dcdc69fc28ff416717c0742f13a88c399116fcb7: Failure

@ghost
Copy link

ghost commented Jun 7, 2014

Build result for #2547 at 07ad08da328a825451e8ba3946bc27d3e4e01d83: Success


/**
* Splits a byte stream of JSON objects and arrays into individual objects/arrays and passes them up the
* channel pipeline.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use {@link ChannelPipeline}

@ghost
Copy link

ghost commented Jun 8, 2014

Build result for #2547 at 99a6c15049b714610622427a11b984e82b32daea: Success

@ghost
Copy link

ghost commented Jun 8, 2014

Build result for #2547 at 8d2755bea51a05a6ea4a4b24676ff5308587156c: Success

@ghost
Copy link

ghost commented Jun 8, 2014

Build result for #2547 at 6d4dc4ed89eb988a7eae1d67c718df2d9ca36005: Success

@normanmaurer
Copy link
Member

@trustin wdyt ?

@daschl
Copy link
Member

daschl commented Jun 12, 2014

@jakobbuchgraber I'm doing something a litte different in my project, maybe you like it:

basically using a processor to find some json markers:

    private static class MarkerProcessor implements ByteBufProcessor {

        private int marker = 0;
        private int counter = 0;
        private int depth = 0;
        private byte open = '{';
        private byte close = '}';
        private byte stringMarker = '"';
        private boolean inString = false;

        @Override
        public boolean process(byte value) throws Exception {
            counter++;
            if (value == stringMarker) {
                inString = !inString;
            }
            if (!inString && value == open) {
                depth++;
            }
            if (!inString && value == close) {
                depth--;
                if (depth == 0) {
                    marker = counter;
                }
            }
            return true;
        }

        public int marker() {
            return marker;
        }
    }

I'm doing slightly different stuff but you can use the processor for this very efficiently.
/cc @normanmaurer

@buchgr
Copy link
Contributor Author

buchgr commented Jun 12, 2014

@daschl thanks for sharing. I started with a processor too (as I learned from @normanmaurer's FB talk :P ), but I also have to be able to look back one byte to check for escape characters (e.g. "foo " bar") and so I kinda felt the loop was nicer for my purposes.

Maybe I should benchmark both versions, but I didn't bother with these kind of micro optimizations yet 😄

ch.writeInbound(Unpooled.copiedBuffer("blabla 123", CharsetUtil.UTF_8));
assertNull(ch.readInbound());

assertFalse(ch.finish());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also check if an exception is raised here.

@trustin
Copy link
Member

trustin commented Jun 24, 2014

Could you also add a test case where a string contains '{' and other potentially confusing tokens?

@trustin
Copy link
Member

trustin commented Jun 24, 2014

I also want to see an option that enables a streaming of a JSON array elements, whose size is potentially infinite. For example, when a client sends this:

[
"a",
{ key: "value" },
"c",
...
]

The decoder could generate a frame for each array element (i.e. ""a"", "{ key: "value" }", "c"", ...)

@trustin trustin added this to the 4.1.0.Final milestone Jun 24, 2014
@trustin trustin self-assigned this Jun 24, 2014
@ghost
Copy link

ghost commented Jun 25, 2014

Build result for #2547 at 8250ec5554dbfbd4acd56c98cd94811fa14d449b: Failure

@buchgr
Copy link
Contributor Author

buchgr commented Jun 29, 2014

I addressed the comments and I think it's ready for final review/merge.

private final boolean streamArrayElements;

public JsonObjectDecoder() {
this(Integer.MAX_VALUE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default maxObjectLength should be a much smaller value? 1048576?

@trustin
Copy link
Member

trustin commented Jul 2, 2014

Looks pretty good all in all. Left some comments for tiny things. To summarize:

  • Raise a CorruptedFrameException when the received data is not JSON

@buchgr
Copy link
Contributor Author

buchgr commented Jul 2, 2014

done

Motivation:

See GitHub Issue netty#2536.

Modifications:

Introduce the class JsonObjectDecoder to split a JSON byte stream
into individual JSON objets/arrays.

Result:

A Netty application can now handle a byte stream where multiple JSON
documents follow eachother as opposed to only a single JSON document
per request.
@trustin
Copy link
Member

trustin commented Jul 3, 2014

Nice work. Merged into 4.1 and master. 👍

@trustin trustin closed this Jul 3, 2014
@trustin trustin modified the milestones: 4.1.0.Beta1, 4.1.0.Final Jul 3, 2014
@alex-vas
Copy link
Contributor

It might be worth to add a comment that this implementation is only compatible with UTF-8 or ASCII encoded streams. Might event be worth it to add Charset as a constructor parameter and assert that it is UTF-8 straight away. This would make the API ready for possible future implementation of other encodings support. Same goes to LineBasedFrameDecoder and XmlFrameDecoder - both missing a comment explaining that they are expecting the stream in UTF-8.

@normanmaurer
Copy link
Member

normanmaurer commented Dec 12, 2018 via email

@alex-vas
Copy link
Contributor

Let me give it a try. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants