Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream parser that doesn't buffer the entire message #105

Open
evpopov opened this issue Dec 16, 2022 · 0 comments
Open

Stream parser that doesn't buffer the entire message #105

evpopov opened this issue Dec 16, 2022 · 0 comments

Comments

@evpopov
Copy link

evpopov commented Dec 16, 2022

Hi,
Parts of this may have been touched on in #100 but I wanted to start a clean discussion here.

I'm trying to redesign an RPC-like protocol to use MessagePack. The protocol was historically based on TLVs and
runs on a bare-metal system that is quite constrained on memory. The idea is for my device to accept
calls over a TCP stream. Each call would consist of a command, parameters, data, etc and the device would execute
whatever is being requested. One of those requests could be for a file upload or firmware upgrade and naturally,
such an RPC call would be hundreds of kb if not a megabyte or two and I don't have anywhere near enough RAM to buffer
the entire message. Historically, I'd have TLVs for the command, and data which allows me to know how long each section
is as well as skip over parts of the request if I don't have to parse them. I'd parse the command TLV, so naturally,
I'd know that for example I needed to save the data TLV to a file as that data TLV was being parsed a few hundred
bytes at a time. Using TLVs also helps when transferring data over TCP because TCP is stream based and in many
cases I need to know the size of the transfer ahead of time. MessagePack gives me the same benefit here because
a MessagePack object is of known size.

I'm trying to redesign this TLV-based approach and have the entire request encoded inside a MessagePack message
because this makes the protocol more "standard" rather than being defined by random proprietary TLVs. To do this,
I need to be able to "feed" the parser with random ammounts of data as it becomes available while at the same
time handle whatever the parser has decoded so far. After digging through the very good manual and trying the
different APIs I'm almost finding what I need but not quite....

NodeAPI: mpack_tree_init_stream() seems to be almost exactly what I need because I simply simulate the "feeding"
functionality through the read_fn() function and use mpack_tree_try_parse() to handle objects as they get decoded.
The read_fn() function can return zero if I don't have any new data and life is good..... Except that the NodeAPI
expects to be able to buffer the whole message and in my case, the message may be a multi-megabyte file upload.

ReaderAPI with fill and skip functions: That API gives me the freedom to parse the objects as they come in which is
ideal, but the fill function is not allowed to return zero. The problem here is that my task cannot block. It has
other things to do. Periodically, it checks for new data from the socket and can "feed" that new data to the parser,
but I simply cannot block the task.
I though maybe I could call mpack_read_tag() only if I have accumulated some data, but there is no way to know how
many bytes mpack_read_tag() will want to consume.

Am I missing something? Is there a way to parse a stream and handle the data a few bytes at a time without
buffering the entire message?

Thanks in advance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant