Skip to content
This repository has been archived by the owner on Dec 7, 2018. It is now read-only.

Multipart Parser implementation. #149

Open
digitalextremist opened this issue May 22, 2014 · 11 comments
Open

Multipart Parser implementation. #149

digitalextremist opened this issue May 22, 2014 · 11 comments
Assignees
Labels
Milestone

Comments

@digitalextremist
Copy link
Member

I was about to start a PR with my code for this, which is here:

penultimatix#28

But apparently my repository needs to be cleaned up. Catching a lot of out of date stuff when I start an official Reel branch PR. I made the multipart-parser branch though, so I'll likely just directly push the code into that.

I figured @Asmod4n might have something to say here, since he's worked with multipart-parser too.

How I've done this so far, is to add Reel::Request::Body::Multipart and associated hooks into Reel::Request and Reel::Request::Body ... I feel that I found the right place to put the parser, and I've implemented it with the least invasion possible.

I considered attaching it directly to the HTTP parser, but that seemed unnecessary, since multipart requests are the vast minority, and ought to be checked for. Meaning, if I anticipate a multipart request, one ought to do multipart? on a Reel::Request just as we do websocket? now.

In an event, the code will be cleaned up and posted to a Reel branch soon. For now, it's on one of my repositories. Lastly, since the multipart-parser is not being actively maintained by the original author, I decided to tie gemspec to my own repository and not the original. I will actively maintain the gem, and keep it current with Reel's needs.

@digitalextremist
Copy link
Member Author

I think this is an essential to Reel, since this comes full circle on what brought me into close contact with this group in the first place... discovering multipart parsing is fundamentally broken/non-existent.

I have working code in my own version of Reel, and so long as it doesn't slow down 0.6.0 I would like it to show up in -pre and bring a big feature along with bug fixes for this coming release.

/cc: @tarcieri

@digitalextremist digitalextremist modified the milestone: 0.6.0 Jan 22, 2015
@tarcieri
Copy link
Member

You might want to talk to @ixti about it. He's working on some gems under the https://github.com/httprb/ organization

@kyledrake
Copy link

FWIW, it would be great to have support for this!

@tarcieri
Copy link
Member

@ixti is the multipart parser something you're still interested in working on?

@digitalextremist
Copy link
Member Author

If not, I have code for this. @ixti and I briefly spoke about bringing my code to http.rb in some form, then bring that over to Reel. I'll get on this after Reel::IO since I have an instance of Reel using the code already.

The only real question is how to access the parts parsed. Like WebSocket access being nonstandard in interfacing practice which lead to the hijack behavior of Rack when there has to be a better way, multipart isn't so straight forward. Do we test for multipart bodies every pageload? Do we add a step like hijack to get at the data? My idea is to lazy load it. Combine both benefits without either downside. Behave like it's always parsed, but only parse at the point of access.

@kyledrake
Copy link

IIRC, the browser/client has the responsibly to declare the type as multipart/form-data in the headers, and if it does not then you shouldn't need to parse it.

I wanted to highlight https://github.com/danabr/multipart-parser, which claims to have an event-driven api and may be an excellent base for this implementation (though I have never used it and cannot confirm). My hacky solution to this issue was going to be to use some flavor of that (detect header, toss body in tempfile in chunks, parse using this gem). I have not attempted this yet.

@tarcieri
Copy link
Member

My hacky solution to this issue was going to be to use some flavor of that (detect header, toss body in tempfile in chunks

Would be nice to not require a Tempfile but stream the data instead. A Tempfile is nice for, uhh, Rails or something, but Reel's goal has been to have streaming APIs everywhere

@kyledrake
Copy link

Yeah, that would be ideal here. It would require a streamed parsing, which the parser gem may or may not support.

It gets weird when you need to start accounting for form "variables", though. I'm not sure how easy it is going to be to stream something like that.

@digitalextremist
Copy link
Member Author

My code forked and revamped the gem mentioned, and uses streams vs. temp files.

@digitalextremist
Copy link
Member Author

By the way @kyledrake -- yes, the incoming request declares whether it is multipart or not, but checking for that ( rather than stumbling upon it doing something always needed ) is what I'm referring to. It may seem insignificant, but if one out of every thousand messages is multipart, it seems wasteful to even test for unless demanded/expected. And variable parsing is handled in an evented way last I remember too. The gem was pretty unactive and not every approach taken was perfect as-is... but you are right that the gem is a very good base.

@ixti
Copy link
Contributor

ixti commented Jul 17, 2015

@tarcieri

@ixti is the multipart parser something you're still interested in working on?

Yeah. Having lots of changes in my life atm, so have not much time unfortunately. Hopefully should finish with all rush and chaos I have soon to dive back into active development ;D

@digitalextremist digitalextremist modified the milestones: 0.6.5, 1.0 Feb 16, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants