http parser #10

Open
dvv opened this Issue Oct 10, 2012 · 11 comments

Projects

None yet

4 participants

@dvv
Contributor
dvv commented Oct 10, 2012

do we need luv.http? what would be its domain? interface exposed? what middleware strategy -- connect-style? *sgi style?
let us discuss that here

@miko
miko commented Oct 10, 2012

Quoting original proposal:

I'd start at the top. I think the API should look something like this:

local req = httpd:recv()

where req is a RACK/PSGI [1] like environment table with:

req = {
REQUEST_METHOD = …,
SCRIPT_NAME = …,
CONTENT_LENGTH = …,

… the rest of the CGI environment vars…

luv_input = <input stream>,
luv_output = <output stream>,
luv_errors = <errors stream>,
}

The body of the request (if any) is read, by the application, from the `luv_input` stream once
the headers are parsed and you know your CONTENT_LENGTH.
For all the rest, the parsing and it's C callbacks should be internal.
There's no need to expose that. You can still do non-blocking read via libuv callbacks,
and feed the HTTP parser with chunks,
you just don't call `luvL_state_ready()` until you've got the headers, wired up the pipes and
built the `req` table. There does not need to be a 1-1 correspondence between a libuv callback
and waking a suspended fiber.
You can run all the C callbacks you like until you're ready.

A response can be either streamed out via `luv_output` (streaming video, or whatever), or sent as:

httpd:send({ 200, { ["Content-Type"] = "text/html", … }, <body chunks> }) 

This is OK for me, except that recv could optionally use maximum number of bytes to read (to prevent malicious requests), so:

local req = httpd:recv(65535)
@richardhundt
Owner

I started on it last night then I came to the realisation that writing these sorts of extensions should be easier than they are. So I'm going to do it basically, but as an extension, and at the same time make the code more modular; wrap up the libuv API, make luv_object_t more generic and allow for polymorphic checks on it (so we need assertions that something is a stream, if it's a tcp object).

I think we can reuse a lot of pieces from dvv's uhttp code. So here goes :)

@richardhundt
Owner

To get back to the OP. The question of interface. I think PSGI-like interface is the way to go (including middleware).

http://search.cpan.org/~miyagawa/PSGI-1.101/PSGI.pod

I know it's Perl, but miyagawa++ is a badass programmer with a good feel for design.

@dvv
Contributor
dvv commented Oct 10, 2012

great!

i don't get error stream -- how they use it?

@richardhundt
Owner

With Perl you can reopen STDERR as the stream if you want in the app code, or you just write to it (it's got a IO::Handle interface, so $env->{psgi.errors}->print(...) will work. On the backend side there's an adaptor (for Apache, or FCGI, or whatever) which has connected STDERR to this and will do it's usual stuff with it (log it to a file, or whatever).

@dvv
Contributor
dvv commented Oct 10, 2012

i believe the parser should:
parse url with internal http_parser method
normalize (lower()) header names
headers should fit lua tables ideally --we should store headers both by name and position
be reusable for all requests coming from the connection
set should_keep_alive and upgrade keys of request table
be given configurable chunk_size and data_limit, to help security
parse querystring

ooff :)

@daogangtang

https://github.com/brimworks/lua-http-parser

maybe this is useful.

On Wed, Oct 10, 2012 at 6:14 PM, Vladimir Dronnikov <
notifications@github.com> wrote:

i believe the parser should:
parse url with internal http_parser method
normalize (lower()) header names
headers should fit lua tables ideally --we should store headers both by
name and position
be reusable for all requests coming from the connection
set should_keep_alive and upgrade keys of request table
be given configurable chunk_size and data_limit, to help security
parse querystring

ooff :)


Reply to this email directly or view it on GitHubhttps://github.com/richardhundt/luv/issues/10#issuecomment-9296348.

Nothing is impossible.

@dvv
Contributor
dvv commented Oct 10, 2012

yeah, it is. i use it so far at https://github.com/dvv/luv-1/blob/master/examples/http_hellosvr.lua#L32
but it doesn't hide details we'd like to not expose and is a bit longish, imho

update: i meant Neopallium's fork really

@dvv
Contributor
dvv commented Oct 15, 2012

any progress?

@richardhundt
Owner

Hey,

I've pulled the code apart and am putting the pieces back together.

The old luv_object_t is gone, as is luv_fiber_t and luv_thread_t.

Instead I've reduced things down to a single polymorphic struct with
a vtable, a lua_State* and some additional pieces for hanging custom
data off this struct.

The reason for all this is that HTTP parsing isn't the only thing
you might want to do with a stream. You might want a line reader,
or a JSON decoder, or whatever.

The old code assumed that the only things that you would resume during
a libuv callback were fibers and threads. You can now suspend/resume
any intermediate "actor" states and pass data between them. Only the
last one in the chain will end up resuming the waiting coroutine.

I wanted to get the design right, so I've been doing loads of
experimenting, so it's taken a while. I'm there now, just need
to finish refactoring.

On Oct 15, 2012, at 11:17 AM, Vladimir Dronnikov wrote:

any progress?


Reply to this email directly or view it on GitHub.

@dvv
Contributor
dvv commented Oct 15, 2012

i see, thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment