Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

parse POST data #13

Open
xcir opened this Issue Jun 5, 2012 · 17 comments

Comments

Projects
None yet
4 participants

xcir commented Jun 5, 2012

I have heard that you are looking for a POST data parse VMOD.
I was just making it.

https://github.com/xcir/libvmod-parsepost

Your project is important and interesting.
Let me think what I can do for you.

Collaborator

scarpellini commented Jun 5, 2012

Syonei, nice to hear you :)
I'm the guy behind the VFW and its POST.vcl PoC. I did the PoC in this way
(copying raw POST data to headers), because I had a tight deadline to
deliver (it was my postgrad coursework).
Now, we (the security.vcl guys and me) are trying to merge our codebase and
port that inline-C to a VMOD (as you did).
I think the libvmod-parsepost project is a superb contribution to the
Varnish community and to us. You did a really nice work. :)
However I guess the way that (me and) you used to expose the POST data to
VCL could be improved. In VMOD, you can store this data in an own structure
and add syntactic sugar to the VCL access it.
Take a look in the libvmod-curl (https://www.varnish-cache.org/vmod/curl)
and it will be clear (if curl.header == ...).
What you and other guys think?

Again: Fine work. Congratulations! :)

2012/6/5 Syohei Tanaka <
reply@reply.github.com

I have heard that you are looking for a POST data parse VMOD.
I was just making it.

https://github.com/xcir/libvmod-parsepost

Your project is important and interesting.
Let me think what I can do for you.


Reply to this email directly or view it on GitHub:
#13

Eduardo S. Scarpellini
scarpellini@gmail.com

xcir commented Jun 5, 2012

Hi, scarpellini

Thank you for your interest.

vmod-curl is very good looks syntax.
I try to emulate the example of it.(in weekend ;-)

I want to ask questions.
decoded data is a good state? or encoded(urlencode)

now, data is urlencoded.

Collaborator

scarpellini commented Jun 5, 2012

Really nice, Tanaka.
I think the decoded data is better/clean way (what an admin expect from a
HTTP proxy system).
In our context (security), we need to parse every POST param. So we need an
way (considering VCL limitations - w/o arrays, no for-loops) to check the
params against some regular expressions. I think we have only one choice:
expose the whole POST data to VCL (in specific VMOD var/struct). You can
consider expose this raw data in addition of the parsed-data (splitted in
another vars).

gamelinux, comotion, Ruben, I want to hear your opnions here :)

Tanaka, you also should consider handling another charsets
(application/x-www-form-urlencoded; charset=ASCII-subsets).

2012/6/5 Syohei Tanaka <
reply@reply.github.com

Hi, scarpellini

Thank you for your interest.

vmod-curl is very good looks syntax.
I try to emulate the example of it.(in weekend ;-)

I want to ask questions.
decoded data is a good state? or encoded(urlencode)

now, data is urlencoded.


Reply to this email directly or view it on GitHub:
#13 (comment)

Eduardo S. Scarpellini
scarpellini@gmail.com

Contributor

huayra commented Jun 6, 2012

Tanaka, Eduardo,

There were some discussions (including PHK) about this under VUG5, but I am not sure what the conclusion was. And, I am still too excited to have an opinion yet :-)

I'll catch up with Kacper later this week and get back to you on this.

Great work you both are doing! Congrats!

R.

Owner

comotion commented Jun 7, 2012

This is a good thing.

Actually we spoke of encode() decode() stuff as well as post handling on the VUG. Conclusion was a go ahead for a POST module and that the varnish devs can help us expose and make things easier for such a module.

Encoding and decoding of post requests should be done here (as it will never be done inside varnish), so that we can match on specific post vars or multiparts, in the style of https://github.com/fastly/libvmod-urlcode

Ideally, Security.vcl needs to match on

  • the whole request body
  • specific body params
  • all body param names

and I see we are getting there :-)

Many backends take multiple levels of encoding, but security.vcl should be able to detect & kill requests that are obfuscated with many levels of encoding.

xcir commented Jun 7, 2012

Hi,

I made to hear you opinion. (encode/decode is not implement yet)
https://github.com/xcir/libvmod-parsepost/tree/experimental

Is the code in preparation ,but will complete on weekend.

example

set req.http.post_key_data = postparse.post_header("key");
set req.http.get_key_data = postparse.get_header("key");
set req.http.cookie_key_data = postparse.cookie_header("key");

set req.http.get_key_alldata = postparse.get_body();
set req.http.cookie_key_alldata = postparse.cookie_body();
set req.http.post_key_alldata = postparse.post_body();

all body param names

how do you want to a interface?

example
1)
call
parsepost.get_key();
ret
"key1,key2,key3"

call
    parsepost.get_key();
ret
    "key1" (1st call)
    "key1" (2nd call)
    "" (last call)

I'm sorry if I get it wrong.

Contributor

huayra commented Jun 12, 2012

It seems you got it right now, at least :-)

http://blog.xcir.net/index.php/2012/06/to-parsing-the-postget-request-and-cookie-header-of-varnish-vmod-parsereq/
https://github.com/xcir/libvmod-parsereq

How does this relate to the Sec.VCL + Varnish Firewall Roadmap/To-Do list?

Contributor

huayra commented Oct 1, 2012

I could arrange for a Roadmap/To-Do list update if you seem it relevant. Let me know.

Owner

comotion commented Oct 3, 2012

now it looks like encode&decode is on top to tie all of this together

Owner

comotion commented Oct 22, 2012

ok so we already have urlencode / urldecode in the libvmod_urlcode from @drwilco.
I've changed my mind, basically this module will be simpler if we leave decoding to another module.

The encode/decode functions that are vital for a web application firewall are unicode-normalization functions that give the shortest representation of all codepoints. Code that does this should probably end up in a different vmod. After a little research it seems this utf vmod should be based around utf8proc
(http://www.public-software-group.org/utf8proc )

We'll end up with code like this:

req.http.X-SEC-body = urlcode.decode( unicode.decode( parsereq.post_body ));
req.http.X-SEC-url = urlcode.decode( unicode.decode(req.url));

now to the subject at hand, the functions exposed by parsereq. this will probably end up in a parsereq ticket:

  • get_read_keylist() is of limited value because we cannot iterate in VCL. We would like to iterate over these values from VCL, how can we solve this?
    I guess that answers your question: we need all keys in one string or some iteration callback-to-vcl
    We also need all values in one string. Just having get_body() and post_body() is quite a step ahead!
  • get_header / post_header : They should probably be called get_key and post_key?

and while we are parsing the request and talking about headers:
it would be quite useful to be able to match all HTTP headers, and to iterate over HTTP header values.

cheers & keep up the good work. looking forward to hearing from you!

xcir commented Oct 22, 2012

Good evening,

I thought how to iterate. (just an idea)

parsereq.addfilter_post(othervmod.filter_a());
parsereq.addfilter_post(othervmod.filter_b());
....
parsereq.execfilter(); //iterate registered filter for all post,get,req.http.*? data.

-vcc and c sample:
https://gist.github.com/3933140

What do you think about this code?

get_header / post_header : They should probably be called get_key and post_key?

Oh... certainly.
I'll think it over. (rename or add)

Contributor

huayra commented Oct 22, 2012

Kacper, Syohei,

Make sure to send your VCL design to the varnish-dev and -misc lists. So
you can both get feedback on what would be useful for other and get the
attention from potentially interested people that might join the community
since the last time this was up.

Best,

Rubén Romero
Varnish Software
Den 22. okt. 2012 20.54 skrev "Syohei Tanaka" notifications@github.com
følgende:

Good evening,

I thought how to iterate. (just an idea)

parsereq.addfilter_post(othervmod.filter_a());
parsereq.addfilter_post(othervmod.filter_b());
....
parsereq.execfilter(); //iterate registered filter for all
post,get,req.http.*? data.

-vcc and c sample:
https://gist.github.com/3933140

What do you think about this code?

get_header / post_header : They should probably be called get_key and
post_key?

Oh... certainly.
I'll think it over. (rename or add)


Reply to this email directly or view it on GitHubhttps://github.com/comotion/security.vcl/issues/13#issuecomment-9675871.

Owner

comotion commented Oct 23, 2012

@everyone https://github.com/comotion/VSF
@xcir: the best would be some way of writing filters in VCL than you can iterate over all headers or params
@huayra: when it's ready

Contributor

huayra commented Oct 23, 2012

Wow, nice work Kacper.

Really awesome wrapping of all the work that has been done since Security.VCL came about in 2009.
(I fixed the README.rst but you have an issue over at github already)

xcir commented Nov 17, 2012

--sample
https://gist.github.com/3939606

I Changed to call subroutine.
And, I'm going to do also apply to the iterate for header. (req.http.* and other)
What do you think about this code?

--vmod
https://github.com/xcir/libvmod-parsereq/tree/future-iterate

Owner

comotion commented Nov 21, 2012

hmm.. will this inline C stuff:
C{
Vmod_Func_parsereq.iterate(sp, "get", (const char*)VGC_function_test2);
}C
become
parsereq.iterate("headers", test2)
or
parsereq.iterate("post", filter_function)
?
cause that could work

xcir commented Nov 22, 2012

I don't think I can without use of inline-C.
Because, VGC_function_XXX is static function.
And, VMOD data-types does not have a function pointer.
I can't come up with a good idea.

Any ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment