replace HTTP::Body with HTTP::Entity::Parser and optimize query_parameters #538

Merged
merged 4 commits into from Nov 25, 2015

Projects

None yet

2 participants

@kazeburo
Contributor

related to #537 and #434

This PR replace HTTP::Body with HTTP::Entity::Parser and WWW::Form::UrlEncoded(::XS). HTTP::Entity::Parser was created based on tokuhirom's code of #434.

And also this PR remove Hash::MultiValue->flatten in parameters for reducing cost.

Here is small benchmark this pr and master.

use HTTP::Entity::Parser and WWW::Form::UrlEncoded::XS

% perl -Ilib entity_parser_bench.pl
Benchmark: running get, noparam, post for at least 3 CPU seconds...
       get:  3 wallclock secs ( 3.11 usr +  0.00 sys =  3.11 CPU) @ 6381.03/s (n=19845)
   noparam:  3 wallclock secs ( 3.09 usr +  0.00 sys =  3.09 CPU) @ 7336.57/s (n=22670)
      post:  3 wallclock secs ( 3.30 usr +  0.00 sys =  3.30 CPU) @ 5225.15/s (n=17243)
          Rate    post     get noparam
post    5225/s      --    -18%    -29%
get     6381/s     22%      --    -13%
noparam 7337/s     40%     15%      --

master(0fbe073)

% perl -Ilib entity_parser_bench.pl
Benchmark: running get, noparam, post for at least 3 CPU seconds...
       get:  3 wallclock secs ( 3.03 usr +  0.00 sys =  3.03 CPU) @ 4884.49/s (n=14800)
   noparam:  3 wallclock secs ( 3.34 usr +  0.00 sys =  3.34 CPU) @ 6130.24/s (n=20475)
      post:  3 wallclock secs ( 3.05 usr +  0.00 sys =  3.05 CPU) @ 3472.79/s (n=10592)
          Rate    post     get noparam
post    3473/s      --    -29%    -43%
get     4884/s     41%      --    -20%
noparam 6130/s     77%     26%      --
@kazeburo kazeburo closed this Nov 18, 2015
@kazeburo kazeburo reopened this Nov 18, 2015
@miyagawa
Member

@kazeburo uri.t is failing. I reproduce it locally as well.

@miyagawa

Any reason we store the cache in two variables, body_parameters and body? Same for query/query_parameters.

@miyagawa
Member

@kazeburo the test is failing when you don't have XS of WWW::Form::UrlEncoded.

@kazeburo
Contributor

@miyagawa I found bug in WWW::Form::UrlEncoded::PP. I'm fixing it now.

Hash::MultiValue new and flatten is bit slowly. For reducing HMV cost, this PR caches parameters as arrayref and create HMV instance on demand.
POST/GET data is not so large compared to modern hardware's memory. Even if cache data in two variables, It will not be a problem.

@miyagawa
Member

For reducing HMV cost, this PR caches parameters as arrayref and create HMV instance on demand.

Right, but don't we cache H::MV result anyway? Is this to optimize parameters so that we don't calculate H::MV flatten twice? It sounds like a bit micro optimization to me. I get that it might not be that harmful though.

@kazeburo
Contributor

H::MV flatten benchmark. about 3 times faster without twice flatten.
But we already cache H::MV result in parameters, this H::MV->new is called one time per a request. This optimization does not affect the real applications.

Benchmark: running with_flatten, without_flatten for at least 3 CPU seconds...
with_flatten:  3 wallclock secs ( 3.16 usr +  0.00 sys =  3.16 CPU) @ 32478.48/s (n=102632)
without_flatten:  3 wallclock secs ( 3.08 usr +  0.00 sys =  3.08 CPU) @ 91317.21/s (n=281257)
                   Rate    with_flatten without_flatten
with_flatten    32478/s              --            -64%
without_flatten 91317/s            181%              --

script

#!/usr/bin/perl

use strict;
use warnings;
use Benchmark qw/:all/;
use Hash::MultiValue;

my $query_parameter = [ foo => '123', 'bar' => '45678' ];
my $body_parameter = [ b_foo => '123', 'b_bar' => '45678', 'b_baz' => '90123456789'];

cmpthese(timethese(0, {
    with_flatten => sub {
        my $query = Hash::MultiValue->new(@$query_parameter);
        my $body = Hash::MultiValue->new(@$body_parameter);
        my $param = Hash::MultiValue->new(
            $query->flatten,
            $body->flatten
        );
    },
    without_flatten => sub {
        my $param = Hash::MultiValue->new(
            @$query_parameter,
            @$body_parameter
        );
    },
}));
@miyagawa
Member

This optimization does not affect the real applications.

Right, then why do we do it? :)

@kazeburo
Contributor

The faster is good if there is no risk or complicated code.

@miyagawa miyagawa merged commit 6238723 into plack:master Nov 25, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@miyagawa
Member

I changed the default buffer length in 06bd1f6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment