Bundle protocol parser #38 #39

jhthorsen · 2019-02-07T13:37:52Z

I'm not quite sure how stable this protocol parser is, but it looks a lot faster 😄

What do you think?

TEST_BENCHMARK=1000 TEST_ONLINE=redis://localhost prove -vl t/benchmark.t
t/benchmark.t ..
ok 1 - lrange Mojo::Redis::Protocol 2.55848 wallclock secs ( 2.39 usr +  0.05 sys =  2.44 CPU) @ 409.84/s (n=1000)
ok 2 - lrange Protocol::Redis 6.05703 wallclock secs ( 5.86 usr +  0.07 sys =  5.93 CPU) @ 168.63/s (n=1000)
ok 3 - lrange Protocol::Redis::XS 1.85635 wallclock secs ( 1.70 usr +  0.05 sys =  1.75 CPU) @ 571.43/s (n=1000)
# Cannot compare Redis/PP and Redis2/PP
                       Rate Protocol::Redis Mojo::Redis::Protocol Protocol::Redis::XS
Protocol::Redis       169/s              --                  -59%                -70%
Mojo::Redis::Protocol 410/s            143%                    --                -28%
Protocol::Redis::XS   571/s            239%                   39%                  --
1..3
ok
All tests successful.
Files=1, Tests=3, 11 wallclock secs ( 0.02 usr  0.01 sys + 10.23 cusr  0.19 csys = 10.45 CPU)
Result: PASS

Update 8. feb:

One of the big reasons why this is so much faster is that we don't have to loop over the data twice:

mojo-redis/lib/Mojo/Redis/Connection.pm

Line 178 in 3901fef

sub _parse_message_cb {

_on_message_cb() is not in use at all when using Mojo::Redis::Protocol, which again saves at least one method call.

In addition, there's a lot less state and method calls inside the actual parser.

Protocol::Redis       167/s              --                  -60%                -71%
Mojo::Redis::Protocol 422/s            153%                    --                -26%
Protocol::Redis::XS   571/s            243%                   35%                  --

kraih · 2019-02-07T15:52:55Z

Those results certainly look encouraging.

lib/Mojo/Redis/Protocol.pm

Grinnz · 2019-02-07T23:13:10Z

Would it be easier to just leave the encoding part out of the protocol parser since it only applies to bulk-string replies anyway?

Grinnz · 2019-02-07T23:27:51Z

lib/Mojo/Redis/Connection.pm

+      my ($protocol, $msg) = @_;
+      my $p = shift @{$self->{waiting} || []};
+
+      if ($msg->{type} eq '-') {


This won't emit an error if there is an error contained within a multi-bulk reply somewhere; it seems like the current unwrapping method does this.

I'm not sure the best way to handle it, should a sub-error (like in ->exec response) cause the whole command to error? But we have no way to differentiate errors within the returned arrayref from strings.

How does this work for array types? Doesn't it need to change the data from hashrefs to their contents (and so on if it contains an array...)

I'm not sure how to write a test for error inside arrays. Do you know how to provoke such an error? If so, then I think I want to return it as a single error, since that's what I support in Connection.pm:

mojo-redis/lib/Mojo/Redis/Connection.pm

Line 216 in 3901fef

return $p ? $p->reject($err) : $self->emit(error => $err) unless $res;

Grinnz · 2019-02-08T17:48:40Z

lib/Mojo/Redis/Protocol.pm

+    my $type    = $ref eq 'ARRAY' ? '*' : !$ref ? '$' : $message->{type};
+
+    if ($type eq '+' or $type eq '-' or $type eq ':') {
+      $data = $encoder->encode($data, 0) if $encoder;


None of these types should be encoded, because they are not binary safe.

What is the purpose of specifying 0 for encode and decode here?

Note that if you specify any second argument for encode or decode, you should copy the string
(like https://metacpan.org/source/SRI/Mojolicious-8.12/lib/Mojo/Util.pm#L134) otherwise Encode will modify it in place.

None of these types should be encoded, because they are not binary safe.

What does "binary safe" mean? What will be the consequence of doing what I do here? Why should we not encode "+", but encode "$"?

What is the purpose of specifying 0 for encode and decode here?

To make sure it returns the value instead of replacing in place.

Note that if you specify any second argument for encode or decode, you should copy the string

I don't agree. I think that is only correct for the functions, not the methods. https://metacpan.org/pod/Encode::Encoding#-%3Eencode($string-[,$check])

Binary safe means that bulk strings can contain any byte. Namely, simple strings, errors, and integers cannot contain the bytes \r\n, and there is no guarantee that encoding values in some encoding will result in this. I added a test for this.

The encode specification is a mess. What you linked is the documentation for how Encode encoders maybe should work. But it's not how they work.

Grinnz · 2019-02-08T17:49:06Z

lib/Mojo/Redis/Protocol.pm

+    }
+    elsif ($type eq '$') {
+      if (defined $data) {
+        $data = $encoder->encode($data, 0) if $encoder and $type ne ':';


This is already only for $ type so the type check here is extraneous

Grinnz · 2019-02-08T17:50:14Z

lib/Mojo/Redis/Protocol.pm

+        $curr->{data} += 0;
+      }
+      elsif ($encoder ||= $self->{encoder}) {
+        $curr->{data} = $encoder->decode($curr->{data}, 0);


Likewise these types should not be decoded, they will always be ascii bytes.

Grinnz · 2019-02-08T17:54:56Z

lib/Mojo/Redis/Connection.pm

@@ -46,8 +46,12 @@ sub write_q {

 sub _encode {
  my $self     = shift;
+  my $protocol = $self->protocol;
+
+  return $protocol->encoding($self->encoding)->encode(\@_) if $protocol->can('encoding');


I think this is too fragile, it should check ->isa('Mojo::Redis::Protocol') specifically because the encode method has different behavior, if someone makes a subclass of Protocol::Redis that accidentally has a sub named encoding it would be bad.

Grinnz · 2019-02-08T17:55:39Z

lib/Mojo/Redis/Connection.pm

+
+  Scalar::Util::weaken($self);
+
+  if ($self->protocol->can('encoding')) {


Same here for ->isa check

Grinnz · 2019-02-08T18:03:19Z

The protocol spec (https://redis.io/topics/protocol) unfortunately does not make any recommendation as to how to handle multi-bulk replies that contain errors, but it does include an example of one.

This issue also discusses a similar problem. redis/node-redis#689

My opinion is that it should not cause an error condition, because there are still potentially other responses that the client needs to have returned (redis does not roll back successes). However, that would require some way to distinguish an error response within a returned arrayref from a regular string.

Grinnz · 2019-02-08T18:04:27Z

Perhaps the embedded error response could be turned into an exception object like Mojo::Exception?

Grinnz · 2019-02-08T18:09:47Z

Here is a simple way to get an error in multi-bulk response:

MULTI
SET foo bar
LPOP foo
EXEC

The response to the EXEC in the redis-cli is:

1) OK
2) (error) WRONGTYPE Operation against a key holding the wrong kind of value

And the foo key remains successfully created.

Grinnz · 2019-02-08T18:21:09Z

lib/Mojo/Redis/Protocol.pm

+      last CHUNK if length($$buf) - 2 < $self->{len};    # Wait for more data
+
+      $curr->{data} = $self->{len} < 0 ? undef : substr $$buf, 2, $self->{len}, '';
+      $curr->{data} = $encoder->decode($curr->{data}, 0) if $encoder ||= $self->{encoder};


This needs a defined check so it does not try to decode undef

Grinnz · 2019-02-08T18:23:41Z

lib/Mojo/Redis/Protocol.pm

+      if ($curr->{type} eq ':') {
+        $curr->{data} += 0;
+      }
+      elsif ($encoder ||= $self->{encoder}) {


I don't think the $encoder variable really does anything useful in this sub

Grinnz · 2019-02-08T18:29:15Z

lib/Mojo/Redis/Protocol.pm

+      last CHUNK if $self->{stop} < 0;    # Wait for more data
+
+      $self->{len} = substr $$buf, 0, $self->{stop}, '';
+      last CHUNK if length($$buf) - 2 < $self->{len};    # Wait for more data


This should be - 4, because the \r\n before and after the string data are both still there

Grinnz · 2019-02-08T18:37:23Z

lib/Mojo/Redis/Protocol.pm

+      last CHUNK if $self->{stop} < 0;                   # Wait for more data
+
+      $curr->{len} = substr $$buf, 0, $self->{stop}, '';
+      last CHUNK if length($$buf) - 2 < $curr->{len};    # Wait for more data


I don't think this line should be here. The size of the array does not indicate anything about how much data is required.

Grinnz · 2019-02-08T21:22:52Z

lib/Mojo/Redis/Protocol.pm

+  print $protocol->encode({type => "-", data => "foo"});
+  print $protocol->encode(["foo", "bar", "baz"]);
+
+  while (my $msg = $protocol->parse($bytes)) {


This synopsis is wrong, since it will keep passing the same $bytes into the parser forever...

Grinnz · 2019-02-08T21:23:52Z

lib/Mojo/Redis/Protocol.pm

+
+  $protocol = Mojo::Redis::Protocol->new;
+
+=head2 parse


This also needs to be updated to match the current API (returns a list of messages)

Grinnz · 2019-02-08T22:24:43Z

I wrote a test file based on Protocol::Redis::Test plus some protocol error and encoding tests, it does not currently pass mostly because messages are all coming back with a 'level' key, and it blows up on the first split bulk string message. https://gist.github.com/Grinnz/7b5322e8dcf9b50e8ae2ac7cab0c6709

jhthorsen · 2019-02-09T13:10:18Z

Sorry for wasting your time @Grinnz, but there seems to be too many pitfalls for me to finish this, so I've converted most of the code over to Protocol::Redis instead:

und3f/protocol-redis#8

I might revisit later on if I find the time. (I don't mind anyone else to finish it)

jhthorsen added the enhancement New feature or request label Feb 7, 2019

jhthorsen self-assigned this Feb 7, 2019

jhthorsen requested a review from Grinnz February 7, 2019 13:37

Grinnz reviewed Feb 7, 2019

View reviewed changes

lib/Mojo/Redis/Protocol.pm Outdated Show resolved Hide resolved

Grinnz reviewed Feb 7, 2019

View reviewed changes

lib/Mojo/Redis/Protocol.pm Show resolved Hide resolved

Grinnz reviewed Feb 7, 2019

View reviewed changes

Jan Henning Thorsen added 4 commits February 8, 2019 10:00

Bundle protocol parser #38

5fcbc56

Add support for detecting data type from input in encode()

9dfc520

Only encode the data part of the message

0e49a60

Got rid of on_message

d764919

jhthorsen force-pushed the feature/protocol branch from 6f803fb to d764919 Compare February 8, 2019 09:38

Grinnz reviewed Feb 8, 2019

View reviewed changes

lib/Mojo/Redis/Connection.pm

Scalar::Util::weaken($self);

if ($self->protocol->can('encoding')) {

Copy link

Collaborator

Grinnz Feb 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here for ->isa check

Grinnz reviewed Feb 8, 2019

View reviewed changes

jhthorsen closed this Feb 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bundle protocol parser #38 #39

Bundle protocol parser #38 #39

jhthorsen commented Feb 7, 2019 •

edited

kraih commented Feb 7, 2019

Grinnz commented Feb 7, 2019

Grinnz Feb 7, 2019

Grinnz Feb 7, 2019

Grinnz Feb 8, 2019

jhthorsen Feb 8, 2019

Grinnz Feb 8, 2019 •

edited

Grinnz Feb 8, 2019

Grinnz Feb 8, 2019 •

edited

jhthorsen Feb 9, 2019 •

edited

Grinnz Feb 9, 2019 •

edited

Grinnz Feb 8, 2019

Grinnz Feb 8, 2019 •

edited

Grinnz Feb 8, 2019 •

edited

Grinnz Feb 8, 2019

Grinnz commented Feb 8, 2019 •

edited

Grinnz commented Feb 8, 2019

Grinnz commented Feb 8, 2019 •

edited

Grinnz Feb 8, 2019

Grinnz Feb 8, 2019

Grinnz Feb 8, 2019

Grinnz Feb 8, 2019

Grinnz Feb 8, 2019

Grinnz Feb 8, 2019

Grinnz commented Feb 8, 2019

jhthorsen commented Feb 9, 2019


		Scalar::Util::weaken($self);

		if ($self->protocol->can('encoding')) {

Bundle protocol parser #38 #39

Bundle protocol parser #38 #39

Conversation

jhthorsen commented Feb 7, 2019 • edited

kraih commented Feb 7, 2019

Grinnz commented Feb 7, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Grinnz Feb 8, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Grinnz Feb 8, 2019 • edited

Choose a reason for hiding this comment

jhthorsen Feb 9, 2019 • edited

Choose a reason for hiding this comment

Grinnz Feb 9, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Grinnz Feb 8, 2019 • edited

Choose a reason for hiding this comment

Grinnz Feb 8, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Grinnz commented Feb 8, 2019 • edited

Grinnz commented Feb 8, 2019

Grinnz commented Feb 8, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Grinnz commented Feb 8, 2019

jhthorsen commented Feb 9, 2019

jhthorsen commented Feb 7, 2019 •

edited

Grinnz Feb 8, 2019 •

edited

Grinnz Feb 8, 2019 •

edited

jhthorsen Feb 9, 2019 •

edited

Grinnz Feb 9, 2019 •

edited

Grinnz Feb 8, 2019 •

edited

Grinnz Feb 8, 2019 •

edited

Grinnz commented Feb 8, 2019 •

edited

Grinnz commented Feb 8, 2019 •

edited