[FIXED] Use of `*` and `>` in subjects as literals #561

kozlovic · 2017-08-16T18:37:28Z

The issue was that a subject such as foo.bar,*,> would be
inserted to the cache as is, but when trying to remove from the
cache, calling matchLiteral() with the above subject in the cache
against the same subject would return false. This is because
matchLiteral would treat those characters as wildcards token.

Note that the sublist itself splits subjects on the . separator
and seem not bothered by such subject (would have foo and bar,*,>
tokens). Also, note that IsValidSubject() and IsValidLiteralSubject()
properly checked that the characters * and > are treated
as wildcards only if they are tokens on their own.

Resolves #558

/cc @derekcollison if you could have a look to see if this makes sense, thanks!

The issue was that a subject such as `foo.bar,*,>` would be inserted to the cache as is, but when trying to remove from the cache, calling matchLiteral() with the above subject in the cache against the same subject would return false. This is because matchLiteral would treat those characters as wildcards token. Note that the sublist itself splits subjects on the `.` separator and seem not bothered by such subject (would have `foo` and `bar,*,>` tokens). Also, note that IsValidSubject() and IsValidLiteralSubject() properly checked that the characters `*` and `>` are treated as wildcards only if they are tokens on their own. Resolves #558

derekcollison · 2017-08-16T18:53:57Z

Does this have a noticeable impact on that functions performance? Should we track separators and token length to make the checks more straightforward?

kozlovic · 2017-08-16T21:16:08Z

Will look into make it faster. But yes, there is an impact as it is now.

But fundamentally, do you agree that matchLiteral("foo,*,>", "foo,*,>")
should return true because both strings are literals? (current behavior
is that second param of the function will be interpreting the * and >
as wildcards and therefore return false).

derekcollison · 2017-08-17T00:12:58Z

Yes I agree, but want to be sensitive to performance impacts.

…

On Wed, Aug 16, 2017 at 2:16 PM, Ivan Kozlovic ***@***.***> wrote: Will look into make it faster. But yes, there is an impact as it is now. But fundamentally, do you agree that matchLiteral("foo,*,>", "foo,*,>") should return true because both strings are literals? (current behavior is that second param of the function will be interpreting the * and > as wildcards and therefore return false). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#561 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAFf8cRCnJWu8ZO1Gt77uZNQskE6CVhkks5sY1wkgaJpZM4O5T8K> .

kozlovic · 2017-08-17T00:14:49Z

Agreed.. working on optimizations, but difficult to be as fast as with old code since it was not checking that * or > were single tokens. Still making some improvements and adding a benchmark test. Will update the PR later.

derekcollison · 2017-08-17T00:20:59Z

I think the trick may be in tracking "." separators and keeping separate count on token length, if when you see next ".", or end of subject, etc. then and only then to you check for pwc or fwc.

…

On Wed, Aug 16, 2017 at 5:14 PM, Ivan Kozlovic ***@***.***> wrote: Agreed.. working on optimizations, but difficult to be as fast as with old code since it was not checking that * or > were single tokens. Still making some improvements and adding a benchmark test. Will update the PR later. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#561 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAFf8T95srlBa4NGY8bKzDy7Z8jN1Lb9ks5sY4X6gaJpZM4O5T8K> .

kozlovic · 2017-08-17T00:24:08Z

I think I tried that but it was slow, will try again. Anytime we check ahead (even if we know that it is in bound, assembly code shows that there is a bound check with possible invocation of panicindex()).

kozlovic · 2017-08-17T00:27:34Z

I am going to push a change that includes a small optimization, simplification of 'if' statements and a benchmark test. Will continue trying to optimize. Let's not merge for now.

coveralls · 2017-08-17T19:09:21Z

Changes Unknown when pulling 0cc49ec on fix_issue_558 into ** on master**.

kozlovic · 2017-08-24T21:13:58Z

Impact on normal bench is not visible. This is a comparison between master and the branch in 1 run only: (note negative value means branch (new) is faster)

Ivans-MacBook-Pro:gnatsd ivan$ benchcmp master.txt branch.txt 
benchmark                         old ns/op     new ns/op     delta
Benchmark_____Pub0b_Payload-8     86.8          86.2          -0.69%
Benchmark_____Pub8b_Payload-8     87.2          87.0          -0.23%
Benchmark____Pub32b_Payload-8     95.1          96.3          +1.26%
Benchmark___Pub128B_Payload-8     122           116           -4.92%
Benchmark___Pub256B_Payload-8     180           152           -15.56%
Benchmark_____Pub1K_Payload-8     239           227           -5.02%
Benchmark_____Pub4K_Payload-8     1131          1127          -0.35%
Benchmark_____Pub8K_Payload-8     2292          2314          +0.96%
Benchmark_AuthPub0b_Payload-8     179           175           -2.23%
Benchmark____________PubSub-8     178           177           -0.56%
Benchmark____PubSubTwoConns-8     178           178           +0.00%
Benchmark____PubTwoQueueSub-8     201           200           -0.50%
Benchmark___PubFourQueueSub-8     200           204           +2.00%
Benchmark__PubEightQueueSub-8     201           199           -1.00%
Benchmark___RoutedPubSub_0b-8     496           486           -2.02%
Benchmark___RoutedPubSub_1K-8     1044          1096          +4.98%
Benchmark_RoutedPubSub_100K-8     77286         78541         +1.62%

benchmark                         old MB/s     new MB/s     speedup
Benchmark_____Pub0b_Payload-8     126.75       127.58       1.01x
Benchmark_____Pub8b_Payload-8     218.00       218.39       1.00x
Benchmark____Pub32b_Payload-8     462.85       456.83       0.99x
Benchmark___Pub128B_Payload-8     1154.74      1212.86      1.05x
Benchmark___Pub256B_Payload-8     1488.72      1766.16      1.19x
Benchmark_____Pub1K_Payload-8     4335.65      4558.20      1.05x
Benchmark_____Pub4K_Payload-8     3631.12      3646.54      1.00x
Benchmark_____Pub8K_Payload-8     3579.30      3545.01      0.99x
Benchmark_AuthPub0b_Payload-8     61.18        62.71        1.03x

I ran the test another time and the Benchmark___Pub256B_Payload-8 run is also around 15% faster, not sure why.

@derekcollison

derekcollison · 2017-08-24T21:24:53Z

LGTM

petemiron

LGTM.

This is similar to #561 where `*` and `>` characters appear in tokens as literals, not wilcards. Both Insert() and Remove() were checking that the first character was `*` or `>` and consider it a wildcard node. This is wrong. Any token that is more than 1 character long must be treated as a literal. Only for token of size one should we check if the character is `*` or `>`. Added a test case for Insert and Remove with subject like `foo.*-` or `foo.>-`.

kozlovic added 2 commits August 16, 2017 16:28

add matchLiteral benchmark

30eeac2

Improve matchLiteral performance and simplify if conditions

42b27d2

Improve matchLiteral further and add some more tests

0cc49ec

nats-io deleted a comment from coveralls Aug 17, 2017

kozlovic requested a review from petemiron August 24, 2017 21:14

petemiron approved these changes Aug 25, 2017

View reviewed changes

kozlovic merged commit 10fd640 into master Aug 25, 2017

kozlovic deleted the fix_issue_558 branch August 25, 2017 02:26

kozlovic mentioned this pull request Sep 1, 2017

[FIXED] Sublist Insert and Remove with wildcard characters in literals #573

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FIXED] Use of `*` and `>` in subjects as literals #561

[FIXED] Use of `*` and `>` in subjects as literals #561

kozlovic commented Aug 16, 2017

derekcollison commented Aug 16, 2017

kozlovic commented Aug 16, 2017

derekcollison commented Aug 17, 2017 via email

kozlovic commented Aug 17, 2017

derekcollison commented Aug 17, 2017 via email

kozlovic commented Aug 17, 2017

kozlovic commented Aug 17, 2017

coveralls commented Aug 17, 2017

kozlovic commented Aug 24, 2017

derekcollison commented Aug 24, 2017

petemiron left a comment

[FIXED] Use of * and > in subjects as literals #561

[FIXED] Use of * and > in subjects as literals #561

Conversation

kozlovic commented Aug 16, 2017

derekcollison commented Aug 16, 2017

kozlovic commented Aug 16, 2017

derekcollison commented Aug 17, 2017 via email

kozlovic commented Aug 17, 2017

derekcollison commented Aug 17, 2017 via email

kozlovic commented Aug 17, 2017

kozlovic commented Aug 17, 2017

coveralls commented Aug 17, 2017

kozlovic commented Aug 24, 2017

derekcollison commented Aug 24, 2017

petemiron left a comment

Choose a reason for hiding this comment

[FIXED] Use of `*` and `>` in subjects as literals #561

[FIXED] Use of `*` and `>` in subjects as literals #561