Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIXED] Use of * and > in subjects as literals #561

Merged
merged 4 commits into from
Aug 25, 2017
Merged

Conversation

kozlovic
Copy link
Member

The issue was that a subject such as foo.bar,*,> would be
inserted to the cache as is, but when trying to remove from the
cache, calling matchLiteral() with the above subject in the cache
against the same subject would return false. This is because
matchLiteral would treat those characters as wildcards token.

Note that the sublist itself splits subjects on the . separator
and seem not bothered by such subject (would have foo and bar,*,>
tokens). Also, note that IsValidSubject() and IsValidLiteralSubject()
properly checked that the characters * and > are treated
as wildcards only if they are tokens on their own.

Resolves #558

/cc @derekcollison if you could have a look to see if this makes sense, thanks!

The issue was that a subject such as `foo.bar,*,>` would be
inserted to the cache as is, but when trying to remove from the
cache, calling matchLiteral() with the above subject in the cache
against the same subject would return false. This is because
matchLiteral would treat those characters as wildcards token.

Note that the sublist itself splits subjects on the `.` separator
and seem not bothered by such subject (would have `foo` and `bar,*,>`
tokens). Also, note that IsValidSubject() and IsValidLiteralSubject()
properly checked that the characters `*` and `>` are treated
as wildcards only if they are tokens on their own.

Resolves #558
@derekcollison
Copy link
Member

Does this have a noticeable impact on that functions performance? Should we track separators and token length to make the checks more straightforward?

@kozlovic
Copy link
Member Author

Will look into make it faster. But yes, there is an impact as it is now.

But fundamentally, do you agree that matchLiteral("foo,*,>", "foo,*,>")
should return true because both strings are literals? (current behavior
is that second param of the function will be interpreting the * and >
as wildcards and therefore return false).

@derekcollison
Copy link
Member

derekcollison commented Aug 17, 2017 via email

@kozlovic
Copy link
Member Author

Agreed.. working on optimizations, but difficult to be as fast as with old code since it was not checking that * or > were single tokens. Still making some improvements and adding a benchmark test. Will update the PR later.

@derekcollison
Copy link
Member

derekcollison commented Aug 17, 2017 via email

@kozlovic
Copy link
Member Author

I think I tried that but it was slow, will try again. Anytime we check ahead (even if we know that it is in bound, assembly code shows that there is a bound check with possible invocation of panicindex()).

@kozlovic
Copy link
Member Author

I am going to push a change that includes a small optimization, simplification of 'if' statements and a benchmark test. Will continue trying to optimize. Let's not merge for now.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 0cc49ec on fix_issue_558 into ** on master**.

@nats-io nats-io deleted a comment from coveralls Aug 17, 2017
@nats-io nats-io deleted a comment from coveralls Aug 17, 2017
@kozlovic
Copy link
Member Author

Impact on normal bench is not visible. This is a comparison between master and the branch in 1 run only: (note negative value means branch (new) is faster)

Ivans-MacBook-Pro:gnatsd ivan$ benchcmp master.txt branch.txt 
benchmark                         old ns/op     new ns/op     delta
Benchmark_____Pub0b_Payload-8     86.8          86.2          -0.69%
Benchmark_____Pub8b_Payload-8     87.2          87.0          -0.23%
Benchmark____Pub32b_Payload-8     95.1          96.3          +1.26%
Benchmark___Pub128B_Payload-8     122           116           -4.92%
Benchmark___Pub256B_Payload-8     180           152           -15.56%
Benchmark_____Pub1K_Payload-8     239           227           -5.02%
Benchmark_____Pub4K_Payload-8     1131          1127          -0.35%
Benchmark_____Pub8K_Payload-8     2292          2314          +0.96%
Benchmark_AuthPub0b_Payload-8     179           175           -2.23%
Benchmark____________PubSub-8     178           177           -0.56%
Benchmark____PubSubTwoConns-8     178           178           +0.00%
Benchmark____PubTwoQueueSub-8     201           200           -0.50%
Benchmark___PubFourQueueSub-8     200           204           +2.00%
Benchmark__PubEightQueueSub-8     201           199           -1.00%
Benchmark___RoutedPubSub_0b-8     496           486           -2.02%
Benchmark___RoutedPubSub_1K-8     1044          1096          +4.98%
Benchmark_RoutedPubSub_100K-8     77286         78541         +1.62%

benchmark                         old MB/s     new MB/s     speedup
Benchmark_____Pub0b_Payload-8     126.75       127.58       1.01x
Benchmark_____Pub8b_Payload-8     218.00       218.39       1.00x
Benchmark____Pub32b_Payload-8     462.85       456.83       0.99x
Benchmark___Pub128B_Payload-8     1154.74      1212.86      1.05x
Benchmark___Pub256B_Payload-8     1488.72      1766.16      1.19x
Benchmark_____Pub1K_Payload-8     4335.65      4558.20      1.05x
Benchmark_____Pub4K_Payload-8     3631.12      3646.54      1.00x
Benchmark_____Pub8K_Payload-8     3579.30      3545.01      0.99x
Benchmark_AuthPub0b_Payload-8     61.18        62.71        1.03x

I ran the test another time and the Benchmark___Pub256B_Payload-8 run is also around 15% faster, not sure why.

@derekcollison

@derekcollison
Copy link
Member

LGTM

Copy link
Contributor

@petemiron petemiron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@kozlovic kozlovic merged commit 10fd640 into master Aug 25, 2017
@kozlovic kozlovic deleted the fix_issue_558 branch August 25, 2017 02:26
kozlovic added a commit that referenced this pull request Sep 1, 2017
This is similar to #561 where `*` and `>` characters appear in tokens
as literals, not wilcards.
Both Insert() and Remove() were checking that the first character
was `*` or `>` and consider it a wildcard node. This is wrong. Any
token that is more than 1 character long must be treated as a literal.
Only for token of size one should we check if the character is `*`
or `>`.
Added a test case for Insert and Remove with subject like `foo.*-`
or `foo.>-`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants