Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't cache SublistResult if from MatchWithResult call #5324

Merged
merged 1 commit into from
Apr 22, 2024

Conversation

neilalexander
Copy link
Member

Otherwise the cache might end up with multiple cache entries sharing the same underlying memory.

Signed-off-by: Neil Twigg neil@nats.io

Otherwise the cache might end up with multiple cache
entries mangling the same underlying memory.

Signed-off-by: Neil Twigg <neil@nats.io>
Copy link
Member

@kozlovic kozlovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but there is still some possibility of misusing this, which will lead to weird bugs.

cacheEnabled := s.cache != nil
// Writing to the cache is only allowed if not supplying our
// own SublistResult, i.e. via call to MatchWithResult.
cacheEnabled := result == nil && s.cache != nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, the other danger is that - although in this PR we make sure we don't store the result in the cache - we still may get a cache hit from a previously stored result (from some other code doing a regular match, with result being cached). If we get a hit, we will return the result from the cache, which is fine with the way you have use it in the other PR. However, if later some dev thinks that the returned result is the same than the pointer to the SublistResult from the stack, bad things will happen. Say:

r := &SublistResult{}
for _, subj := range []subjectsToCheck {
    if r = sl.MatchWithResult(subj, r); len(r.psubs) > 0 {
        // do something
    }
}

The code above would be bad if one result was returned from a cache hit, because at the next invocation in the loop, we would reset the cache result's content.

This is a far-fetched misuse, but we have to be mindful of the consequences.

@derekcollison
Copy link
Member

Where are we on this issue? I think a bool check for interest that does not add to the cache might be worth considering.

@kozlovic
Copy link
Member

@derekcollison

Where are we on this issue? I think a bool check for interest that does not add to the cache might be worth considering.

This current PR should be merged for now since it fixes the previous one by preventing adding the stack result to the cache. But even with the fix, as I mentioned in this PR, I think there is still a risk of misuse (albeit low).

But since the use case so far is just to check interest, I think we could add a Sublist_HashInterest(plainSubs, queueSubs bool) that will check for interest, but will not collect subs. It should stop as soon as a match is found, as opposed to traverse the whole trie. The 2 booleans could be used to indicate if the check wants to know if there is only plain subs interest (true, false), or only queue sub interest (false, true) or any interest (true, true). We could instead use a byte with logical operation (Sublist_PlainSubInterest|Sublist_QueueSubInterest).

I would be happy to work on that later today unless @neilalexander is planning or has already started.

The question would be - if we add the "has interest" function - do we still keep Sublist_MatchWithResult() or not?

@derekcollison
Copy link
Member

I like the idea of HasInterest and agree if we have that can remove the other one.

Will let you and @neilalexander decide how to proceed but should have something before we release 2.10.15 IMO.

@neilalexander
Copy link
Member Author

I actually already have HasInterest() implemented on a local branch, however it ignores the cache as it doesn't actually populate the subs/qsubs slice (it bails early upon discovering the first interest to save cycles). I will push it up but just haven't got around to doing tests yet.

@derekcollison
Copy link
Member

Should we merge this one and do a follow on?

@kozlovic
Copy link
Member

@neilalexander I think HasInterest() should still hit the cache first and return a boolean based on the presence of subs in the cache's result. Sure, that function will not add anything to the cache, but other part of the code may have called Match() on the same subject and populated the cache. In other words, there is no reason HasInterest() should not look up in the cache first.

Actually, part of the test would be that calling HasInterest() first would leave the cache empty, but if calling Match() and then HasInterest(), the cache hit should increase to show that HasInterest() makes use of the cache.

@derekcollison derekcollison merged commit 1007c14 into main Apr 22, 2024
4 checks passed
@neilalexander
Copy link
Member Author

Suggest we merge this for now, given it's safe in the context that it's being used in, and we'll take it back out once we have worked out a good caching strategy for a HasInterest() replacement.

In other words, there is no reason HasInterest() should not look up in the cache first.

Agreed, that is easily done for sure.

@derekcollison derekcollison deleted the neil/slcache branch April 22, 2024 15:23
neilalexander added a commit that referenced this pull request Apr 24, 2024
Includes the following:

* #5328
* #5324
* #5332
* #5314
* #5333
* #5330
* #5329
* #5338
* #5315
* #5339
* #5340
* #5341
* #5342
* #5347

Signed-off-by: Neil Twigg <neil@nats.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants