Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to KV #901

Merged
merged 1 commit into from
Feb 9, 2022
Merged

Updates to KV #901

merged 1 commit into from
Feb 9, 2022

Conversation

kozlovic
Copy link
Member

@kozlovic kozlovic commented Feb 9, 2022

Relying on ConsumerInfo.NumPending on create does not always work
if the NATS subscription exists before (which is normal case)
due to a design in the server that process the add consumer
request and then computes the consumer info (after the consumer
is setup which means it could start to deliver messages).

Changed a bit so that we add a JS consumer without attached
subscription and use internal option to allow a js.Subscribe()
with a bind to existing consumer.

Signed-off-by: Ivan Kozlovic ivan@synadia.com

Relying on ConsumerInfo.NumPending on create was not realiable
if the NATS subscription exists before (which is normal case)
due to a design in the server that process the add consumer
request and then computes the consumer info (after the consumer
is setup which means it could start to deliver messages).

Changed a bit so that we add a JS consumer without attached
subscription and use internal option to allow a js.Subscribe()
with a bind to existing consumer.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
@coveralls
Copy link

Coverage Status

Coverage increased (+0.02%) to 85.326% when pulling 85e8c7f on kv_updates into 6cbe827 on main.

Copy link
Member

@derekcollison derekcollison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kozlovic kozlovic merged commit 161162c into main Feb 9, 2022
@kozlovic kozlovic deleted the kv_updates branch February 9, 2022 01:17
@ripienaar
Copy link
Contributor

It’s supposed to work - we even fixed it once when we noticed it doesn’t, is it really not fixable server side?

@ripienaar
Copy link
Contributor

Oh I see it’s because this code does a consumer info post create - that will indeed be wrong in ordered consumer.

However the info received in tbe create response is correct and could be used to save a round trip

This is quite fundamental and If that’s not working we should fix the server - but iirc we already did when initial KV design was done as we notified it then but being able to rely on the info from consumer create saves some round trips and makes things much more consistent.

@derekcollison
Copy link
Member

Your right that the issue is with already active push consumers since those messages can be delivered and in the case of the ordered consumer in question with AckNone that affects num pending.

I agree we should always get the initial numPending, will make that change to the server.

@ripienaar
Copy link
Contributor

Pretty sure we fixed that already in the server, problem is here we werent using the initial cinfo but a later info. Anyway you could do numpending+delivered to get the value also

@kozlovic
Copy link
Member Author

kozlovic commented Feb 9, 2022

@ripienaar For sure in standalone server mode, the problem exists. See this code:

o, err := stream.addConsumer(&req.Config)
if err != nil {
    resp.Error = NewJSConsumerCreateError(err, Unless(err))
    s.sendAPIErrResponse(ci, acc, subject, reply, string(msg), s.jsonResponse(&resp))
    return
}
resp.ConsumerInfo = o.info()
s.sendAPIResponse(ci, acc, subject, reply, string(msg), s.jsonResponse(resp))

We return the values from o.info() which is gathered after the consumer is added in addConsumer() and has possibly started delivering messages. I have traced/instrumented both servers and client and I am 100% confident that the issue (that is the "AddConsumer" response contains NumPending==0 when the consumer was created with a stream containing say 4 messages) can happen.
I have not verified with clustering mode though.

But agreed that it would be better to have the server respond to the add consumer request with the info as it was before delivery starts. This PR could be reverted, but then we would have to fix the racy nature of possibly setting the initDone/adding nil entry when creating the watch with what is done in the "update" callback. Simple lock should solve that and perf impact should be really minimal since most of time will not be contented and anyway, since history is currently limited to a rather small number, that should not be a problem.

@ripienaar
Copy link
Contributor

Thanks @kozlovic good to know, guess we fixed it badly hehe - I think its pretty fundemantal that this work.

In the above code though Pending+Delivered would still give you the correct answer to the question of how many messages should be expected.

@derekcollison
Copy link
Member

True but for initial state completion we can't depend on per message pending, it could keep increasing and take a while to get to zero.

@derekcollison
Copy link
Member

Will fix in the server to make sure NumPending is always initial state and can't be changed by already active push consumer and the Go scheduler.

@ripienaar
Copy link
Contributor

True but for initial state completion we can't depend on per message pending, it could keep increasing and take a while to get to zero.

No thats not what i mean, i mean:

cinfo, err := js.AddConsumer(...)

to know how many you should recieve to catch up with NOW, you get numpending+delivered. You dont keep checking the pending on the per message metadata.

@derekcollison
Copy link
Member

Ah gotcha, yes that is correct. Will still fix to always have NumPending be correct from AddConsumer.

@ripienaar
Copy link
Contributor

Ah gotcha, yes that is correct. Will still fix to always have NumPending be correct from AddConsumer.

KV ADR suggested that approach - but was later updated to match what was in the go code as an optional approach, should have fixed the code rather :)

@derekcollison
Copy link
Member

Yes the code should do that regardless. @kozlovic possible to make that change?

@kozlovic
Copy link
Member Author

kozlovic commented Feb 9, 2022

What change? NumPending+Delivered? With current approach it works since nats subscription is created after. If you want me to revert the PR changes and simply have pending+delivered instead, I am ok with that too (although we will still need to fix the race between watch creation and update callback).

@derekcollison
Copy link
Member

No need to revert, just wanted to make sure we took that into consideration when calculating the initial state being done etc.

@kozlovic
Copy link
Member Author

kozlovic commented Feb 9, 2022

Ok, but with current code, delivered should be 0 since we just created the consumer and there is no interest yet? (I guess unless something else has interest that triggers the delivery, but I though interest notification was "strict" as-in it has to be the exact subject (say inbox in this case).

@ripienaar
Copy link
Contributor

Might be ok as is but then update the ADR to reflect this approach? It already says to do pending+delivered

kozlovic added a commit that referenced this pull request Feb 9, 2022
This is undoing changes done in PR #901
and makes the changes dicussed in the comments of that PR.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants