-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nsqd: --worker-id issues #429
Conversation
Can you confirm what version of |
So this one's a doozy 😁 🔥
This effectively breaks message ID generation which leads to duplicate message IDs being delivered to clients which breaks state tracking in both We need to do a few things:
|
I've updated the title to reflect my previous comment |
💣 |
Dang, you guys are good! Thanks for spotting that so quickly. I'll change NsqCluster to fix this.
|
Thank you for the report 👍! |
RFR @jehiah, going to separately improve docs on |
Also, I've decided I really hate the name Alternatives include |
Since message ids are only ever used between a single nsqd to it's clients (and never shared amongst hosts), why is it particularly important to use this as part of the message guid ? I guess i'm also missing exactly what implication here broke message generation? |
A value greater than the max broke message ID generation. The original idea of making this configurable was to be able to build your cluster in a way in which message IDs are guaranteed globally unique. In practice, partly due to the complete lack of documentation, I doubt it's ever used. |
Ping @jehiah Independent of its current name and usefulness we should at least get these small changes in for the current state of things. |
It appears there are some cases where nsqd will fail to FIN a message because it thinks it's not in flight (but the message is actually in flight).
In playing around with nsqd locally, I kept seeing messages like this in the log:
This was happening with a single consumer on a topic that was doing nothing but pulling messages off the queue and immediately FIN'ing them.
At first I thought it was a bug with the client library I was writing, but I was able to reproduce it with other client libraries as well (krakow and pynsq).
After some trial and error, I've found a way to reliably reproduce it:
Here's a small Ruby program that will reproduce it every time:
https://gist.github.com/bschwartz/d9b82670c48bcf6a5b7d
The output for it looks like this:
Hopefully it's clear what's going on in there. I am hoping to read through the source of nsqd and help track this down, but thought I'd throw it up here in case you guys know right away what might be causing it.
Thanks for building a great piece of software!