-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nsqd: allow unbuffered memory chan if ephemeral or deferred #1376
Conversation
I pushed up another commit, a hacky idea to handle the case of |
OK, I came up with another strategy, which I think is more efficient, fixes some long-standing potential for undesired behavior with very small
fixed by making |
Went back to the simpler idea, with an additional "started" flag. |
@ploxiln catching up on this... Doesn't this effectively reduce the likelihood of messages being written to disk when Seems like users are trying to accomplish a few things with
This fix breaks (1), somewhat handles (2), and fixes (3). Note, it looks like we also broke ephemeral channels according to metal-stack/metal-roles#74. This makes sense intuitively, if you don't want any messages stored in memory ephemeral channels shouldn't work? Not sure we did that intentionally though. |
With The change to |
... so yes, agreed with what you wrote, but I think this handles (2) just fine, and only somewhat breaks (1) (mainly the feedback). |
sorry I'm late on this... I think I follow what's happening now, but I'm still not sold. In the event consumers are slow, this creates undesirable back-pressure when you'd rather those messages get written to disk. Why don't we just revert #1159 and take a step back with what (if anything) we want to achieve with Alternatively, I think the fix in 786c1f2 is fine — creating an ephemeral channel should arguably never write to disk. For DPUB, perhaps there's a similar highly-targeted fix (like keeping a non-nil memory message chan and only using it when a DPUB occurs? |
Yeah, I'm amenable to this idea. I was trying to please the no-mem-queue use-case here but agree it might not be worth all this. |
@mreiferson Sorry for bothering but is there any progress? Here are some of my ideas that may not be consistent with your original design:
|
Updated to only change behavior (compared to latest nsq-1.2.1 release) for deferred messages or paused topics. (But it's still kinda a lot of lines.) |
Want it back |
return nil | ||
case <-time.After(time.Second): | ||
break // give up | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to wait around for a second in the ephemeral case - the intention is that it's bounded and lossy, why add back pressure when the queue is full?
My proposal would be to simplify this whole if/else block to:
if cap(t.memoryMsgChan) > 0 || m.deferred != 0 {
select {
case t.memoryMsgChan <- m:
return nil
default:
break // write to backend
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we can pare it all the way down and remove the wait. The 1 second is arbitrary ... the expectation/desire is more like a handful of milliseconds. If there's a burst of messages and a bunch of channels, they can all get through the topic to the channels in just a few milliseconds, instead of having deferred time lost or messages kinda sampled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(oh and ephemeral topics don't work at all without the memory chan)
In b3b29b7 / nsqio#1159 unbuffered memoryMsgChan were replaced with nil chan for both Topic and Channel if mem-queue-size=0. This inadvertently made deferred publish never work (with mem-queue-size=0), because some metadata is lost when messages go through the "backend" disk queue, instead of immediately going through the topic memory chan to the channels' deferred pqueues. The motivation for fully disabling the memoryMsgQueue for mem-queue-size=0 was to avoid excessively shuffling message order, which would happen if some messages jump instantly through the memory-queue while others take the longer way through the disk-queue. So, only allow using the memory queue in this case if really needed, for deferred messages, or for ephemeral topic or channel which just lose all messages if they all go through the no-op backend queue. Granted, mem-queue-size=0 never worked very well, in many respects.
db22502
to
4f5d227
Compare
much reduced, squashed, re-wrote commit message |
Thanks for the reviews @mreiferson :) |
In b3b29b7 / #1159 unbuffered
memoryMsgChan were replaced with nil chan for both Topic and Channel
(this only happens if mem-queue-size=0). This inadvertently made
deferred publish never work (with mem-queue-size=0), because some
metadata is lost when messages go through the "backend" disk queue,
but previously messages could go immediately through the topic's
unbuffered chan with their metadata intact, to then be added to each
channel's deferred pqueue. This partial revert brings this part back.
Granted, mem-queue-size=0 never worked very well, in many respects.
For this functionality, at low message publish rates, it seems there
was rarely a problem. But at high message publish rates it was always
likely that some messages would hit the topic disk queue and lose
the deferred time metadata. This potential problem remains.
In fact, the motivation for fully disabling the memoryMsgQueue for
mem-queue-size=0 was to avoid excessively shuffling message order,
which would happen if some messages jump instantly through the
memory-queue while others take the longer way through the disk-queue.
That will be likely again. But the users who noticed this probably
did not use publish-deferred functionality!
Additional fixes would be appropriate, but this is a quick-fix to
restore previous useful behavior for mem-queue-size=0.
fixes #1375