HBASE-24807 Backport HBASE-20417 to branch-1#2197
Conversation
| try (WALEntryStream entryStream = | ||
| new WALEntryStream(logQueue, fs, conf, lastReadPosition, metrics)) { | ||
| while (isReaderRunning()) { // loop here to keep reusing stream while we can | ||
| if (!source.isPeerEnabled()) { |
There was a problem hiding this comment.
This is just a safeguard to prevent accumulation of batches right? No other implications of the patch that I can think of.
There was a problem hiding this comment.
Yeah and the accumulation of batches can lead to major problems, because it's been accounted on overall buffer usage by ReplicationSourceManager. If buffer usage reaches the quota limits, replication becomes stuck. And since we check the buffer usage at ReplicationSourceManager, that means a single buffer for all peers. If one peer is disabled, while other source peers were supposed to continue to get replicated edits, those source would also be stuck because of this, until an RS restart.
There was a problem hiding this comment.
Makes sense. I've been following the jira updates on the issue that Josh created.
|
Let me re-trigger the build, it should work now. |
|
💔 -1 overall
This message was automatically generated. |
|
Checking on the UT failures. |
|
Test failures look unrelated, have those tests passing, locally. |
No description provided.