-
Notifications
You must be signed in to change notification settings - Fork 884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support ACKs from replica to master #1243
Conversation
5af7d5f
to
7cfc1ba
Compare
e0033a6
to
f840ddf
Compare
7cfc1ba
to
82df92a
Compare
82df92a
to
cf82beb
Compare
cf82beb
to
621e6e9
Compare
src/server/replica.cc
Outdated
waker_.await_until( | ||
[&]() { | ||
return journal_rec_executed_.load(std::memory_order_relaxed) > | ||
ack_offs_ + kAckRecordMaxInterval || | ||
force_ping_; | ||
}, | ||
next_ack_tp); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nit: maybe declare lambda separately 👀 🔫
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda prefer it inlined :) Any specific reason to split it out or just general style?
VLOG(1) << "Received client ACK=" << ack; | ||
cntx->replication_flow->last_acked_lsn = ack; | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure at all, but I was thinking if there is any option for invalid pointer access.
If a flow fails, master starts cancellation. Part of cancellation on stable sync is running flow cleanup, it closes the socket. Then replica_ptr is dropped, so all flows are deallocated. Closing the socket will make the connection close.
Is there a way for the ack to be received before the socket is closed, but run after the flow has been deleted (and the context isn't closed yet)? Some unlucky preemption inside connection? It seems like there are none like this. But it would be a wild bug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's Ok:
We call flow->cleanup()
and flow->full_sync_fb.Join()
in DflyCmd::CancelReplication
before erasing the ReplicationInfo
struct from DflyCmd::replica_infos_
199f9a7
to
d25d9b4
Compare
This does implement the ACKs functionality we wanted, and adds a periodic PING from the master to the replicas.