Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support ACKs from replica to master #1243

Merged
merged 8 commits into from
May 29, 2023
Merged

Conversation

royjacobson
Copy link
Contributor

This does implement the ACKs functionality we wanted, and adds a periodic PING from the master to the replicas.

src/server/replica.cc Outdated Show resolved Hide resolved
src/server/dflycmd.h Outdated Show resolved Hide resolved
@royjacobson royjacobson mentioned this pull request May 21, 2023
7 tasks
src/server/replica.cc Outdated Show resolved Hide resolved
src/server/replica.cc Outdated Show resolved Hide resolved
@royjacobson royjacobson marked this pull request as ready for review May 21, 2023 13:34
@royjacobson royjacobson changed the base branch from replication_abs_offsets to main May 23, 2023 09:40
src/server/replica.cc Outdated Show resolved Hide resolved
src/server/replica.cc Outdated Show resolved Hide resolved
adiholden
adiholden previously approved these changes May 24, 2023
dranikpg
dranikpg previously approved these changes May 24, 2023
Comment on lines 968 to 1008
waker_.await_until(
[&]() {
return journal_rec_executed_.load(std::memory_order_relaxed) >
ack_offs_ + kAckRecordMaxInterval ||
force_ping_;
},
next_ack_tp);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit: maybe declare lambda separately 👀 🔫

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda prefer it inlined :) Any specific reason to split it out or just general style?

Comment on lines +1947 to +1958
VLOG(1) << "Received client ACK=" << ack;
cntx->replication_flow->last_acked_lsn = ack;
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure at all, but I was thinking if there is any option for invalid pointer access.

If a flow fails, master starts cancellation. Part of cancellation on stable sync is running flow cleanup, it closes the socket. Then replica_ptr is dropped, so all flows are deallocated. Closing the socket will make the connection close.

Is there a way for the ack to be received before the socket is closed, but run after the flow has been deleted (and the context isn't closed yet)? Some unlucky preemption inside connection? It seems like there are none like this. But it would be a wild bug

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's Ok:
We call flow->cleanup() and flow->full_sync_fb.Join() in DflyCmd::CancelReplication before erasing the ReplicationInfo struct from DflyCmd::replica_infos_

@royjacobson royjacobson dismissed stale reviews from dranikpg and adiholden via 199f9a7 May 28, 2023 11:57
adiholden
adiholden previously approved these changes May 28, 2023
@royjacobson royjacobson merged commit 29c258d into main May 29, 2023
7 checks passed
@romange romange deleted the replication_acks branch June 7, 2023 09:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants