Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework goroutines and synchronization #136

Merged
merged 26 commits into from
Sep 24, 2016

Conversation

ongardie-sfdc
Copy link

Today, the division of work and the synchronization between goroutines gets to be hard to follow in places. I think we can do better, to make the library more maintainable and eliminate potential race conditions from accidentally shared state. Ideally, it'll become more unit testable too.

This commit includes a diagram and description of where I think we should go. I'm open to feedback on it. Some of it's probably underspecified, with details to be determined as we implement more; questions are fair game too.

I held back on subdividing the main Raft module into a nonblocking goroutine and blocking helpers, but it's something we could consider. I haven't studied the code enough to know whether that'd be feasible or advantageous.

The transition from here to there is going to take significant effort. Here are a few of the major differences:

  • Peer is structured completely differently from replication.go today.
  • Peer handles all communication including RequestVote, not just AppendEntries/InstallSnapshot as replication.go does today.
  • Fewer locks and shared state. commitment.go and raftstate.go remove locking/atomics, possibly merge into raft.go. Other goroutines don't get a handle to the Raft module's state.
  • Snapshots are created through a different flow.

I started on the replication.go/peer.go changes, but it was before I had a good idea of where things were heading. I'll be happy to pick that up again later.

/cc @superfell @cstlee @bmizerany @kr @slackpad @sean- #84

@slackpad
Copy link
Contributor

@ongardie-sfdc thanks for putting this together! I'm still wrapping my head around a few things (and traveling) so I'll have some more thorough feedback in a bit, but here are a couple of initial impressions:

The split of responsibilities and channel-based interfaces will be huge for testability, especially for cases that we currently can't drive things into today; we will now be able to inject whatever we want to these various channels. I'm super excited about this aspect of the design.

The short-lived goroutines kind of put me off initially as a performance smell but I see how they lead to nice, clean interactions with Peer non-blocking loop and as a checkpoint for the term check in the middle, etc. I'm thinking we might be able to create some pools of goroutines that we reuse and have that also be part of our work rate limiting. I need to think on these changes in general and how they can affect performance, and how pipelining will look. This might be a bogus worry as well, need to think more on it :-)

This will definitely be a large effort and will take some time. In the near term I'd like to try to ship a release candidate version of Consul with the #84 peer changes. This design will address several of our outstanding issues (especially the AppendEntries stale term checks) so I'm trying to weigh if it's practical to clean those up as simply as possible in the short term in the existing architecture and target this into a different, longer term integration branch to roll out in a later release of Consul.

@kr
Copy link

kr commented Jul 14, 2016

we might be able to create some pools of goroutines that we reuse

Definitely measure that approach before adopting it. Making a new goroutine is pretty cheap; can be cheaper than the alternative. For example:

https://github.com/golang/net/blob/f841c39d/http2/server.go#L610-L617
https://github.com/golang/net/blob/f841c39d/http2/server.go#L848-L851

This was a surprising result; this way turned out faster than sending frames to a long-running goroutine to be written.

as a performance smell … need to think more on it

For performance, I'd strongly encourage an empirical approach over intuition or even careful thought. You'll be surprised how hard it is to predict where the performance problems are (and aren't!). 😄

@ongardie
Copy link
Contributor

I've now reworked the replication/peer side of things, but I need to get the existing tests passing again. And then there's adding actual unit tests now that that's possible. I'll post for a review one of these days. Pipelining fit in easily using the Transport's future return value, not the additional channel as before.

So regarding short-term shipping vs making this change, I think the biggest part is already mostly done. The rest will be more incremental, like removing dependencies on shared Raft/RaftState state from other parts of the library.

@slackpad
Copy link
Contributor

@ongardie if you've got a branch some place I'm definitely willing to take a look!

@ongardie-sfdc
Copy link
Author

Well, this patch is about ready for review, but the project might feel less 1337 now.

18 files changed, 2064 insertions(+), 1337 deletions(-)

Despite that, I'll post it for review this afternoon.

See PR#136 for rationale

// There are scenarios where this request didn't succeed
// but there's no need to wait/back-off the next attempt.
NoRetryBackoff bool
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this because I don't think there are legitimate scenarios where the leader should back off upon receiving a reply. Backoff happens now only in response to transport errors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is an interesting one. The types of things where we didn't set this to true were were kind of a miscellaneous grab of application errors like not being able to write to the log. I'd probably rather see the back pressure done as a delayed response (such as if we did retries writing to the log, etc.) vs. an immediate response asking for a back off. Removing this and keeping it for comm errors seems like the way to go.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How important is backward compatibility? If providing backward compatibility, perhaps it would be better to retain this field but document that it's deprecated and ignored.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rogpeppe we've got a number of non-compatible changes in this version of the library already, though they are generally pretty easy to deal with (it took about 2 hours to get Consul building and booting again). We intend to keep this in a branch and once it's done we will message the community and give a little bit of time for people to adapt to the changes before we take it to master. Whenever we can make something safer (like ID and addresses as separate types) or simpler we are breaking things. We've also got a story for how folks can interoperate with newer versions of the library during the transition - https://github.com/hashicorp/raft/blob/library-v2-stage-one/config.go#L10-L115.

@ongardie-sfdc
Copy link
Author

Sorry for the many emails. Blame GitHub.

This is ready for your review now, @slackpad @sean- @superfell and any other generous folks.

// blockingSelect reads/writes the Peer channels just once, blocking if needed.
func (p *peerState) blockingSelect() {
// We need to send a heartbeat at lastHeartbeatSent + heartbeatInterval.
heartbeatTimer := time.After(p.shared.options.heartbeatInterval -
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need an if statement around this so that Followers and Candidates don't fire all the time.

@@ -10,7 +10,7 @@ import (
// NewInmemAddr returns a new in-memory addr with
// a randomly generate UUID as the ID.
func NewInmemAddr() ServerAddress {
return ServerAddress(generateUUID())
return ServerAddress(generateUUID()[:4])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's definitely the best part of this patch.

@ongardie-sfdc
Copy link
Author

With de57f4d, Peer still assumes the pipeline won't fail, and a new "reliable pipeline" wrapper handles errors under the hood.

@ongardie-sfdc
Copy link
Author

Hey I just discovered this thing where net_transport.go tries to detect a heartbeat based on it being all zeros (no longer true with this PR), then directly invokes a function that's probably not safe to run concurrently. I think we should disable it for now and revisit later.

@ongardie-sfdc
Copy link
Author

^@slackpad please add to issue-84 checklist

@slackpad
Copy link
Contributor

@ongardie-sfdc added as item 25

@ongardie-sfdc
Copy link
Author

hey @slackpad @sean-, any ETA on reviewing this?

@slackpad
Copy link
Contributor

Sorry for the delay @ongardie-sfdc - I will review this week!

@@ -372,6 +382,7 @@ func (r *Raft) restoreSnapshot() error {
r.configurations.latest = configuration
r.configurations.latestIndex = snapshot.Index
}
r.updatePeers()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to get called on line 332 in NewRaft() is there a good reason to call it here? We might change the configuration later when we look through the log entries.

@slackpad
Copy link
Contributor

slackpad commented Sep 2, 2016

Made it about 1/3 through but I'll be out tomorrow, so this'll spill over into next week. Things are looking really good so far. Will update once I pick up again (may have a little time on Saturday).

// If allowPipeline is true, pipelineUnsupported must be false.
allowPipeline bool

// Set to true when we're an AppendEntries request is being sent on the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment doesn't make sense.

@slackpad
Copy link
Contributor

@ongardie-sfdc I've made a pass through everything and I think the basic structure of this is sound. Pointed out a couple tiny things in the comments.

Let's go ahead and rebase or merge with the latest branch contents and get this into issue-84-integration. Over there I'm going to make another super detailed pass through the peer logic and the unit tests, as well as try to get some benchmarks going that we can pick to the stage one branch to get a before/after.

@slackpad
Copy link
Contributor

And sorry it took so long to get back to this!

@ongardie-sfdc
Copy link
Author

Thanks @slackpad, that sounds good to me. I'll be on vacation next week and most of the following, but I'm going to fix up the little nits and then try to rebase or merge before I go.

@ongardie-sfdc
Copy link
Author

I think 8a71deb is the last of my non-merge-related changes.

Notable merge issues include:
- New RPCHeader field
- New InstallSnapshotRequest.SnapshotVersion field
- processConfigurationLogEntry function
@ongardie-sfdc
Copy link
Author

Merged issue-84-integration into ongardie/modules. My confidence that I got that merge perfectly right isn't terribly high, but it's probably good enough to call done. Unit tests and a quick run seem ok.

@slackpad
Copy link
Contributor

Cool I'll scan through it again for merge weirdness and bring it over to
the branch.

On Thu, Sep 15, 2016 at 5:31 PM, Diego Ongaro notifications@github.com
wrote:

Merged issue-84-integration into ongardie/modules. My confidence that I
got that merge perfectly right isn't terribly high, but it's probably good
enough to call done. Unit tests and a quick run seem ok.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#136 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AApG5az1eLA7_nw2zUXaAfFPbK09-4k4ks5qqeNKgaJpZM4JJ8Qv
.

@slackpad slackpad merged commit ebf12dd into hashicorp:issue-84-integration Sep 24, 2016
@ongardie
Copy link
Contributor

woot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants