New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: transport package #1748

Merged
merged 1 commit into from Jan 12, 2017

Conversation

Projects
None yet
4 participants
@LK4D4
Contributor

LK4D4 commented Nov 14, 2016

I'm trying to come up with a smaller testable package for raft transport to simplify membership.Cluster and have better coverage. It's just a start, and it doesn't pass the linters or whatever for now. Will appreciate any feedback.
ping @aaronlehmann

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Nov 15, 2016

Contributor

@aaronlehmann PTAL. I'm still not sure how to handle message to unknown peer properly. Now I just create peer for it, which might not be desirable on Send. Other solution would be to have separate goroutine which sends messages to unknown peers. Let me know what do you think.

Contributor

LK4D4 commented Nov 15, 2016

@aaronlehmann PTAL. I'm still not sure how to handle message to unknown peer properly. Now I just create peer for it, which might not be desirable on Send. Other solution would be to have separate goroutine which sends messages to unknown peers. Let me know what do you think.

@aaronlehmann

This comment has been minimized.

Show comment
Hide comment
@aaronlehmann

aaronlehmann Nov 15, 2016

Collaborator

Now I just create peer for it, which might not be desirable on Send

What is the downside?

Collaborator

aaronlehmann commented Nov 15, 2016

Now I just create peer for it, which might not be desirable on Send

What is the downside?

@codecov-io

This comment has been minimized.

Show comment
Hide comment
@codecov-io

codecov-io Nov 15, 2016

Current coverage is 55.00% (diff: 65.15%)

Merging #1748 into master will increase coverage by 0.18%

@@             master      #1748   diff @@
==========================================
  Files           103        105     +2   
  Lines         17250      17518   +268   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits           9456       9636   +180   
- Misses         6649       6724    +75   
- Partials       1145       1158    +13   

Sunburst

Powered by Codecov. Last update fef7386...79a5679

codecov-io commented Nov 15, 2016

Current coverage is 55.00% (diff: 65.15%)

Merging #1748 into master will increase coverage by 0.18%

@@             master      #1748   diff @@
==========================================
  Files           103        105     +2   
  Lines         17250      17518   +268   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits           9456       9636   +180   
- Misses         6649       6724    +75   
- Partials       1145       1158    +13   

Sunburst

Powered by Codecov. Last update fef7386...79a5679

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Nov 15, 2016

Contributor

@aaronlehmann it calls Dial (we probably need to call health check there as well) which might block sending messages for some time.

Contributor

LK4D4 commented Nov 15, 2016

@aaronlehmann it calls Dial (we probably need to call health check there as well) which might block sending messages for some time.

@aaronlehmann

This comment has been minimized.

Show comment
Hide comment
@aaronlehmann

aaronlehmann Nov 15, 2016

Collaborator

Yeah, that's not good. If a dedicated goroutine for sending messages to unknown peers would solve the problem, it's worth considering.

Collaborator

aaronlehmann commented Nov 15, 2016

Yeah, that's not good. If a dedicated goroutine for sending messages to unknown peers would solve the problem, it's worth considering.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Nov 16, 2016

Contributor

@aaronlehmann PTAL
I've moved unknown senders to separate goroutine and embedded context more deeply. Also added healthcheck on peer add.

Contributor

LK4D4 commented Nov 16, 2016

@aaronlehmann PTAL
I've moved unknown senders to separate goroutine and embedded context more deeply. Also added healthcheck on peer add.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Nov 16, 2016

Contributor

@aaronlehmann PTAL, now you can wait on Done until everything is finished.

Contributor

LK4D4 commented Nov 16, 2016

@aaronlehmann PTAL, now you can wait on Done until everything is finished.

@aaronlehmann

This comment has been minimized.

Show comment
Hide comment
@aaronlehmann

aaronlehmann Nov 16, 2016

Collaborator

That looks much better, thanks.

Collaborator

aaronlehmann commented Nov 16, 2016

That looks much better, thanks.

@aaronlehmann

This comment has been minimized.

Show comment
Hide comment
@aaronlehmann

aaronlehmann Dec 1, 2016

Collaborator

I think we should move forward with trying to port the raft code to use the new transport package. If we wait, it will become harder as the code diverges.

Collaborator

aaronlehmann commented Dec 1, 2016

I think we should move forward with trying to port the raft code to use the new transport package. If we wait, it will become harder as the code diverges.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Dec 1, 2016

Contributor

@aaronlehmann thanks! will do.

Contributor

LK4D4 commented Dec 1, 2016

@aaronlehmann thanks! will do.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Dec 22, 2016

Contributor

@aaronlehmann I've replaced the code with transport package. However, I blindly removed stuff with lastSeenHost and not sure that made proper replacement.
@cyli @dperny I'd love some review here. This PR is about separating transport of raft messages from Node structure.

Contributor

LK4D4 commented Dec 22, 2016

@aaronlehmann I've replaced the code with transport package. However, I blindly removed stuff with lastSeenHost and not sure that made proper replacement.
@cyli @dperny I'd love some review here. This PR is about separating transport of raft messages from Node structure.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Dec 28, 2016

Contributor

I believe I've fixed all outstanding bugs and now it's probably even more stable than master.

Contributor

LK4D4 commented Dec 28, 2016

I believe I've fixed all outstanding bugs and now it's probably even more stable than master.

Show outdated Hide outdated manager/state/raft/raft.go Outdated
Show outdated Hide outdated manager/state/raft/raft.go Outdated
cc := p.conn()
p.stop()
timer := time.AfterFunc(8*time.Second, func() {
t.mu.Lock()

This comment has been minimized.

@aaronlehmann

aaronlehmann Jan 3, 2017

Collaborator

This should probably check t.stopped and do nothing if run has already reaped the deferred connections. I realize it probably doesn't make a difference, but it's good to avoid possible side effects after the transport has stopped running.

@aaronlehmann

aaronlehmann Jan 3, 2017

Collaborator

This should probably check t.stopped and do nothing if run has already reaped the deferred connections. I realize it probably doesn't make a difference, but it's good to avoid possible side effects after the transport has stopped running.

@LK4D4 LK4D4 changed the title from [WIP] raft: transport package to raft: transport package Jan 4, 2017

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Jan 4, 2017

Contributor

Heh, failure probably because of healthcheck removal :/ Will debug tomorrow.

Contributor

LK4D4 commented Jan 4, 2017

Heh, failure probably because of healthcheck removal :/ Will debug tomorrow.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Jan 4, 2017

Contributor

controlapi tests doesn't redirect requests to leader. That's why failure.

Contributor

LK4D4 commented Jan 4, 2017

controlapi tests doesn't redirect requests to leader. That's why failure.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Jan 4, 2017

Contributor

@aaronlehmann @cyli Fixed!
I'll try with docker now.

Contributor

LK4D4 commented Jan 4, 2017

@aaronlehmann @cyli Fixed!
I'll try with docker now.

Show outdated Hide outdated manager/state/raft/storage.go Outdated
@aaronlehmann

This comment has been minimized.

Show comment
Hide comment
@aaronlehmann

aaronlehmann Jan 7, 2017

Collaborator

Needs a rebase :)

Collaborator

aaronlehmann commented Jan 7, 2017

Needs a rebase :)

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Jan 7, 2017

Contributor

Ok, it passes docker integration. Will fix comments at monday.

Contributor

LK4D4 commented Jan 7, 2017

Ok, it passes docker integration. Will fix comments at monday.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Jan 9, 2017

Contributor

@aaronlehmann @cyli I've split logic from restoreFromSnapshot.
@cyli I wasn't able to simplify stuff with map too much. Full Clear() would require some sort of synchronization with transport and also it would make short "quorum loss" for some functions that checks that. The solution would be to acquire members lock for whole snapshot operation, but I don't like how it looks.
Thanks!

Contributor

LK4D4 commented Jan 9, 2017

@aaronlehmann @cyli I've split logic from restoreFromSnapshot.
@cyli I wasn't able to simplify stuff with map too much. Full Clear() would require some sort of synchronization with transport and also it would make short "quorum loss" for some functions that checks that. The solution would be to acquire members lock for whole snapshot operation, but I don't like how it looks.
Thanks!

@cyli

cyli approved these changes Jan 9, 2017

LGTM

Show outdated Hide outdated manager/state/raft/raft.go Outdated
@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Jan 9, 2017

Contributor

@aaronlehmann Fixed, thanks! Also, I have 52 sequential passes of integration test on my machine so far.

Contributor

LK4D4 commented Jan 9, 2017

@aaronlehmann Fixed, thanks! Also, I have 52 sequential passes of integration test on my machine so far.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Jan 10, 2017

Contributor

@aaronlehmann PTAL. I've added address change handling.

Contributor

LK4D4 commented Jan 10, 2017

@aaronlehmann PTAL. I've added address change handling.

@aaronlehmann

This comment has been minimized.

Show comment
Hide comment
@aaronlehmann

aaronlehmann Jan 11, 2017

Collaborator

I tested the address change detection and it seems to work, except for the spammy log message.

Collaborator

aaronlehmann commented Jan 11, 2017

I tested the address change detection and it seems to work, except for the spammy log message.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Jan 11, 2017

Contributor

@aaronlehmann I've fixed message and other your comments. Thanks for review and testing!

Contributor

LK4D4 commented Jan 11, 2017

@aaronlehmann I've fixed message and other your comments. Thanks for review and testing!

@aaronlehmann

LGTM

raft: introducing transport package
This package is separate grpc transport layer for raft package. Before we
used membership package + one very big method in raft package.

Signed-off-by: Alexander Morozov <lk4d4math@gmail.com>
@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Jan 12, 2017

Contributor

Ok, I'm trying last time with docker and then merging.

Contributor

LK4D4 commented Jan 12, 2017

Ok, I'm trying last time with docker and then merging.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Jan 12, 2017

Contributor

Only TestSwarmNetworkPlugin fails which is expected after #1856

Contributor

LK4D4 commented Jan 12, 2017

Only TestSwarmNetworkPlugin fails which is expected after #1856

@LK4D4 LK4D4 merged commit 50f82a9 into docker:master Jan 12, 2017

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
codecov/project 55.00% (target 0.00%)
Details
dco-signed All commits are signed

@LK4D4 LK4D4 deleted the LK4D4:raft_transport branch Jan 12, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment