New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use gRPC to communicate the engine and agents #1426

Merged
merged 34 commits into from Aug 12, 2016

Conversation

Projects
None yet
9 participants
@hectorj2f
Copy link
Contributor

hectorj2f commented Feb 5, 2016

This PR aims to provide a new communication mechanism to improve the performance, data transmission and unit state sharing between the fleet engine and agents in a fleet cluster.

Motivation: In our infrastructure, we have experienced some issues with fleet in terms of scalability, performance and fault-tolerance. Therefore we'd like to present our ideas to help improve those areas.

We use gRPC/HTTP2 as framework to expose all the required operations (schedule unit, destroy unit, save state, ...) that will allow to coordinate the engine (the fleet node elected as leader) with the agents. In this implementation, we provide a new registry that stores all the information in-memory. Nevertheless, it's also possible to use the etcd registry.

Generally, this implementation provides two solutions as mentioned above. You can use etcd if that fits better your architecture or requirements. OR you can use the in-memory registry to reduce the dependencies with etcd (but not to avoid using etcd). In that direction, we found out in our infrastructure that a high workload over etcd induces into a poor or wrong behavior of fleet. Besides, we believe that the use of etcd to provide inter-process communication for the agents, it could end up into potential bottlenecks, as well as it has an impact into the fault tolerance of fleet.

Additional information and plots about the motivation of this PR can be found below: #1426 (comment)

This PR has been labeled as WIP, we are still working on improvements, fault tolerance, bug fixing, etc :)

NOTE: If you want to try this PR, you need to rebuild the Go dependencies, and preferably to use Go v1.5.1. This PR was so big that we were forced to exclude our new Go dependencies.

@kayrus

This comment has been minimized.

Copy link
Contributor

kayrus commented Feb 5, 2016

@hectorj2f could you provide more info on use cases of this PR? You are trying to implement new protocol for fleet daemons communication and avoid etcd because it is slow? Is that correct?

@hectorj2f

This comment has been minimized.

Copy link
Contributor

hectorj2f commented Feb 5, 2016

could you provide more info on use cases of this PR?

@kayrus Yes, we'll add more information to this PR description such as motivation, use cases, etc...

You are trying to implement new protocol for fleet daemons communication and avoid etcd because it is slow? Is that correct?

Generally, this implementation provides two solutions as mentioned above. You can use etcd if that fits better your architecture or requirements. OR you can use the in-memory registry to reduce the dependencies with etcd (but not to avoid using etcd). In that direction, we found out in our infrastructure that a high workload over etcd induces into a poor or wrong behavior of fleet.

@mischief

This comment has been minimized.

Copy link
Contributor

mischief commented Feb 5, 2016

@hectorj2f perhaps it would be worth your time to investigate using fleet with the etcd v3 api, which uses gRPC.

@hectorj2f

This comment has been minimized.

Copy link
Contributor

hectorj2f commented Feb 8, 2016

@mischief yes, that won't be a bad idea although we believe that the use of etcd to provide inter-process communication for the agents, it could end up into potential bottlenecks, as well as it has an impact into the fault tolerance of fleet. Perhaps, @htr could add more light into this to clarify our intention with this PR.

On the other hand and related to your suggestion, Do you have benchmarking results to analyze the benefits of etcd v2 VS v3 ? We also use etcd in other projects :).

@kayrus

This comment has been minimized.

Copy link
Contributor

kayrus commented Feb 8, 2016

@htr

This comment has been minimized.

Copy link
Contributor

htr commented Feb 9, 2016

This PR addresses issues related with our fleet usage pattern. Usually our fleet cluster isn't composed by many nodes (<50), but it is expected to handle a non trivial (at this time) amount of units (>3000).

Currently, fleet's only registry implementation uses etcd to store transient and persistent data. More units means more unitStates being updated per second, which after some time causes some instability (an update can be very quick (<50ms) but also quite slow (~ 1s)).

This PR implements a multiplexing layer on top of the former EtcdRegistry. This means:

  • from the agent perspective:
    • each time an engine change is detected, a grpc connection to the new engine is established
    • any Registry operation is forwarded (via grpc) to the engine
  • from the engine perspective:
    • every time a new engine gets "elected", it starts a grpcserver
    • when an engine looses it's leadership, it stops the grpcserver, terminating existing connections
    • all the transient data is stored inmemory
    • the persistent data (units, desired state and schedule) is stored in etcd

Another issue that we encountered was the lack of cleanup (or GC) in fleet's etcd directory. This PR doesn't address that issue, but having a single writer makes the development of a solution much easier.

This implementation can coexist with the traditional etcd-centric implementation.

We have a micro benchmark (note the emphasis on micro benchmark) that we use to compare tweaks, configuration parameters, etc:

  • start NN units spaced by 100ms and wait until NN-10 units are actually started

The time between the start api call (or change desired state) and the actual start (each unit curls an endpoint with an unique identifier) is measured and stored. This delay is expected to increase with the number of units running in the system.

fleet 0.11.3 with some caching:
current fleet

this branch:
github com-htr-grpc-wip

@htr

This comment has been minimized.

Copy link
Contributor

htr commented Feb 9, 2016

perhaps it would be worth your time to investigate using fleet with the etcd v3 api, which uses gRPC.

@mischief we might, but I still think separating the storage of transient and persistent data is important.

@hectorj2f hectorj2f referenced this pull request Feb 29, 2016

Closed

Allow submission of custom fleet systemd units and timers #14

1 of 2 tasks complete

@jonboulle jonboulle added this to the v1.0.0 milestone Mar 15, 2016

@jonboulle jonboulle referenced this pull request Mar 15, 2016

Closed

Let's fix presence #746

"github.com/coreos/fleet/registry"
)

func (e *Engine) rpcLeadership(leaseTTL time.Duration, machID string) lease.Lease {

This comment has been minimized.

This comment has been minimized.

@hectorj2f

hectorj2f Mar 17, 2016

Contributor

@xiang90 Yes, this fleet version doesn't use the etcd clientv3 yet. But I always keep one eye on your changes.

@hectorj2f hectorj2f changed the title [WIP] Use gRPC to communicate the engine and agents Use gRPC to communicate the engine and agents Apr 4, 2016

@hectorj2f

This comment has been minimized.

Copy link
Contributor

hectorj2f commented Apr 4, 2016

@kayrus @tixxdz @jonboulle I will update the PR to the latest version. But, do you want me to include grpc dependencies into the Godeps folder ? This could make harder to review this PR due to the amount of the files changed.

@kayrus

This comment has been minimized.

Copy link
Contributor

kayrus commented Apr 4, 2016

@hectorj2f Just create additional PR for godeps.

@hectorj2f

This comment has been minimized.

Copy link
Contributor

hectorj2f commented Apr 4, 2016

@kayrus Alright, thanks 👍

@jonboulle

This comment has been minimized.

Copy link
Contributor

jonboulle commented Apr 4, 2016

hmm, additional commit should be fine..?

On Mon, Apr 4, 2016 at 5:02 PM, kayrus notifications@github.com wrote:

@hectorj2f https://github.com/hectorj2f Just create additional PR for
godeps.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#1426 (comment)

@kayrus

This comment has been minimized.

Copy link
Contributor

kayrus commented Apr 4, 2016

@jonboulle it will be hard to review whole PR

@jonboulle

This comment has been minimized.

Copy link
Contributor

jonboulle commented Apr 4, 2016

Eh, having it in a separate commit seems sufficient to me,but whatever

On Mon, Apr 4, 2016 at 5:24 PM kayrus notifications@github.com wrote:

@jonboulle https://github.com/jonboulle it will be hard to review whole
PR


You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub
#1426 (comment)

@hectorj2f

This comment has been minimized.

Copy link
Contributor

hectorj2f commented Apr 4, 2016

@jonboulle If we do so, there would be approx. 230 files changed and github doesn't work well with those kind of PR :/.

@dongsupark

This comment has been minimized.

Copy link
Contributor

dongsupark commented Apr 12, 2016

Recently I have been looking into this PR.
First of all, I totally agree with the original problem analysis and its approach with using GRPC. This seems to be the best way to avoid bottleneck on the side of etcd. Given the fact that etcd v3 is being implemented based on GRPC, I think this is the way to go.

On the other hand, it was not easy for me to follow each commit in this PR, from older to newer ones. Is there any chance to reorganize the PR, to reduce number of commits? (Though I can already see that it would not be trivial...)

And today I rebased this PR on top of master, fixing a minor thing to compile. (see the branch endocode/dongsu/grpc_engine_communication) And I ran functional tests.
Good news is that it runs without errors, with enable_grpc disabled.
Bad news is that 16 of 20 tests failed, with enable_grpc enabled. I suppose it could be the first issue for us to investigate.

@hectorj2f

This comment has been minimized.

Copy link
Contributor

hectorj2f commented Apr 12, 2016

Bad news is that 16 of 20 tests failed, with enable_grpc enabled. I suppose it could be the first issue for us to investigate.

@dongsupark I will upgrade this branch against the latest version and fix the tests, asap. I'd inform you if I find any blocking error. But the purpose is to have all the tests passing or adapt them if need it.

Is there any chance to reorganize the PR, to reduce number of commits? (Though I can already see that it would not be trivial...)

I feel your pain. I can squash our commits but we can do that once tests are happy, and code is LGTMed.

@antrik

This comment has been minimized.

Copy link
Contributor

antrik commented Apr 12, 2016

@hectorj2f I think the point is that having a messy history actually makes the review harder... While github favours reviewing all commits as one big diff, that's not the traditional way to deal with pull requests; and it can be quite daunting for complex changes. Thus it's better to clean up history before submitting upstream, to help the review.

Note that "clean up" doesn't typically mean to squash everything into a single commit -- that wouldn't really help. Such complex changes can usually be made as a logical series of individual commits, each representing some meaningful change that can be explained on its own, and doesn't result in anything breaking...

@hectorj2f

This comment has been minimized.

Copy link
Contributor

hectorj2f commented Apr 12, 2016

@antrik Alright!. It makes sense, I'll group the commits and add helpful messages for them.

@dongsupark

This comment has been minimized.

Copy link
Contributor

dongsupark commented Apr 12, 2016

@hectorj2f

16 of 20 tests failed, with enable_grpc enabled

Ah, sorry. This was due to my mistake. For some reason I set ExecStart=/opt/fleet/fleetd --enable_grpc=1 in util/config.go, which was not supported. I should have configured enable_grpc=1 in /opt/fleet/fleet.conf. Doing that, functional test is working correctly. Sorry for the noise. (Fundamentally I don't like nc.Fleetctl failing silently without printing out stderr, but that's another issue..)

As for the logical structure of commits, I think it was not that bad in the beginning. The first 10 commits were relatively good to understand, well-structured. It's just that at some point it started to grow like that. :-)

@hectorj2f

This comment has been minimized.

Copy link
Contributor

hectorj2f commented Aug 1, 2016

@dongsupark I rebased against master but I am getting some errors in the tests of fleetctl: https://gist.github.com/hectorj2f/c53edf850f67f6279559749a30727862

Does it ring any bell ? Where could I look at ?

@hectorj2f

This comment has been minimized.

Copy link
Contributor

hectorj2f commented Aug 1, 2016

@dongsupark Should I add the LICENSE to the header of the new files ? or you'd do it once it is merged.

@dongsupark

This comment has been minimized.

Copy link
Contributor

dongsupark commented Aug 1, 2016

@hectorj2f That looks like an error from unit tests. I haven't seen such an error.
Sorry, but I'm on vacation this week. I'll be able to look into that probably next week.

@hectorj2f

This comment has been minimized.

Copy link
Contributor

hectorj2f commented Aug 1, 2016

Alright @dongsupark ! I'd see what I can do.

@hectorj2f hectorj2f force-pushed the giantswarm:grpc_engine_communication branch from 74e147a to b39e193 Aug 8, 2016

@dongsupark

This comment has been minimized.

Copy link
Contributor

dongsupark commented Aug 11, 2016

FTR, the error in the unit test was already solved earlier in this week.
Merge conflict between #1426 and #1657 was also solved yesterday.

LGTM.
I'll create another PR just for running functional tests on semaphoreci.
If that passes, I'd like to merge #1426, #1657, and #1656 tomorrow.

@hectorj2f

This comment has been minimized.

Copy link
Contributor

hectorj2f commented Aug 11, 2016

Great @dongsupark

@dongsupark dongsupark merged commit d96d80b into coreos:master Aug 12, 2016

1 check failed

continuous-integration/travis-ci/pr The Travis CI build could not complete due to an error
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment