Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Route creation reconciler loop. #8164

Merged
merged 3 commits into from May 22, 2015
Merged

Conversation

cjcullen
Copy link
Member

This teases apart node registration from cloudprovider routing configuration. It also makes route configuration robust against controller crashes and transient cloudprovider failures.

RIght now, I just have this naively syncing every 10 seconds. It would be nicer to react to changes in the NodeList, and then just run a background sync every few minutes. (@lavalamp, I'm told you're the expert on that :) )

I'd like to wait for #7984 to get in and then rebase this before merging.

}).Do()
if err != nil {
return err
}
if err := gce.waitForGlobalOp(insertOp); err != nil {
if gapiErr, ok := err.(*googleapi.Error); ok && gapiErr.Code == http.StatusConflict {
// TODO (cjcullen): Make this actually check the route is correct.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resolution of this TODO is that the reconciler will now remove an incorrect route and create a correct one. Any transient 409s or 404s from reconciling will be logged, but will not prevent progress.

@davidopp
Copy link
Member

(Other than that, LGTM.)

@cjcullen cjcullen force-pushed the cloudprovider branch 2 times, most recently from fdcceb5 to d58c7aa Compare May 18, 2015 17:40
if err := rc.reconcile(testCase.nodes, testCase.initialRoutes); err != nil {
t.Errorf("%d. Error from rc.reconcile(): %v", i, err)
}
time.Sleep(10 * time.Millisecond)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel gross about this. The reconciler kicks off goroutines to add/delete routes in the cloudprovider, so immediately checking the state of the fake_cloud in the main thread will show that nothing has been updated. A sleep of pretty much any length is enough to allow the background threads to update the in-memory representation, but it feels so wrong.

Any suggestions on a better way?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notify via a channnel that one is complete.

@dchen1107
Copy link
Member

@cjcullen Have you run e2e test with your change yet?

@cjcullen
Copy link
Member Author

Yes. All green (1 time). This was already rebased on top of #6949 though :/

@cjcullen
Copy link
Member Author

Rebased against #6949. Now running e2e's (over and over again).

@cjcullen
Copy link
Member Author

2 for 2 on e2es so far after the rebase (tearing down and rebuilding each time)

@cjcullen
Copy link
Member Author

This already got an LGTM before I rebased. Any opposition to me adding the status/LGTM tag?

if err != nil {
return fmt.Errorf("error listing routes: %v", err)
}
nodeList, err := rc.kubeClient.Nodes().List(labels.Everything(), fields.Everything())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// TODO: use pkg/controller/framework.NewInformer to watch this and reduce the number of lists needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@lavalamp
Copy link
Member

LGTM-- a few requests for comments

@lavalamp
Copy link
Member

Thank you very much, unfortunately I noticed one more thing. Sorry!

tick := time.Tick(10 * time.Millisecond)
poll:
for {
select {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lavalamp is this more like what you were thinking?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you, much better!

Rename reconcilePodCIDRs to reconcileNodeCIDRs.
Add comments and TODOs about using controller framework.
@lavalamp lavalamp added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 22, 2015
@lavalamp
Copy link
Member

LGTM

@dchen1107
Copy link
Member

LGTM. but still @cjcullen have you run e2e with this one yet?

cc/ @justinsb on aws, and @derekwaynecarr on vagrant. Just a FYI.

@cjcullen
Copy link
Member Author

e2e's were solid green 2 days ago. I'll rebase and run another round (or 3) to get a little more confidence that this will play nice with what we have at head.

@dchen1107
Copy link
Member

@cjcullen Thanks for your patience.

dchen1107 added a commit that referenced this pull request May 22, 2015
Route creation reconciler loop.
@dchen1107 dchen1107 merged commit 677a4aa into kubernetes:master May 22, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants