Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS: Cache instances for ELB to avoid #45050 #47410

Merged
merged 1 commit into from
Jun 16, 2017

Conversation

justinsb
Copy link
Member

@justinsb justinsb commented Jun 13, 2017

We maintain a cache of all instances, and we invalidate the cache
whenever we see a new instance. For ELBs that should be sufficient,
because our usage is limited to instance ids and security groups, which
should not change.

Fix #45050

AWS: Maintain a cache of all instances, to fix problem with > 200 nodes with ELBs

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 13, 2017
@k8s-github-robot k8s-github-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Jun 13, 2017
@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 14, 2017
@justinsb justinsb force-pushed the fix_45050 branch 2 times, most recently from f37f616 to 2ae83f0 Compare June 14, 2017 14:49
@justinsb justinsb changed the title WIP: Use instance id and caching to avoid #45050 AWS: Cache instances for ELB to avoid #45050 Jun 14, 2017
@justinsb justinsb added this to the v1.7 milestone Jun 14, 2017
@justinsb
Copy link
Member Author

@k8s-bot pull-kubernetes-kubemark-e2e-gce test this

@marun marun added the sig/aws label Jun 14, 2017
// cacheCriteria holds criteria that must hold to use a cached snapshot
type cacheCriteria struct {
MaxAge time.Duration
HasInstances []awsInstanceID
Copy link
Member

@gnufied gnufied Jun 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we document this? What does HasInstances do?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding field docs

defer c.mutex.Unlock()

// After() is technically broken by time changes until we have monotonic time
if c.snapshot != nil && c.snapshot.timestamp.After(snapshot.timestamp) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it may be worth writing c.snapshot.Younger(snapshot) so as to encapsulate out internal time keeping?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

olderThan proved clearer, but good suggestion!

if err != nil {
return nil, err
}
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can drop this else altogether.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can - we only want to log if we are using a pre-existing snapshot

// After() is technically broken by time changes until we have monotonic time
if c.snapshot != nil && c.snapshot.timestamp.After(snapshot.timestamp) {
// If this happens a lot, we could run this function in a mutex and only return one result
glog.Infof("Not caching concurrent DescribeInstances results")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add the word aws here..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

if c.snapshot != nil && c.snapshot.timestamp.After(snapshot.timestamp) {
// If this happens a lot, we could run this function in a mutex and only return one result
glog.Infof("Not caching concurrent DescribeInstances results")
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can drop the else.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we can?

@gnufied
Copy link
Member

gnufied commented Jun 14, 2017

mostly looks good to me. Some minor style changes and nits.

@justinsb
Copy link
Member Author

Pushed suggested changes (other than the if blocks) - please take a look @gnufied :-)

@gnufied
Copy link
Member

gnufied commented Jun 14, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 14, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gnufied, justinsb

Associated issue: 45050

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

We maintain a cache of all instances, and we invalidate the cache
whenever we see a new instance.  For ELBs that should be sufficient,
because our usage is limited to instance ids and security groups, which
should not change.

Fix kubernetes#45050
@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 15, 2017
@justinsb
Copy link
Member Author

@k8s-bot pull-kubernetes-e2e-gce-etcd3 test this

@justinsb
Copy link
Member Author

Readding lgtm after trivial rebase

@justinsb justinsb added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 15, 2017
@justinsb
Copy link
Member Author

@k8s-bot pull-kubernetes-federation-e2e-gce test this

@justinsb
Copy link
Member Author

/retest

@justinsb
Copy link
Member Author

@k8s-bot pull-kubernetes-federation-e2e-gce test this

@justinsb
Copy link
Member Author

/retest

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 47451, 47410, 47598, 47616, 47473)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

running more than 200 instances on AWS breaks ELB LoadBalancers
6 participants