New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS: API RequestLimit delayer was not triggering #22906
Comments
I'm running Running into a rate limit issue by setting |
@mzupan can you post / send me the relevant bits of your kube-controller-manager log (/var/log/kube-controller-manager.log) from the master. We increased the logging in 1.2, so that it should now be much more informative. |
@justinsb is there a set level you want to see? Right now most of the loggin is set to |
@mzupan I think that in 1.2, when we hit the RateLimit we log the failing API request as a warning https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/aws/retry_handler.go#L86. So I hope you basically can't turn it off :-) I'm really trying to figure out which particular call is hitting the rate limit ... if it is an ELB call or something else. And of course there may be other hints in the log. We can start with v=0 - I believe that still logs warnings! |
I've been digging around too and just enabled aws on the kubelet till I figured out what process was caucing a limit to be hit.. On the kubelets per node I notice this
That happens once per second per node. The only services I'm running are normal k8 services like internal redis instances and nodeports. I'm not running any loadbalancers that would try to make an ELB or expose a external IP |
@mzupan do you have any logs that show a rate limit error? I'm not sure whether the " Could not determine public IP from AWS metadata." is related. As I recall that happens when your instances don't have public IPs, and we don't have any great options about how to determine the error - we just get back a generic error. I take you aren't giving your nodes public IPs? I'll look into why it is happening so often though.... |
so i have aws enabled for controller and kubelet and so far nothing is sprung.. Another issue is the controller is looking up ELB for services that aren't a loadbalancer.
|
I have yet to see any throttling errors in the controller |
Opened those two issues for the two other issues you've found @mzupan - let's try to keep this bug about rate limits / throttling :-) |
little bit of an update.. I ran 1.2.1-beta.0 most of the weekend and I don't think I'm hitting the api limits anymore. Before I would see limit errors when I browsed the console and have yet to see one |
Ok found out my issue finally. We had probably 50 or so services in our cluster. We don't have any load balance type services so we never gave the IAM role any access to ELBs. Looking at the code it makes a check to see if it has a ELB to make sure about any edits is my guess Once I granted |
Sounds like #25401 |
Issues go stale after 30d of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@benmcrae reported that while trying beta.0 he was hitting the AWS ELB rate limit exceeded error because of a security group tagging problem (
Error creating load balancer (will retry): Failed to create load balancer for service default/my-nginx: error creating AWS loadbalancer listeners: Throttling: Rate exceeded
). However, he did not seeInserting delay before AWS request (%s) to avoid RequestLimitExceeded: %s
in the kube-controller-manager logs.The error isn't unexpected because the fix in 5b3bb56 did not make beta.0 (it did make beta.1).
But, we would have wanted to see some AWS API throttling. Possible solutions:
The text was updated successfully, but these errors were encountered: