Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

prevents AWS errors when not using AWS #1810

Closed
wants to merge 1 commit into from

Conversation

dimitropoulos
Copy link
Contributor

@dimitropoulos dimitropoulos commented Mar 8, 2019

I am not using Flux with AWS. Currently, every time I start Flux, the logs report warnings related to AWS configuration-related failures.

here is a snippet of the very beginning of booting up flux:

ts=2019-03-08T20:05:54.641240853Z caller=main.go:156 version=1.10.1
ts=2019-03-08T20:05:54.671565696Z caller=main.go:247 component=cluster identity=/etc/fluxd/ssh/identity
ts=2019-03-08T20:05:54.67163768Z caller=main.go:248 component=cluster identity.pub="ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDfuUBY1CfPRnAdmjZi8+GQvLF6s8E/F+O5Pg2Sxtxu6vy7/aok/ULNh/ij7QQ0koWAmU852eTlkuzTAy44JJVrhapkHhTfEoBv3CrsgjpkTHp9lBZNmNtvEecYmDgsbjVjZfHFvd27NJf8YtW+lZ8XiVIq4uwb+StO24woYfrToD+p4TbBmgNBHoTHqvfagDC7bRoKfvXFSDBKOHuqeBKihX8etSM+RlHv+KTLVBjTjjfaji3taj0znWh2DqBmlmB2DzCB1Q2j8IwZ6VRNIjshBTEwBKBy9OX4mS0KEN9ugZsxABKVDNjdvqheNQE00h0MunzMMx6C+6Zfj4wmApNT"
ts=2019-03-08T20:05:54.67168016Z caller=main.go:249 component=cluster host=https://10.96.0.1:443 version=kubernetes-v1.13.2
ts=2019-03-08T20:05:54.671740974Z caller=main.go:261 component=cluster kubectl=/usr/local/bin/kubectl
ts=2019-03-08T20:05:54.673251798Z caller=main.go:269 component=cluster ping=true
ts=2019-03-08T20:06:06.945387204Z caller=aws.go:69 component=aws warn="no AWS region configured, or detected as cluster region" err="RequestError: send request failed\ncaused by: Get http://169.254.169.254/latest/meta-data/placement/availability-zone: dial tcp 169.254.169.254:80: connect: no route to host"
ts=2019-03-08T20:06:06.945595205Z caller=main.go:288 warning="AWS authorization not used; pre-flight check failed"

It seems reasonable (and simple enough) to me that we should try to determine if the user is even using AWS at all before reporting errors that AWS is apparently misconfigured.

I'm not entirely sure that the code here is exactly what we want to do, but it appears (in my testing) to serve the purpose of preventing logging warnings where we shouldn't.

imageCreds = credsWithAWSAuth
usingAWS := false
for _, awsOption := range awsOptions {
if fs.Changed(awsOption) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks Weave Cloud and any other user that relies on Flux to automatically setup ECR auth.

imageCreds = credsWithDefaults
imageCreds = credsWithAWSAuth
}
if *dockerConfig != "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a docker config is an alternative to ECR auth so it should not depend on AWS being enabled or nor.

@stefanprodan
Copy link
Member

I think the log suppression should happen inside ImageCredsWithAWSAuth. We want to make it easy for people using ECR to debug any miss configuration. We should find a way to determine if Flux is running on EKS before trying to fetch the credentials.

@squaremo
Copy link
Member

It seems reasonable (and simple enough) to me that we should try to determine if the user is even using AWS at all before reporting errors that AWS is apparently misconfigured.

That is what is happening -- checking for the region using the AWS API lib = is the user even using AWS at all.

@dimitropoulos
Copy link
Contributor Author

@squaremo then what are these two log lines happening for:

ts=2019-03-08T20:06:06.945387204Z caller=aws.go:69 component=aws warn="no AWS region configured, or detected as cluster region" err="RequestError: send request failed\ncaused by: Get http://169.254.169.254/latest/meta-data/placement/availability-zone: dial tcp 169.254.169.254:80: connect: no route to host"
ts=2019-03-08T20:06:06.945595205Z caller=main.go:288 warning="AWS authorization not used; pre-flight check failed"

To be clear, (as a user of flux) if I am not attempting to use AWS at all, I would expect that I do not get these log lines at all. If you disagree with that let me know.

@squaremo
Copy link
Member

@dimitropoulos They are reporting that the AWS API is not reachable. If fluxd is running in AWS, then you'll want to know about that. In general it won't know if it is running in AWS, without either

  1. checking; or
  2. requiring an argument that explicitly says it's running in AWS, even though it can figure it out.

I think your objection is that you read it as something going wrong. One mitigation might be to suppress the report of the specific error, unless expressly asked to report it. Then AWS users can see if something is amiss, and troubleshoot it (although at that point, they may as well just exec into the pod).

@rade
Copy link
Contributor

rade commented Mar 11, 2019

How about appending "(this can be ignored when not running on AWS)" or similar to the warnings?

@dimitropoulos
Copy link
Contributor Author

dimitropoulos commented Mar 11, 2019

@rade that's better... but consider if we enable support for other providers (yes, yes, I know this is a partial "slippery slope" fallacy). should we have errors for Azure, GCP, Digital Ocean, etc. when not using those?

I'm not sure of the right approach to take on the code, I acknowledge that, but I can say that getting errors/warnings for AWS when I'm not using AWS seems to me to be something we should be able to prevent.


@squaremo re:

I think your objection is that you read it as something going wrong.

To me it clearly says something is going wrong. Whether it's intended to be scary or not, seeing things like err="RequestError: send request failed and warning=...pre-flight check failed" make me feel like I've done something wrong. that is the essence of what I'm talking about with this item.

Is there some way we can do a little better on distinguishing between:

  1. (true negative) the user has not configured AWS although they intended to
  2. (false negative) the user has not configured AWS and did not intend to
  3. (false positive) the user has configured AWS although they didn't mean to (e.g. providing a registry other than ECR that isn't actually supported by accident when trying to see if some other registry will work)
  4. (true positive) the user has configured AWS and intended to (perhaps in this case we would want to spit out a message saying everything on the AWS side connected just fine?)

Right now, we're just handling all of these as the same. What I'm trying to express with this issue/PR is that I think we can do better. I gave an example as to a first shot at how, but @stefanprodan pointed out some deeper flaws in that so at this point @squaremo I'd say either close my issue or perhaps suggest some way to better distinguish things. I think a flag for what cloud provider they're intending to use, despite being more boilerplate, is not ridiculous (but only in the situation where there's no other way to figure out what their intent may be).

@2opremio suggested he was "thinking about fixing [this] too" so maybe he has a suggestion.

@rade
Copy link
Contributor

rade commented Mar 11, 2019

I offered my solution as something that improves the status quo w/o suffering from the problem @squaremo highlighted.

@squaremo
Copy link
Member

Whether it's intended to be scary or not, seeing things like err="RequestError: send request failed and warning=...pre-flight check failed" make me feel like I've done something wrong

I do get what you mean. If you're running in GCP, the error printed out is quite prominent. So I'm for making this less scary-looking. I'm also for fluxd not requiring explicit instructions when they are not needed.

@squaremo
Copy link
Member

squaremo commented Apr 9, 2019

#1863 removes the scary AWS logs unless you ask for them (and lets you filter ECR images even if you're not running in AWS). I think that covers off the motivation for this PR.

@squaremo squaremo closed this Apr 9, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants