fix: AWS credential fallback robust against AWS_PROFILE set to missing profile#1104
fix: AWS credential fallback robust against AWS_PROFILE set to missing profile#1104
Conversation
…nfig
When a specified AWS profile is not found, the fallback was using
WithSharedConfigProfile("default") which still requires a [default]
section in ~/.aws/config. EC2 users with no config file at all would
panic with "failed to get shared config profile, default".
Remove the explicit profile override in the fallback so the SDK uses
its default credential chain (env vars, shared credentials, EC2 IMDS)
without requiring any profile to exist in config files.
Fixes #1094
Signed-off-by: yxxhero <aiopsclub@163.com>
There was a problem hiding this comment.
Pull request overview
Adjust AWS config loading so that when an explicitly requested profile is missing, vals falls back to the AWS SDK’s default credential chain without requiring a [default] profile/config file to exist—addressing EC2/IAM-role environments (Issue #1094).
Changes:
- Remove the explicit
WithSharedConfigProfile("default")from the profile-not-found fallback path. - Update tests to focus on the “no config present”/EC2-style scenario by replacing the prior regression test.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
pkg/awsclicompat/session.go |
Updates the profile-not-found fallback to retry without forcing the "default" profile. |
pkg/awsclicompat/session_test.go |
Replaces the prior AWS_PROFILE-env regression test with a new fallback test targeting EC2/no-config conditions. |
| @@ -159,13 +159,15 @@ func newConfig(ctx context.Context, region string, profile string, logLevel stri | |||
| if err != nil { | |||
| // If the explicit profile parameter doesn't exist, fall back to the default | |||
| // credential chain (e.g. EC2 instance profile, environment variables, etc.). | |||
| // Explicitly set profile to "default" so that AWS_PROFILE from the environment | |||
| // is not honored during the fallback (it might be set to the same missing profile | |||
| // and cause the same error). Using "default" falls through to the standard credential | |||
| // chain (env vars, EC2 IMDS, etc.) when no credentials are configured in the profile. | |||
| // | |||
| // Do NOT use WithSharedConfigProfile("default") here: that still requires the | |||
| // profile to exist in ~/.aws/config or ~/.aws/credentials. EC2 users may have | |||
| // no AWS config files at all, relying entirely on the instance's IAM role. | |||
| // Without an explicit profile option the SDK falls through its default | |||
| // credential chain (env vars, shared credentials, EC2 IMDS, etc.). | |||
| var profileNotExist config.SharedConfigProfileNotExistError | |||
| if profile != "" && errors.As(err, &profileNotExist) { | |||
| return config.LoadDefaultConfig(ctx, append(baseOpts, config.WithSharedConfigProfile("default"))...) | |||
| return config.LoadDefaultConfig(ctx, baseOpts...) | |||
| } | |||
There was a problem hiding this comment.
The fallback retry (LoadDefaultConfig(ctx, baseOpts...)) no longer overrides profile selection, so if AWS_PROFILE is set (e.g. to the same missing profile), the SDK can still attempt to load that profile and return SharedConfigProfileNotExistError again. This reintroduces the original regression that the previous code avoided by forcing a different profile. Consider adding a retry that explicitly ignores AWS_PROFILE for this call (e.g., by using a credentials/provider path that bypasses shared config profile selection, or another SDK option/approach that prevents env profile from being honored) so the fallback is robust in shells where AWS_PROFILE is set.
| // TestNewConfigProfileNotFoundFallbackNoConfigFile verifies that the fallback works | ||
| // when no ~/.aws/config exists at all (e.g., EC2 instances relying on instance profiles). | ||
| // This is the primary scenario from https://github.com/helmfile/vals/issues/1094. | ||
| func TestNewConfigProfileNotFoundFallbackNoConfigFile(t *testing.T) { | ||
| const nonExistentProfile = "vals-test-profile-does-not-exist-12345" | ||
|
|
||
| configFile, err := os.CreateTemp(t.TempDir(), "aws-config-*") | ||
| if err != nil { | ||
| t.Fatalf("creating temp AWS config file: %v", err) | ||
| } | ||
| if _, err := configFile.WriteString("[default]\n"); err != nil { | ||
| t.Fatalf("writing temp AWS config file: %v", err) | ||
| } | ||
| configFile.Close() | ||
|
|
||
| credFile, err := os.CreateTemp(t.TempDir(), "aws-credentials-*") | ||
| if err != nil { | ||
| t.Fatalf("creating temp AWS credentials file: %v", err) | ||
| } | ||
| if _, err := credFile.WriteString("[default]\n"); err != nil { | ||
| t.Fatalf("writing temp AWS credentials file: %v", err) | ||
| } | ||
| credFile.Close() | ||
|
|
||
| t.Setenv("AWS_CONFIG_FILE", configFile.Name()) | ||
| t.Setenv("AWS_SHARED_CREDENTIALS_FILE", credFile.Name()) | ||
| t.Setenv("AWS_DEFAULT_REGION", "us-east-1") |
There was a problem hiding this comment.
This test name/doc says "no ~/.aws/config exists", but it actually creates empty temp config/credentials files and points AWS_CONFIG_FILE/AWS_SHARED_CREDENTIALS_FILE at them. Either adjust the test to truly exercise the missing-file case (don’t create the files / point env vars at non-existent paths while isolating from the real home dir), or rename the test to reflect that it’s covering empty config files rather than absent ones.
| t.Fatalf("newConfig should fall back even when AWS_PROFILE env var points to the missing profile, got error: %v", fallbackErr) | ||
| t.Fatalf("newConfig should fall back to default credentials when profile is not found and no config file exists, got error: %v", fallbackErr) | ||
| } | ||
| } |
There was a problem hiding this comment.
The previous regression coverage for the case where AWS_PROFILE is set to the same missing profile was removed. With the new fallback behavior (no explicit profile), that scenario is likely to fail again and is currently untested. Consider re-adding a dedicated test that sets AWS_PROFILE to the missing profile and asserts newConfig still falls back successfully (or, if the intended behavior changed, document it and keep a test asserting the new behavior).
| } | |
| } | |
| // TestNewConfigProfileNotFoundFallbackNoConfigFileWithAWSProfileEnv verifies that | |
| // the fallback still works when AWS_PROFILE is set to the same missing profile. | |
| // This guards the regression case where ambient profile selection can block fallback. | |
| func TestNewConfigProfileNotFoundFallbackNoConfigFileWithAWSProfileEnv(t *testing.T) { | |
| const nonExistentProfile = "vals-test-profile-does-not-exist-12345" | |
| configFile, err := os.CreateTemp(t.TempDir(), "aws-config-*") | |
| if err != nil { | |
| t.Fatalf("creating temp AWS config file: %v", err) | |
| } | |
| configFile.Close() | |
| credFile, err := os.CreateTemp(t.TempDir(), "aws-credentials-*") | |
| if err != nil { | |
| t.Fatalf("creating temp AWS credentials file: %v", err) | |
| } | |
| credFile.Close() | |
| t.Setenv("AWS_CONFIG_FILE", configFile.Name()) | |
| t.Setenv("AWS_SHARED_CREDENTIALS_FILE", credFile.Name()) | |
| t.Setenv("AWS_DEFAULT_REGION", "us-east-1") | |
| t.Setenv("AWS_PROFILE", nonExistentProfile) | |
| t.Setenv("FORCE_AWS_PROFILE", "") | |
| t.Setenv("AWS_SDK_LOAD_CONFIG", "") | |
| t.Setenv("AWS_ACCESS_KEY_ID", "") | |
| t.Setenv("AWS_SECRET_ACCESS_KEY", "") | |
| t.Setenv("AWS_SESSION_TOKEN", "") | |
| ctx := context.Background() | |
| _, fallbackErr := newConfig(ctx, "us-east-1", nonExistentProfile, "") | |
| if fallbackErr != nil { | |
| t.Fatalf("newConfig should fall back to default credentials when AWS_PROFILE is set to the same missing profile and no config file exists, got error: %v", fallbackErr) | |
| } | |
| } |
|
This appears to resolve the issue I was having! I built |
|
@copilot fix comments. |
|
@Kellen275 do you have any good idea? |
Agent-Logs-Url: https://github.com/helmfile/vals/sessions/58c724eb-18ad-4e34-b1ed-9eb28fccfd2a Co-authored-by: yxxhero <11087727+yxxhero@users.noreply.github.com>
Fixed in commit 1882cbc. Here's a summary of the changes:
|
|
@copilot fix ci issue. |
Agent-Logs-Url: https://github.com/helmfile/vals/sessions/00c2fd7d-20cf-42d6-a319-e391f15517e2 Co-authored-by: yxxhero <11087727+yxxhero@users.noreply.github.com>
|
@yxxhero I tried this out again with the recent changes and it still appears to work for my use-case 👍
Was there something in particular you were looking for feedback on? |
Signed-off-by: Kellen Sappington <kellen275@gmail.com>
|
@yxxhero I believe this is ready for a final review by the way. (note my MR that was merged earlier, was intentionally targeting this branch, rather than |
When
profile=devis passed to vals andAWS_PROFILE=devis also set in the shell, the profile-not-found fallback would callLoadDefaultConfigwithout an explicit profile option — but the AWS SDK v2 checksos.Getenv("AWS_PROFILE")directly insideresolveConfigLoadersand switches to its strict loader (loadSharedConfig) whenever that env var is non-empty. The result: the fallback hits the sameSharedConfigProfileNotExistErrorit was designed to escape.Changes
pkg/awsclicompat/session.go: Replace the inline fallbackLoadDefaultConfigcall withloadDefaultConfigWithoutProfileEnv, which temporarily clearsAWS_PROFILE/AWS_DEFAULT_PROFILEunder a package-level mutex before callingLoadDefaultConfig. This forces the SDK onto its lenientloadSharedConfigIgnoreNotExistpath so it falls through to env-key, EC2 IMDS, or other ambient credentials.pkg/awsclicompat/session_test.go:TestNewConfigProfileNotFoundFallbackNoConfigFileto point env vars at truly non-existent paths (previously it created empty temp files, which don't accurately represent the EC2 no-config-file scenario).TestNewConfigProfileNotFoundFallbackWithAWSProfileEnv— regression guard for the case whereAWS_PROFILEis set to the same missing profile.