Skip to content

Comments

Fix unit test delays#1731

Merged
lionello merged 3 commits intomainfrom
lio/fix-tests
Dec 23, 2025
Merged

Fix unit test delays#1731
lionello merged 3 commits intomainfrom
lio/fix-tests

Conversation

@lionello
Copy link
Member

@lionello lionello commented Dec 22, 2025

Description

Many unit tests would have "arbitrary" delays related to HTTP requests, but in particular the AWS calls would try to resolve credentials from an EC2 IMDS and time out. Most of those were related to doing STS at several points in the code.

The existing mocking form STS (and others) was not consistent. Now using an overridable global factory function, eg. NewStsFromConfig which can be overridden to return a mock.

Linked Issues

Similar fix in https://github.com/DefangLabs/defang-mvp/pull/2552

Checklist

  • I have performed a self-review of my code
  • I have added appropriate tests
  • I have updated the Defang CLI docs and/or README to reflect my changes, if necessary

Summary by CodeRabbit

  • New Features

    • Debug messages now output to the error stream for proper log separation.
  • Bug Fixes

    • AWS credentials are now fetched on-demand, ensuring up-to-date credential handling rather than using potentially stale cached clients.
  • Refactor

    • Streamlined AWS service client initialization patterns for improved consistency and testability.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 22, 2025

Walkthrough

Refactors AWS client creation to use factory variables and interfaces for STS and SSM, removes global STS client, updates tests to use TestMain and centralized mocks, standardizes test region to "us-test-2", adjusts Route53 test API to zone-aware variant, and redirects debug output to stderr.

Changes

Cohort / File(s) Change Summary
STS client abstraction & factories
src/pkg/clouds/aws/sts.go
Added StsClientAPI interface, NewStsFromConfig factory var, and MockStsClientAPI mock implementation.
SSM factory
src/pkg/clouds/aws/secrets.go
Added NewSsmFromConfig factory var; removed SsmClientOverride and internal newFromConfig helper.
Byoc AWS client usage
src/pkg/cli/client/byoc/aws/byoc.go
Removed StsProviderAPI and global StsClient; introduced per-call STS client creation via NewStsFromConfig(cfg) and added type Config = awssdk.Config alias.
Load config STS usage
src/pkg/clouds/aws/common.go
Switched STS construction to use NewStsFromConfig(cfg) for GetCallerIdentity.
Public ECR constructor change
src/pkg/clouds/aws/publicecr.go, src/pkg/clouds/aws/publicecr_test.go
Converted newPublicECRClientFromConfig from a named function to a var holding a function; removed PublicECRClientOverride and updated tests to use the new constructor pattern and NewStsFromConfig for mocks.
Test initialization & mock wiring
src/cmd/cli/command/commands_test.go, src/pkg/clouds/aws/publicecr_test.go
Replaced init() with TestMain in commands_test.go; centralized mocking via NewStsFromConfig/NewSsmFromConfig with t.Cleanup/restoration; adjusted mock capture/restore across tests.
Region/test data updates
src/cmd/cli/command/stack_test.go, src/pkg/agent/tools/current_stack_test.go
Updated test region values from us-west-2 to us-test-2 and adjusted expected output strings accordingly.
Route53 API & tests
src/pkg/clouds/aws/route53_test.go
Updated calls from GetDelegationSet(ctx, r53Client) to GetDelegationSetByZone(ctx, zone, r53Client) (using nil zone in tests).
Debug output stream change
src/pkg/term/colorizer.go, src/pkg/term/colorizer_test.go
Redirected Debug/Debugf output to stderr (t.err) and updated tests to assert both stdout (expectedOut) and stderr (expectedErr) lines.

Sequence Diagram(s)

sequenceDiagram
  participant CLI
  participant BYOC as byoc/aws
  participant STS as STS Client (NewStsFromConfig)
  participant AWS as AWS STS Service

  CLI->>BYOC: Request caller identity / assume role
  BYOC->>STS: NewStsFromConfig(cfg) -> create STS client
  BYOC->>STS: GetCallerIdentity / AssumeRole call
  STS->>AWS: RPC GetCallerIdentity / AssumeRole
  AWS-->>STS: Return identity / credentials
  STS-->>BYOC: Return result
  BYOC-->>CLI: Return identity / credentials
  note right of BYOC `#f0f4c3`: Per-call STS client replaces global singleton
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60–75 minutes

Suggested reviewers

  • jordanstephens

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 15.38% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix unit test delays' refers to the core issue addressed in the PR (IMDS timeouts during unit testing), and this is reflected in multiple file changes that introduce mocking infrastructure and update test initialization.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch lio/fix-tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.5.0)

level=warning msg="[linters_context] running gomodguard failed: unable to read module file go.mod: current working directory must have a go.mod file: if you are not using go modules it is suggested to disable this linter"
level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies"


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/pkg/clouds/aws/route53_test.go (2)

61-70: Same nil zone ID issue affects delegation set retrieval.

The second call to GetDelegationSetByZone(ctx, nil, r53Client) has the same issue as line 39. A valid zone ID is needed for the AWS API call to succeed. This prevents the test from verifying that the created delegation set can be retrieved correctly.


39-44: GetDelegationSetByZone never returns ErrNoDelegationSetFound; passing nil causes AWS validation errors instead.

The function passes the zoneId directly to the AWS API without converting errors. When nil is passed, AWS returns a validation error (e.g., missing required field), not ErrNoDelegationSetFound. The test expects ErrNoDelegationSetFound but the function never returns this error—it's defined but unused in the codebase.

To fix: Update GetDelegationSetByZone to handle nil zoneId by returning ErrNoDelegationSetFound, or update the test to expect the actual AWS validation error instead.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 44ec70f and 7404e72.

📒 Files selected for processing (12)
  • src/cmd/cli/command/commands_test.go
  • src/cmd/cli/command/stack_test.go
  • src/pkg/agent/tools/current_stack_test.go
  • src/pkg/cli/client/byoc/aws/byoc.go
  • src/pkg/clouds/aws/common.go
  • src/pkg/clouds/aws/publicecr.go
  • src/pkg/clouds/aws/publicecr_test.go
  • src/pkg/clouds/aws/route53_test.go
  • src/pkg/clouds/aws/secrets.go
  • src/pkg/clouds/aws/sts.go
  • src/pkg/term/colorizer.go
  • src/pkg/term/colorizer_test.go
🧰 Additional context used
🧬 Code graph analysis (10)
src/pkg/clouds/aws/common.go (1)
src/pkg/clouds/aws/sts.go (1)
  • NewStsFromConfig (16-18)
src/cmd/cli/command/commands_test.go (4)
src/pkg/clouds/aws/common.go (1)
  • Region (15-15)
src/pkg/crun/common.go (1)
  • Region (12-12)
src/pkg/clouds/aws/sts.go (4)
  • StsClientAPI (11-14)
  • MockStsClientAPI (20-20)
  • MockStsClientAPI (22-27)
  • MockStsClientAPI (29-31)
src/protos/io/defang/v1/fabric.pb.go (8)
  • Config (2370-2378)
  • Config (2391-2391)
  • Config (2406-2408)
  • Provider (28-28)
  • Provider (66-68)
  • Provider (70-72)
  • Provider (79-81)
  • Provider_AWS (33-33)
src/pkg/term/colorizer_test.go (1)
src/pkg/term/colorizer.go (1)
  • Errorf (322-324)
src/cmd/cli/command/stack_test.go (1)
src/pkg/clouds/aws/common.go (1)
  • Region (15-15)
src/pkg/clouds/aws/sts.go (1)
src/pkg/cli/client/byoc/aws/byoc.go (1)
  • Config (51-51)
src/pkg/agent/tools/current_stack_test.go (2)
src/pkg/clouds/aws/common.go (1)
  • Region (15-15)
src/pkg/modes/modes.go (2)
  • Mode (12-12)
  • ModeAffordable (16-16)
src/pkg/clouds/aws/publicecr.go (2)
src/pkg/cli/client/byoc/aws/byoc.go (1)
  • Config (51-51)
src/pkg/clouds/aws/common.go (1)
  • Region (15-15)
src/pkg/clouds/aws/publicecr_test.go (4)
src/pkg/clouds/aws/sts.go (5)
  • NewStsFromConfig (16-18)
  • StsClientAPI (11-14)
  • MockStsClientAPI (20-20)
  • MockStsClientAPI (22-27)
  • MockStsClientAPI (29-31)
src/pkg/http/client.go (1)
  • DefaultClient (11-11)
src/pkg/cli/client/byoc/aws/byoc.go (1)
  • Config (51-51)
src/pkg/clouds/aws/publicecr.go (1)
  • PublicECRAPI (15-17)
src/pkg/cli/client/byoc/aws/byoc.go (1)
src/pkg/clouds/aws/sts.go (1)
  • NewStsFromConfig (16-18)
src/pkg/clouds/aws/route53_test.go (1)
src/pkg/clouds/aws/route53.go (1)
  • GetDelegationSetByZone (54-63)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Analyze (go)
  • GitHub Check: go-test
🔇 Additional comments (21)
src/pkg/term/colorizer.go (2)

197-202: LGTM: Debug output correctly redirected to stderr.

Redirecting debug messages to stderr aligns with Unix conventions, where diagnostic output should be separated from standard program output. This allows users to redirect stdout without losing debug information.


204-209: LGTM: Debugf output correctly redirected to stderr.

The change is consistent with the Debug method modification and maintains proper separation of diagnostic output.

src/pkg/term/colorizer_test.go (2)

96-111: LGTM: Stdout validation properly updated.

The test correctly validates that stdout now contains only Info and Warn messages, excluding Debug messages. The variable naming (expectedOut, gotOut) clearly indicates the stream being tested.


113-124: LGTM: Stderr validation properly added.

The test now validates that debug messages are correctly output to stderr. The validation logic is consistent with the stdout validation and comprehensively covers the behavior change.

src/pkg/agent/tools/current_stack_test.go (1)

26-31: LGTM! Good use of a clearly fake test region.

Using us-test-2 as a standardized test region is a sensible choice—it's obviously not a real AWS region, which helps prevent accidental real API calls if mocks aren't properly wired.

src/pkg/clouds/aws/common.go (1)

36-39: LGTM! Factory pattern enables proper test isolation.

The switch to NewStsFromConfig(cfg) allows tests to inject mock STS clients. The error handling is intentionally lenient—GetCallerIdentity failure doesn't prevent config loading, which is reasonable for credential validation that may fail in certain environments.

src/cmd/cli/command/stack_test.go (2)

46-57: LGTM!

Test data correctly updated to use the standardized fake region us-test-2.


93-103: LGTM!

Region values consistently updated in non-interactive test scenarios.

src/pkg/clouds/aws/publicecr_test.go (1)

82-92: LGTM! Clean mock setup and teardown pattern.

The save-defer-restore pattern ensures proper test isolation. Capturing previous values before overriding and restoring them in defer prevents test pollution.

src/pkg/clouds/aws/secrets.go (2)

22-24: LGTM! Consistent factory pattern for SSM client.

The NewSsmFromConfig factory follows the same pattern as NewStsFromConfig, enabling test mocking while keeping production behavior unchanged.


45-45: LGTM!

Call site correctly updated to use the factory function.

src/pkg/clouds/aws/publicecr.go (1)

21-24: LGTM! Factory variable enables test mocking.

The conversion from a function to a variable allows tests to override client creation. The us-east-1 region hardcoding is correct since ECR Public is only available in that region.

src/cmd/cli/command/commands_test.go (3)

84-87: LGTM! TestMain is preferable to init() for test setup.

Using TestMain provides explicit control over test lifecycle and makes the setup more visible.


184-191: LGTM! Clean centralized mock setup.

Good pattern for tests that share common mock requirements. The cleanup properly restores both factory functions.


321-332: LGTM!

This nested test correctly captures and restores NewStsFromConfig in its own cleanup.

src/pkg/clouds/aws/sts.go (2)

11-18: Clean abstraction for STS client testability.

The interface + factory variable pattern enables clean dependency injection for testing while maintaining production behavior by default.


29-31: Mock AssumeRole returns empty credentials—verify this won't cause nil dereferences.

The mock returns an empty AssumeRoleOutput{}, meaning Credentials is nil. In byoc.go:374, stscreds.NewAssumeRoleProvider(stsClient, roleARN) is used, which internally calls AssumeRole to obtain credentials. If any test exercises the role assumption path, accessing credential fields could panic.

Consider populating mock credentials:

Suggested fix
 func (MockStsClientAPI) AssumeRole(ctx context.Context, params *sts.AssumeRoleInput, optFns ...func(*sts.Options)) (*sts.AssumeRoleOutput, error) {
-	return &sts.AssumeRoleOutput{}, nil
+	return &sts.AssumeRoleOutput{
+		Credentials: &ststypes.Credentials{
+			AccessKeyId:     ptr.String("AKIAIOSFODNN7EXAMPLE"),
+			SecretAccessKey: ptr.String("wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"),
+			SessionToken:    ptr.String("mock-session-token"),
+			Expiration:      ptr.Time(time.Now().Add(time.Hour)),
+		},
+	}, nil
 }
src/pkg/cli/client/byoc/aws/byoc.go (3)

44-44: Type alias and import addition look good.

The Config type alias provides a stable public API without exposing the underlying SDK type directly. The sts import is needed for sts.GetCallerIdentityInput.

Also applies to: 51-51


373-376: STS client factory integration is correct.

The aws.StsClientAPI satisfies the stscreds.AssumeRoleAPIClient interface requirement since both define AssumeRole with the same signature. This enables proper role assumption while allowing test mocking.


425-426: Consistent use of the new STS factory pattern.

This mirrors the pattern used in findZone, ensuring uniform STS client instantiation throughout the codebase.

src/pkg/clouds/aws/route53_test.go (1)

16-16: Test remains skipped despite PR title "fix tests".

The test is marked as broken and skipped. Given the PR objective is to fix tests, clarify whether:

  • This test fix is intentionally deferred to a follow-up PR, or
  • The test should be enabled as part of this PR once the nil zone ID issues (flagged below) are resolved.

@lionello lionello requested a review from edwardrf December 22, 2025 02:29
@lionello lionello changed the title Lio/fix tests Fix unit test delays Dec 22, 2025
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/cmd/cli/command/commands_test.go (1)

321-335: Consider removing redundant mock setup in sub-tests.

The parent test already sets awsdriver.NewStsFromConfig to a mock at line 266 with cleanup at line 262. This sub-test (and similar ones at lines 362, 395, 445, 482, 513) re-applies the same mock. The sub-test setup is redundant since it's overwriting a mock with an identical mock.

That said, this doesn't cause issues and does make each sub-test more self-documenting. Feel free to keep as-is if you prefer the explicitness.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7404e72 and 24433f4.

📒 Files selected for processing (1)
  • src/cmd/cli/command/commands_test.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Analyze (go)
  • GitHub Check: go-test
🔇 Additional comments (5)
src/cmd/cli/command/commands_test.go (5)

30-42: LGTM!

The MockSsmClient correctly embeds the interface and implements only the methods needed for testing.


84-87: LGTM!

Using TestMain instead of init() is the idiomatic Go approach for test-wide setup. The os.Exit(m.Run()) correctly propagates the test exit code.


184-190: LGTM!

Proper save-cleanup-mock pattern: saves both factory functions, registers cleanup, then applies mocks. This ensures test isolation.


258-268: Past issue resolved.

The previous review flagged missing restoration of NewStsFromConfig. This is now correctly addressed: oldSts is saved at line 258 and restored in the cleanup at line 262.


69-69: LGTM!

Standardizing on "us-test-2" across all tests makes it immediately clear these are fake test values, avoiding any confusion with real AWS regions.

Also applies to: 128-128, 298-298

@lionello lionello merged commit e2bf461 into main Dec 23, 2025
14 checks passed
@lionello lionello deleted the lio/fix-tests branch December 23, 2025 06:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant