Skip to content

fix(yaml-functions): detect cross-component !terraform.state cycles instead of stack-overflowing#2533

Merged
Andriy Knysh (aknysh) merged 5 commits into
cloudposse:mainfrom
thejrose1984:fix/2457-cross-walk-cycle-detection
May 28, 2026
Merged

fix(yaml-functions): detect cross-component !terraform.state cycles instead of stack-overflowing#2533
Andriy Knysh (aknysh) merged 5 commits into
cloudposse:mainfrom
thejrose1984:fix/2457-cross-walk-cycle-detection

Conversation

@thejrose1984
Copy link
Copy Markdown
Contributor

@thejrose1984 thejrose1984 commented May 27, 2026

what

Fix the goroutine stack overflow reported in #2457: two components that reference each other via !terraform.state (A → B, B → A) drove atmos describe affected / describe component / terraform plan into infinite recursion until the Go runtime stack overflowed.

The YAML-function cycle detector already existed and worked within a single ProcessCustomYamlTags walk, but it didn't survive the recursive describe path that !terraform.state triggers.

why

When a component is being processed and the resolver encounters !terraform.state, it does:

processTagTerraformStateWithContext
  → GetTerraformState
    → ExecuteDescribeComponent (ProcessYamlFunctions: true)
      → ProcessStacks
        → ProcessCustomYamlTags   ← re-entry

ProcessCustomYamlTags was wrapping every entry with scopedResolutionContext(), which saved the parent's context and installed a fresh, empty one. So when the inner walk found B's !terraform.state a ..., the cycle detector's Visited map had no record that A was already in progress, and it pushed A → B → A → B forever until the goroutine stack hit its 1 GB cap.

The cycle detector unit tests pass because they exercise Push/Pop on a single context; the only integration tests that would have caught this were t.Skip()-ed placeholders in internal/exec/yaml_func_circular_deps_test.go referencing fixtures that don't exist.

how

Three coordinated changes:

  1. internal/exec/yaml_func_utils.goProcessCustomYamlTags now reuses the goroutine-local ResolutionContext via GetOrCreateResolutionContext() and drops the scopedResolutionContext() wrap. The Push/Pop discipline in processTagTerraformStateWithContext / trackOutputDependency already pairs every successful Push with a deferred Pop, so the context is empty when the top-level walk returns. Removed the now-unused scopedResolutionContext helper.

  2. internal/exec/yaml_func_resolution_context.go — Added MaxResolutionDepth = 64 and a depth check in Push that returns ErrYamlFuncMaxResolutionDepth if any future re-entry path slips past the cycle detector. This is belt-and-suspenders: real cycles are caught by the Visited check; the depth bound exists so atmos surfaces a clean error instead of stack-overflowing if the detector regresses.

  3. internal/exec/terraform_state_utils.goGetTerraformState's describe-error wrap now uses double %w so errors.Is can match a propagated sentinel like ErrCircularDependency through the descriptive wrapper. Without this, the cycle error message is human-readable but errors.Is(err, ErrCircularDependency) returns false, breaking callers that try to handle the error programmatically.

tests

  • New tests/yaml_functions_circular_deps_integration_test.go plus fixture at tests/fixtures/scenarios/yaml-functions-circular-deps/ — exercises the full ExecuteDescribeComponent path on an A↔B cycle and asserts ErrCircularDependency comes back (and not the depth safety net, which would indicate the cycle detector regressed). Test completes in ~20 ms instead of running forever.
  • Removed internal/exec/yaml_func_circular_deps_test.go — all four tests in it were t.Skip()-ed placeholders referencing fixtures that don't exist. The new integration test replaces them with one that actually runs.
  • All existing TestResolutionContext* unit tests still pass unchanged.

references

Summary by CodeRabbit

  • New Features

    • Improved YAML-function cycle detection with clearer, surfaced errors.
    • Added a maximum recursion-depth safeguard to prevent stack overflow during YAML-function resolution.
  • Bug Fixes

    • Enhanced error wrapping so root causes are preserved and easier to identify.
  • Tests

    • Added an integration regression test for cross-component cycles and removed obsolete skipped tests.
  • Fixtures

    • Added scenario fixtures to reproduce and validate circular dependency behavior.

Review Change Stack

…nstead of stack-overflowing

Two components that reference each other via !terraform.state (A → B,
B → A) drove atmos into infinite recursion ending in a goroutine stack
overflow on `describe affected` / `describe component` / `terraform plan`.

The YAML-function cycle detector existed and worked for cycles inside a
single ProcessCustomYamlTags walk, but ProcessCustomYamlTags was
installing a fresh, scoped ResolutionContext on every entry via
scopedResolutionContext(). Resolving !terraform.state recurses through
GetTerraformState → ExecuteDescribeComponent → ProcessStacks →
ProcessCustomYamlTags, and each re-entry started over with an empty
Visited map, so the outer walk's in-progress components were invisible
to the inner walk and the cycle was unrecoverable.

The fix:
- ProcessCustomYamlTags now reuses the goroutine-local ResolutionContext
  so the Visited map survives across nested walks. The Push/Pop discipline
  in processTagTerraformState*/processTagTerraformOutput* already pairs
  every successful Push with a deferred Pop, so the context is empty when
  the top-level walk returns.
- Added MaxResolutionDepth (= 64) as defense-in-depth: if any future
  re-entry path bypasses the cycle detector, Push refuses to grow past 64
  frames and returns ErrYamlFuncMaxResolutionDepth rather than letting the
  Go runtime stack overflow.
- GetTerraformState's describe-error wrap now uses double %w so
  errors.Is can match the propagated sentinel (e.g., ErrCircularDependency)
  through the descriptive wrapper.

Tests:
- New regression test tests/yaml_functions_circular_deps_integration_test.go
  exercises the full ExecuteDescribeComponent path on a fixture with an
  A↔B cycle and asserts ErrCircularDependency comes back (and not the
  depth safety net, which would indicate the cycle detector regressed).
- Removed internal/exec/yaml_func_circular_deps_test.go — all four tests
  in it were t.Skip()-ed placeholders referencing fixtures that don't
  exist; the new integration test replaces them.

Closes cloudposse#2457
@thejrose1984 thejrose1984 requested a review from a team as a code owner May 27, 2026 00:45
@atmos-pro
Copy link
Copy Markdown
Contributor

atmos-pro Bot commented May 27, 2026

Tip

Atmos Pro  

No affected stacks workflow was detected for this pull request.
If this is expected, no action is needed.
Learn More. Ask AI.

@github-actions github-actions Bot added the size/m Medium size PR label May 27, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 27, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ff63c047-8e39-4fc5-a6ce-de254edff8ad

📥 Commits

Reviewing files that changed from the base of the PR and between 01304bc and bc6f319.

📒 Files selected for processing (2)
  • tests/fixtures/scenarios/yaml-functions-circular-deps/atmos.yaml
  • tests/yaml_functions_circular_deps_integration_test.go
💤 Files with no reviewable changes (1)
  • tests/fixtures/scenarios/yaml-functions-circular-deps/atmos.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/yaml_functions_circular_deps_integration_test.go

📝 Walkthrough

Walkthrough

Adds a MaxResolutionDepth constant and ErrYamlFuncMaxResolutionDepth sentinel; enforces the depth guard in ResolutionContext.Push; removes the scoped resolution context so goroutine-local cycle-detection state persists across nested YAML-tag resolution; preserves inner sentinel errors when wrapping ExecuteDescribeComponent failures; and adds fixtures + an integration test reproducing a cross-component terraform.state cycle.

Changes

Circular dependency safeguard for YAML function resolution

Layer / File(s) Summary
Depth limit constant and new error sentinel
internal/exec/yaml_func_resolution_context.go, errors/errors.go
Adds exported MaxResolutionDepth = 64 and ErrYamlFuncMaxResolutionDepth sentinel error.
Depth checking and context lifecycle refactoring
internal/exec/yaml_func_resolution_context.go, internal/exec/yaml_func_utils.go
ResolutionContext.Push enforces the depth guard and returns the depth-limit error when exceeded; removes the scoped context helper so ProcessCustomYamlTags reuses the goroutine-local ResolutionContext and allows visited-state to persist across nested resolutions.
Inner error propagation
internal/exec/terraform_state_utils.go
ExecuteDescribeComponent error wrapping now includes the inner error as a second %w so errors.Is can match both the outer sentinel and inner sentinels like ErrCircularDependency.
Regression test and fixtures
tests/fixtures/scenarios/yaml-functions-circular-deps/atmos.yaml, tests/fixtures/scenarios/yaml-functions-circular-deps/stacks/test.yaml, tests/yaml_functions_circular_deps_integration_test.go
Adds fixtures creating two components that cross-reference via !terraform.state and an integration test asserting ErrCircularDependency is returned (not the depth-limit error).

Sequence Diagram(s)

sequenceDiagram
  participant Test
  participant ExecuteDescribeComponent
  participant YAMLFunctionResolver
  participant ResolutionContext
  participant TerraformStateStore
  Test->>ExecuteDescribeComponent: describe component-a with YAML tags
  ExecuteDescribeComponent->>YAMLFunctionResolver: resolve !terraform.state
  YAMLFunctionResolver->>ResolutionContext: Push(node)
  ResolutionContext->>TerraformStateStore: fetch outputs of other component
  TerraformStateStore-->>YAMLFunctionResolver: outputs (may trigger nested resolves)
  YAMLFunctionResolver->>ResolutionContext: Push(nested node)
  ResolutionContext-->>YAMLFunctionResolver: ErrCircularDependency or ErrYamlFuncMaxResolutionDepth
  YAMLFunctionResolver-->>ExecuteDescribeComponent: return wrapped error
  ExecuteDescribeComponent-->>Test: propagate wrapped error
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • cloudposse/atmos#1708: Modifies the same goroutine-local ResolutionContext stack and visited tracking in yaml_func_resolution_context.go.

Suggested labels

patch

Suggested reviewers

  • aknysh
  • osterman
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main fix: detecting cross-component circular dependencies in YAML functions instead of stack-overflowing.
Linked Issues check ✅ Passed All code requirements from #2457 are met: cycle detection via persistent ResolutionContext, ErrCircularDependency propagation, max-depth safety net, and integration test demonstrating the A↔B cycle handling.
Out of Scope Changes check ✅ Passed All changes directly support the cycle detection fix. Removal of skipped placeholder tests is appropriate cleanup; no unrelated modifications found.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/fixtures/scenarios/yaml-functions-circular-deps/atmos.yaml`:
- Line 14: In atmos.yaml replace the hardcoded Unix-only entry file:
"/dev/stderr" (seen on line containing file: "/dev/stderr") with a
platform-neutral logging configuration: either remove the file key so the
fixture uses the default logger, or set it to a cross-platform target (e.g., a
logical "stderr" sink) or an environment-driven value; update the fixture so
tests don’t rely on the literal "/dev/stderr".

In `@tests/yaml_functions_circular_deps_integration_test.go`:
- Line 33: Replace the hard-coded path string passed to t.Chdir with an
OS-neutral path built using filepath.Join: locate the test call to
t.Chdir("./fixtures/scenarios/yaml-functions-circular-deps") and change it to
use filepath.Join(".", "fixtures", "scenarios", "yaml-functions-circular-deps");
also import the "path/filepath" package if not already imported so the test
compiles.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b4347e4d-7944-4389-ab59-43d27a8e67b7

📥 Commits

Reviewing files that changed from the base of the PR and between 3bc31f2 and 01304bc.

📒 Files selected for processing (8)
  • errors/errors.go
  • internal/exec/terraform_state_utils.go
  • internal/exec/yaml_func_circular_deps_test.go
  • internal/exec/yaml_func_resolution_context.go
  • internal/exec/yaml_func_utils.go
  • tests/fixtures/scenarios/yaml-functions-circular-deps/atmos.yaml
  • tests/fixtures/scenarios/yaml-functions-circular-deps/stacks/test.yaml
  • tests/yaml_functions_circular_deps_integration_test.go
💤 Files with no reviewable changes (1)
  • internal/exec/yaml_func_circular_deps_test.go

Comment thread tests/fixtures/scenarios/yaml-functions-circular-deps/atmos.yaml Outdated
Comment thread tests/yaml_functions_circular_deps_integration_test.go Outdated
…loudposse#2457 regression test

- Drop the Unix-only `/dev/stderr` log target from the fixture's atmos.yaml;
  the regression test doesn't depend on log routing.
- Use `filepath.Join` for the `t.Chdir` fixture path so the test runs on
  Windows runners as well as Linux/macOS (per CLAUDE.md cross-platform rule).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@aknysh Andriy Knysh (aknysh) added the patch A minor, backward compatible change label May 28, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 28, 2026

Codecov Report

❌ Patch coverage is 11.11111% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.59%. Comparing base (5cbc94a) to head (d04bac3).

Files with missing lines Patch % Lines
internal/exec/yaml_func_resolution_context.go 0.00% 7 Missing and 1 partial ⚠️

❌ Your patch check has failed because the patch coverage (11.11%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #2533   +/-   ##
=======================================
  Coverage   78.59%   78.59%           
=======================================
  Files        1145     1145           
  Lines      110311   110305    -6     
=======================================
- Hits        86699    86698    -1     
+ Misses      18804    18798    -6     
- Partials     4808     4809    +1     
Flag Coverage Δ
unittests 78.59% <11.11%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
errors/errors.go 100.00% <ø> (ø)
internal/exec/terraform_state_utils.go 79.76% <100.00%> (ø)
internal/exec/yaml_func_utils.go 94.11% <ø> (-0.12%) ⬇️
internal/exec/yaml_func_resolution_context.go 87.80% <0.00%> (-7.55%) ⬇️

... and 10 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@aknysh Andriy Knysh (aknysh) merged commit 7ada7db into cloudposse:main May 28, 2026
62 of 66 checks passed
@atmos-pro
Copy link
Copy Markdown
Contributor

atmos-pro Bot commented May 28, 2026

Tip

Atmos Pro  

No affected stacks workflow was detected for this pull request.
If this is expected, no action is needed.
Learn More. Ask AI.

@github-actions
Copy link
Copy Markdown

These changes were released in v1.220.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

patch A minor, backward compatible change size/m Medium size PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stack overflow on atmos describe affected

2 participants