Fix CI out of memory conditions on Windows #263
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Templates fail to run on Windows, encountering an error indicative of OOM.
Workaround: Reduce CI test parallelism on Windows to 3 from 5.
Unrelated change and first attempt at fix (first commit) moved more cleanup actions into the t.Cleanup() step, ensuring they are ordered by the Go test harness and subtests are allowed to run their cleanups respectively on test failure. This should have the side effect of reducing orphan resources in our cloud accounts, assuming the cleanup step is allowed to run.
Root cause analysis
The last successful run: https://github.com/pulumi/templates/runs/5804552689?check_suite_focus=true all green with
First failing run began failing on the dev CLI: https://github.com/pulumi/templates/runs/5812628677?check_suite_focus=true
As the commit hash for the repo hadn't changed, only the Pulumi CLI, I looked for changes there between those two timestamps:
There was only one relevant merge commit in that time frame, referencing this PR pulumi/pulumi#9294 which modified pulumi new to invoke language plugins, which would mean spawning additional processes. Additional processes, greater memory burden, and the raw logs point to the same error we've seen on pulumi/pulumi when we've hit out of memory conditions: