You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request. Searching for pre-existing feature requests helps us consolidate datapoints for identical requirements into a single place, thank you!
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
If you are interested in working on this issue or have submitted a pull request, please leave a comment.
Overview of the Issue
The parallel planning mechanism for different projects with the same workspace name has a race condition where N goroutines will try to clone the same repository into the same directory concurrently.
Reproduction Steps
Set parallel_plan: true in atlantis.yaml
Have 2 different projects in atlantis.yaml with different directories but the same workspace name
Raise a pull request that triggers a plan in both projects
Wait for Atlantis to report both plans (these will succeed)
Trigger one more plan by commenting atlantis plan
Wait for both plans to fail
Logs
Plan output for one project:
running git clone --branch REDACTED --depth=1 --single-branch https://REDACTED@REDACTED /home/atlantis/.atlantis/repos/REDACTED/XXXX/default: Cloning into '/home/atlantis/.atlantis/repos/REDACTED/XXXX/default'...
fatal: Unable to create '/home/atlantis/.atlantis/repos/REDACTED/XXXX/default/.git/shallow.lock': No such file or directory
: exit status 128
Plan output for the other project:
running git clone --branch REDACTED --depth=1 --single-branch https://REDACTED@REDACTED /home/atlantis/.atlantis/repos/REDACTED/XXXX/default: Cloning into '/home/atlantis/.atlantis/repos/REDACTED/XXXX/default'...
fatal: Unable to read current working directory: No such file or directory
fatal: fetch-pack: invalid index-pack output
: exit status 128
Relevant server logs (you'll see why these are relevant later on):
will re-clone repo, could not determine if was at correct commit: git rev-parse HEAD: exit status 128: HEAD
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
Which (1) deletes the directory, (2) creates the directory and (3) clones the repository into the directory (this is done concurrently by both plan goroutines, there's no locking/semaphone mechanism that prevents it).
And of course, since two goroutines are running rm && mkdir && git clone on the same directory concurrently, the both fail at some point with different errors every time they do, since race conditions are unpredictable.
Fixing this should be as straightforward as adding some mutex mechanism before calling p.WorkingDir.Clone(...). What do you think?
The text was updated successfully, but these errors were encountered:
We ran into the same problem. We have a mono-repo setup with about 50 different terraform directories, each with 3 terraform workspaces corresponding to 3 different Atlantis workflows. Occasionally some changes may trigger all Atlantis projects to plan (e.g., changes on a common dependent module), and they'd all fail with the same error OP mentioned.
Community Note
Overview of the Issue
The parallel planning mechanism for different projects with the same workspace name has a race condition where
N
goroutines will try to clone the same repository into the same directory concurrently.Reproduction Steps
parallel_plan: true
inatlantis.yaml
atlantis.yaml
with different directories but the same workspace nameatlantis plan
Logs
Plan output for one project:
Plan output for the other project:
Relevant server logs (you'll see why these are relevant later on):
Relevant stacktrace:
Environment details
Version:
Repo
atlantis.yaml
file:Root cause
When we trigger the second plan via
atlantis plan
, this is what happens:Plan(...)
functionN
times, one per plan up to--parallel-pool-size
, concurrently via goroutinesdoPlan(...)
functionp.WorkingDir.Clone(...)
functiongit rev-parse
, thus calls theforceClone(...)
functionAnd of course, since two goroutines are running
rm && mkdir && git clone
on the same directory concurrently, the both fail at some point with different errors every time they do, since race conditions are unpredictable.Fixing this should be as straightforward as adding some mutex mechanism before calling
p.WorkingDir.Clone(...)
. What do you think?The text was updated successfully, but these errors were encountered: