You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 26, 2023. It is now read-only.
I tried to spin up OTF locally with the following setup - one controller, one organization, a workspace configured to run on an agent.
Then I launched 3 agent processes and began running terraform plan repeatedly in the workspace. Every time one of the agents would report something like this:
2023/06/28 16:45:54 INFO executing phase run=run-23z0M1fCvZC6pGpq phase=plan
2023/06/28 16:45:57 INFO finishing phase run=run-23z0M1fCvZC6pGpq phase=plan
It's probably benign and just an error noise, and I'm not sure what the good solution would be, because agents communicate with the controller over HTTPS and thus can't use Postgres directly (otherwise, it'd be possible to do something like SELECT .. FOR UPDATE SKIP LOCKED).
The text was updated successfully, but these errors were encountered:
Hello. Yes, it is benign and just error noise as you say.
They all receive an event notifying them of a new run phase (plan or apply), and then then race to be the first to claim the phase, triggering a "thundering herd", albeit a small herd. It's not a terrible approach, performance isn't an issue, etc,. I could quash the errors...
But I've been working on a refactor for a little while: instead an agent manager makes the decision as to which agent a phase is assigned to. I've noticed that the same way TFC assigns jobs to its agents. This work will take a little while though, can't say when it'll be complete.
In the meantime let's keep this issue open for other folks.
Hi
I tried to spin up OTF locally with the following setup - one controller, one organization, a workspace configured to run on an agent.
Then I launched 3 agent processes and began running
terraform plan
repeatedly in the workspace. Every time one of the agents would report something like this:While the other two reported:
It's probably benign and just an error noise, and I'm not sure what the good solution would be, because agents communicate with the controller over HTTPS and thus can't use Postgres directly (otherwise, it'd be possible to do something like
SELECT .. FOR UPDATE SKIP LOCKED
).The text was updated successfully, but these errors were encountered: