Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky systemd-resolved test #4921

Open
jamilbk opened this issue May 8, 2024 · 5 comments
Open

Flaky systemd-resolved test #4921

jamilbk opened this issue May 8, 2024 · 5 comments
Assignees
Labels
area/ci Changes to the CI pipeline / Github Actions kind/bug Something isn't working

Comments

@jamilbk
Copy link
Member

jamilbk commented May 8, 2024

https://github.com/firezone/firezone/actions/runs/9006460494/job/24744332707

Seen this one happen a few times. Maybe a timing issue / race condition that a sleep could fix?

@jamilbk jamilbk added kind/bug Something isn't working area/ci Changes to the CI pipeline / Github Actions labels May 8, 2024
@ReactorScram
Copy link
Collaborator

Will look into it after #4899

I thought maybe systemctl start is returning before our tunnel interface is all the way up. When it passes, this resolvectl dns tun-firezone prints the sentinel https://github.com/firezone/firezone/actions/runs/9006458384/job/24744284872#step:6:65

But systemctl start is supposed to wait for us to notify systemd that we're ready, and we only do that if we're running as an IPC service (not applicable) or right after we configure DNS.

Maybe the DNS configuration is secretly async on the inside. I'll add some debug logs as the next step

github-merge-queue bot pushed a commit that referenced this issue May 8, 2024
Refs #4921 

I'm not sure of the cause yet. This extra debugging code should narrow
it down.
@ReactorScram
Copy link
Collaborator

Happened again, but debug_exit didn't do what I needed: https://github.com/firezone/firezone/actions/runs/9009905220/job/24755105430

github-merge-queue bot pushed a commit that referenced this issue May 9, 2024
If these fail we shouldn't bail out since we're already bailing out and
we need them to continue for debug output.

Refs #4921
@jamilbk
Copy link
Member Author

jamilbk commented May 14, 2024

Fixed by #4962

@jamilbk jamilbk closed this as completed May 14, 2024
@ReactorScram
Copy link
Collaborator

Still happening. https://github.com/firezone/firezone/actions/runs/9290693824/job/25567691838

PR #5111 will move the systemd notification up to the Client so maybe that will give us more control over it.

@ReactorScram ReactorScram reopened this May 29, 2024
@ReactorScram
Copy link
Collaborator

ReactorScram commented Jun 11, 2024

Another replication https://github.com/firezone/firezone/actions/runs/9469788060/job/26089691139
In this case the DNS didn't get controlled even though our logs indicate we thought it had https://github.com/firezone/firezone/actions/runs/9469788060/job/26089691139#step:6:93

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci Changes to the CI pipeline / Github Actions kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants