Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podvm: trigger api-server-rest activation by file #1751

Merged

Conversation

mkulke
Copy link
Contributor

@mkulke mkulke commented Mar 18, 2024

fixes #1695

there will be a race condition when CDH, started by kata-agent doesn't listen on cdh.sock yet, but api-server-rest attempts to use the socket immediately.

This change adds adds a path unit that will activate api-server-rest once the socket is being created and remove it from the default target.

A sophisticated e2e test has scientifically proven that this works reliably:

$ for i in $(seq 1 7); do k apply -f nginx-caa.yaml; sleep 90; k exec -it deploy/nginx-caa -- curl http://127.0.0.1:8006/cdh/resource/one/two/key; k delete deploy/nginx-caa; done
deployment.apps/nginx-caa created
ohai
deployment.apps "nginx-caa" deleted
deployment.apps/nginx-caa created
ohai
deployment.apps "nginx-caa" deleted
deployment.apps/nginx-caa created
ohai
deployment.apps "nginx-caa" deleted
deployment.apps/nginx-caa created
ohai
deployment.apps "nginx-caa" deleted
deployment.apps/nginx-caa created
ohai
deployment.apps "nginx-caa" deleted
deployment.apps/nginx-caa created
ohai
deployment.apps "nginx-caa" deleted
deployment.apps/nginx-caa created
ohai
deployment.apps "nginx-caa" deleted

fixes confidential-containers#1695

there will be a race condition when CDH, started by kata-agent doesn't
listen on cdh.sock yet, but api-server-rest attempts to use the socket
immediately.

This change adds adds a path unit that will activate api-server-rest
once the socket is being created and remove it from the default target

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
@mkulke mkulke added the test_e2e_libvirt Run Libvirt e2e tests label Mar 18, 2024
Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks okay to me (thanks for all the science in testing 😉 ), I'm not an systemd expert though!

Copy link
Member

@liudalibj liudalibj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mkulke mkulke merged commit 3cd1720 into confidential-containers:main Mar 21, 2024
29 checks passed
@mkulke mkulke deleted the mkulke/fix-api-server-rest-race branch March 21, 2024 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test_e2e_libvirt Run Libvirt e2e tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

api-server-rest gave up after five tries to connect to CDH
3 participants