Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd: add packet-phone-home.service #107

Merged
merged 2 commits into from
Oct 4, 2023
Merged

Conversation

tormath1
Copy link
Contributor

this service is triggered only on Equinix Metal / Packet instances.


[Service]
EnvironmentFile=/run/metadata/flatcar
ExecStart=/usr/bin/curl --header "Content-Type: application/json" --request POST "${COREOS_PACKET_PHONE_HOME_URL}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ExecStart=/usr/bin/curl --header "Content-Type: application/json" --request POST "${COREOS_PACKET_PHONE_HOME_URL}"
ExecStart=/usr/bin/curl -fsSL --header "Content-Type: application/json" --request POST "${COREOS_PACKET_PHONE_HOME_URL}"

We should follow any redirection and also error on non-successful HTTP return codes

After=coreos-metadata.service

[Service]
EnvironmentFile=/run/metadata/flatcar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
EnvironmentFile=/run/metadata/flatcar
Type=oneshot
RemainAfterExit=yes
EnvironmentFile=/run/metadata/flatcar

This should only run once

Copy link
Contributor Author

@tormath1 tormath1 Sep 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I think we should keep it as it was and let the service retries.

On QEMU, in the cloudinit* tests, we add in /etc a coreos-metadata.service which takes precedence over the one shipped in /usr. Since cloudinit runs "late", the first coreos-metadata.service required as a dependency of the packet-phone-home is started and this is one is failing because it actually wants to use afterburn -> the dependency fails -> packet-phone-home.service fails because of missing dependency (before evaluating the ConditionKernelCommandLine).

$ journalctl -u coreos-metadata.service
Sep 29 07:37:57 localhost systemd[1]: Starting coreos-metadata.service - Flatcar Metadata Agent...
Sep 29 07:37:57 localhost coreos-metadata[1266]: Error: failed to run
Sep 29 07:37:57 localhost coreos-metadata[1266]: Caused by:
Sep 29 07:37:57 localhost coreos-metadata[1266]:     0: fetching metadata from provider
Sep 29 07:37:57 localhost coreos-metadata[1266]:     1: unknown provider 'qemu'
Sep 29 07:37:57 localhost systemd[1]: coreos-metadata.service: Main process exited, code=exited, status=1/FAILURE
Sep 29 07:37:57 localhost systemd[1]: coreos-metadata.service: Failed with result 'exit-code'.
Sep 29 07:37:57 localhost systemd[1]: Failed to start coreos-metadata.service - Flatcar Metadata Agent.
Sep 29 07:37:58 core1 systemd[1]: coreos-metadata.service: Scheduled restart job, restart counter is at 1.
Sep 29 07:37:58 core1 systemd[1]: Stopped coreos-metadata.service - QEMU metadata agent.
Sep 29 07:37:58 core1 systemd[1]: Starting coreos-metadata.service - QEMU metadata agent...
Sep 29 07:37:58 core1 systemd[1]: coreos-metadata.service: Deactivated successfully.
Sep 29 07:37:58 core1 systemd[1]: Finished coreos-metadata.service - QEMU metadata agent.

We can't add a ConditionKernelCommandLine=!flatcar.oem.id=qemu on the coreos-metadata.service that we ship as this condition is not evaluated when the unit is a dependency of another one.

EDIT: Moving from Requires=coreos-metadata.service to Wants=coreos-metadata.service should be enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The service should still retry with RemainAfterExit=yes and Restart=on-failure, or?
I'm a confused, why is this related to qemu? It would be good to fix any dependency issues - Requires actually makes sense here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not retry as the failure is caused by systemd itself ("dependency failure"). It's related to QEMU because qemu is not a supported Afterburn platform, so the FIRST run of coreos-metadata (the one from /usr) is failing.

Yes, packet-phone-home is not supposed to run on qemu, and this is the case, but the ConditionKernelCommandLine is evaluated only at the execution and not before when the service (and its dependencies) are queued.

The conditions and asserts are checked at the time the queued start job is to be executed. The ordering dependencies are still respected, so other units are still pulled in and ordered as if this unit was successfully activated, and the conditions and asserts are executed the precise moment the unit would normally start and thus can validate system state after the units ordered before completed initialization. Use condition expressions for skipping units that do not apply to the local system, for example because the kernel or runtime environment doesn't require their functionality.

https://www.freedesktop.org/software/systemd/man/systemd.unit.html#Conditions%20and%20Asserts

Requires make sense but we have EnvironmentFile that actually makes fail the unit + it ensures that the unit run if the file is available.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could still add conditions to coreos-metadata.service to prevent it from running. We know which platforms it supports and that would remove this unit failure from the logs.

@tormath1 tormath1 force-pushed the tormath1/em branch 2 times, most recently from 8021d7b to 0e6cd65 Compare September 29, 2023 08:55
this service is triggered only on Equinix Metal / Packet instances.

Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
it's enabled from the OEM initially.

Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
@tormath1 tormath1 marked this pull request as ready for review October 2, 2023 16:14
@tormath1 tormath1 requested a review from a team October 2, 2023 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants