Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Launch mountpoint in host systemd #29

Merged
merged 4 commits into from
Nov 7, 2023
Merged

Launch mountpoint in host systemd #29

merged 4 commits into from
Nov 7, 2023

Conversation

jjkr
Copy link
Contributor

@jjkr jjkr commented Nov 1, 2023

This changes the mountpoint process to run within the host's systemd instead of within the CSI node container. This allows upgrades of the CSI driver without interrupting customer workloads.

To talk to systemd, the container mounts the host dbus socket and uses the dbus api via the github.com/coreos/go-systemd/v22 library (Apache licensed).

pkg/driver/node.go Outdated Show resolved Hide resolved
cmd/install-mp.sh Show resolved Hide resolved
cmd/install-mp.sh Show resolved Hide resolved
Dockerfile Show resolved Hide resolved
deploy/kubernetes/base/node-daemonset.yaml Show resolved Hide resolved
pkg/driver/node.go Outdated Show resolved Hide resolved
pkg/driver/systemd.go Show resolved Hide resolved
pkg/driver/systemd.go Show resolved Hide resolved
return l.ActiveState != r.ActiveState
}
filter := func(name string) bool { return !strings.Contains(name, serviceName) }
updates, errChan := systemdConn.SubscribeUnitsCustom(50*time.Millisecond, 10, isChanged, filter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this spawns a goroutine with a seemingly endless loop: https://github.com/coreos/go-systemd/blob/main/dbus/subscription.go#L112

have you checked if those goroutines are terminated? should we defer close(updates) / defer close(errChan) for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this leaks a goroutine and the two channels and I don't see a way around that. That goroutine writes to the channels, so I can't just defer close them or we'll get a panic and crash. Let me know if I'm missing something. The workaround is probably to just call ListUnits in my own polling function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, neither do I see a better than implementing our polling function with a proper termination interface

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im ok with fixing it separately, as the bug is not confirmed (I can take that next week); but for now it seems that SubscribeUnitsCustom goroutines will always be left waiting on errChan <- err call (so at least consuming some memory, probably not CPU cycles)

pkg/driver/systemd.go Show resolved Hide resolved
pkg/driver/systemd.go Outdated Show resolved Hide resolved
dlakhaws
dlakhaws previously approved these changes Nov 6, 2023
This changes the mountpoint process to run within the host's systemd
instead of within the CSI node container. This allows upgrades of the
CSI driver without interrupting customer workloads.
@jjkr jjkr merged commit 465af01 into main Nov 7, 2023
3 checks passed
@jjkr jjkr deleted the jk-systemd-mp branch November 7, 2023 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants