-
Notifications
You must be signed in to change notification settings - Fork 568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] The feature of auto remount read only volume not work on a single node cluster. #7843
Comments
I can reproduce the issue. If there is only one node or the pod keeps recreating on the same node This is a day-1 issue, we trust that when Kubernetes recreates the pod it will reattach the volume but it is not true in this case. |
Before volume remount feature
After volume remount feature (Only in one node or statefulset which pod always runs on the same node)
Workaround
|
After investigation, is not what we originally thought |
NoteThe root cause is
Workaround:Scale down to 0 wait for pod to be terminated and cleaned up and set the replicas back to 1 |
Pre Ready-For-Testing Checklist
|
I am a bit confused on a couple of points. Can you help to clarify?
I understand that UnpulishVolume hasn't been called yet, but it seems to me the real problem is that UnstageVolume hasn't been called yet. (Of course, it can't be until after UnpublishVolume.)
I don't get the cause and effect here. It sounds like you are implying the following, but I don't see how the bind mount can affect the global mount.
Is the following flow accurate?
|
Hi @ejweber
The logs matches the steps above. (There are two pods)
To see what went wrong exactly (``fsck
Note: when doing
Summary So I believe that the issue here was because the old pod bind mount was not unmount first. When the new pod was starting, since the globalmount point was there, |
Verified pass on longhorn master(longhorn-manager Remount PVC to read only, can see engine > k get engine -A -o yaml -w | grep FilesystemReadOnly -B 1
status: "False"
type: FilesystemReadOnly
--
status: "True"
type: FilesystemReadOnly
--
status: "False"
type: FilesystemReadOnly |
Describe the bug
After #6386 implemented, Longhorn will automatically remount read-only RWO volume to read-write. It worked well on multi nodes cluster, the volume will remount in few seconds and can write data again.
But in single node cluster it seems not worked properly, after change volume to read only, after 20 minutes, the volume stiil can not write data.
To Reproduce
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.28.5+k3s1 sh -s - --write-kubeconfig-mode 644
find /var/lib -name ${PVC name}
/mount
in the command end )Expected behavior
Auto remount read only volume work on a single node cluster.
Support bundle for troubleshooting
supportbundle_b3abd9ae-8fc9-44f4-ad95-2deceffe927b_2024-02-05T07-26-29Z.zip
Environment
Additional context
N/A
The text was updated successfully, but these errors were encountered: