Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

volumemanager could not process reconstructed volume‘s consistency while pvc volume delay adding to dsw while apiserver access unstabitily #103143

Closed
249043822 opened this issue Jun 24, 2021 · 9 comments · Fixed by #103181 or #124242
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@249043822
Copy link
Member

What happened:

We have even encountered such case that pod continuously reports “Unable to attach or mount volumes for pod;...” when kubelet restarted. The reasons are listed in details as below:

  1. In dswp.populatorLoop, as result of apiserver instantaneous unstablility, dswp get pvc‘s pv spec failed, then the pvc volume does not be added to dsw.
    dswp.getPVSpec(pvName, pvcSource.ReadOnly, pvcUID)
  2. In rc.reconciliationLoopFunc() -> rc.sync(), syncStates() lists all the volume directories locally, and recontructs the volume from local volume, if the volume isn't in dsw, it will put the recontructed volume to asw, but now the volume.OuterVolumeSpecName isn't correct.
    outerVolumeSpecName: volume.volumeSpecName,
  3. By chance, at this time, In dswp.populatorLoop, the fail to added pvc volume is added to dsw again normally
  4. Now in reconcile() -> mountAttachVolumes(), it compares dsw and asw by volume's innerVolumename, it's all the same. but it could not sense the OuterVolumeSpecName dismatch. so it will not re-mount again to update asw cache
    volMounted, devicePath, err := rc.actualStateOfWorld.PodExistsInVolume(volumeToMount.PodName, volumeToMount.VolumeName)
  5. In syncPod, it checks pod's volume attach or mount success by comparing volume.OuterVolumeSpecName and pod.spec.volumeName, but this pv'c OuterVolumeSpecName isn't correct in asw. so pod continuously reports “Unable to attach or mount volumes for pod;...”
    mountedVolumes.Insert(mountedVolume.OuterVolumeSpecName)

What you expected to happen:

reconcile() could sense the reconstruct volume's dismatch to the dsw, then re-mount again to make volume cache consistency

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version): 1.19.4
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others:
@249043822 249043822 added the kind/bug Categorizes issue or PR as related to a bug. label Jun 24, 2021
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 24, 2021
@249043822
Copy link
Member Author

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 24, 2021
@249043822
Copy link
Member Author

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 24, 2021
@249043822
Copy link
Member Author

/assign @gnufied @jingxu97

@249043822
Copy link
Member Author

CC @sanwishe @yangjunmyfm192085

@yangjunmyfm192085
Copy link
Contributor

yangjunmyfm192085 commented Jun 24, 2021

I think this is a point that can be optimized. /cc@ehashman @dchen1107

@249043822
Copy link
Member Author

I will give a fix for this

@jingxu97
Copy link
Contributor

@249043822 Thanks for reporting the issue, this innerSpecName and outerSpecName are confusing and cause issues. Please ping me for review when you have a fix. Thanks!

@ehashman
Copy link
Member

Please don't triage your own issues immediately, give someone else a chance to review them :(

/sig storage

@k8s-ci-robot k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Jun 25, 2021
@249043822
Copy link
Member Author

Please don't triage your own issues immediately, give someone else a chance to review them :(

/sig storage

@ehashman okay, I know

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
6 participants