Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unused cleans disks that not mounted right now, but assigned to PV in k8s #82

Open
fedordikarev opened this issue Jan 10, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@fedordikarev
Copy link
Contributor

I'm not sure if it easy to do or even possible with the current architecture of the tool, but we got some incidents due to next behaviour:

  1. k8s sts was scaled down
  2. due to that underlaying disks were unmounted, while PV referring them are still in place
  3. engineer from the team runs unused tool and remove disks that are currently unmounted
  4. after scale-up event pods can't start as there is no disks referred by existing PV

It could be less an issue after k8s 1.27 and persistentvolumeclaim-retention, but should we add some extra checks (in the tool or maybe external) to check if there any PV referring to the disk before deleting them?

@fedordikarev fedordikarev added the bug Something isn't working label Jan 10, 2024
@inkel
Copy link
Collaborator

inkel commented Jan 10, 2024

🤔 with the current architecture I don't think it's possible. While the tool has some support for Kubernetes it is only for metadata, it doesn't really know about it; you can think of it as mostly a generic kind of tool.

In order to know if the disks as bounded by a PV we would need to not only add a dependency on the Kubernetes packages, but also provide the user with the way to indicate which disk was created by which Kubernetes cluster (e.g. you could have multiple GKE clusters within the same GCP project, each cluster managing PVs). This will add tons of complexity to the code, adding the door to more bugs.

I don't think this is a bug, and I don't think this is fixable. The tool is clearly for doing cleanup but as always this needs to be done in a conscious manner. While I understand the pain using the tool to delete disks brought to people I don't think it's a tool problem.

I vote to close this one with a wont-fix label, but I'm open to be convinced of the contrary.

@fedordikarev
Copy link
Contributor Author

As an idea we can add feature to support several cases:

  1. use-case when user has a number of disks that not mounted, but required from time to time: it could some base image that cloned when needed or some data that used from time to time, for example for some quarterly or annual operations.
  2. when "usefulness" data can be fetched from some external system, for example from k8s.

To address that we can add flag --exclude-list filename with the list of disk ids (one per line) that should be excluded from the list at all (or maybe shown in the main list with extra note and excluded from the selection during delete process).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants