boskos: static resources removed from the configuration may never be deleted #17282
Labels
area/boskos
Issues or PRs related to code in /boskos
kind/bug
Categorizes issue or PR as related to a bug.
I mentioned this tangentially in #16047 (comment), but I want to pull it to a separate issue to be more easily highlight it.
Boskos doesn't delete static resources that are removed from the configuration if they are in use, to ensure that jobs don't fail, and to ensure that such resources are properly cleaned up by the janitor.
Originally, this was a reasonable decision, since Boskos periodically synced its storage against the configuration, and most likely such resources would eventually be free and thus deleted from storage.
After #13990, Boskos only syncs its storage against the configuration when the configuration changes (or when Boskos restarts). As a result, it may take a long time for static resources to be deleted, if ever.
There was a similar issue for DRLCs that I addressed in #16021, effectively by putting the DRLCs into lame-duck mode.
There isn't a clear way to indicate that static resources are in lame-duck mode, though.
Possible ways to address this bug, in increasing order of complexity:
a. Add a field into the
UserData
for static resources. (CurrentlyUserData
is not used for static resources.)b. Set an
ExpirationDate
on static resources. (CurrentlyExpirationDate
is not used for static resources.)c. Add a new field on the
ResourceStatus
indicating resources are in lame-duck mode.Workaround until this bug is fixed: admins with access to the cluster where Boskos is running can just delete the resources manually using
kubectl
./area boskos
The text was updated successfully, but these errors were encountered: