New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reserve overhead when validating that a Filesystem has enough space #1319
Conversation
Missing doc, unit tests, functional tests. |
/retest |
pkg/controller/config-controller.go
Outdated
@@ -211,6 +216,62 @@ func (r *CDIConfigReconciler) reconcileDefaultPodResourceRequirements(config *cd | |||
return nil | |||
} | |||
|
|||
func (r *CDIConfigReconciler) reconcileFilesystemOverhead(config *cdiv1.CDIConfig) error { | |||
var globalOverhead cdiv1.Percent = "0.055" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make "0.055" a const name defaultGlobalOverhead or something?
pkg/controller/import-controller.go
Outdated
@@ -412,6 +413,10 @@ func (r *ImportReconciler) createImportEnvVar(pvc *corev1.PersistentVolumeClaim) | |||
if err != nil { | |||
return nil, err | |||
} | |||
podEnvVar.filesystemOverhead, err = GetFilesystemOverhead(r.uncachedClient, pvc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason you are using the uncachedClient instead of the client here?
pkg/image/qemu_test.go
Outdated
@@ -226,7 +226,7 @@ var _ = Describe("Validate", func() { | |||
table.Entry("should return error on bad json", mockExecFunction(badValidateJSON, "", expectedLimits), "unexpected end of JSON input", imageName), | |||
table.Entry("should return error on bad format", mockExecFunction(badFormatValidateJSON, "", expectedLimits), fmt.Sprintf("Invalid format raw2 for image %s", imageName), imageName), | |||
table.Entry("should return error on invalid backing file", mockExecFunction(backingFileValidateJSON, "", expectedLimits), fmt.Sprintf("Image %s is invalid because it has backing file backing-file.qcow2", imageName), imageName), | |||
table.Entry("should return error when PVC is too small", mockExecFunction(hugeValidateJSON, "", expectedLimits), fmt.Sprintf("Virtual image size %d is larger than available size %d. A larger PVC is required.", 52949672960, 42949672960), imageName), | |||
table.Entry("should return error when PVC is too small", mockExecFunction(hugeValidateJSON, "", expectedLimits), fmt.Sprintf("Virtual image size %d is larger than available size %d (PVC size %d, reserved overhead %f%%). A larger PVC is required.", 52949672960, 42949672960, 52949672960, 0.0), imageName), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a non 0 overhead test as well?
Type: "object", | ||
Properties: map[string]extv1beta1.JSONSchemaProps{ | ||
"global": { | ||
Description: "How much space of a Filesystem volume should be reserved for safety. This value is the global one to be used unless a per-storageClass value is chosen.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add "Pattern", here with a regular expression, this will stop invalid values from reaching our controllers.
Description: "How much space of a Filesystem volume should be reserved for safety. This value is the global one to be used unless a per-storageClass value is chosen.", | ||
Type: "string", | ||
}, | ||
"storageClass": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to specify more here about what goes in, in particular we want the pattern on value as well. It should look something like this:
"storageClass": {
Items: &extv1beta1.JSONSchemaPropsOrArray{
Schema: &extv1beta1.JSONSchemaProps{
Description: "Some description explaining what the storage class map does, same as in types.go",
Type: "array",
Properties: map[string]extv1beta1.JSONSchemaProps{
"storageClass": {
Description: "Name of the storage class",
Type: "object",
},
"overhead": {
Description: "The overhead as a decimal",
Type: "object",
Pattern: "regex for pattern",
},
},
Not entirely sure this will work since you used a map. But I think the validation is useful
// XXX: validation doesn't actually seem to do anything. | ||
// Actually relies on the reconcile not to use bogus values. | ||
// +kubebuilder:validation:Pattern=`^(0\.[0-9]+|0)$` | ||
type Percent string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is a string better than using a float?
May be nice to have a NewPercent
function that validates and returns a Percent
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An actual float is problematic for portability reasons, and so the tooling discourages it.
storageClassNameOverhead, found := perStorageConfig[storageClassName] | ||
|
||
if found { | ||
valid, err := validOverhead(storageClassNameOverhead) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can validation be enforced by CRD schema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a regexp on the field now, to enforce 0 <= overhead <= 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw: I only managed to add it to the global overhead one, not to the map[storageclass]overhead.
d53bec6
to
9b34faa
Compare
9b34faa
to
3f1d0ee
Compare
3f1d0ee
to
85f4101
Compare
I think I'm going to have to add validation to a lot of other data sources, but that's probably a separate PR |
The tests are fine if run independently, I just introduced some flakiness |
fd76730
to
cb60be7
Compare
The NFS tests still fail locally as being flaky. Simply adding a wait between the tests helps, but I don't know what else I could be doing to un-flake them. |
cb60be7
to
9acaa33
Compare
9acaa33
to
1c48937
Compare
The amount of available space in a filesystem is not exactly the advertise amount. Things like indirect blocks or metadata may use up some of this space. Reserving it to avoid reaching full capacity by default. This value is configurable from the CDIConfig object spec, both globally and per-storageclass. The default value is 0.055, or "5.5% of the space is reserved". This value was chosen because some filesystems reserve 5% of the space as overhead for the root user and this space doubles as reservation for the worst case behaviour for unclear space usage. I've chosen a value that is slightly higher. This validation is only necessary because we use sparse images instead of fallocated ones, which was done to have reasonable alerts regarding space usage from various storage providers. --- Update CDIConfig filesystemOverhead status, validate, and pass the final value to importer/upload pods. Only the status values controlled by the config controller are used, and it's filled out for all available storage classes in the cluster. Use this value in Validate calls to ensure that some of the space is reserved for the filesystem overhead to guard from accidents. Caveats: Doesn't use Default: to define the default of 0.055, instead it is hard-coded in reconcile. It seems like we can't use a default value. Validates the per-storageClass values in reconcile, and doesn't reject bad values. Signed-off-by: Maya Rashish <mrashish@redhat.com>
Signed-off-by: Maya Rashish <mrashish@redhat.com>
Signed-off-by: Maya Rashish <mrashish@redhat.com>
1c48937
to
e477664
Compare
@maya-r: The following test failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/test pull-containerized-data-importer-e2e-k8s-1.17-ceph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good, a few comments and one want. I would like to see a functional test in cdiconfig_test.go that verifies that if you add an invalid value for the percent field, that the update fails. So things like -0.04 or 1.00 or 0.9999 which don't match the regex.
if err := utils.DeletePV(client, pv); err != nil { | ||
return err | ||
} | ||
if err := utils.WaitTimeoutForPVDeleted(client, pv, timeout); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimization here, right now you are doing the following:
- Delete PV
- Wait For PV to actually be gone
- Go to 1
So you are sequentially waiting for the PV to be deleted before starting a new delete. How about you do
for i := 1; i <= pvCount; i++ {
pv := nfsPVDef(strconv.Itoa(i), utils.NfsService.Spec.ClusterIP)
if err := utils.DeletePV(client, pv); err != nil {
return err
}
}
// kicked off deletion of all PVs I care about
for i := 1; i <= pvCount; i++ {
if err := utils.WaitTimeoutForPVDeleted(client, pv, timeout); err != nil {
return err
}
}
In this order you start the deletion of the PVs before waiting for any of them to really be deleted. That way the deletion is in parallell while the waiting is sequential. And hopefully the wait will be less because when you get to the wait for the later PVs, they will be gone already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the way it is now, it looks like you are waiting for just the last PV (since the PV variable is set to the last at the end of the first loop). But there is no guarantee in the order in which the PVs are deleted, it is possible at this point that some of the earlier ones aren't gone yet. I would either just re-create the pv defs in the second loop, or store them in a slice and loop over the slice in the second loop to ensure all the PVs are gone at the end.
storageClassNameOverhead, found := perStorageConfig[storageClassName] | ||
|
||
if found { | ||
valid, err := validOverhead(storageClassNameOverhead) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a regexp on the field now, to enforce 0 <= overhead <= 1
pkg/controller/import-controller.go
Outdated
@@ -440,6 +442,11 @@ func (r *ImportReconciler) createImportEnvVar(pvc *corev1.PersistentVolumeClaim) | |||
if err != nil { | |||
return nil, err | |||
} | |||
fsOverhead, err := GetFilesystemOverhead(r.uncachedClient, pvc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason to use the uncached client here? I see the cached client being used in the uploadServer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, missed it. thanks
Intended to help with flakes, but didn't make a difference. Probably still worth doing. Signed-off-by: Maya Rashish <mrashish@redhat.com>
Signed-off-by: Maya Rashish <mrashish@redhat.com>
Note that this change isn't expected to make a difference, as we check if the targetStorageClass is nil later on and have the same behaviour, but this is probably more correct API usage. Signed-off-by: Maya Rashish <mrashish@redhat.com>
Signed-off-by: Maya Rashish <mrashish@redhat.com>
e477664
to
cec1791
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a small logic error in the nfs deletion logic. Otherwise I am happy.
if err := utils.DeletePV(client, pv); err != nil { | ||
return err | ||
} | ||
if err := utils.WaitTimeoutForPVDeleted(client, pv, timeout); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the way it is now, it looks like you are waiting for just the last PV (since the PV variable is set to the last at the end of the first loop). But there is no guarantee in the order in which the PVs are deleted, it is possible at this point that some of the earlier ones aren't gone yet. I would either just re-create the pv defs in the second loop, or store them in a slice and loop over the slice in the second loop to ensure all the PVs are gone at the end.
Wait for all of them, not just the last one. Signed-off-by: Maya Rashish <mrashish@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: awels The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
When validating disk space, reserve space for filesystem overhead
The amount of available space in a filesystem is not exactly
the advertise amount. Things like indirect blocks or metadata
may use up some of this space. Reserving it to avoid reaching
full capacity by default.
This value is configurable from the CDIConfig object spec,
both globally and per-storageclass.
The default value is 0.055, or "5.5% of the space is
reserved". This value was chosen because some filesystems
reserve 5% of the space as overhead for the root user and
this space doubles as reservation for the worst case
behaviour for unclear space usage. I've chosen a value
that is slightly higher.
This validation is only necessary because we use sparse
images instead of fallocated ones, which was done to have
reasonable alerts regarding space usage from various
storage providers.
Update CDIConfig filesystemOverhead status, validate, and
pass the final value to importer/upload pods.
Only the status values controlled by the config controller
are used, and it's filled out for all available storage
classes in the cluster.
Use this value in Validate calls to ensure that some of the
space is reserved for the filesystem overhead to guard from
accidents.
Caveats:
Doesn't use Default: to define the default of 0.055, instead
it is hard-coded in reconcile. It seems like we can't use a
default value.
Release note: