Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not factor fs overhead into available space during validation #2195

Merged
merged 4 commits into from Mar 22, 2022

Conversation

brybacki
Copy link
Contributor

@brybacki brybacki commented Mar 21, 2022

What this PR does / why we need it:

This is the continuation of #2193

When validating whether an image will fit into a PV we compare the
image's virtual size to the filesystem's reported available space to
guage whether it will fit. The current calculation reduces the apparent
available space by the configured filesystem overhead value but the
overhead is already (mostly) factored into the result of Statfs. This
causes the check to fail for PVCs that are just large enough to
accommodate an image plus overhead (ie. when using the DataVolume
Storage API with filesystem PVs with capacity constrained by the PVC
storage request size).

This was not caught in testing because HPP does not have capacity
constrained PVs and we are typically testing block volumes in the ceph
lanes. It can be triggered in our CI by allocating a Filesystem PV on
ceph-rbd storage because these volumes are capacity constrained and
subject to filesystem overhead.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

bz#2064936.

Special notes for your reviewer:

This is the continuation of #2193

Trying to cleanup, remove unwanted changes, verify tests.

Release note:

Do not factor fs overhead into available space during validation

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. size/M labels Mar 21, 2022
@maya-r
Copy link
Contributor

maya-r commented Mar 21, 2022

Inlining my rationale for closing #2194, alternate proposal to use statfs Bfree (and still account for FS overhead):

Most users who want just barely big enough DVs are using the storage API (it's mostly vm-import-operator), which already pads their size requests to be big enough.
Let's avoid trying to coordinate two different checks of FS overhead, and limit ourselves to storage profiles creating a bigger PVC to account for the FS overhead.

@brybacki
Copy link
Contributor Author

Here message needs to be fixed:

Tests Suite: [rfe_id:138][crit:high][vendor:cnv-qe@redhat.com][level:component]Upload tests [posneg:negative][test_id:2330]Verify failure on sync upload if virtual size > pvc size fail given a large virtual size RAW XZ file expand_more
--

New test needs image in images directory.

@brybacki
Copy link
Contributor Author

other tests still need investigation

@brybacki
Copy link
Contributor Author

test 2329 is also about different warning/error message

tests/upload_test.go Outdated Show resolved Hide resolved
tests/upload_test.go Outdated Show resolved Hide resolved
This image size and filesystem overhead combination was experimentally determined
to reproduce bz#2064936 in CI when using ceph/rbd with a Filesystem mode PV since
the filesystem capacity will be constrained by the PVC request size.

Below is the problem it tries to recreate:
When validating whether an image will fit into a PV we compare the
image's virtual size to the filesystem's reported available space to
guage whether it will fit.  The current calculation reduces the apparent
available space by the configured filesystem overhead value but the
overhead is already (mostly) factored into the result of Statfs.  This
causes the check to fail for PVCs that are just large enough to
accommodate an image plus overhead (ie. when using the DataVolume
Storage API with filesystem PVs with capacity constrained by the PVC
storage request size).

This was not caught in testing because HPP does not have capacity
constrained PVs and we are typically testing block volumes in the ceph
lanes.  It can be triggered in our CI by allocating a Filesystem PV on
ceph-rbd storage because these volumes are capacity constrained and
subject to filesystem overhead.

Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
Corrects the validation logic for target volume.

Below description of the original problem:
When validating whether an image will fit into a PV we compare the
image's virtual size to the filesystem's reported available space to
guage whether it will fit.  The current calculation reduces the apparent
available space by the configured filesystem overhead value but the
overhead is already (mostly) factored into the result of Statfs.  This
causes the check to fail for PVCs that are just large enough to
accommodate an image plus overhead (ie. when using the DataVolume
Storage API with filesystem PVs with capacity constrained by the PVC
storage request size).

This was not caught in testing because HPP does not have capacity
constrained PVs and we are typically testing block volumes in the ceph
lanes.  It can be triggered in our CI by allocating a Filesystem PV on
ceph-rbd storage because these volumes are capacity constrained and
subject to filesystem overhead.

Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
Removed redundant and misleading part about pvc size and update the simplification

Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
The test checks that the validation logic takes fs Overhead into account.
New validation logic does not check fs overhead. So test is no longer
relevant.

Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
@awels
Copy link
Member

awels commented Mar 22, 2022

/retest

@awels
Copy link
Member

awels commented Mar 22, 2022

/lgtm
/approve

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Mar 22, 2022
@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: awels

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 22, 2022
@awels
Copy link
Member

awels commented Mar 22, 2022

/cherrypick release-v1.43
/cherrypick release-v1.38

@kubevirt-bot
Copy link
Contributor

@awels: once the present PR merges, I will cherry-pick it on top of release-v1.43 in a new PR and assign it to you.

In response to this:

/cherrypick release-v1.43
/cherrypick release-v1.38

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@awels
Copy link
Member

awels commented Mar 22, 2022

/test pull-containerized-data-importer-e2e-k8s-1.21-hpp

@kubevirt-bot kubevirt-bot merged commit f89812c into kubevirt:main Mar 22, 2022
@kubevirt-bot
Copy link
Contributor

@awels: #2195 failed to apply on top of branch "release-v1.38":

Applying: Create a test for an overhead bug
Using index info to reconstruct a base tree...
M	tests/upload_test.go
M	tests/utils/datavolume.go
M	tests/utils/upload.go
Falling back to patching base and 3-way merge...
Auto-merging tests/utils/upload.go
CONFLICT (content): Merge conflict in tests/utils/upload.go
Auto-merging tests/utils/datavolume.go
Auto-merging tests/upload_test.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Create a test for an overhead bug
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherrypick release-v1.43
/cherrypick release-v1.38

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kubevirt-bot
Copy link
Contributor

@awels: new pull request created: #2198

In response to this:

/cherrypick release-v1.43
/cherrypick release-v1.38

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants