Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always use scratchspace when importing and converting #2832

Merged
merged 7 commits into from
Aug 11, 2023

Conversation

awels
Copy link
Member

@awels awels commented Aug 1, 2023

What this PR does / why we need it:
nbdkit is slow in combination with qemu-img when importing and converting a qcow2 image. For now when converting always use scratch space, this is significantly faster. Once nbdkit 1.35.8 is available we can revisit this because that version will improve nbdkit performance significantly.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #2809

Special notes for your reviewer:
Also removes old version of nbdkit left over in WORKSPACE file, as well as fixing invalid number of arguments in some error returns

Release note:

BugFix: use scratch space for all http conversion imports, to speed up the process.

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Aug 1, 2023
@awels awels force-pushed the always_scratchspace branch 2 times, most recently from 752c731 to 511e405 Compare August 1, 2023 17:28
@alromeros
Copy link
Collaborator

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Aug 3, 2023
return ProcessingPhaseTransferScratch, nil
}
if hs.readers.Archived || hs.customCA != "" {
return ProcessingPhaseTransferDataFile, nil
}
hs.url, _ = url.Parse(fmt.Sprintf("nbd+unix:///?socket=%s", nbdkitSocket))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like nbdkit will still be called for importing raw non compressed. How's the performance with that? should we get rid of it entirely?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to keep it around for the multi stage import, which uses nbdkit. But you are right this will still be slow for uncompressed raw files since it uses nbdkit.

@kubevirt-bot kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Aug 3, 2023
if err = hs.n.StartNbdkit(hs.endpoint.String()); err != nil {
return ProcessingPhaseError, err
if !hs.readers.Convert && (hs.readers.Archived || hs.customCA != "") {
return ProcessingPhaseTransferDataFile, nil
Copy link
Member

@mhenriks mhenriks Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can technically get rid && (hs.readers.Archived || hs.customCA != "")?

No need to write a raw/iso to scratch, right?

UNLESS we want to truly respect preallocation settings. In which case, we should just always return ProcessingPhaseTransferScratch, nil for kubevirt content type and preallocation=false

Signed-off-by: Alexander Wels <awels@redhat.com>
of ndbkit. Once we are able to get nbdkit 1.35.8 or newer
we can revert this change since that will include improvements
to the downloading speed.

Signed-off-by: Alexander Wels <awels@redhat.com>
return total, and this means the metrics are disabled.

Signed-off-by: Alexander Wels <awels@redhat.com>
Signed-off-by: Alexander Wels <awels@redhat.com>
Signed-off-by: Alexander Wels <awels@redhat.com>
Fixed functional test that was not doing the right
thing while running the test.

Signed-off-by: Alexander Wels <awels@redhat.com>
@awels
Copy link
Member Author

awels commented Aug 6, 2023

/test pull-containerized-data-importer-e2e-ceph

@mhenriks
Copy link
Member

mhenriks commented Aug 7, 2023

@awels test failures may be related to changes?

@awels
Copy link
Member Author

awels commented Aug 7, 2023

Yes, I think so, investigating what the problem is.

@@ -308,6 +308,8 @@ func (dp *DataProcessor) resize() (ProcessingPhase, error) {
return ProcessingPhaseError, err
}
dp.preallocationApplied = dp.preallocation
} else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's cleaner to leave dp.preallocationApplied = dp.preallocation outside the condition and do it once, but it's not a must.

/test pull-containerized-data-importer-e2e-ceph

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point in my mind it was not logically equivalent because the stuff above could return a different phase, but that would not change outside the else.

writing to the device

Signed-off-by: Alexander Wels <awels@redhat.com>
@awels
Copy link
Member Author

awels commented Aug 8, 2023

/test pull-containerized-data-importer-e2e-nfs

}
dp.preallocationApplied = dp.preallocation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the current implementation isn't preallocationApplied always true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if the DV has the preallocation flag set, the value of dp.preallocation will be true in that case, false otherwise

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mistaken when preallocationApplied should always be true. My understanding of the implementation in this PR is:

  1. Regardless of what the user preallocation setting the disk will always be preallocated when not "converting" so raw, iso, etc
  2. User preallocation settings are respected when "converting" from scratch space.

I am wondering if case 1 above is accurately reflected in dp.preallocationApplied?

Copy link
Member Author

@awels awels Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I can know at this point in the code, this gets run for multiple different datasources, some of which may preallocate and some may not. I do know that if the flag was set on the importer pod, then the datasource would respect it, if it was not set, it may or may not have preallocated depending on the datsource.

Maybe we need to add something to the datasource API where the datasource can indicate it preallocated or not.

@alromeros
Copy link
Collaborator

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Aug 11, 2023
@alromeros
Copy link
Collaborator

Is #2743 still relevant after this merges?

@mhenriks
Copy link
Member

/approve

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mhenriks

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 11, 2023
@awels
Copy link
Member Author

awels commented Aug 11, 2023

/cherrypick release-v1.57

@kubevirt-bot
Copy link
Contributor

@awels: once the present PR merges, I will cherry-pick it on top of release-v1.57 in a new PR and assign it to you.

In response to this:

/cherrypick release-v1.57

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kubevirt-bot kubevirt-bot merged commit 1f14ac7 into kubevirt:main Aug 11, 2023
17 checks passed
@kubevirt-bot
Copy link
Contributor

@awels: #2832 failed to apply on top of branch "release-v1.57":

Applying: Remove older nbdkit
Applying: When converting always use scratch space importing instead of ndbkit. Once we are able to get nbdkit 1.35.8 or newer we can revert this change since that will include improvements to the downloading speed.
Using index info to reconstruct a base tree...
M	pkg/importer/http-datasource_test.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/importer/http-datasource_test.go
CONFLICT (content): Merge conflict in pkg/importer/http-datasource_test.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0002 When converting always use scratch space importing instead of ndbkit. Once we are able to get nbdkit 1.35.8 or newer we can revert this change since that will include improvements to the downloading speed.
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherrypick release-v1.57

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

awels added a commit to awels/containerized-data-importer that referenced this pull request Aug 11, 2023
* Remove older nbdkit

Signed-off-by: Alexander Wels <awels@redhat.com>

* When converting always use scratch space importing instead
of ndbkit. Once we are able to get nbdkit 1.35.8 or newer
we can revert this change since that will include improvements
to the downloading speed.

Signed-off-by: Alexander Wels <awels@redhat.com>

* Disable metrics test for import because straight import doesn't
return total, and this means the metrics are disabled.

Signed-off-by: Alexander Wels <awels@redhat.com>

* Fix broken functional tests

Signed-off-by: Alexander Wels <awels@redhat.com>

* Address review comments

Signed-off-by: Alexander Wels <awels@redhat.com>

* Additional review comments.
Fixed functional test that was not doing the right
thing while running the test.

Signed-off-by: Alexander Wels <awels@redhat.com>

* Always set preallocation on block devices when directly
writing to the device

Signed-off-by: Alexander Wels <awels@redhat.com>

---------

Signed-off-by: Alexander Wels <awels@redhat.com>
awels added a commit to awels/containerized-data-importer that referenced this pull request Aug 11, 2023
* Remove older nbdkit

Signed-off-by: Alexander Wels <awels@redhat.com>

* When converting always use scratch space importing instead
of ndbkit. Once we are able to get nbdkit 1.35.8 or newer
we can revert this change since that will include improvements
to the downloading speed.

Signed-off-by: Alexander Wels <awels@redhat.com>

* Disable metrics test for import because straight import doesn't
return total, and this means the metrics are disabled.

Signed-off-by: Alexander Wels <awels@redhat.com>

* Fix broken functional tests

Signed-off-by: Alexander Wels <awels@redhat.com>

* Address review comments

Signed-off-by: Alexander Wels <awels@redhat.com>

* Additional review comments.
Fixed functional test that was not doing the right
thing while running the test.

Signed-off-by: Alexander Wels <awels@redhat.com>

* Always set preallocation on block devices when directly
writing to the device

Signed-off-by: Alexander Wels <awels@redhat.com>

---------

Signed-off-by: Alexander Wels <awels@redhat.com>
kubevirt-bot pushed a commit that referenced this pull request Aug 12, 2023
* Remove older nbdkit



* When converting always use scratch space importing instead
of ndbkit. Once we are able to get nbdkit 1.35.8 or newer
we can revert this change since that will include improvements
to the downloading speed.



* Disable metrics test for import because straight import doesn't
return total, and this means the metrics are disabled.



* Fix broken functional tests



* Address review comments



* Additional review comments.
Fixed functional test that was not doing the right
thing while running the test.



* Always set preallocation on block devices when directly
writing to the device



---------

Signed-off-by: Alexander Wels <awels@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CDI import from URL is significantly slower than a manual wget + virtctl image-upload
4 participants