Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating image on CLI with URL source can throw timeout #6431

Closed
2 of 3 tasks
onenhansen opened this issue Dec 13, 2023 · 10 comments · Fixed by OpenNebula/docs#2822
Closed
2 of 3 tasks

Creating image on CLI with URL source can throw timeout #6431

onenhansen opened this issue Dec 13, 2023 · 10 comments · Fixed by OpenNebula/docs#2822

Comments

@onenhansen
Copy link
Contributor

onenhansen commented Dec 13, 2023

Description
When creating a URL-sourced image, Net::ReadTimeout with #<TCPSocket:(closed)> may be displayed, but the image will still be created correctly.

To Reproduce
Using curl > 7.17.0

oneimage create --name cdrom-testing2398uf --type cdrom --path http://tinycorelinux.net/8.x/x86/release/Core-current.iso -d 100 
Net::ReadTimeout with #<TCPSocket:(closed)>

Expected behavior
Should output the ID of the created image

Details

  • Affected Component: Storage
  • Hypervisor: vCenter, KVM
  • Version: 6.8.1 ( Running curl > 7.71.0 )

Progress Status

  • Code committed
  • Testing - QA
  • Documentation (Release notes - resolved issues, compatibility, known issues)
@tinova tinova added this to the Release 6.8.1 milestone Dec 14, 2023
@tinova tinova modified the milestones: Release 6.8.1, Release 6.8.2 Dec 14, 2023
onenhansen added a commit to OpenNebula/docs that referenced this issue Dec 18, 2023
Signed-off-by: Neal Hansen <nhansen@opennebula.io>
tinova pushed a commit to OpenNebula/docs that referenced this issue Dec 18, 2023
Signed-off-by: Neal Hansen <nhansen@opennebula.io>
@onenhansen onenhansen changed the title Creating vCenter image with URL source throws timeout Creating image on CLI with URL source can throw timeout Dec 22, 2023
onenhansen added a commit to OpenNebula/docs that referenced this issue Dec 22, 2023
Signed-off-by: Neal Hansen <nhansen@opennebula.io>
tinova pushed a commit to OpenNebula/docs that referenced this issue Jan 4, 2024
…2800)

Signed-off-by: Neal Hansen <nhansen@opennebula.io>
tinova pushed a commit to OpenNebula/docs that referenced this issue Jan 4, 2024
…2800)

Signed-off-by: Neal Hansen <nhansen@opennebula.io>
(cherry picked from commit b48ef17)
@paczerny
Copy link
Member

paczerny commented Jan 4, 2024

I don't think this issue depends on the curl version. The default timeout for all CLI actions is 30 seconds. If you are downloading big image, this can be exceeded.

You can increase the timeout by setting ONE_XMLRPC_TIMEOUT environment variable

In case of timeout, we may add a warning message, that the download continues in background, similar to showback calculation

@Franco-Sparrow
Copy link

I don't think this issue depends on the curl version. The default timeout for all CLI actions is 30 seconds. If you are downloading big image, this can be exceeded.

You can increase the timeout by setting ONE_XMLRPC_TIMEOUT environment variable

In case of timeout, we may add a warning message, that the download continues in background, similar to showback calculation

Hi Sir

I agree with your proposal. At least, let the user know that the download will continue and the image will be created.

@onenhansen
Copy link
Contributor Author

onenhansen commented Jan 10, 2024

The timeout occurs during the stat function when it checks the header of the download for the size of the file. This first check should never take 30 seconds no matter the size of image unless there is a networking issue in which case curl should fail before 30 seconds.

This error is coming because we are using the --retry-all-errors in cURL version 7.17.0 and up which, as it states, retries on every error including the error curl sees when we curl | head -c "${MAX_SIZE}". (Cannot write to target) - With a higher timeout it would just take longer to fail.

A couple of solutions to avoid this error is to lower the version of curl below 7.17.0, or you may need to edit the downloader.sh and remove lines 312-315, it should be this section to remove:

    # To retry also on conn-reset-by-peer fresh curl is needed
    if verlte "7.71.0" "$CURL_VER"; then
        RETRY_ARGS+=" --retry-all-errors"
    fi

@Franco-Sparrow
Copy link

@onenhansen Hi Sir, thanks for your solution...but this will also sacrifice the possibility of retries for downloads. Is there any other solution in the feature that include --retry-all-errors option and avoid this message error on Sunstone when downloading an image?

@onenhansen
Copy link
Contributor Author

onenhansen commented Jan 10, 2024

@Franco-Sparrow It does not remove all the retries, but forces it to retry even on failures where it should not be retrying.

In the downloader.sh script we pipe the curl output to head -c ${MAX_SIZE} and we set MAX_SIZE to 64KB to check the file type and size that we are about to download(a form of stat).

When curl is piped to a command which stops accepting input it returns the code curl: (23) Failure writing output to destination which normally would just fail. That would be correct, as we only want the first 64KB and don't need the whole file. However, with --retry-all-errors this will retry the command up to the maximum retries(3) delaying 3 seconds each time.

So, in certain cases, this added up with all the other stuff running to create the image will take up to and over 30 seconds which triggers the timeout on the API. This caused failures where there should not have been any.

If you do see failures please send us any errors you encounter so we can improve this further if necessary!

If the image you're trying to download is on a server which has many transient errors, then you may want to consider an alternate form of downloading the image and then FTP/SCP upload the image to the frontend in order to add the image via a local path rather than downloading via a URL.

@Franco-Sparrow
Copy link

Thanks @onenhansen, I will keep you updated about this modification. The cluster is on LA with a good DC and internet provider, with no problem to access the internet. In this case, removing these lines should not affect the download of the image via URL.

@Franco-Sparrow
Copy link

Franco-Sparrow commented Jan 10, 2024

Hi @onenhansen

This means that if verlte "7.71.0" <= "$CURL_VER" then RETRY_ARGS+=" --retry-all-errors"...right?

    if verlte "7.71.0" "$CURL_VER"; then
        RETRY_ARGS+=" --retry-all-errors"
    fi

In my case I am using Ubuntu 22.04, then curl version is 7.81.0:

imagen

# Returns curl retry options based on its version
function curl_retry_args {
    [ "$NO_RETRY" = "yes" ] && return

    RETRY_ARGS="--retry 3 --retry-delay 3"

    CURL_VER=`curl --version | grep -o 'curl [0-9\.]*' | awk '{print $2}'`

    # To retry also on conn-reset-by-peer fresh curl is needed
    if verlte "7.71.0" "$CURL_VER"; then
        RETRY_ARGS+=" --retry-all-errors"
    fi

    echo $RETRY_ARGS
}

In other words...every time I get this error, my curl command should is using the option --retry-all-errors when those lines are uncommented (if I understood well the previous logic)...

**Anyway...your approach solved the issue. By commenting those lines, the issue is gone...the image is created and no error is prompted in the Sunstone. Thanks for your solution Sir :-) **

@Franco-Sparrow
Copy link

Franco-Sparrow commented Jan 10, 2024

The problem is related with the use of the option --retry-all-errors. If the command does not use this option, the image will be created and no error will be prompted in the sunstone. I tested this with:

  1. Original lines uncommented (this throw the error on Sunstone). Redirected the value of the var RETRY_ARGS to a file and got that RETRY_ARGS=--retry 3 --retry-delay 3 --retry-all-errors
  2. Lines uncommented but changing 7.71.0 to 7.81.0 and same error. Redirected the value of the var RETRY_ARGS to a file and got that RETRY_ARGS=--retry 3 --retry-delay 3 --retry-all-errors.
  3. Lines commented and it works, no error on Sunstone (what @onenhansen proposed before). Redirected the value of the var RETRY_ARGS to a file and got that RETRY_ARGS=--retry 3 --retry-delay 3.

imagen

@onenhansen
Copy link
Contributor Author

@Franco-Sparrow Just for clarity, the line if verlte "7.71.0" "$CURL_VER"; then checks if the first version is lower than or equal the second version, so this would be read as if "7.71.0" <= "$CURL_VER"; then. This feature showed up in version 7.71.0 of curl.

@Franco-Sparrow
Copy link

Franco-Sparrow commented Jan 11, 2024

Good day @onenhansen

Thank for the clarity of the bash logic. As you can see on my previous comment, I tested all three possibilities on ubuntu 22.04 with curl 7.81.0 >= 7.71.0, then RETRY_ARGS=--retry 3 --retry-delay 3 --retry-all-errors and the creation of the image fail. Removing this lines fixed the problem. I think that we should not be using this option for curl <= 7.81.0 or just remove those lines as you suggested.

onenhansen added a commit to OpenNebula/docs that referenced this issue Jan 23, 2024
Signed-off-by: Neal Hansen <nhansen@opennebula.io>
tinova pushed a commit to OpenNebula/docs that referenced this issue Jan 24, 2024
Signed-off-by: Neal Hansen <nhansen@opennebula.io>
@tinova tinova closed this as completed Jan 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment