Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve jdk download retries #48560

Merged
merged 3 commits into from
Nov 3, 2019

Conversation

alpar-t
Copy link
Contributor

@alpar-t alpar-t commented Oct 28, 2019

Look like the --retry argument is not always effective.

Closes #40531

Look like the `--retry` argument is not always affective.

Closes elastic#40531
@alpar-t alpar-t added the :Delivery/Build Build or test infrastructure label Oct 28, 2019
@alpar-t alpar-t requested a review from dliappis October 28, 2019 03:47
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (:Core/Infra/Build)

Copy link
Contributor

@dliappis dliappis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I left a few comments, mainly highlighting the need to emit the error at the end.

tar -C /opt -zxf /tmp/jdk.tar.gz && \
rm -Rf /tmp/jdk.tar.gz
RUN for iter in {1..10}; do curl -L -s -S ${jdkUrl} | tar -C /opt -zx &&\
exit_code=0 && break || exit_code=\$? && echo "download error: retry \$iter in 10s" && sleep 10; done;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to emit the error here (similar to how we do it for yum install later on) using:

 \
    (exit $exit_code)

otherwise as it is now, if all retries are exhausted it will continue to the next step.

Unrelated, but while testing this, I observed that (when running ./gradlew with --info) retry messages are shown as:

download error: retry $iter in 10s

instead of

download error: retry 1<or 2, 3 and so on> in 10s

I thought this was ok and tested in the past in the yum section, but apparently \ shouldn't be used inside the double quoted section.

In other words the complete fixed command should be:

RUN for iter in {1..10}; do curl -L -s -S ${jdkUrl} | tar -C /opt -zx && \
    exit_code=0 && break || exit_code=\$? && echo "download error: retry $iter in 10s" && sleep 10; done; \
    (exit $exit_code)

RUN curl -L --retry 8 -s -S ${jdkUrl} --continue-at - --output /tmp/jdk.tar.gz && \
tar -C /opt -zxf /tmp/jdk.tar.gz && \
rm -Rf /tmp/jdk.tar.gz
RUN for iter in {1..10}; do curl -L -s -S ${jdkUrl} | tar -C /opt -zx &&\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's have a space between && and \ for consistency i.e. && \

rm -Rf /tmp/jdk.tar.gz
RUN for iter in {1..10}; do curl -L -s -S ${jdkUrl} | tar -C /opt -zx &&\
exit_code=0 && break || exit_code=\$? && echo "download error: retry \$iter in 10s" && sleep 10; done;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One new line is enough, I think we can remove this.

Copy link
Contributor

@mark-vieira mark-vieira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM I suppose. Pretty annoying that --retry doesn't do what we want. Did we discover what kind of errors we are encountering that aren't caught by --retry?

@dliappis
Copy link
Contributor

Did we discover what kind of errors we are encountering that aren't caught by --retry?

Connection resets -- which is something that occasionally happens from the GitHub URL we are relying -- aren't being retried (others reporting the same here).

For the record curl already covers connection refused (via curl/curl#1036) and will also retry connection timeouts (if --connect-timeout has been set).

@alpar-t
Copy link
Contributor Author

alpar-t commented Oct 30, 2019

but --connection-timeoutis not available in the curl version we use here.

Copy link
Contributor

@dliappis dliappis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left a minor comment, there's no need for another review cycle.

RUN for iter in {1..10}; do curl -L -s -S ${jdkUrl} | tar -C /opt -zx && \
exit_code=0 && break || exit_code=\$? && echo "download error: retry $iter in 10s" && sleep 10; done; \
(exit $exit_code)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(super) nit: we could remove the 4 space characters in line 22 (so that there's just a newline).

@dliappis
Copy link
Contributor

but --connection-timeoutis not available in the curl version we use here.

Good point and additionally, my earlier comment attempts to show that there are other errors, like connection resets, that curl won't catch, even with the latest version, so I am in favor of wrapping such operations in this kind of retry logic -- even if it looks a bit awkward due to the shell syntax.

@alpar-t alpar-t merged commit a91f0f8 into elastic:6.8 Nov 3, 2019
@alpar-t alpar-t deleted the improove-jdk-download-retries branch November 3, 2019 02:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Delivery/Build Build or test infrastructure Team:Delivery Meta label for Delivery team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants