Skip to content

Conversation

@gsoldevila
Copy link
Contributor

@gsoldevila gsoldevila commented Nov 19, 2025

Addresses eventual CI failures such as this one.
image

INFO:docker build:#19 [node_deps 4/7] RUN install_packages yarn=1.22.19-1
INFO:docker build:#19 7.734 W: Failed to fetch https://deb.nodesource.com/node_12.x/dists/buster/InRelease  500  Internal Server Error [IP: 172.66.150.169 443]
INFO:docker build:#19 7.734 W: Failed to fetch https://dl.yarnpkg.com/debian/dists/stable/InRelease  500  Internal Server Error [IP: 104.16.169.120 443]
INFO:docker build:#19 7.734 W: Some index files failed to download. They have been ignored, or old ones used instead.

The official install_packages script fails if the package repository fails with 500 Server Error when trying to fetch the package lists (which happens from time to time).
This PR adds a wrapper script with a retry mechanism, so that we don't give up that quickly.

@github-actions
Copy link

A documentation preview will be available soon.

Request a new doc build by commenting
  • Rebuild this PR: run docs-build
  • Rebuild this PR and all Elastic docs: run docs-build rebuild

run docs-build is much faster than run docs-build rebuild. A rebuild should only be needed in rare situations.

If your PR continues to fail for an unknown reason, the doc build pipeline may be broken. Elastic employees can check the pipeline status here.

RUN echo "deb http://archive.debian.org/debian/ buster main" > /etc/apt/sources.list && \
echo "deb http://archive.debian.org/debian-security/ buster/updates main" >> /etc/apt/sources.list

# TODO install_packages calls apt-get update and then nukes the list files after. We should avoid multiple calls to apt-get update.....
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nik9000 I don't fully understand your comment above, but I wonder if the wrapper script that I'm introducing can help in any way.

@reakaleek reakaleek requested a review from Copilot November 19, 2025 12:19
Copilot finished reviewing on behalf of reakaleek November 19, 2025 12:20
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses intermittent CI failures caused by 500 Server Errors from package repositories during install_packages operations. It introduces a retry mechanism to make package installation more resilient to transient network failures.

Key Changes:

  • Added a new install_packages.sh wrapper script with retry logic (up to 5 attempts with 10-second delays)
  • Updated all install_packages calls in the Dockerfile to use the new install_packages.sh wrapper

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
install_packages.sh New wrapper script that adds retry logic around the existing install_packages command to handle transient 500 errors
Dockerfile Copies the wrapper script and updates all package installation commands to use install_packages.sh instead of install_packages

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

# Check the exit status of the previous command
if [ $? -eq 0 ]; then
echo "Packages installed successfully."
exit 0 # Success, exit the function
Copy link

Copilot AI Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "exit the function" but this is a script, not a function. The comment should say "exit the script" for clarity.

Suggested change
exit 0 # Success, exit the function
exit 0 # Success, exit the script

Copilot uses AI. Check for mistakes.
Comment on lines +18 to +30
install_packages "$@"

# Check the exit status of the previous command
if [ $? -eq 0 ]; then
echo "Packages installed successfully."
exit 0 # Success, exit the function
else
RETRY_COUNT=$((RETRY_COUNT + 1))
if [ $RETRY_COUNT -lt $MAX_ATTEMPTS ]; then
echo "Installation failed. Retrying in $WAIT_TIME seconds..."
sleep $WAIT_TIME
fi
fi
Copy link

Copilot AI Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using $? in a condition is fragile. The exit status should be captured immediately after the command. Consider rewriting as:

if install_packages "$@"; then
    echo "Packages installed successfully."
    exit 0
else
    RETRY_COUNT=$((RETRY_COUNT + 1))
    # ... rest of error handling
fi

This ensures the exit status is evaluated correctly without risk of being overwritten by intervening commands.

Copilot uses AI. Check for mistakes.

# Loop for retry attempts
while [ $RETRY_COUNT -lt $MAX_ATTEMPTS ]; do
echo "Attempt $((RETRY_COUNT + 1)) of $MAX_ATTEMPTS: Running install_packages $@"
Copy link

Copilot AI Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable $@ should be quoted as "$@" to properly handle arguments containing spaces or special characters. Without quotes, arguments with spaces will be split into multiple arguments.

Suggested change
echo "Attempt $((RETRY_COUNT + 1)) of $MAX_ATTEMPTS: Running install_packages $@"
echo "Attempt $((RETRY_COUNT + 1)) of $MAX_ATTEMPTS: Running install_packages \"$@\""
echo "Arguments: $@"

Copilot uses AI. Check for mistakes.
@bmorelli25 bmorelli25 requested a review from a team as a code owner November 21, 2025 18:08
@bmorelli25 bmorelli25 requested review from Mpdreamz and removed request for a team, KOTungseth, bmorelli25 and florent-leborgne November 21, 2025 18:08
reakaleek
reakaleek previously approved these changes Nov 21, 2025
Copy link
Member

@reakaleek reakaleek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for making it more stable.

There is an error in CI. But just looks like syntax error or wrong folder?

@gsoldevila
Copy link
Contributor Author

LGTM. Thank you for making it more stable.

There is an error in CI. But just looks like syntax error or wrong folder?

I forgot to place the script inside the .docker folder, fixed!

@gsoldevila gsoldevila requested a review from reakaleek November 24, 2025 11:47
@gsoldevila gsoldevila enabled auto-merge (squash) November 24, 2025 11:48
@gsoldevila gsoldevila merged commit b104f69 into elastic:master Dec 1, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants