Skip to content

Test Airflow Release step is flaky. #42025

@potiuk

Description

@potiuk

The "Test Airflow Releases" job is flaky and fails far too often.

Most of this comes from "external" factors - for example installing node packages, pulling images etc. often fail with 500 internal error or "Rate limit exceeded" .

Example https://github.com/apache/airflow/actions/runs/10717569021/job/29717755211?pr=41555 where root cause is installing node packages:

 yarn install v1.22.21
  (node:499) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
  (Use `node --trace-deprecation ...` to show where the warning was created)
  [1/4] Resolving packages...
  [2/4] Fetching packages...
  [] 0/1573[] 7/1573[] 13/1573[] 21/1573[] 29/1573error Error: https://registry.yarnpkg.com/@chakra-ui/skeleton/-/skeleton-2.0.18.tgz: Request failed "500 Internal Server Error"
      at ResponseError.ExtendableBuiltin (/opt/airflow/files/home/.cache/pre-commit/repoj5n0lz2l/node_env-22.2.0/lib/node_modules/yarn/lib/cli.js:696:66)
      at new ResponseError (/opt/airflow/files/home/.cache/pre-commit/repoj5n0lz2l/node_env-22.2.0/lib/node_modules/yarn/lib/cli.js:802:124)
      at Request.<anonymous> (/opt/airflow/files/home/.cache/pre-commit/repoj5n0lz2l/node_env-22.2.0/lib/node_modules/yarn/lib/cli.js:66218:16)
      at Request.emit (node:events:520:28)
      at module.exports.Request.onRequestResponse (/opt/airflow/files/home/.cache/pre-commit/repoj5n0lz2l/node_env-22.2.0/lib/node_modules/yarn/lib/cli.js:141751:10)
      at ClientRequest.emit (node:events:520:28)
      at HTTPParser.parserOnIncomingClient (node:_http_client:700:27)
      at HTTPParser.parserOnHeadersComplete (node:_http_common:119:17)
      at TLSSocket.socketOnData (node:_http_client:542:22)
      at TLSSocket.emit (node:events:520:28)
  info Visit https://yarnpkg.com/en/docs/cli/install for documentation about this command.
  Traceback (most recent call last):
    File "./scripts/ci/pre_commit/compile_www_assets.py", line 71, in <module>
      subprocess.check_call(["yarn", "install", "--frozen-lockfile"], cwd=os.fspath(www_directory))
    File "/usr/local/lib/python3.8/subprocess.py", line 364, in check_call
      raise CalledProcessError(retcode, cmd)
  subprocess.CalledProcessError: Command '['yarn', 'install', '--frozen-lockfile']' returned non-zero exit status 1.

The solution to that is likely attempting to retry the whole breeze command several times. This "release process" is relatively fast (~4 minutes) so retrying it up to 3 times in case of failures will bring the total time to 12 minutes max and should not have much impact on elapsed time or cost.

Metadata

Metadata

Assignees

Labels

area:CIAirflow's tests and continious integrationarea:dev-envCI, pre-commit, pylint and other changes that do not change the behavior of the final codearea:dev-tools

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions