Skip to content

ci: Mitigate translation download 429 errors#320

Merged
dcalhoun merged 5 commits intotrunkfrom
ci/avoid-unnecessary-tasks
Feb 16, 2026
Merged

ci: Mitigate translation download 429 errors#320
dcalhoun merged 5 commits intotrunkfrom
ci/avoid-unnecessary-tasks

Conversation

@dcalhoun
Copy link
Member

@dcalhoun dcalhoun commented Feb 15, 2026

What?

Reduces translation download requests from ~192 to ~48 per CI run by building once and sharing artifacts, removing translations from postinstall, and adds robust 429 rate limit handling so the remaining 48 requests recover gracefully when throttled.

Why?

Fix CMM-1250.

Every Buildkite CI run makes ~192 HTTP requests to translate.wordpress.org because 4 of 6 pipeline steps independently download all 48 translation locales. This causes 429 (rate limiting) errors. The pipeline had no artifact sharing — the "Build React App" and "Publish Android Library" steps each ran a full make build REFRESH_L10N=1, and the lint/test steps triggered downloads via npm ci's postinstall hook.

Even after reducing to ~48 requests, builds were still failing because the download script's retry logic was too aggressive — batch size of 10 with 1-2s backoff meant retries also got 429'd immediately.

How?

1. Remove prep-translations from postinstall in package.json

Translation downloads are a build concern, not an install concern. The make targets (make build, make dev-server, etc.) already handle translations via the prep-translations dependency. This eliminates the translation downloads that were happening in lint/test CI steps (which run npm ci but don't need translations).

2. Restructure .buildkite/pipeline.yml

  • Build React App gets a key: build-react and uploads dist/ as a Buildkite artifact
  • Lint and Test JS are unchanged — npm ci no longer triggers translation downloads
  • Publish Android Library gets depends_on: build-react, downloads the pre-built dist/ artifact instead of running its own make build
Step Before After
Build React App 48 requests 48 requests (unchanged, sole source)
Lint React App 48 requests 0 (postinstall no longer downloads)
Test JavaScript 48 requests 0 (postinstall no longer downloads)
Publish Android 48 requests 0 (uses artifact, no npm ci)
Total ~192 ~48

3. Fail build on translation fetch errors in bin/prep-translations.js

Previously, translation fetch failures were silently swallowed. The script now properly propagates errors so CI fails visibly rather than producing a build with missing translations.

4. Improve 429 rate limit handling in bin/prep-translations.js

  • Reduce batch size from 10 to 5 concurrent requests
  • Add inter-batch delay of 2s between batches to spread load
  • Detect 429 specifically via a RateLimitError class that carries the parsed Retry-After value
  • Respect Retry-After header (supports both delta-seconds and HTTP-date formats)
  • Exponential backoff for 429s without Retry-After: 5s, 10s, 20s, 40s + jitter
  • Linear backoff for other errors: 1s, 2s, 3s (unchanged behavior)
  • Increase max retries from 3 to 5
  • Add 60s max backoff safety limit to avoid blocking CI indefinitely
  • Log configuration at startup for CI visibility
  • All parameters configurable via environment variables (L10N_BATCH_SIZE, L10N_BATCH_DELAY_MS, L10N_MAX_RETRIES, L10N_MAX_BACKOFF_MS)

Testing Instructions

  1. Verify in Buildkite that:
    • Build React App step downloads translations and uploads dist.tar.gz
    • Lint and Test JS steps run npm ci without downloading translations
    • Publish Android Library step waits for build, downloads artifact, and publishes without running npm ci
    • Any 429 errors from translate.wordpress.org are recovered via backoff and retry
  2. Local verification:
    • Run npm ci and confirm translations are not downloaded
    • Run npm run prep-translations -- --force and confirm translations download with the new config logged (batch=5, delay=2000ms, retries=5, maxBackoff=60000ms)
    • Run L10N_BATCH_SIZE=2 L10N_BATCH_DELAY_MS=5000 npm run prep-translations -- --force and confirm env var overrides are respected in the config log
    • Run make dev-server and confirm the editor works with translations

@dcalhoun dcalhoun added the [Type] Build Tooling Issues or PRs related to build tooling label Feb 15, 2026
Remove `prep-translations` from `postinstall` in package.json so that
`npm ci` no longer triggers translation downloads in lint/test CI steps.
Translations are already handled by `make` targets that need them.

Restructure the Buildkite pipeline so that the Build React App step
uploads `dist/` as an artifact, and the Publish Android Library step
downloads it instead of running its own full build. This reduces
translation requests from ~192 to ~48 per CI run (75% reduction).
Add STRICT_L10N=1 to the Build React App step so the build fails if
translations cannot be fetched, preventing incomplete artifacts from
reaching the Android publish step.
@dcalhoun dcalhoun force-pushed the ci/avoid-unnecessary-tasks branch from 8a29b52 to 1cefb70 Compare February 15, 2026 17:19
The translation download script was failing in CI due to HTTP 429
responses from translate.wordpress.org. The previous batch size of 10
with 1-2s retry backoffs was too aggressive — retries would also get
429'd immediately.

Changes:
- Reduce batch size from 10 to 5 concurrent requests
- Add 2s delay between batches to spread load
- Add RateLimitError class to distinguish 429s from other errors
- Respect Retry-After header (delta-seconds and HTTP-date formats)
- Use exponential backoff for 429s without Retry-After (5s, 10s, 20s, 40s)
- Increase max retries from 3 to 5
- Add 60s max backoff safety limit to avoid blocking CI indefinitely
- Make all parameters configurable via environment variables
- Log configuration at startup for CI visibility
@dcalhoun dcalhoun changed the title ci: Reduce CI translation downloads to avoid 429 errors ci: Mitigate translation download 429 errors Feb 15, 2026
Set L10N_BATCH_SIZE=2 and L10N_BATCH_DELAY_MS=5000 on the Build React
App step to match the parameters that succeeded locally. The defaults
(batch=5, delay=2000ms) are still too aggressive for translate.wordpress.org
in CI, where en-nz got 429'd on all 5 retry attempts in build 1346.
Comment on lines +9 to +13
key: build-react
command: |
make build REFRESH_L10N=1 REFRESH_JS_BUILD=1 STRICT_L10N=1
tar -czf dist.tar.gz dist/
buildkite-agent artifact upload dist.tar.gz
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upload the build artifact for later tasks to avoid unnecessary duplicate builds.

Comment on lines +27 to +32
depends_on: build-react
command: |
make build REFRESH_L10N=1 REFRESH_JS_BUILD=1
buildkite-agent artifact download dist.tar.gz .
tar -xzf dist.tar.gz
rm -rf ./android/Gutenberg/src/main/assets/
cp -r ./dist/. ./android/Gutenberg/src/main/assets
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rely upon the single build created by an earlier task.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expand the script robustness to include more sophisticated backoff logic when 429 errors occur.

"generate-version": "node bin/generate-version.js",
"lint:js": "eslint . --ext js,jsx --report-unused-disable-directives --max-warnings 0",
"lint:js:fix": "eslint . --ext js,jsx --report-unused-disable-directives --max-warnings 0 --fix",
"postinstall": "patch-package && npm run prep-translations && npm run generate-version",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the default translation downloads that occur in npm's postinstall script.

The downside is that someone running npm install rather than the project's documented recommendation of make dev-server or make build will result in absent translation strings for new clones.

The upside is that avoiding unnecessary or duplicative translation string fetches is far simpler.

command: make build REFRESH_L10N=1 REFRESH_JS_BUILD=1
key: build-react
command: |
make build REFRESH_L10N=1 REFRESH_JS_BUILD=1 STRICT_L10N=1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add STRICT_L10N=1 to ensure the release build fails if downloading translation strings fails. Otherwise, the release may lack expected translations.

@dcalhoun dcalhoun marked this pull request as ready for review February 16, 2026 15:44
@dcalhoun dcalhoun requested a review from nbradbury February 16, 2026 15:44
Copy link
Contributor

@nbradbury nbradbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good! :shipit:

@dcalhoun dcalhoun merged commit b5cc7c9 into trunk Feb 16, 2026
12 checks passed
@dcalhoun dcalhoun deleted the ci/avoid-unnecessary-tasks branch February 16, 2026 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Type] Build Tooling Issues or PRs related to build tooling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants