Skip to content

feat(scripts): convert PO→JSON translation pipeline po2json to cross-platform Python version#39724

Open
FrancescoCastaldi wants to merge 19 commits into
apache:masterfrom
FrancescoCastaldi:feat/add-compile-po-script
Open

feat(scripts): convert PO→JSON translation pipeline po2json to cross-platform Python version#39724
FrancescoCastaldi wants to merge 19 commits into
apache:masterfrom
FrancescoCastaldi:feat/add-compile-po-script

Conversation

@FrancescoCastaldi
Copy link
Copy Markdown
Contributor

SUMMARY

Adds scripts/compile_po.py, a standalone CLI utility to compile and convert Apache Superset translation files in a single automated pipeline. The script:

  • Checks Python (Babel) and Node/npm/npx dependencies automatically, installing missing ones
  • Compiles .po files with pybabel compile
  • Syncs compiled PO files to superset-frontend/src/translations
  • Installs po2json and prettier locally once (avoids repeated npx -y downloads)
  • Converts all .po files to JED 1.x JSON format in parallel using ThreadPoolExecutor (up to min(8, cpu_count*2) workers)
  • Runs prettier on all generated JSON files in a single batch call

Usage:

python scripts/compile_po.py

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

  1. Clone the repository and activate a Python virtualenv with Babel installed.
  2. Ensure Node.js, npm and npx are available in PATH.
  3. Run python scripts/compile_po.py from the repository root.
  4. Verify that .mo files are generated under superset/translations/ and .json files appear under superset-frontend/src/translations/.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

Adds scripts/compile_po.py, a standalone utility to compile and convert
Apache Superset translation files. The script:

- Checks Python (Babel) and Node/npm/npx dependencies automatically
- Compiles .po files with pybabel
- Syncs compiled PO files to superset-frontend/src/translations
- Installs po2json and prettier locally (avoids repeated npx downloads)
- Converts all .po files to JED 1.x JSON format in parallel (ThreadPoolExecutor)
- Runs prettier on all generated JSON files in a single batch

Usage:
  python scripts/compile_po.py
@github-actions github-actions Bot added the risk:ci-script PR modifies scripts that execute in CI (supply chain risk) label Apr 28, 2026
@dosubot dosubot Bot added the i18n:general Related to translations label Apr 28, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 64.37%. Comparing base (dc1c0f6) to head (9a8dd13).
⚠️ Report is 17 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #39724      +/-   ##
==========================================
+ Coverage   64.35%   64.37%   +0.01%     
==========================================
  Files        2569     2569              
  Lines      134680   134684       +4     
  Branches    31254    31255       +1     
==========================================
+ Hits        86679    86707      +28     
+ Misses      46505    46479      -26     
- Partials     1496     1498       +2     
Flag Coverage Δ
hive 39.67% <ø> (?)
mysql 59.94% <ø> (-0.01%) ⬇️
postgres 60.02% <ø> (-0.01%) ⬇️
presto 41.42% <ø> (-0.01%) ⬇️
python 61.55% <ø> (+0.04%) ⬆️
sqlite 59.64% <ø> (-0.01%) ⬇️
unit 100.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread scripts/compile_po.py Outdated
Comment thread scripts/compile_po.py Outdated
Comment thread scripts/compile_po.py Outdated
Comment thread scripts/compile_po.py Outdated
Comment thread scripts/compile_po.py Outdated
Comment thread scripts/compile_po.py Outdated
Comment thread scripts/compile_po.py Outdated
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 28, 2026

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit e6e8e36
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/69f0baa649f9230008958cf7
😎 Deploy Preview https://deploy-preview-39724--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Added Apache License information to the top of the script.
Copy link
Copy Markdown
Contributor

@bito-code-review bito-code-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Agent Run #1072ef

Actionable Suggestions - 2
  • scripts/compile_po.py - 2
Additional Suggestions - 1
  • scripts/compile_po.py - 1
    • Missing License Header · Line 1-1
      New files in this repository must include the Apache license header. Please add the standard ASF header above the shebang line.
Review Details
  • Files reviewed - 1 · Commit Range: e6e8e36..e6e8e36
    • scripts/compile_po.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Comment thread scripts/compile_po.py
Comment thread scripts/compile_po.py Outdated
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 28, 2026

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit 04267d8
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/69f0bcebf39f630008b9baaa
😎 Deploy Preview https://deploy-preview-39724--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 28, 2026

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit 48e5e73
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/69f73b2ac2fcbc00082d19ac
😎 Deploy Preview https://deploy-preview-39724--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented Apr 28, 2026

Code Review Agent Run #f5ee6b

Actionable Suggestions - 0
Review Details
  • Files reviewed - 1 · Commit Range: e6e8e36..9d648e4
    • scripts/compile_po.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Fix all pre-commit check failures:
- I001: sort imports alphabetically (glob, importlib.util, os, shutil, subprocess, sys)
- S603: add noqa comment on subprocess.run call
- C901: reduce compile_translations complexity by extracting _run_pybabel, _sync_po_to_frontend, _convert_po_files_parallel, _run_prettier helpers
- E501: break long lines at 88 chars
- mypy: add full type annotations to all functions (from __future__ import annotations)
Copy link
Copy Markdown
Contributor Author

@FrancescoCastaldi FrancescoCastaldi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Refactor compile_po.py to improve npm package installation and conversion logic.
Copy link
Copy Markdown
Contributor Author

@FrancescoCastaldi FrancescoCastaldi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix

@rusackas
Copy link
Copy Markdown
Member

Thanks for putting this together. A few questions before I can sign off:

  1. Overlap with existing toolingsuperset-frontend/scripts/po2json.sh already runs po2json + prettier and is wired to npm run build-translation in package.json. How is this new script meant to fit alongside it — should compile_po.py replace po2json.sh and the build-translation script, or are they intended to coexist? If they coexist, what's the guidance for contributors on which to use?

  2. Bito's two actionable findings are still showing as unaddressed in the review thread:

    • 'Incorrect Fallback Condition' at line 74
    • 'Broken npx Fallback' at lines 85-89 (looks like the [npx_cmd, 'po2json'] fallback is missing -y / --package and would prompt or fail)

    Could you reply on those specific threads (or link to the commits that addressed them) so it's clear what changed?

  3. Pre-commit check is failing on the latest commit — usually a quick local pre-commit run --all-files and force-push will sort it out.

  4. Tests? This is a 297-line script with subprocess shell-out, a threadpool, and platform branching. A small smoke test (or even a dry-run mode) would make this a lot easier to maintain.

  5. Auto-installing packages (pip install Babel, npm install --no-save) from inside the script is convenient for dev machines but a bit of a footgun if the script ever runs in CI or a container. Worth gating behind a --install flag, or at least documenting the side effects?

No objection to the parallelization idea — just want to make sure we're not adding a second translation pipeline indefinitely.

@hainenber
Copy link
Copy Markdown
Contributor

Adding my concern(s) on top of excellent ones from @rusackas.

IMO, I don't see the PO->JSON translation step being the bottleneck to warrant parallelization and its related complexities

For example, this job ran for around 45 second.

ruff-format reformatted 1 file (scripts/compile_po.py)
@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented May 1, 2026

Code Review Agent Run #ee5bf9

Actionable Suggestions - 0
Review Details
  • Files reviewed - 1 · Commit Range: 9d648e4..f591263
    • scripts/compile_po.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

…k in compile_po.py

- fix 'Incorrect Fallback Condition': remove early return None, None after npm install warning, allowing the code to proceed to check if local binaries exist (fixes unreachable code at lines 103-110)
- fix 'Broken npx Fallback': add -y flag to npx calls for po2json and prettier to avoid interactive prompts in non-TTY environments
@FrancescoCastaldi
Copy link
Copy Markdown
Contributor Author

FrancescoCastaldi commented May 3, 2026

Thanks for the feedback @rusackas and @hainenber!

Re: overlap with po2json.sh
The main motivation for compile_po.py is Windows compatibility. The existing po2json.sh is a Bash script and requires a Unix-like shell (WSL, Git Bash, etc.) to run on Windows it cannot be executed directly in a standard Windows environment (cmd.exe or PowerShell). This makes the translation compilation workflow inaccessible to Windows contributors and less experienced users who may not have WSL set up.

compile_po.py is a pure Python script that works cross-platform out of the box: it auto-detects the OS, resolves the correct binary extensions (.cmd on Windows), and handles Node/npm/npx commands accordingly. It also wraps both pybabel compile and po2json in a single command, so contributors only need to run python scripts/compile_po.py without needing to know the underlying toolchain.

The intent is not to remove po2json.sh immediately it can coexist. But for Windows users and less technical contributors, compile_po.py provides a much friendlier entry point. Long term it could replace po2json.sh if the project wants a single cross-platform solution, but that can be a separate discussion.

Re: parallelization bottleneck (@hainenber)
Fair point on the CI runner the job is already fast (~45s). The parallelization benefit is more relevant in local development on machines with many locale files and slower I/O, where the sequential approach can feel sluggish. That said, I'm open to simplifying if the added complexity isn't worth it — the cross-platform Python wrapper itself remains the core value regardless of whether parallelism is used.

Re: Bito's findings (fallback condition + npx -y)
Both issues have been addressed in the latest commit (0ce3ebd): the early return None, None in install_npm_packages has been removed so the binary-existence check is always reached, and the -y flag has been added to all npx calls to avoid interactive prompts in non-TTY environments.

@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented May 3, 2026

Code Review Agent Run #f6ea96

Actionable Suggestions - 0
Review Details
  • Files reviewed - 1 · Commit Range: f591263..0ce3ebd
    • scripts/compile_po.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@hainenber
Copy link
Copy Markdown
Contributor

@FrancescoCastaldi thanks for sharing your perspective there. I wholly agree on making po2json script being cross-platform as the less development barrier, the better for new contributors.

Having said that, the original script is extremely simple and IMO, if anything, it'd be better to have a 1-to-1 Python/JS equivalence. The more line/branch of code, the more maintenance effort in the long run.

Re: possible benefit for parallelization in local machines, I don't think there are hardware out there getting bogged down by this lightweight translation, unless you're on Arduino or similar. I'd need a realistic stats on that.

…lent of po2json.sh

Remove parallelization (ThreadPoolExecutor, MAX_WORKERS) and all
associated complexity in response to reviewer feedback.

The new script is a minimal, sequential Python equivalent of
po2json.sh: it iterates over .po files, calls po2json and prettier
one at a time, and exits with a non-zero code on failure.
This keeps the cross-platform benefit while minimizing maintenance
overhead and line/branch count.
@pull-request-size pull-request-size Bot added size/M and removed size/L labels May 4, 2026
@FrancescoCastaldi
Copy link
Copy Markdown
Contributor Author

FrancescoCastaldi commented May 4, 2026

@hainenber Thanks for the thorough feedback you're right on both counts.

I've just pushed a complete rewrite of compile_po.py that drops all the parallelization logic (ThreadPoolExecutor, MAX_WORKERS, the worker-pool machinery) and reduces the script to a simple sequential loop a direct 1-to-1 Python equivalent of po2json.sh:

  • Iterates over .po files with glob
  • Calls po2json and prettier one at a time, in order
  • Exits with a non-zero code on failure
  • 63 lines total (down from 278), no branches beyond basic error checking

The only thing kept over the shell script is cross-platform compatibility (Windows path handling, no bash dependency), which was the original motivation for the Python port.

Let me know if this looks closer to what you had in mind!

On Windows, npm global binaries (po2json, prettier) are installed as
.cmd batch scripts and cannot be resolved by subprocess.run() with a
list argument. Adding shell=True when os.name == 'nt' lets the Windows
shell (cmd.exe) find and execute .cmd files transparently, while
keeping shell=False on Unix for safety.
Copy link
Copy Markdown
Contributor

@bito-code-review bito-code-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Agent Run #7ed816

Actionable Suggestions - 1
  • scripts/compile_po.py - 1
    • subprocess.run without explicit check argument · Line 33-33
Review Details
  • Files reviewed - 1 · Commit Range: 0ce3ebd..1e620e3
    • scripts/compile_po.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Comment thread scripts/compile_po.py Outdated
@hainenber hainenber changed the title feat(scripts): add compile_po.py – parallel PO→JSON translation pipeline feat(scripts): convert PO→JSON translation pipeline po2json to cross-platform Python version May 5, 2026
Co-authored-by: bito-code-review[bot] <188872107+bito-code-review[bot]@users.noreply.github.com>
@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented May 5, 2026

Code Review Agent Run #262c32

Actionable Suggestions - 0
Review Details
  • Files reviewed - 1 · Commit Range: 1e620e3..9a8dd13
    • scripts/compile_po.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

i18n:general Related to translations risk:ci-script PR modifies scripts that execute in CI (supply chain risk) size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants