Feature/docker deploy#14
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. 🗂️ Base branches to auto review (4)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis pull request establishes comprehensive Docker containerization for a Weblate translation management system with a custom "boost-weblate" distribution. It introduces a multi-stage Dockerfile, docker-compose orchestration for Weblate and Redis services, Supervisor daemon configurations for Celery workers and web services, Nginx reverse-proxy templates with optional SSL support, a CD pipeline for automated deployment, and updates to QuickBook format handling and Django settings to integrate new Boost features. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai, full review. |
|
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 20
🧹 Nitpick comments (4)
docker/etc/supervisor/conf.d/check.conf (1)
2-2: Consider making the deploy-check delay configurable.Line 2 hardcodes
sleep 60; exposing this as an env-driven value would reduce startup flakiness across slower/faster deployments.♻️ Suggested tweak
-command=bash -c "sleep 60 && exec /app/venv/bin/weblate check --deploy" +command=bash -c "sleep ${WEBLATE_DEPLOY_CHECK_DELAY:-60} && exec /app/venv/bin/weblate check --deploy"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker/etc/supervisor/conf.d/check.conf` at line 2, The hardcoded startup delay "sleep 60" in the supervisor command for running "/app/venv/bin/weblate check --deploy" should be made configurable via an environment variable (e.g., DEPLOY_CHECK_DELAY or DEPLOY_CHECK_SECONDS); change the command string used by the supervisor entry (the line containing command=bash -c "sleep 60 && exec /app/venv/bin/weblate check --deploy") to read the env var with a sensible default (use ${DEPLOY_CHECK_DELAY:-60}) and ensure the shell expansion is used so non-set or invalid values fall back to 60; also add a short validation guard in the same one-liner (optional) to coerce or default non-numeric values so the supervisor won't fail on startup.weblate/settings_docker.py (1)
1498-1499: Revisit these Docker defaults; they change boost behavior significantly.Line 1498 defaults OpenRouter batch translation to enabled, and Line 1499 doubles the translation wait default to 300s. That combination increases data-sharing risk and queue latency by default in Docker deployments.
Based on learnings, in `weblate/boost_endpoint/services.py` the 150-second wait is intentional and required for the auto-translation flow.♻️ Suggested alignment
-AUTO_BATCH_TRANSLATE_VIA_OPENROUTER = get_env_bool("AUTO_BATCH_TRANSLATE_VIA_OPENROUTER", True) -BOOST_ENDPOINT_ADD_TRANSLATION_SECONDS = get_env_int("BOOST_ENDPOINT_ADD_TRANSLATION_SECONDS", 300) +AUTO_BATCH_TRANSLATE_VIA_OPENROUTER = get_env_bool("AUTO_BATCH_TRANSLATE_VIA_OPENROUTER", False) +BOOST_ENDPOINT_ADD_TRANSLATION_SECONDS = get_env_int("BOOST_ENDPOINT_ADD_TRANSLATION_SECONDS", 150)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@weblate/settings_docker.py` around lines 1498 - 1499, The Docker defaults unintentionally enable OpenRouter batch translation and increase the translation wait to 300s; change AUTO_BATCH_TRANSLATE_VIA_OPENROUTER to default False and BOOST_ENDPOINT_ADD_TRANSLATION_SECONDS to default 150 by updating the calls to get_env_bool("AUTO_BATCH_TRANSLATE_VIA_OPENROUTER", False) and get_env_int("BOOST_ENDPOINT_ADD_TRANSLATION_SECONDS", 150) so the boost endpoint behavior matches the intended auto-translation flow used in boost_endpoint/services.py and avoids increased data-sharing risk and queue latency..github/workflows/cd.yml (1)
31-33: Poll the health endpoint instead of sleeping a fixed 5 minutes.A fixed
sleep 300makes every successful deploy wait the full five minutes, and it still fails rollouts that become healthy just after the sleep elapses.Suggested refactor
script: | - sleep 300 - curl -sf http://localhost:8000/healthz/ + attempt=0 + while [ "$attempt" -lt 60 ]; do + if curl -sf http://localhost:8000/healthz/ > /dev/null; then + exit 0 + fi + attempt=$((attempt + 1)) + sleep 5 + done + exit 1🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/cd.yml around lines 31 - 33, The workflow currently uses a hardcoded "sleep 300" which forces a 5-minute wait; replace that with a polling loop in the same "script" block that repeatedly calls the existing curl command (curl -sf http://localhost:8000/healthz/) at short intervals (e.g., 5–10s) until it succeeds or a max timeout is reached (e.g., 300s), exiting success as soon as curl returns OK and failing if the timeout elapses; update the script to perform retries, a sleep between attempts, and a final non-zero exit on overall timeout so deployments proceed as soon as /healthz/ is healthy.docker/Dockerfile (1)
10-10: Trim build-only sources from the runtime image.Line 51 copies the entire
/apptree into the final stage, so/app/boost-weblateand/app/srcship alongside the installed venv. That blunts the multi-stage split and increases the production image surface without an obvious runtime need.Also applies to: 50-51
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker/Dockerfile` at line 10, The final-stage Dockerfile currently uses "COPY . /app/boost-weblate/" which pulls the entire build tree (including /app/boost-weblate and /app/src) into the runtime image; change the final stage to copy only the runtime artifacts from the build stage (for example the virtualenv/venv, installed site-packages, compiled assets, and the packaged app entrypoint) instead of the full source tree, or use a .dockerignore to exclude build-only files—update the COPY in the final stage (and lines that reference /app/boost-weblate and /app/src) to reference only the needed runtime paths (e.g., COPY --from=builder /app/venv /app/venv and COPY --from=builder /app/dist/<package> /app/) so build sources are not shipped.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.dockerignore:
- Around line 5-31: Add an entry to the .dockerignore to exclude the
docker/environment folder from Docker build context so secrets aren't uploaded;
update the existing .dockerignore (the section listing data, logs, .venv, etc.)
to include a line for docker/environment and optionally docker/environment/*,
ensuring there are no overriding negation patterns that would re-include it.
In @.github/workflows/cd.yml:
- Around line 18-22: Replace the non-deterministic "git pull origin dev" with a
pinned checkout of the workflow-triggering revision and enable fail-fast shell
behavior: add "set -eu" at the start of the script (so failures abort) and
replace the pull with a fetch+checkout of the commit referenced by the workflow
(use the workflow SHA variable, e.g. GITHUB_SHA or the equivalent
action-provided ref) before running "cd docker" and "docker compose up -d
--build"; ensure the checkout step happens in the same script block that runs
"cd /opt/boost-weblate" so the working tree used by "docker compose up -d
--build" is the exact revision that triggered the run.
In `@docker/docker-compose.yml`:
- Around line 4-29: The YAML lists under the weblate service are over-indented
causing pre-commit YAML lint failures; fix by aligning all sequence items for
volumes, extra_hosts, depends_on, env_file, ports, and tmpfs to the repository's
YAML style (reduce the two-space excess indentation so each dash begins at the
same column as other top-level keys within the weblate block); update the
sequences in the weblate service (volumes, extra_hosts, depends_on, env_file,
ports, tmpfs) to use consistent indentation so the file parses and pre-commit
passes.
In `@docker/Dockerfile`:
- Around line 66-77: The Dockerfile currently makes /etc/group writable via the
command that runs "chmod 664 /etc/passwd /etc/group"; update that RUN line to
avoid changing /etc/group permissions—only make /etc/passwd writable. Locate the
RUN chain containing "chmod 664 /etc/passwd /etc/group" and remove /etc/group
from that chmod invocation (or explicitly set /etc/group to a safe read-only
mode while only chmod-ing /etc/passwd), keeping the rest of the RUN steps (nginx
links, logs, sed on /etc/pam.d/su) unchanged.
- Around line 79-97: The Dockerfile's RUN block currently clones the mutable git
tag v0.74; replace that git clone flow with downloading the official release
archive po4a-0.74.tar.gz, verify it with the provided SHA256 checksum
(25fc323f2ba37bbd48c3af0ebf49952644b0e468261f98633e91219a838fe7c2), then tar xzf
it, cd into po4a-0.74, run perl Build.PL && ./Build build && ./Build install,
and finally remove the tarball and extracted dir and purge build deps — i.e.,
remove the git clone and use curl + sha256sum check before build/install in the
same RUN block to make image creation deterministic.
In `@docker/environment.example`:
- Around line 20-22: The sample environment file exposes unsafe defaults: change
WEBLATE_ALLOWED_HOSTS=* to a localhost-only value (e.g., localhost or 127.0.0.1)
and change WEBLATE_MIN_PASSWORD_SCORE=0 to a nonzero sensible default (e.g., 2
or 3) or leave these variables unset and add a short inline note explaining they
must be set for production; update the entries for WEBLATE_ALLOWED_HOSTS and
WEBLATE_MIN_PASSWORD_SCORE in the docker/environment.example file (referencing
those exact variable names) to use the safer defaults and include a brief
comment guiding users to configure them for deployment.
- Around line 113-115: The sample environment enables OpenRouter batch
translation by default via the AUTO_BATCH_TRANSLATE_VIA_OPENROUTER variable;
change this to be opt-in by setting AUTO_BATCH_TRANSLATE_VIA_OPENROUTER=0
(disabled) in docker/environment.example and update the adjacent comment to
clearly state that enabling it will send content to a third-party service and
must be explicitly opted into (e.g., "set to 1 to enable OpenRouter batch
translation (off by default)"); ensure the variable name
AUTO_BATCH_TRANSLATE_VIA_OPENROUTER is the one updated so copies of the example
do not transmit data by default.
In `@docker/etc/nginx/default.tpl`:
- Around line 46-49: The redirect embeds $scheme://$host$request_uri raw, which
can break the outer query string; change location `@redirectToAnubis` so the
target (the full URL built from $scheme://$host$request_uri) is URL-encoded
before being placed into the redir= parameter. Implement this by using nginx's
njs/js module (or an equivalent encoding mechanism available in your stack) to
compute an encoded variable (e.g., via js_set/ngx_http_js_module's
encodeURIComponent) and then use that encoded variable in the return 307 ...
?redir=$ENCODED_TARGET; update the return line inside location `@redirectToAnubis`
accordingly.
- Around line 70-76: The X-Forwarded-For header is currently only set inside the
WEBLATE_BUILTIN_SSL conditional causing backend IPs to be lost for non-SSL
deployments; move the proxy_set_header X-Forwarded-For
$proxy_add_x_forwarded_for; line out of the {% if WEBLATE_BUILTIN_SSL %} ... {%
endif %} block so it is always sent, but leave proxy_set_header
X-Forwarded-Proto $scheme; inside the conditional (so X-Forwarded-Proto remains
set only when WEBLATE_BUILTIN_SSL is true); update the template around the
existing proxy_set_header Host, proxy_read_timeout, and proxy_connect_timeout
entries to reflect this change.
In `@docker/etc/nginx/generate-site.py`:
- Line 1: The file generate-site.py currently only contains a shebang and is
missing the project-required GPL-3.0-or-later license header; update the top of
the file (above or immediately after the existing "#!/usr/bin/env python3"
shebang) to insert the standard GPL-3.0-or-later header comment block used
across the repo so the file complies with the coding guidelines; ensure the
header mentions "GPL-3.0-or-later" and matches the exact format used in other
Python files in the project.
- Around line 39-40: The current use of print(...) in generate-site.py around
the template.render call triggers the lint T201; replace the print call with
sys.stdout.write(template.render(...)) and ensure sys is imported (add import
sys at the top if missing) so output is written via sys.stdout.write; if a
trailing newline is required preserve it by appending "\n" to the rendered
string.
In `@docker/etc/nginx/nginx.conf`:
- Around line 65-66: nginx.conf currently only includes /etc/nginx/conf.d/* and
/etc/nginx/sites-enabled/* but docker/start generates
/tmp/nginx/weblate-site.conf, so update the configuration or generation: either
add an include for /tmp/nginx/*.conf (or /tmp/nginx/weblate-site.conf) to
nginx.conf, or change docker/start to write the generated weblate-site.conf
directly into /etc/nginx/conf.d/; adjust the include path or output path and
reload nginx accordingly (references: nginx.conf include lines, docker/start
script, /tmp/nginx/weblate-site.conf).
In `@docker/etc/supervisor/conf.d/celery-single.conf`:
- Line 3: The Supervisor command in celery-single.conf uses the bare "celery"
which can pick up the wrong binary or fail; update the "command =" line to call
the venv-installed Celery executable (the same explicit path pattern used in the
other worker configs), e.g. replace "celery" with the venv binary path like
"%(ENV_VENV_DIR)s/bin/celery" (or the project's VIRTUAL_ENV variable) so
Supervisor runs the correct interpreter for the single-worker command.
In `@docker/health_check`:
- Around line 4-8: Update the two curl invocations inside the health-check
if-block (the curl against https://localhost:4443/healthz/ and the curl against
http://localhost:8080/healthz/) to include the --fail flag and reduce --max-time
from 30 to 3 (i.e., use --fail --silent --max-time 3) so HTTP error statuses
cause a non-zero exit and the timeout aligns with the Docker HEALTHCHECK
timeout.
- Around line 5-6: The health-check uses curl against
https://localhost:4443/healthz/ which causes hostname verification to fail when
the cert in /app/data/ssl/fullchain.pem is for the real site domain; update the
curl invocation used in the health check (the line starting with curl --silent
--max-time 30 --cacert ...) to request the certificate's actual hostname (or
supply a Host header) and map that hostname to localhost using curl's --resolve
option so the URL hostname matches the cert SAN (e.g., use the production domain
in the URL and add --resolve 'your.domain:4443:127.0.0.1' while still passing
--cacert /app/data/ssl/fullchain.pem).
In `@docker/requirements.txt`:
- Line 50: Remove the legacy, unpinned dependency "raven" from the requirements
list because the project uses the modern sentry integration
"sentry-sdk==2.53.0"; delete the standalone "raven" entry so only
"sentry-sdk==2.53.0" remains and ensure no other references to the symbol
"raven" are left in the dependency file.
In `@docker/start`:
- Around line 167-177: The probe builds a RawQuerySet but never executes it, so
the loop exits immediately; change the probe command used by run_weblate to
force query execution (e.g., evaluate the RawQuerySet by converting to a list or
iterating it) instead of just constructing it. Update the shell invocation that
currently runs User.objects.raw("SELECT 1") to a form that actually runs the SQL
(for example use list(User.objects.raw("SELECT 1")) or iterate the result) so
the until condition reflects the real DB availability; keep the rest of the loop
(the timeout, fail_dep PosgreSQL, and fallback check using
User.objects.exists()) the same.
- Around line 46-49: Ensure the /app/data/ssh directory is created (mkdir -p
/app/data/ssh) before touching known_hosts and stop blindly trusting ssh-keyscan
output; instead, if an environment variable like GITHUB_KNOWN_HOSTS is provided,
write that content into /app/data/ssh/known_hosts, otherwise fetch GitHub's
official host keys over HTTPS from the GitHub meta API (e.g., GET
https://api.github.com/meta and extract the SSH host keys) and write those
verified keys to /app/data/ssh/known_hosts; remove or guard ssh-keyscan usage so
it is only a fallback after explicit verification.
In `@pyproject.toml`:
- Around line 191-201: The PYPI constant in weblate/utils/version.py is still
pointing at the old package name; update the PYPI variable to
"https://pypi.org/pypi/boost-weblate/json" so download_version_info() queries
the correct distribution; locate the PYPI symbol in the version.py module and
change its string value to the new distribution name ensuring any other
references to the old URL are updated as well.
In `@weblate/formats/quickbook.py`:
- Around line 98-105: The current code uses zip(tmpl_units, trans_units) which
silently truncates and can misalign translations; instead, after building
tmpl_units and trans_units (from store.units and translated_store.units), check
their lengths and refuse the positional import if they differ—e.g., log/raise an
error including the storefile_path/name and the counts, and do not perform the
for tmpl_unit, trans_unit in zip(...) assignments when counts mismatch; ensure
the guard is applied immediately after computing tmpl_units/trans_units (before
the loop that assigns tmpl_unit.target = trans_unit.source).
---
Nitpick comments:
In @.github/workflows/cd.yml:
- Around line 31-33: The workflow currently uses a hardcoded "sleep 300" which
forces a 5-minute wait; replace that with a polling loop in the same "script"
block that repeatedly calls the existing curl command (curl -sf
http://localhost:8000/healthz/) at short intervals (e.g., 5–10s) until it
succeeds or a max timeout is reached (e.g., 300s), exiting success as soon as
curl returns OK and failing if the timeout elapses; update the script to perform
retries, a sleep between attempts, and a final non-zero exit on overall timeout
so deployments proceed as soon as /healthz/ is healthy.
In `@docker/Dockerfile`:
- Line 10: The final-stage Dockerfile currently uses "COPY .
/app/boost-weblate/" which pulls the entire build tree (including
/app/boost-weblate and /app/src) into the runtime image; change the final stage
to copy only the runtime artifacts from the build stage (for example the
virtualenv/venv, installed site-packages, compiled assets, and the packaged app
entrypoint) instead of the full source tree, or use a .dockerignore to exclude
build-only files—update the COPY in the final stage (and lines that reference
/app/boost-weblate and /app/src) to reference only the needed runtime paths
(e.g., COPY --from=builder /app/venv /app/venv and COPY --from=builder
/app/dist/<package> /app/) so build sources are not shipped.
In `@docker/etc/supervisor/conf.d/check.conf`:
- Line 2: The hardcoded startup delay "sleep 60" in the supervisor command for
running "/app/venv/bin/weblate check --deploy" should be made configurable via
an environment variable (e.g., DEPLOY_CHECK_DELAY or DEPLOY_CHECK_SECONDS);
change the command string used by the supervisor entry (the line containing
command=bash -c "sleep 60 && exec /app/venv/bin/weblate check --deploy") to read
the env var with a sensible default (use ${DEPLOY_CHECK_DELAY:-60}) and ensure
the shell expansion is used so non-set or invalid values fall back to 60; also
add a short validation guard in the same one-liner (optional) to coerce or
default non-numeric values so the supervisor won't fail on startup.
In `@weblate/settings_docker.py`:
- Around line 1498-1499: The Docker defaults unintentionally enable OpenRouter
batch translation and increase the translation wait to 300s; change
AUTO_BATCH_TRANSLATE_VIA_OPENROUTER to default False and
BOOST_ENDPOINT_ADD_TRANSLATION_SECONDS to default 150 by updating the calls to
get_env_bool("AUTO_BATCH_TRANSLATE_VIA_OPENROUTER", False) and
get_env_int("BOOST_ENDPOINT_ADD_TRANSLATION_SECONDS", 150) so the boost endpoint
behavior matches the intended auto-translation flow used in
boost_endpoint/services.py and avoids increased data-sharing risk and queue
latency.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6da324ff-8cad-4dd7-8338-eca0ca6d54ac
⛔ Files ignored due to path filters (1)
docker/etc/nginx/ffdhe2048.pemis excluded by!**/*.pem
📒 Files selected for processing (26)
.dockerignore.github/workflows/cd.yml.gitignoredocker/Dockerfiledocker/docker-compose.ymldocker/environment.exampledocker/etc/nginx/default.tpldocker/etc/nginx/generate-site.pydocker/etc/nginx/nginx.confdocker/etc/supervisor/conf.d/celery-backup.confdocker/etc/supervisor/conf.d/celery-beat.confdocker/etc/supervisor/conf.d/celery-celery.confdocker/etc/supervisor/conf.d/celery-memory.confdocker/etc/supervisor/conf.d/celery-notify.confdocker/etc/supervisor/conf.d/celery-single.confdocker/etc/supervisor/conf.d/celery-translate.confdocker/etc/supervisor/conf.d/check.confdocker/etc/supervisor/conf.d/web.confdocker/etc/supervisor/supervisord.confdocker/health_checkdocker/requirements.txtdocker/startpyproject.tomlweblate/formats/quickbook.pyweblate/settings_docker.pyweblate/utils/quickbook.py
- Check the CI workflow files and adjust. - Update the codes to pass some CI checks.
Summary by CodeRabbit
New Features
Bug Fixes
Chores