perf(ci): use postgres service container for python job#4633
Merged
Yicong-Huang merged 2 commits intoMay 2, 2026
Conversation
Replace 'apt-get install postgresql' + systemctl start in the python matrix with the same docker service container the scala job already uses (postgres image with POSTGRES_PASSWORD=postgres on localhost :5432, plus a pg_isready healthcheck). Drop the apt-get update step to remove the dependency on the Azure Ubuntu mirror, which has been unreliable; container pulls go through a separate Docker registry CDN. Mechanics: connect via 'psql -h localhost -U postgres' with PGPASSWORD=postgres, the same way the scala job does. The single 'Create iceberg catalog database' step replaces the previous install/start/seed sequence. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
github-actions Bot
pushed a commit
that referenced
this pull request
May 2, 2026
### What changes were proposed in this PR? Switch the python job in `build.yml` from `apt-get install postgresql` + `systemctl start` to a `services: postgres` container, mirroring what the scala job already does: - Add `services.postgres` (image `postgres`, `POSTGRES_PASSWORD=postgres`, port 5432, `pg_isready` healthcheck). - Drop `Install PostgreSQL`, `Start PostgreSQL Service`, and the `sudo -u postgres psql -f` seed step. - Single `Create iceberg catalog database` step that runs `psql -h localhost -U postgres -f sql/iceberg_postgres_catalog.sql` (same pattern as the scala job). ### Any related issues, documentation, discussions? Closes #4634. Driven by repeated python-job failures on `apt-get update` against `azure.archive.ubuntu.com`, which has been unreliable; runs sit ignoring the InRelease responses for tens of seconds and either fail or surface stale package metadata. The docker registry path used by `services` is independent of that mirror. Side benefit: postgres container starts in seconds, vs. ~30 s of `apt-get update` even on a healthy day. Removes the only place in `build.yml` that still needed the apt mirror. ### How was this PR tested? Will be exercised by this PR's own python matrix once the CI runs. The seed SQL is the same one the scala job already runs successfully against the same container image. ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.7 Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> (backported from commit f32c974)
SarahAsad23
pushed a commit
to SarahAsad23/texera
that referenced
this pull request
May 4, 2026
### What changes were proposed in this PR? Switch the python job in `build.yml` from `apt-get install postgresql` + `systemctl start` to a `services: postgres` container, mirroring what the scala job already does: - Add `services.postgres` (image `postgres`, `POSTGRES_PASSWORD=postgres`, port 5432, `pg_isready` healthcheck). - Drop `Install PostgreSQL`, `Start PostgreSQL Service`, and the `sudo -u postgres psql -f` seed step. - Single `Create iceberg catalog database` step that runs `psql -h localhost -U postgres -f sql/iceberg_postgres_catalog.sql` (same pattern as the scala job). ### Any related issues, documentation, discussions? Closes apache#4634. Driven by repeated python-job failures on `apt-get update` against `azure.archive.ubuntu.com`, which has been unreliable; runs sit ignoring the InRelease responses for tens of seconds and either fail or surface stale package metadata. The docker registry path used by `services` is independent of that mirror. Side benefit: postgres container starts in seconds, vs. ~30 s of `apt-get update` even on a healthy day. Removes the only place in `build.yml` that still needed the apt mirror. ### How was this PR tested? Will be exercised by this PR's own python matrix once the CI runs. The seed SQL is the same one the scala job already runs successfully against the same container image. ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.7 Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this PR?
Switch the python job in
build.ymlfromapt-get install postgresql+systemctl startto aservices: postgrescontainer, mirroring what the scala job already does:services.postgres(imagepostgres,POSTGRES_PASSWORD=postgres, port 5432,pg_isreadyhealthcheck).Install PostgreSQL,Start PostgreSQL Service, and thesudo -u postgres psql -fseed step.Create iceberg catalog databasestep that runspsql -h localhost -U postgres -f sql/iceberg_postgres_catalog.sql(same pattern as the scala job).Any related issues, documentation, discussions?
Closes #4634.
Driven by repeated python-job failures on
apt-get updateagainstazure.archive.ubuntu.com, which has been unreliable; runs sit ignoring the InRelease responses for tens of seconds and either fail or surface stale package metadata. The docker registry path used byservicesis independent of that mirror.Side benefit: postgres container starts in seconds, vs. ~30 s of
apt-get updateeven on a healthy day. Removes the only place inbuild.ymlthat still needed the apt mirror.How was this PR tested?
Will be exercised by this PR's own python matrix once the CI runs. The seed SQL is the same one the scala job already runs successfully against the same container image.
Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.7