Release 0.4.59#732
Merged
Merged
Conversation
Release 0.4.58 to develop
- Cancel in-progress runs on new commits (concurrency group by PR) - Collapse [http, non_http] matrix into one job (TEST_PROTOCOL=both); both protocols still run in parallel via run_test_container.sh - Skip notebook and coverage image builds on PRs (BUILD_AUX_IMAGES=false) — neither is needed to run tests - Enable BuildKit + BUILDKIT_INLINE_CACHE=1 so --cache-from actually works - Replace unconditional sleep 20 with a health-check readiness probe
Both develop.yml and main.yml were still running a matrix: [http, non_http] that duplicated every docker image build (notebook deps, notebook, tests, coverage) on a second runner, and did not set DOCKER_BUILDKIT=1 so the --cache-from / BUILDKIT_INLINE_CACHE plumbing already present in ci.sh had no effect. Changes: - Remove matrix; single job sets TEST_PROTOCOL=both so run_test_container.sh runs both stacks in parallel (same pattern as pr.yaml) - Add DOCKER_BUILDKIT=1 and COMPOSE_DOCKER_CLI_BUILD=1 to all ci.sh invocations - Add concurrency cancel-in-progress so stale runs are superseded when a newer push lands - Set RUNNER_NAME env var (required by run_test_container.sh log paths)
…ip cache mounts
- docker/tests/Dockerfile: replace FROM aperturedata/aperturedb-notebook:dependencies
(compiles OpenCV from source, ~15-20 min cold) with FROM python:3.10.
Tests already have opencv-python-headless via pyproject.toml runtime deps.
System packages: ffmpeg, libavcodec-extra, libgl1, libfuse-dev/fuse, git.
- Pin CLIP to openai-clip (PyPI) in tests + notebook Dockerfiles.
Eliminates git-clone on every build; stable version = Docker layer stays cached.
Split test Dockerfile into three pip layers:
1. awscli + openai-clip (stable, almost never re-runs)
2. pyproject.toml-only copy + stub pkg + pip install [dev] deps
(torch/tensorflow/etc. cached until pyproject.toml changes)
3. ADD real code + pip install --no-deps (only re-runs on code changes)
- ci.sh build_tests(): push aperturedb-python-tests:latest after build when
NO_PUSH != true (develop/main). Enables --cache-from to work across runners
and cold starts instead of relying solely on local daemon cache.
- All pip RUN steps in tests, notebook, and dependencies Dockerfiles now use
--mount=type=cache,target=/root/.cache/pip. Wheels survive layer invalidation
on the same runner; cold reinstalls skip the network download.
Contributor
There was a problem hiding this comment.
Pull request overview
Release 0.4.59 focusing on safer logging (token redaction), expanded test coverage for several utility modules, and significantly faster/more reliable CI via Docker BuildKit and consolidated protocol runs.
Changes:
- Add
aperturedb.LoggingUtils.censor_tokens()and apply token redaction to connector/common logging paths. - Introduce/extend helper APIs (e.g.,
ConnectionPool, multi-keySort, schema normalization inUtils) with accompanying tests. - Rework CI/test container build & execution to improve caching, reduce duplicate work, and run HTTP + non-HTTP protocols in parallel within one job.
Reviewed changes
Copilot reviewed 54 out of 56 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| test/test_Utils.py | Adds unit tests for token censoring and schema-summary normalization without a live DB. |
| test/test_Stats.py | Updates assertions to reflect new throughput behavior (no NaN) and improves assertion formatting. |
| test/test_Sources.py | Adds tests for S3/GCS auth fallback, retry behavior, and client caching. |
| test/test_Sort.py | Adds tests for the new multi-key Sort behavior and mismatch validation. |
| test/test_Session.py | Adds test for Connector(connect=True) behavior at construction time. |
| test/test_Server.py | Refactors formatting of test call/assertions (no logic change intended). |
| test/test_ResponseHandler.py | Adjusts mock query response shape and adds a regression-style test for partial responses. |
| test/test_Parallel.py | Adds integration-style test for Dask generator + dry_run=True ingestion. |
| test/test_Images.py | Adds tests for Images helpers and geometric transforms (rotate, resolve). |
| test/test_Datawizard.py | Minor formatting cleanup in test model construction. |
| test/test_ConnectionPool.py | Adds tests for new ConnectionPool behavior (borrow/return, concurrency, timeout). |
| test/test_Configuration.py | Adds Configuration.__repr__ test aligned with new repr format. |
| test/run_test.sh | Switches to pytest --cov and makes error-log copy non-fatal. |
| test/run_test_container.sh | Uses dynamic ports, adds readiness waiting, and mounts run_test.sh into containers. |
| test/docker-compose.yml | Uses ephemeral host ports (0) for services to avoid port conflicts. |
| test/conftest.py | Formatting-only change to long assertion message. |
| robots.md | Updates local/CI testing docs for new images, mounting behavior, and warnings. |
| pyproject.toml | Adds pytest-cov to dev extras. |
| examples/README.md | Updates documentation links and capitalization. |
| examples/loading_with_models/get_tl_embeddings.py | Formatting-only change. |
| examples/loading_with_models/add_video_model.py | Updates doc link. |
| examples/DataWizard/Polygon Regions DataWizard.ipynb | Doc text capitalization fix. |
| examples/CelebADataKaggle.py | Formatting-only refactor of long f-string and kwargs. |
| docker/tests/Dockerfile | Rebuilds test image for faster caching; uses PyPI openai-clip and BuildKit cache mounts. |
| docker/notebook/Dockerfile | Uses PyPI openai-clip, adds pip cache mounts, and reorganizes install steps. |
| docker/dependencies/Dockerfile | Adds pip cache mount for Jupyter-related installs. |
| ci.sh | Enables BuildKit + inline cache; pushes test image on non-PR runs; optionally skips aux images. |
| aperturedb/VideoDataCSV.py | Simplifies exception logging (uses logger.exception(...)). |
| aperturedb/Utils.py | Adds schema normalization helper and censors tokens in error logging. |
| aperturedb/Sources.py | Adds S3/GCS anonymous fallback, retries, and client caching; improves file handling/logging. |
| aperturedb/Sort.py | Changes Sort to support multiple sort keys via append/chaining. |
| aperturedb/Query.py | Formatting/doc wording improvements (ApertureDB capitalization, spacing). |
| aperturedb/PyTorchData.py | Docstring capitalization fix. |
| aperturedb/ParallelQuerySet.py | Improves log formatting and throughput computation (avoid NaN). |
| aperturedb/ParallelQuery.py | Adds docs, propagates dry_run into Dask path, and updates throughput computation. |
| aperturedb/ParallelLoader.py | Expands docstring and updates throughput computation (avoid NaN). |
| aperturedb/NotebookHelpers.py | Expands module docstring. |
| aperturedb/LoggingUtils.py | New module implementing recursive token redaction for safe logging. |
| aperturedb/KaggleData.py | Docstring capitalization fix. |
| aperturedb/ImageDownloader.py | Improves exception logging and fixes potential None access on failed download. |
| aperturedb/ImageDataCSV.py | Simplifies exception logging (uses logger.exception(...)). |
| aperturedb/Descriptors.py | Corrects docstrings to match descriptor (not descriptorset) semantics and clarifies args. |
| aperturedb/DaskManager.py | Adds module docstring; propagates dry_run; improves connector creation error handling. |
| aperturedb/CSVWriter.py | Improves assertion message formatting for readability. |
| aperturedb/Connector.py | Adds connect option at construction; censors token-bearing logs; fixes mutable default arg in _query. |
| aperturedb/ConnectionPool.py | New thread-safe connection pool for Connector instances. |
| aperturedb/Configuration.py | Updates __repr__ to angle-bracket format (tests added). |
| aperturedb/CommonLibrary.py | Censors tokens in logging; fixes response-handler mapping for partial responses. |
| aperturedb/cli/mount_coco.py | Uses context-manager locking and returns proper errno on read failures. |
| aperturedb/cli/configure.py | Refactors string formatting for overwrite warning. |
| aperturedb/Blobs.py | Adds module docstring and improves class doc wording/links. |
| aperturedb/init.py | Bumps version to 0.4.59. |
| .github/workflows/pr.yaml | Adds concurrency cancellation; runs both protocols in one job; skips aux images on PRs. |
| .github/workflows/main.yml | Adds concurrency cancellation; consolidates protocols into one job; sets BuildKit env. |
| .github/workflows/develop.yml | Adds concurrency cancellation; consolidates protocols into one job; sets BuildKit env. |
| .github/workflows/checks.yml | Updates pre-commit Python version to 3.12. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces several improvements to the CI workflows, enhances logging and error handling in the Python client, and improves code clarity and documentation. The main themes are CI optimization, improved security and logging, and code/documentation quality.
CI/CD Optimization and Efficiency:
concurrencycontrols to.github/workflows/pr.yaml,.github/workflows/develop.yml, and.github/workflows/main.ymlto cancel in-progress runs when superseded by newer commits, preventing wasted CI resources and speeding up feedback cycles [1] [2] [3].Security and Logging Improvements:
censor_tokensutility into logging statements inCommonLibrary.pyto prevent sensitive information (such as tokens) from being logged in debug/error outputs [1] [2] [3] [4].Code and Documentation Quality:
CSVWriter.pyandCommonLibrary.pyfor better debugging and clarity [1] [2] [3] [4] [5] [6].Blobs.pyand clarified class-level documentation for better developer experience [1] [2].map_response_to_handlerfor better compatibility and correctness when handling both list and non-list responses [1] [2].Configuration.pyfrom square brackets to angle brackets for consistency and clarity.