Skip to content

Release 0.4.59#732

Merged
luisremis merged 26 commits into
mainfrom
release-0.4.59
May 21, 2026
Merged

Release 0.4.59#732
luisremis merged 26 commits into
mainfrom
release-0.4.59

Conversation

@luisremis
Copy link
Copy Markdown
Contributor

@luisremis luisremis commented May 21, 2026

This pull request introduces several improvements to the CI workflows, enhances logging and error handling in the Python client, and improves code clarity and documentation. The main themes are CI optimization, improved security and logging, and code/documentation quality.

CI/CD Optimization and Efficiency:

  • Added concurrency controls to .github/workflows/pr.yaml, .github/workflows/develop.yml, and .github/workflows/main.yml to cancel in-progress runs when superseded by newer commits, preventing wasted CI resources and speeding up feedback cycles [1] [2] [3].
  • Replaced matrix builds for protocol testing with a single job that runs both protocols in parallel, reducing duplicated work and overall CI time across all workflows [1] [2] [3].
  • Enabled Docker BuildKit and improved Docker caching in all workflows for faster and more efficient image builds [1] [2] [3] [4] [5].
  • Upgraded the Python version used in CI from 3.10 to 3.12 for future-proofing and consistency.

Security and Logging Improvements:

  • Integrated the censor_tokens utility into logging statements in CommonLibrary.py to prevent sensitive information (such as tokens) from being logged in debug/error outputs [1] [2] [3] [4].

Code and Documentation Quality:

  • Improved assertion error messages throughout CSVWriter.py and CommonLibrary.py for better debugging and clarity [1] [2] [3] [4] [5] [6].
  • Added a module-level docstring to Blobs.py and clarified class-level documentation for better developer experience [1] [2].
  • Improved response handling logic in map_response_to_handler for better compatibility and correctness when handling both list and non-list responses [1] [2].
  • Updated the string representation in Configuration.py from square brackets to angle brackets for consistency and clarity.

luisremis and others added 26 commits April 23, 2026 23:40
- Cancel in-progress runs on new commits (concurrency group by PR)
- Collapse [http, non_http] matrix into one job (TEST_PROTOCOL=both); both protocols still run in parallel via run_test_container.sh
- Skip notebook and coverage image builds on PRs (BUILD_AUX_IMAGES=false) — neither is needed to run tests
- Enable BuildKit + BUILDKIT_INLINE_CACHE=1 so --cache-from actually works
- Replace unconditional sleep 20 with a health-check readiness probe
Both develop.yml and main.yml were still running a matrix: [http, non_http]
that duplicated every docker image build (notebook deps, notebook, tests,
coverage) on a second runner, and did not set DOCKER_BUILDKIT=1 so the
--cache-from / BUILDKIT_INLINE_CACHE plumbing already present in ci.sh
had no effect.

Changes:
- Remove matrix; single job sets TEST_PROTOCOL=both so run_test_container.sh
  runs both stacks in parallel (same pattern as pr.yaml)
- Add DOCKER_BUILDKIT=1 and COMPOSE_DOCKER_CLI_BUILD=1 to all ci.sh invocations
- Add concurrency cancel-in-progress so stale runs are superseded when a
  newer push lands
- Set RUNNER_NAME env var (required by run_test_container.sh log paths)
…ip cache mounts

- docker/tests/Dockerfile: replace FROM aperturedata/aperturedb-notebook:dependencies
    (compiles OpenCV from source, ~15-20 min cold) with FROM python:3.10.
    Tests already have opencv-python-headless via pyproject.toml runtime deps.
    System packages: ffmpeg, libavcodec-extra, libgl1, libfuse-dev/fuse, git.

- Pin CLIP to openai-clip (PyPI) in tests + notebook Dockerfiles.
    Eliminates git-clone on every build; stable version = Docker layer stays cached.
    Split test Dockerfile into three pip layers:
      1. awscli + openai-clip (stable, almost never re-runs)
      2. pyproject.toml-only copy + stub pkg + pip install [dev] deps
         (torch/tensorflow/etc. cached until pyproject.toml changes)
      3. ADD real code + pip install --no-deps (only re-runs on code changes)

- ci.sh build_tests(): push aperturedb-python-tests:latest after build when
    NO_PUSH != true (develop/main). Enables --cache-from to work across runners
    and cold starts instead of relying solely on local daemon cache.

- All pip RUN steps in tests, notebook, and dependencies Dockerfiles now use
    --mount=type=cache,target=/root/.cache/pip.  Wheels survive layer invalidation
    on the same runner; cold reinstalls skip the network download.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Release 0.4.59 focusing on safer logging (token redaction), expanded test coverage for several utility modules, and significantly faster/more reliable CI via Docker BuildKit and consolidated protocol runs.

Changes:

  • Add aperturedb.LoggingUtils.censor_tokens() and apply token redaction to connector/common logging paths.
  • Introduce/extend helper APIs (e.g., ConnectionPool, multi-key Sort, schema normalization in Utils) with accompanying tests.
  • Rework CI/test container build & execution to improve caching, reduce duplicate work, and run HTTP + non-HTTP protocols in parallel within one job.

Reviewed changes

Copilot reviewed 54 out of 56 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test/test_Utils.py Adds unit tests for token censoring and schema-summary normalization without a live DB.
test/test_Stats.py Updates assertions to reflect new throughput behavior (no NaN) and improves assertion formatting.
test/test_Sources.py Adds tests for S3/GCS auth fallback, retry behavior, and client caching.
test/test_Sort.py Adds tests for the new multi-key Sort behavior and mismatch validation.
test/test_Session.py Adds test for Connector(connect=True) behavior at construction time.
test/test_Server.py Refactors formatting of test call/assertions (no logic change intended).
test/test_ResponseHandler.py Adjusts mock query response shape and adds a regression-style test for partial responses.
test/test_Parallel.py Adds integration-style test for Dask generator + dry_run=True ingestion.
test/test_Images.py Adds tests for Images helpers and geometric transforms (rotate, resolve).
test/test_Datawizard.py Minor formatting cleanup in test model construction.
test/test_ConnectionPool.py Adds tests for new ConnectionPool behavior (borrow/return, concurrency, timeout).
test/test_Configuration.py Adds Configuration.__repr__ test aligned with new repr format.
test/run_test.sh Switches to pytest --cov and makes error-log copy non-fatal.
test/run_test_container.sh Uses dynamic ports, adds readiness waiting, and mounts run_test.sh into containers.
test/docker-compose.yml Uses ephemeral host ports (0) for services to avoid port conflicts.
test/conftest.py Formatting-only change to long assertion message.
robots.md Updates local/CI testing docs for new images, mounting behavior, and warnings.
pyproject.toml Adds pytest-cov to dev extras.
examples/README.md Updates documentation links and capitalization.
examples/loading_with_models/get_tl_embeddings.py Formatting-only change.
examples/loading_with_models/add_video_model.py Updates doc link.
examples/DataWizard/Polygon Regions DataWizard.ipynb Doc text capitalization fix.
examples/CelebADataKaggle.py Formatting-only refactor of long f-string and kwargs.
docker/tests/Dockerfile Rebuilds test image for faster caching; uses PyPI openai-clip and BuildKit cache mounts.
docker/notebook/Dockerfile Uses PyPI openai-clip, adds pip cache mounts, and reorganizes install steps.
docker/dependencies/Dockerfile Adds pip cache mount for Jupyter-related installs.
ci.sh Enables BuildKit + inline cache; pushes test image on non-PR runs; optionally skips aux images.
aperturedb/VideoDataCSV.py Simplifies exception logging (uses logger.exception(...)).
aperturedb/Utils.py Adds schema normalization helper and censors tokens in error logging.
aperturedb/Sources.py Adds S3/GCS anonymous fallback, retries, and client caching; improves file handling/logging.
aperturedb/Sort.py Changes Sort to support multiple sort keys via append/chaining.
aperturedb/Query.py Formatting/doc wording improvements (ApertureDB capitalization, spacing).
aperturedb/PyTorchData.py Docstring capitalization fix.
aperturedb/ParallelQuerySet.py Improves log formatting and throughput computation (avoid NaN).
aperturedb/ParallelQuery.py Adds docs, propagates dry_run into Dask path, and updates throughput computation.
aperturedb/ParallelLoader.py Expands docstring and updates throughput computation (avoid NaN).
aperturedb/NotebookHelpers.py Expands module docstring.
aperturedb/LoggingUtils.py New module implementing recursive token redaction for safe logging.
aperturedb/KaggleData.py Docstring capitalization fix.
aperturedb/ImageDownloader.py Improves exception logging and fixes potential None access on failed download.
aperturedb/ImageDataCSV.py Simplifies exception logging (uses logger.exception(...)).
aperturedb/Descriptors.py Corrects docstrings to match descriptor (not descriptorset) semantics and clarifies args.
aperturedb/DaskManager.py Adds module docstring; propagates dry_run; improves connector creation error handling.
aperturedb/CSVWriter.py Improves assertion message formatting for readability.
aperturedb/Connector.py Adds connect option at construction; censors token-bearing logs; fixes mutable default arg in _query.
aperturedb/ConnectionPool.py New thread-safe connection pool for Connector instances.
aperturedb/Configuration.py Updates __repr__ to angle-bracket format (tests added).
aperturedb/CommonLibrary.py Censors tokens in logging; fixes response-handler mapping for partial responses.
aperturedb/cli/mount_coco.py Uses context-manager locking and returns proper errno on read failures.
aperturedb/cli/configure.py Refactors string formatting for overwrite warning.
aperturedb/Blobs.py Adds module docstring and improves class doc wording/links.
aperturedb/init.py Bumps version to 0.4.59.
.github/workflows/pr.yaml Adds concurrency cancellation; runs both protocols in one job; skips aux images on PRs.
.github/workflows/main.yml Adds concurrency cancellation; consolidates protocols into one job; sets BuildKit env.
.github/workflows/develop.yml Adds concurrency cancellation; consolidates protocols into one job; sets BuildKit env.
.github/workflows/checks.yml Updates pre-commit Python version to 3.12.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/test_ResponseHandler.py
Comment thread aperturedb/ConnectionPool.py
Comment thread test/run_test_container.sh
@luisremis luisremis merged commit 2d855f6 into main May 21, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants