Juniper Data: Ignoble Iguana #39
pcalnon
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Juniper Data v0.4.2 Release Notes
Release Date: 2026-02-17
Version: 0.4.2
Codename: First JuniperData Release
Release Type: PATCH
Overview
This is the first official release of JuniperData as a standalone dataset generation microservice. It addresses critical bug fixes discovered during CI/CD validation: 12 failing MNIST generator tests, a Bandit security scan suppression issue, an arc-agi import failure affecting downstream consumers, CI workflow branch coverage, and repository hygiene. All 658 service tests and 41 client tests now pass across Python 3.12-3.14.
Release Summary
Features Summary
What's New
MNIST Generator n_samples Support (MNIST-001)
Added missing
n_samplesparameter handling in the MNIST generator's_load_and_preprocessmethod. Whenn_samplesis specified inMnistParams, the generator now callsds.select(range(n_samples))to limit the dataset before preprocessing.Files:
juniper_data/generators/mnist/generator.py(line 94-95)Bug Fixes
12 Failing MNIST Generator Tests (MNIST-001)
Problem: All 12
TestMnistGeneratortests failed withValueError: cannot reshape array of size 0 into shape (0,newaxis)orIndexError: arrays used as indices must be of integer (or boolean) type.Root Cause: The generator's
_load_and_preprocessmethod callsds.with_format("numpy")for bulk column access before accessingds["image"]andds["label"]. The test mocks did not configurewith_format(), so it returned a genericMagicMock. Callingnp.array()on aMagicMockproduces an empty array, which then failed on reshape. Additionally, the generator was missingn_samplessupport viads.select().Solution:
formatted_dsmock with proper__getitem__returning numpy arrays for"image"and"label"keysmock_ds.with_format.return_value = formatted_dsin_make_mock_hf_dataset()andtest_generate_image_without_convert()if params.n_samples is not None: ds = ds.select(range(params.n_samples))to the generatorFiles:
juniper_data/generators/mnist/generator.py(lines 94-95)juniper_data/tests/unit/test_mnist_generator.py(lines 17-38, 255-264)Bandit B615 Security Scan Failure (SEC-007)
Problem: CI Bandit security scan failed with
[B615:huggingface_unsafe_download]at medium severity, blocking the pipeline.Root Cause: The
# nosec B615suppression directive was placed on the comment line above thehf_load_dataset()call. Bandit only recognizes# nosecdirectives when placed on the same line as the flagged code. The scan reportedTotal lines skipped (#nosec): 0.Solution: Moved
# nosec B615from the comment line to inline on thehf_load_dataset()call.Files:
juniper_data/generators/mnist/generator.py(line 91)arc-agi Optional Dependency (DEP-001)
Problem: Importing
juniper_datain JuniperCascor failed becausearc-agi>=0.9.0was a hard dependency that wasn't installed in Cascor's environment.Root Cause: The arc-agi package was listed as a required dependency in
pyproject.tomlrather than as an optional dependency, causingImportErrorwhen the package wasn't available.Solution: Made
arc-agian optional dependency, allowing the module to gracefully degrade when the package is not installed.Files:
juniper_data/storage/__init__.pyjuniper_data/storage/hf_store.pyImprovements
Test Suite Stability
All test failures from v0.4.0 have been resolved:
CI/CD Pipeline
# nosecsuppressions (all justified)subproject.juniper_data.**branches (v0.4.2)Repository Hygiene (v0.4.2)
__pycache__exclusion patterns in.gitignore__pycache__/*.pycfiles from the repositoryTest Results
Test Suite
Coverage Details
Client Tests
Environments Tested
Upgrade Notes
This is a backward-compatible patch release. No migration steps required.
Known Issues
source_pkgsconfiguration.datasets.pyfrom FastAPI patterns (Query(),Depends()in function defaults). These are not bugs.What's Next
Planned for v0.5.0
Coverage Goals
Roadmap
See INTEGRATION_DEVELOPMENT_PLAN.md for full details.
Contributors
Version History
Links
This discussion was created from the release Juniper Data: Ignoble Iguana.
Beta Was this translation helpful? Give feedback.
All reactions