Skip to content

v0.2.0 - Separating Architecture and Implementation

Choose a tag to compare

@SomeB1oody SomeB1oody released this 28 May 05:36
· 23 commits to master since this release

Release Notes — v0.2.0 (2026-05-27)

This release is a major restructuring of the project since v0.1.0: the repository has been split into a Cargo workspace. dataset-core now contains only the architecture layer, while a new companion crate dataset-ml houses all built-in dataset loaders. The two crates are published to crates.io independently.

Crate Version
dataset-core 0.1.00.2.0
dataset-ml 0.1.0 (initial release)

⚠️ Breaking Changes

  • Workspace split: dataset-core now only ships Dataset<T>, the utils module, and the error module. All built-in dataset loaders have moved to the new dataset-ml crate.

  • datasets feature removed: the former datasets feature on dataset-core is gone. Use dataset-ml instead.

  • Import path changes (loaders moved to dataset-ml):

    Old path (dataset-core 0.1.x) New path (dataset-ml 0.1.0)
    dataset_core::datasets::iris::Iris dataset_ml::iris::Iris
    dataset_core::datasets::boston_housing::BostonHousing dataset_ml::boston_housing::BostonHousing
    dataset_core::datasets::diabetes::Diabetes dataset_ml::diabetes::Diabetes
    dataset_core::datasets::titanic::Titanic dataset_ml::titanic::Titanic
    dataset_core::datasets::wine_quality::red_wine_quality::RedWineQuality dataset_ml::wine_quality::red_wine_quality::RedWineQuality
    dataset_core::datasets::wine_quality::white_wine_quality::WhiteWineQuality dataset_ml::wine_quality::white_wine_quality::WhiteWineQuality

    There is no longer a datasets:: namespace — modules sit directly at the dataset_ml crate root, and every dataset struct is also re-exported at the crate root for convenience.

  • utils function renames:

    • prepare_download_direvaluate_storage
    • download_dataset_withacquire_dataset
  • Download backend swap: replaced downloader with ureq. The download_to API was refactored and now supports an optional custom filename.

  • Slimmer error payloads: DataFormatError no longer formats the offending record into the error message. Error output is more compact and avoids echoing raw data.

✨ Added

  • Structured error handling: thiserror is now used to derive DatasetError / DataFormatErrorKind. Detailed variants, a consistent [dataset_name] ... prefix, and From impls for UreqError, ZipError, and std::io::Error mean ? just works inside loader closures.
  • dataset-ml initial release: ships loaders for Iris, Boston Housing, Diabetes, Titanic, and Red / White Wine Quality. Wine Quality is split into red and white submodules that share parse_wine_data_to_array.
  • Semantic tests across the board: dataset integration tests now assert value constraints, consistency checks, and finiteness — not just shapes.
  • Documentation upgrades: each dataset module gained detailed module-level docs covering features, target variable, sample count, applications, and source.
  • Chinese localization: README.zh-CN.md added for dataset-core, dataset-ml, and the workspace root.

🔧 Changed

  • Dependency bumps: ureq3.3.0, thiserror2.0.18, zip8.5.1.
  • Shared metadata (edition, rust-version, authors, license, repository) lifted into [workspace.package]; shared dependency versions live in [workspace.dependencies].
  • Doctests that create files on disk are now marked no_run, so cargo test --doc no longer leaves stray artifacts behind.
  • Removed redundant module-level docs from error.rs and stale markdown links in utils docs.

📦 Installation

[dependencies]
dataset-core = "0.2.0"           # architecture layer: Dataset<T> + utils + error
dataset-ml   = "0.1.0"           # add this only if you want the built-in loaders

If you only need the Dataset<T> container (zero external dependencies), no features are required. Enable features = ["utils"] to pull in acquire_dataset / download_to / unzip / SHA-256 helpers. dataset-ml transitively enables dataset-core/utils, so you don't need to configure it manually.