v0.2.0 - Separating Architecture and Implementation
Release Notes — v0.2.0 (2026-05-27)
This release is a major restructuring of the project since v0.1.0: the repository has been split into a Cargo workspace. dataset-core now contains only the architecture layer, while a new companion crate dataset-ml houses all built-in dataset loaders. The two crates are published to crates.io independently.
| Crate | Version |
|---|---|
dataset-core |
0.1.0 → 0.2.0 |
dataset-ml |
0.1.0 (initial release) |
⚠️ Breaking Changes
-
Workspace split:
dataset-corenow only shipsDataset<T>, theutilsmodule, and theerrormodule. All built-in dataset loaders have moved to the newdataset-mlcrate. -
datasetsfeature removed: the formerdatasetsfeature ondataset-coreis gone. Usedataset-mlinstead. -
Import path changes (loaders moved to
dataset-ml):Old path ( dataset-core0.1.x)New path ( dataset-ml0.1.0)dataset_core::datasets::iris::Irisdataset_ml::iris::Irisdataset_core::datasets::boston_housing::BostonHousingdataset_ml::boston_housing::BostonHousingdataset_core::datasets::diabetes::Diabetesdataset_ml::diabetes::Diabetesdataset_core::datasets::titanic::Titanicdataset_ml::titanic::Titanicdataset_core::datasets::wine_quality::red_wine_quality::RedWineQualitydataset_ml::wine_quality::red_wine_quality::RedWineQualitydataset_core::datasets::wine_quality::white_wine_quality::WhiteWineQualitydataset_ml::wine_quality::white_wine_quality::WhiteWineQualityThere is no longer a
datasets::namespace — modules sit directly at thedataset_mlcrate root, and every dataset struct is also re-exported at the crate root for convenience. -
utilsfunction renames:prepare_download_dir→evaluate_storagedownload_dataset_with→acquire_dataset
-
Download backend swap: replaced
downloaderwithureq. Thedownload_toAPI was refactored and now supports an optional custom filename. -
Slimmer error payloads:
DataFormatErrorno longer formats the offending record into the error message. Error output is more compact and avoids echoing raw data.
✨ Added
- Structured error handling:
thiserroris now used to deriveDatasetError/DataFormatErrorKind. Detailed variants, a consistent[dataset_name] ...prefix, andFromimpls forUreqError,ZipError, andstd::io::Errormean?just works inside loader closures. dataset-mlinitial release: ships loaders for Iris, Boston Housing, Diabetes, Titanic, and Red / White Wine Quality. Wine Quality is split into red and white submodules that shareparse_wine_data_to_array.- Semantic tests across the board: dataset integration tests now assert value constraints, consistency checks, and finiteness — not just shapes.
- Documentation upgrades: each dataset module gained detailed module-level docs covering features, target variable, sample count, applications, and source.
- Chinese localization:
README.zh-CN.mdadded fordataset-core,dataset-ml, and the workspace root.
🔧 Changed
- Dependency bumps:
ureq→3.3.0,thiserror→2.0.18,zip→8.5.1. - Shared metadata (
edition,rust-version,authors,license,repository) lifted into[workspace.package]; shared dependency versions live in[workspace.dependencies]. - Doctests that create files on disk are now marked
no_run, socargo test --docno longer leaves stray artifacts behind. - Removed redundant module-level docs from
error.rsand stale markdown links inutilsdocs.
📦 Installation
[dependencies]
dataset-core = "0.2.0" # architecture layer: Dataset<T> + utils + error
dataset-ml = "0.1.0" # add this only if you want the built-in loadersIf you only need the Dataset<T> container (zero external dependencies), no features are required. Enable features = ["utils"] to pull in acquire_dataset / download_to / unzip / SHA-256 helpers. dataset-ml transitively enables dataset-core/utils, so you don't need to configure it manually.