dashi v0.3.0
dashi v0.3.0
Highlights
🚀 New dimensionality reduction: SVD is now available as a dimensionality reduction method for numerical data, alongside PCA, MCA, and FAMD.
🌲 Histogram Gradient Boosting models: A new model family (histogram_gradient_boosting) is now supported in estimate_multibatch_models, offering faster training and native categorical feature handling. This family of models perform better than the standard Random Forest for large datasets.
📊 PR-AUC metric: Precision-Recall AUC is now reported per class and as a macro average in classification tasks, complementing the existing ROC-AUC metrics.
💾 Memory optimization: The format_data function now supports inplace=True to avoid copying large DataFrames, and data type downcasting is available for further memory savings.
What's Changed
New Features
- SVD as a dimensionality reduction method for numerical data.
- Histogram Gradient Boosting for classification and regression in
estimate_multibatch_modelsviamodel_type='histogram_gradient_boosting'. Note: Histogarm Gradient Boosting is now the default model. Random Forest can be selected viamodel_type='random_forest'. - PR-AUC classification metric (per class and macro average).
inplaceparameter informat_datafor memory-efficient transformation.- Data type downcasting support for memory optimization.
Bug Fixes
- Fixed data type recognition when creating supports for variable distribution estimation.
- Fixed bugs in the supervised characterization pipeline that decreased model performance.
- All datetime units (
datetime64[ns],[us],[ms],[s]) are now correctly recognized. - Fixed label misalignment in
estimate_conditional_data_temporal_mapwhen usingstart_dateorend_dateparameters. - Corrected various incorrectly raised or suppressed warnings.
Dependency Updates
plotlycompatibility expanded from==5.18.0to>=5.18.0,<6.0.0.scikit-learncompatibility expanded from==1.5.1to>=1.5.1,<2.0.0.
⚠️ Upgrade Notes
- Dependency versions: This release widens the accepted versions for
plotlyandscikit-learn. If you pin exact versions in your environment, you may need to update them. - No breaking API changes: All existing code should continue to work without modification. New features are additive (new parameters with backward-compatible defaults).
Installation
pip install --upgrade dashi