Skip to content

dashi v0.1.0

Choose a tag to compare

@surfer8137 surfer8137 released this 07 Mar 11:37
· 298 commits to main since this release

Release v0.1.0 – Initial Version

We are excited to introduce dashi, a powerful Python library for dataset shift analysis and characterization!
This first release provides robust tools for analyzing temporal and multi-source dataset shifts, enabling both supervised and unsupervised evaluations to detect, understand, and mitigate changes in data distributions.

Key Features

Supervised Characterization

  • Train classification/regression models (Random Forests) on batched data (temporal or multi-source).
  • Analyze how dataset shifts impact model performance and pinpoint potential degradation areas.

Unsupervised Characterization

  • Detect temporal dataset shifts by estimating statistical distributions over time.
  • Project these distributions onto non-parametric statistical manifolds to reveal hidden trends and latent temporal variability.

Visualization Tools

To facilitate exploration and interpretation, dashi includes:

  • Data Temporal Heatmaps (DTHs) – Visualize temporal shifts in data distributions.
  • Information Geometric Temporal (IGT) plots – Embed temporal batches onto latent statistical manifolds for deeper insights.
  • Multi-batch Contingency Matrices – Compare multiple evaluation metrics (F1-Score, Recall, Precision, AUC, etc.) across pairwise batches (temporal/multi-source).

This release provides a foundation for dataset shift analysis, helping researchers and data practitioners monitor and understand data integrity over time.

Installation

pip install dashi