A simple and reusable Python library to automate data cleaning for machine learning workflows.
While working on machine learning models, I noticed that I was repeatedly writing the same Pandas code to clean datasets.
To solve this, I built DataCleaner — a lightweight library that handles common data preprocessing tasks in one go.
-
📂 Load data from:
- CSV
- Excel
- JSON
-
🧼 Handle missing values:
- Drop missing values
- Fill with mean
- Fill with median
-
🔁 Remove duplicate rows (optional)
-
📊 Logging support to track cleaning steps
Make sure you have Python installed, then install dependencies:
pip install pandas