🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
-
Updated
Jun 17, 2024 - Python
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Make quick and dirty data mining made easier in Sublime Text
Predict if a driver will file an insurance claim next year. (Kaggle Competition)
Data Cleaning is a python package for data preprocessing. This cleans the CSV file and returns the cleaned data frame. It does the work of imputation, removing duplicates, replacing special characters, and many more.
Here is some implementation and using methods in Topics on Data mining course
A Python script to Parse data from Non-Meaningful data to Meaningful and save it to .csv File
Some little json tools for my own use and maybe can help you
Python package to make URL extraction, generalization, validation, and filtration easy.
This project works with data collected from the donor database of Blood Transfusion Service Center in Hsin-Chu City in Taiwan. The center passes its blood transfusion service bus to one university in Hsin-Chu City to gather blood donated about every three months. The dataset, obtained from the UCI Machine Learning Repository, consists of a rando…
Data quality analysis and scoring system.
A scrub system for de-identification and cleaning of data to maintain its privacy from the world.
Transform unstructured, inconsistent or incomplete address data into structured and complete address data with Google Maps Geocoding API.
Quickly find flags (words, phrases, etc) within your data. 🕵️♂️
determine the worker garment productivity's. regression problem
This is my Data Engineer portfolio
Data exploration project introduced by Udacity Data Analysis Nanodegree
Python3 script to convert transcribed video VTT to CSV for import into Google Sheets
manual image labeller (with human level accracy 😉). diy
A collection of scripts written to complete DQLab Data Engineer Career Track
Add a description, image, and links to the data-cleansing topic page so that developers can more easily learn about it.
To associate your repository with the data-cleansing topic, visit your repo's landing page and select "manage topics."