This is my Data Engineer portfolio
-
Updated
Feb 19, 2023 - Python
This is my Data Engineer portfolio
determine the worker garment productivity's. regression problem
Quickly find flags (words, phrases, etc) within your data. 🕵️♂️
Data exploration project introduced by Udacity Data Analysis Nanodegree
A Python script to Parse data from Non-Meaningful data to Meaningful and save it to .csv File
Some little json tools for my own use and maybe can help you
A collection of scripts written to complete DQLab Data Engineer Career Track
Python package to make URL extraction, generalization, validation, and filtration easy.
A scrub system for de-identification and cleaning of data to maintain its privacy from the world.
Transform unstructured, inconsistent or incomplete address data into structured and complete address data with Google Maps Geocoding API.
Here is some implementation and using methods in Topics on Data mining course
manual image labeller (with human level accracy 😉). diy
Make quick and dirty data mining made easier in Sublime Text
Data quality analysis and scoring system.
Data Cleaning is a python package for data preprocessing. This cleans the CSV file and returns the cleaned data frame. It does the work of imputation, removing duplicates, replacing special characters, and many more.
Python3 script to convert transcribed video VTT to CSV for import into Google Sheets
This project works with data collected from the donor database of Blood Transfusion Service Center in Hsin-Chu City in Taiwan. The center passes its blood transfusion service bus to one university in Hsin-Chu City to gather blood donated about every three months. The dataset, obtained from the UCI Machine Learning Repository, consists of a rando…
Predict if a driver will file an insurance claim next year. (Kaggle Competition)
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Add a description, image, and links to the data-cleansing topic page so that developers can more easily learn about it.
To associate your repository with the data-cleansing topic, visit your repo's landing page and select "manage topics."