Dataclean is a data cleaning library which cleans the dataframe.
Dataclean performs the following operations on the pandas dataframes
- Fixes column headers for spaces and special charectors
- Drops missing value columns
- Fixes outliers
- Remove duplicate rows
- Impute missing values
In the following paragraphs, I am going to describe how you can get and use Scrapeasy for your own projects.
To download dataclean, either fork this github repo or simply use Pypi via pip.
$ pip install dataclean
dataclean was programmed with ease-of-use in mind. First, import cleandata from dataclean.clean
from dataclean.clean import cleandata