# DataMop Tutorial

Welcome to the tutorial for `datamop`, the ultimate Python package for cleaning and preparing your datasets with minimal effort. Data cleaning can often feel like the most tedious part of any data analysis or machine learning project. Missing values, inconsistent scales, and different data types can slow you down and distract from the real task: extracting insights from your data.

That is where `datamop` package comes in! This powerful, easy-to-use package automates many of the common data cleaning tasks, like imputing missing values, encoding categorical features and scaling numerical features, saving you time and effort while ensuring your data is consistent, complete, and ready for analysis.

Here we will show example usages for each function in the package, including `sweep_nulls`, `column_encoder`, and `column_scaler`. Your messy data will be ready to use after using this robust package. With `datamop`, you can focus more on analysis and less on tedious preprocessing. 

## Importing and Version Checking


Before we get started, let's install and import the `datamop` package. We will demonstrate each functions in the `datamop` package with examples using the Airbnb Open Data from kaggle.

In [1]:
# import modules
import pandas as pd
import datamop
from datamop.sweep_nulls import sweep_nulls
from datamop.column_encoder import column_encoder
from datamop.column_scaler import column_scaler

# import Airbnb Open Data
data = pd.read_csv()

ModuleNotFoundError: No module named 'datamop'

## Handling missing values with `sweep_nulls()`

One of the most common challenges in data cleaning process is dealing with missing values. `datamop` provides a convenient method called `sweep_nulls()` to help you handle these issues effortlessly. The `sweep_nulls()` function scans your dataset for missing values and allows you to handle them using various strategies, including 'mean'(numeric only), 'median'(numeric only), 'mode', 'constant', and 'drop'.

Let's start by checking the missing values in the dataset:

In [None]:
data.info()

In [None]:
data.isnull().sum()