Skip to content

viplazylmht/DataMiningLab01

Repository files navigation

DataMining Lab 01 - Preprocessing

How to use

Note: Feel free to pass -h option to show help each of command.

  1. List column that missing data

    python3 list_missing.py test/house-prices.csv --extra
  2. Count rows that missing data

    python3 count_missing.py test/house-prices.csv
  3. Impute

     python3 impute.py test/house-prices.csv --method mode
  4. Remove rows that have missing rate greater than a constant

    python3 remove_missing.py test/house-prices.csv 50
  5. Remove cols that have missing rate greater than a constant

    python3 remove_missing.py test/house-prices.csv 50 --column
  6. Remove duplicate rows

    python3 remove_dup.py test/house-prices.csv
  7. Feature Scaling dataset

    python3 feature_scaling.py test/house-prices.csv --column PoolArea YrSold
    python3 feature_scaling.py test/house-prices.csv --column PoolArea YrSold --method zscore
  8. Calculate the value of attributes expressions
    For Windows

    python3 calculating_attributes_expressions.py test/house-prices.csv YrSold + SalePrice * 2 --cname Total

    For Linux

    python3 calculating_attributes_expressions.py test/house-prices.csv YrSold + SalePrice \* 2 --cname Total

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages