Machine Learning Programming Homework 1~2, Department of Software, Gachon Univ, South Korea. (2021 fall semester)
: Apply data preprocessing(scaling & encoding) to dataset
parameters:
dataset : dataframe.
encode_list : list to encode feature
scale_list : list to scale feature
return:
dictionary to dataframe
Examples
//classification
train = Preprocessing(train, ["Sex"], ["Pclass", "Age", "SibSp", "Parch", "Fare"])
//clustering
pre_feature = Preprocessing(dataset, ["ocean_proximity"], ["longitude", "latitude", "housing_median_age", "total_rooms", "population", "households", "median_income"])
reference : https://github.com/catsaveearth/scale_encode_combination
Data load -> data preprocessing -> model training -> check result
model:
- DecisionTreeClassifier (entropy)
- Support vector machine (SVC)
- GaussianNB
Data load -> data preprocessing -> model training -> check result
model:
- k-mean
- EM (GaussianMixture)
- Clarans
- DBSCAN
- Meanshift