Skip to content

A python implementation of missing value imputation with kNN

License

Notifications You must be signed in to change notification settings

robinsonkwame/Imputer.py

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Imputer

A python implementation for missing value imputation using kNN.

Require Scikit-learn, Numpy and Pandas installed. Initialise:

from imputer import Imputer
impute = Imputer()

Default Usage (X should be a pandas.dataframe, column is the name or index of the dataframe):

X_imputed = impute.knn(X = data, column = 'age')#default 10nn

Change Number of k:

X_imputed = impute.knn(X = data, column = 'age', k = 3)

Default impute for numerical features, for categorical feature imputation:

X_imputed = impute.knn(X = data, column = 'gender', k = 10, is_categorical = True)

Reference

Troyanskaya O, Cantor M, Sherlock G, et al. Missing value estimation methods for DNA microarrays[J]. Bioinformatics, 2001, 17(6): 520-525.

About

A python implementation of missing value imputation with kNN

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 93.0%
  • Python 7.0%