Codes&Datasets Codes: Contain 6 Classification algos; 6 Clustering algos; 5 Regression algos. All codes are written in C++. ps. LogisticRegression is used for both Classification and Regression. Datasets: Contain 5 Classification original datasets; 5 Clustering original datasets; 5 Regression original datasets. Dirty data are injected into Original Datasets: Contain Missing Data; Inconsistency Data; Conflict Data. Missing rate vaires from 10% to 50%; Inconsistency rate varies from 10% to 50%; Conflict rate vaires from 10% to 50%. If you have any question, please email to zhixin.qi@foxmail.com. Enjoy it!
-
Notifications
You must be signed in to change notification settings - Fork 4
qizhixinhit/Dirty-dataImpacts
About
Codes&Datasets
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published