Skip to content

Imbalanced data commonly exist in real world, especially in anomaly-detection tasks. Handling imbalanced data is important to the tasks, otherwise the predictions are biased towards the majority class. RandomOverSampler, SMOTE, and ADASYN are useful oversampling tools to fabricate data for minority classes and make the dataset balanced.

Notifications You must be signed in to change notification settings

hanfei1986/Oversampling-of-imbalanced-data-with-RandomOverSampler-SMOTE-and-ADASYN

Repository files navigation

Oversampling-of-imbalanced-data-with-SMOTE-and-ADASYN

Imbalanced data commonly exist in real world, especially in anomaly-detection tasks. Handling imbalanced data is important to the tasks, otherwise the predictions are biased towards the majority class. RandomOverSampler, SMOTE, and ADASYN are useful oversampling tools to fabricate data for minority classes and make the dataset balanced.

The data for the "Poor" and "Good" classes are much less than the "Standard" class:

image

The predictions are biased towards the majority class:

image

Oversampling with RandomOverSampler:

image

The predictions get a little bit more balanced:

image

Oversampling with SMOTE:

image

The predictions get more balanced:

image

Oversampling with ADASYN:

image

The predictions get more balanced:

image

About

Imbalanced data commonly exist in real world, especially in anomaly-detection tasks. Handling imbalanced data is important to the tasks, otherwise the predictions are biased towards the majority class. RandomOverSampler, SMOTE, and ADASYN are useful oversampling tools to fabricate data for minority classes and make the dataset balanced.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published