How to classify a highly unbalanced or skewed data ?

Notebook might not open because of heavy visuals, you can see it on kaggle also https://www.kaggle.com/shweta2407/oversampling-vs-undersampling-techniques

How to classify a highly unbalanced or skewed data ?

An unbalanced data or skewed data is the dataset that has its most of the data falling in one class and rest in others.

To classify this type of data, we need to first balance the data.

How to balance the unbalanced data ?

Apply different resampling techniques to balance the data : there are 2 kinds of resampling techniques - OVERSAMPLING & UNDERSAMPLING techniques.

OVERSAMPLING Techniques

SMOTE - Synthetic Minority Oversampling Technique

UNDERSAMPLING Techniques

NearMiss Version 1, 2, 3

Tomek Links

Condensed Nearest Neighbor

Edited Nearest Neighbor

Combination of Oversampling & Undersampling Techniques

One Sided Selection (Tomek Links and the Condensed Nearest Neighbor (CNN)

Neighborhood Cleaning Rule (Condensed Nearest Neighbor & Edited Nearest Neighbors )

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
images		images
README.md		README.md
oversampling-vs-undersampling-techniques.ipynb		oversampling-vs-undersampling-techniques.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

README.md

README.md

oversampling-vs-undersampling-techniques.ipynb

oversampling-vs-undersampling-techniques.ipynb

Repository files navigation

Notebook might not open because of heavy visuals, you can see it on kaggle also https://www.kaggle.com/shweta2407/oversampling-vs-undersampling-techniques

How to classify a highly unbalanced or skewed data ?

How to balance the unbalanced data ?

OVERSAMPLING Techniques

UNDERSAMPLING Techniques

Combination of Oversampling & Undersampling Techniques

About

Releases

Packages

Languages

epicure24/Classifier-for-highly-unbalanced-data

Folders and files

Latest commit

History

Repository files navigation

Notebook might not open because of heavy visuals, you can see it on kaggle also https://www.kaggle.com/shweta2407/oversampling-vs-undersampling-techniques

How to classify a highly unbalanced or skewed data ?

How to balance the unbalanced data ?

OVERSAMPLING Techniques

UNDERSAMPLING Techniques

Combination of Oversampling & Undersampling Techniques

About

Topics

Resources

Stars

Watchers

Forks

Languages