Multivariate Normal Distribution based Oversampling
-
Updated
Mar 18, 2019 - Jupyter Notebook
Multivariate Normal Distribution based Oversampling
Detecting Frauds in Online Transactions using Anamoly Detection Techniques Such as Over Sampling and Under-Sampling as the ratio of Frauds is less than 0.00005 thus, simply applying Classification Algorithm may result in Overfitting
Multivariate Normal Distribution Based Over-Sampling for Numerical and Categorical Features
The aim of this project is to determine which classification technique produces the best results when applied to the task of determining credit riskiness.
Genre Identification task along with Text Analytics with Multi-Class and Imbalanced Learning on Gutenberg Corpus
Repo contains scripts to perform data analysis on structure data. It also provides a comparison of various ML algorithms at different stages of data preparation.
Credit Card Fraud detection based on anonymized data using multiple classification algorithms
Kaggle Competition: Predictions of West Nile Virus outbreaks in the City of Chicago.
In this project, data analytics is used to analyze customer-level data of a leading telecom firm, build predictive models to identify customers at high risk of churn, and identify the main indicators of churn. The project focuses on a four-month window, wherein the first two months are the ‘good’ phase, the third month is the ‘action’ phase, whi…
Predicts the qualified employee for promotion using Classification
Football Positions: A Multi-class Classification Problem
Supervised Machine Learning and Credit Risk
Use random forest, gradient boosting, neural network, with SMOTE-ENN and random over-sampling
Predicting students admission with Logistic Regression, Decision Tree, SVM (SVC) and Random Forest
Trained and evaluated two supervised machine learning models using original and resampled data to identify 'healthy loan' and 'high risk loan' applicants from financial disclosures.
Imbalanced data commonly exist in real world, especially in anomaly-detection tasks. Handling imbalanced data is important to the tasks, otherwise the predictions are biased towards the majority class. RandomOverSampler, SMOTE, and ADASYN are useful oversampling tools to fabricate data for minority classes and make the dataset balanced.
Synthetic Minority Over-Sampling Technique for Regression
RCSMOTE: Range-Controlled Synthetic Minority Over-sampling Technique for handling the class imbalance problem
Add a description, image, and links to the over-sampling topic page so that developers can more easily learn about it.
To associate your repository with the over-sampling topic, visit your repo's landing page and select "manage topics."