Predictive Modeling TA- Fraud IEEE case study

Predictive modeling on fraud dataset Those notebooks were created as part of my role as a predictive modeling teaching assistant for master students. In complimentary with the course theory dealing with data science pipeline.

Implementing data science theory into practice using unbalanced fraud data into colab python notebooks. (Full notebooks that I shared with my student found here)

Performing exploratory data analysis (EDA) using NumPy, pandas, matplotlib, seaborn, spicy, and ploty in python.
Exploring the pros and cons of different methods to handle missing data, outliers, and transformations.
- Handle missing data: dropping missing data, fill with ‘NaN’ and ‘0’, forward and back-fill, fill with mode and mean, fill nulls by distribution, handling nulls with interpolate.
- Transformations of the data according to the positivity and the negativity of the distribution’s skew.
- Removing outliers according to the quantile and kurtosis.
Feature selection using correlation and mutual info.
Handling categorical features using get dummies.
Handling unbalanced data by using SMOTE nested within cross-validation using K-Folds. Balancing positive and negative target data selection for the cross-validation by divided sampling.
Applying Logistic Regression machine learning model (Intentionally- for the purpose of exploring the consequence of data handling, a Decision Tree is a better fitted modal for this type of data).
Evaluating Accuracy, confusion matrix (precision and recall), AUC (Area under the ROC Curve), and f1-score.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Calculate mutual_info for numeric in batches.ipynb		Calculate mutual_info for numeric in batches.ipynb
Correlation.ipynb		Correlation.ipynb
DataSet.ipynb		DataSet.ipynb
EDA Table.ipynb		EDA Table.ipynb
EDA- Perform transformation for float.ipynb		EDA- Perform transformation for float.ipynb
EDA- handle NULLS (distribution).ipynb		EDA- handle NULLS (distribution).ipynb
EDA- handle NULLS (mean-mode).ipynb		EDA- handle NULLS (mean-mode).ipynb
EDA- outliers.ipynb		EDA- outliers.ipynb
EDA.ipynb		EDA.ipynb
Full EDA		Full EDA
Get dummies fraud.ipynb		Get dummies fraud.ipynb
Imbalance Model CV and testing with Dummies.ipynb		Imbalance Model CV and testing with Dummies.ipynb
Imbalance Model CV and testing.ipynb		Imbalance Model CV and testing.ipynb
Imbalance Model Comparing.ipynb		Imbalance Model Comparing.ipynb
LICENSE		LICENSE
Mutual_info.ipynb		Mutual_info.ipynb
README.md		README.md
SMOTE.ipynb		SMOTE.ipynb
Split to train_test.ipynb		Split to train_test.ipynb
isNAN.ipynb		isNAN.ipynb
null drop.ipynb		null drop.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predictive Modeling TA- Fraud IEEE case study

About

Releases

Packages

Languages

License

dinbav/Predictive-Modeling-TA--Fraud-case-study

Folders and files

Latest commit

History

Repository files navigation

Predictive Modeling TA- Fraud IEEE case study

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages