Skip to content

bharadwajrathod/MachineLearning_Miniproject

Repository files navigation

Scope Of Project :

image image image image

Feature Extracting :

  • Data preprocessing & Data Visualisation (how many classes are divided in our dataset ).
  • Over Sample & Under Sample (if data is imbalance make it balance by level uping )
  • Normalisation (making larger values into samller values ).

Training Algorithm :

  • Divide data set into three parts :
  1. Training set
  2. Validation set
  3. Testing set

Machine learning models and algorithms for fraud detection :

  1. (Logistics Regression
  2. Support Vector Classifier
  3. Decision Tree
  4. K-Nears Neighbors classifier
  5. Single Layer classifier
  6. Multi Layer Classifier

Results of above models :

  • By using the logistic regression , I have got the accuracy for the test data set as 80%
  • By using the K – Nearest Neighbors(KNN), I have got the accuracy for the test data set as 80% which is same as logistic regression
  • By using the Multi Layer Classifier (MLP), I have got the accuracy for the test data set as 84.168% which is comparatively more than other models
  • By using the Single layer Perceptron (SLP), I have got the accuracy for the test data set as 72.5% which is comparatively lesser than other models
  • By using the Support Vector Classifier (SVC) , I have got the accuracy for the test data set as 80% which is same as log reg and KNN
  • By using the Decision Tree Classifier , I have got the accuracy for the test data set as 83% which is comparatively less than MLP
  • By Comparing above model’s accuracy the Best Model is “MLP Classifier”

ROC_SCORE and ROC_CURVE :

  • Checking True Positive Rate and False Positive Rate by ploting at different rate

Confussion Matrix :

  • Distingushing the performance of a given classifier
  • Confusion Matrix gives us a comparison between actual and predicted values
  • ACC = (TP + TN)/(TP + FP + FN + TN)

K- Fold Cross Validation :

  • K-Fold Cross-Validation. Cross-validation is a resampling procedure used to evaluate machine learning models on a limited data sample
  • The procedure has a single parameter called k that refers to the number of groups that a given data sample is to be split into

Results :

About

ML-based Fraud Detection Algorithms

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published