Skip to content

Assessment solution for Arya-ai Binary classification assessment

License

Notifications You must be signed in to change notification settings

man-o-war/Binary-Classification-XGB

Repository files navigation

Binary-Classification-XGB

Assessment solution for Arya-ai Binary classification problem


Project File Struture:

Root /fldr

  • Alpha&Omega.ipynb <---- Main Jupyter notebook (Forgot to change name)
  • Assignement - Data Scientist (1).docx <---- Assessment problem document
  • Testing_predicitons.csv <---- Target class outputs of the Test Data
  • README.md <---- this very file
  • XGBFTW.sav <---- XGBoost model export done over using pickle
  • requirements.txt <---- Environment Screenshot
  • essentials_only_req.txt <---- ipynb specific requirements
  • Data /fldr
    • Training_set.csv <---- Training Dataset
    • Test_set.csv <---- Testing Dataset

Data Stats:

  • Train Dataset Shape -> (3910,58)
  • Test Dataset Shape -> (691,57)
  • Dataset is Sparse and High Dimensional
  • Features are highly skewed

Key Decisions:

  • Used RandomForest Classifier for feature selection.
  • Selected top 30 features with respect to their feature importance.
  • For metric considered Binary CrossEntropy | LogLoss and ROC-AUC score.
  • Model of choice is Xgboost.

Process Flow - Main.ipynb (Alpha&Omega.ipynb)

  1. EDA
  2. Splitting the data
  3. Feature Selection
  4. Data Scaling - Normalization
  5. Model Training
  6. Prediction Metrics
  7. Processing and Predicting on Test Data
  8. Saving Model for Future Usage
  9. Exporting Y_test Predicted scores
  10. Generating requirements. #Has an important Note. Must Read!

Process Flow - Performance_print.py

  1. Splitting the Data
  2. Feature Selection
  3. Importing Presaved model
  4. Using presaved model to generate scores
  5. Using Prettytable to print output table

Had fun making this!!

About

Assessment solution for Arya-ai Binary classification assessment

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published