Skip to content

MaurovX/Nanodegree_Capstone

Repository files navigation

Nanodegree_Capstone

Capstone Project for Nanodegree in Data Science

Read blog post at: https://mauricio-jac2.medium.com/udacity-data-science-nanodegree-capstone-project-fd365c9ba059

Table of Contents

  1. Installation
  2. Project Motivation
  3. File Descriptions
  4. Results
  5. Licensing, Authors, and Acknowledgements

Installation

LIME - MLI for classification models

Project Motivation

As a part of the Data Science Nanodegree in Udacity, we are required to develop a final project using public/given relevant data, I decided to use the GiveMeSomeCredit dataset given my current role in the organization financing my studies. The results of this project are summarized on the file Capstone_project.html

The full set of files related to this repo are public and free of use.

File Descriptions

There are 2 notebooks available here to showcase work related to the project, the first notebook Credit_EDA is an EDA of the dataset, we seek to understand the dataset and to plot our insights. After that the Second notebook aims at answering the main question, how to implement explanations at the correct level of granularity. Capstone_Project.

Results / Reflection

To recapitulate what we’ve done we:

  • Used relevant Credit Risk data from a global competition
  • Explored, and processed the data
  • Trained a Random Forest classifier and then iterated to find the best parameters
  • Displayed Variable Importance for the results
  • Implemented a Machine Learning Interpretability technique to fully explain the effect of each feature on the predicted probability using LIME

The latest part seemed the most difficult. LIME is still on its early releases, it is still not fully scalable and the GUI elements from the explainer are still a bit unflexible but the power of explaining at the record level is indeed impressive.

The results are rendered on an HTML file named Capstone_Project.html

Licensing, Authors, Acknowledgements

Must give credit to Kaggle for the data. You can find the Licensing for the data and other descriptive information at the Kaggle link available here. Otherwise, feel free to use the code here as you would like!

About

Capstone Project for Nanodegree in Data Science

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published