# Case Study 4 - Decision Trees

__Team Members:__ Amber Clark, Andrew Leppla, Jorge Olmos, Paritosh Rai

# Team Strategy

# Content
* [Business Understanding](#business-understanding)
    - [Scope](#scope)
    - [Introduction](#introduction)
    - [Methods](#methods)
    - [Results](#results)
* [Data Evaluation](#data-evaluation)
    - [Loading Data](#loading-data) 
    - [Data Summary](#data-summary)
    - [Missing Values](#missing-values)
    - [Feature Removal](#feature-removal)
    - [Exploratory Data Analysis (EDA)](#eda)
    - [Assumptions](#assumptions)
* [Model Preparations](#model-preparations)
    - [Sampling & Scaling Data](#sampling-scaling-data)
    - [Proposed Method](#proposed-metrics)
    - [Evaluation Metrics](#evaluation-metrics)
    - [Feature Selection](#feature-selection)
* [Model Building & Evaluations](#model-building)
    - [Sampling Methodology](#sampling-methodology)
    - [Model](#model)
    - [Performance Analysis](#performance-analysis)
* [Model Interpretability & Explainability](#model-explanation)
    - [Examining Feature Importance](#examining-feature-importance)
* [Conclusion](#conclusion)
    - [Final Model Proposal](#final-model-proposal)
    - [Future Considerations and Model Enhancements](#model-enhancements)
    - [Alternative Modeling Approaches](#alternative-modeling-approaches)

# Business Understanding & Executive Summary <a id='business-understanding'/>

What are we trying to solve for and why is it important?


### Scope <a id='scope'/>


### Introduction <a id='introduction'/>


### Methods <a id='methods'/>
 
 
### Results <a id='results'/>
 

# Data Evaluation/Engineering <a id='data-evaluation'>
    

Summarize data being used?

Are there missing values?

Which variables are needed and which are not?

What assumptions or conclusions are you drawing about your data?

| Variable | Description                                                                                                         |
|----------|---------------------------------------------------------------------------------------------------------------------|
| X1       | net profit / total assets                                                                                           |
| X2       | total liabilities / total assets                                                                                    |
| X3       | working capital / total assets                                                                                      |
| X4       | current assets / short-term liabilities                                                                             |
| X5       | [(cash + short-term securities + receivables - short-term liabilities) / (operating expenses - depreciation)] * 365 |
| X6       | retained earnings / total assets                                                                                    |
| X7       | EBIT / total assets                                                                                                 |
| X8       | book value of equity / total liabilities                                                                            |
| X9       | sales / total assets                                                                                                |
| X10      | equity / total assets                                                                                               |
| X11      | (gross profit + extraordinary items + financial expenses) / total assets                                            |
| X12      | gross profit / short-term liabilities                                                                               |
| X13      | (gross profit + depreciation) / sales                                                                               |
| X14      | (gross profit + interest) / total assets                                                                            |
| X15      | (total liabilities * 365) / (gross profit + depreciation)                                                           |
| X16      | (gross profit + depreciation) / total liabilities                                                                   |
| X17      | total assets / total liabilities                                                                                    |
| X18      | gross profit / total assets                                                                                         |
| X19      | gross profit / sales                                                                                                |
| X20      | (inventory * 365) / sales                                                                                           |
| X21      | sales (n) / sales (n-1)                                                                                             |
| X22      | profit on operating activities / total assets                                                                       |
| X23      | net profit / sales                                                                                                  |
| X24      | gross profit (in 3 years) / total assets                                                                            |
| X25      | (equity - share capital) / total assets                                                                             |
| X26      | (net profit + depreciation) / total liabilities                                                                     |
| X27      | profit on operating activities / financial expenses                                                                 |
| X28      | working capital / fixed assets                                                                                      |
| X29      | logarithm of total assets                                                                                           |
| X30      | (total liabilities - cash) / sales                                                                                  |
| X31      | (gross profit + interest) / sales                                                                                   |
| X32      | (current liabilities * 365) / cost of products sold                                                                 |
| X33      | operating expenses / short-term liabilities                                                                         |
| X34      | operating expenses / total liabilities                                                                              |
| X35      | profit on sales / total assets                                                                                      |
| X36      | total sales / total assets                                                                                          |
| X37      | (current assets - inventories) / long-term liabilities                                                              |
| X38      | constant capital / total assets                                                                                     |
| X39      | profit on sales / sales                                                                                             |
| X40      | (current assets - inventory - receivables) / short-term liabilities                                                 |
| X41      | total liabilities / ((profit on operating activities + depreciation) * (12/365))                                    |
| X42      | profit on operating activities / sales                                                                              |
| X43      | rotation receivables + inventory turnover in days                                                                   |
| X44      | (receivables * 365) / sales                                                                                         |
| X45      | net profit / inventory                                                                                              |
| X46      | (current assets - inventory) / short-term liabilities                                                               |
| X47      | (inventory * 365) / cost of products sold                                                                           |
| X48      | EBITDA (profit on operating activities - depreciation) / total assets                                               |
| X49      | EBITDA (profit on operating activities - depreciation) / sales                                                      |
| X50      | current assets / total liabilities                                                                                  |
| X51      | short-term liabilities / total assets                                                                               |
| X52      | (short-term liabilities * 365) / cost of products sold)                                                             |
| X53      | equity / fixed assets                                                                                               |
| X54      | constant capital / fixed assets                                                                                     |
| X55      | working capital                                                                                                     |
| X56      | (sales - cost of products sold) / sales                                                                             |
| X57      | (current assets - inventory - short-term liabilities) / (sales - gross profit - depreciation)                       |
| X58      | total costs /total sales                                                                                            |
| X59      | long-term liabilities / equity                                                                                      |
| X60      | sales / inventory                                                                                                   |
| X61      | sales / receivables                                                                                                 |
| X62      | (short-term liabilities *365) / sales                                                                               |
| X63      | sales / short-term liabilities                                                                                      |
| X64      | sales / fixed assets                                                                                                |

In [333]:
# standard libraries
import pandas as pd
import numpy as np
import re
import os
from IPython.display import Image
import sklearn
import time

# import warnings filter
'''import warnings
warnings.filterwarnings('ignore')
from warnings import simplefilter 
simplefilter(action='ignore', category=FutureWarning)'''



## Loading Data <a id='loading-data'>

## Missing Values <a id='missing-values'>



## Feature Removal <a id='feature-removal'>

## Exploratory Data Analysis (EDA) <a id='eda'>

# Model Preparations <a id='model-preparations'/>

## Proposed Method <a id='proposed-metrics' />

Which methods are you proposing to utilize to solve the problem?

Why is this method appropriate given the business objective? 

How will you determine if your approach is useful (or how will you differentiate which approach is more useful than another)?  

## Evaluation Metrics <a id='evaluation-metrics' />

More specifically, what evaluation metrics are most useful given that the problem is a binary-classification one (ex., Accuracy, F1-score, Precision, Recall, AUC, etc.)?

# Model Building & Evaluations <a id='model-building'/>

In this case, your primary task is to build both a Random Forest and XGBoost model to accurately predict bankruptcy and will involve the following steps:

- Specify your sampling methodology
- Setup your models - highlighting any important parameters
- Analyze each model's performance - referencing your chosen evaluation metric (including supplemental visuals and analysis where appropriate)

## Sampling Methodology

## Modeling

## Model's Performance Analysis <a id='performance-analysis'/>

# Model Interpretability & Explainability <a id='model-explanation'>

Which variables were more important and why?

How did you come to the conclusion these variables were important how how should the audience interpret this?

## Examining Feature Importance <a id='examining-feature-importance'/>

# Conclusion <a id='conclusion'>

What are you proposing to the audience with your models and why?

How should your audience interpret your conclusion and whwere should they go moving forward on the topic?

What other approaches do you recommend exploring?

Bring it all home!

### Final Model Proposal <a id='final-model-proposal'/>

### Future Considerations and Model Enhancements <a id='model-enhancements'/>

### Alternative Modeling Approaches <a id='alternative-modeling-approaches'>