<a href="https://colab.research.google.com/github/josephhchoi/Bellevue-DSC/blob/main/DSC410_W12_PM5.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# DSC410: Project Milestone 5

**Name**: Joseph Choi <br>
**Class**: DSC410-T301 Predictive Analytics (2243-1)

## Table of Content:
1. Model Setup
2. Dash App
3. Instructions for Users
4. Cheat Sheet (Encoded Value)

## 1. Model Setup:
- Loading dataset (preprocessed from Project Milestone 1-4)
- Renaming column headers
- Training Random Forest Regressor model using hyperparameters from Project Milestone 1-4

**JC Comment**: Reusing codes from Project Milestone 1-4

In [8]:
# Setup
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

In [9]:
# Loading in the cleaned, preprocessed, and feature engineered df from Project 1-4
healthcare_dataset_final = pd.read_csv('/content/healthcare_dataset_final.csv')

In [10]:
# Renaming specified columns for simplicity and clarity
healthcare_dataset_final = healthcare_dataset_final.rename(columns={
    'Gender Encoded': 'Gender',
    'Blood Type Encoded': 'Blood Type',
    'Insurance Provider Encoded': 'Insurance Provider',
    'Admission Type Encoded': 'Admission Type',
    'Medication Encoded': 'Medication',
    'Test Results Encoded': 'Test Results',
    'Grouped Medical Condition Encoded': 'Medical Condition'
})

In [12]:
# Printing Updated Result:
healthcare_dataset_final.head().T

Unnamed: 0,0,1,2,3,4
Age,61,45,77,84,56
Billing Amount,97088,1405,9188,3773,9880
Gender,2,1,1,1,1
Blood Type,7,4,5,1,2
Insurance Provider,2,1,5,2,1
Admission Type,2,1,1,3,3
Medication,1,2,2,3,4
Test Results,2,3,3,1,3
Medical Condition,2,5,1,6,1


In [14]:
# Splitting features and target variable
X = healthcare_dataset_final.drop('Billing Amount', axis=1)
y = healthcare_dataset_final['Billing Amount']

# Splitting the data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [15]:
# Creating RandomForestRegressor instance with the parameters from model tuning (hyperparameters)
best_rf_regressor = RandomForestRegressor(max_depth=None, min_samples_leaf=4, min_samples_split=2, n_estimators=50)

# Fitting the model to the training data
best_rf_regressor.fit(X_train, y_train)

# Making predictions on the test set
best_y_pred = best_rf_regressor.predict(X_test)

# Calculating evaluation metrics
best_mse = mean_squared_error(y_test, best_y_pred)
best_mae = mean_absolute_error(y_test, best_y_pred)
best_r2 = r2_score(y_test, best_y_pred)

# Printing the evaluation metrics
print("Mean Squared Error (MSE):", best_mse)
print("Mean Absolute Error (MAE):", best_mae)
print("R-squared (R2) Score:", best_r2)

Mean Squared Error (MSE): 57060149.615596354
Mean Absolute Error (MAE): 4389.920827497666
R-squared (R2) Score: 0.9671802089675124


## 2. Dash App:
- Serializing "best_rf_regressor" model to plug into Dash app
- Defining Dash app layout
- Implement a callback function
- Launch Dash app

In [19]:
# Setup
import dash
from dash import dcc
from dash import html
from dash.dependencies import Input, Output, State
import joblib

In [20]:
"""
Code Description:
  - Serializing and saving model "best_rf_regressor" via "joblib"

Code Breakdown:
  - Step 1: Specifying file path
  - Step 2: Saving the "best_rf_regressor" model
"""

# Step 1:
model_path = '/content/best_rf_regressor.pkl'

# Step 2:
joblib.dump(best_rf_regressor, model_path)

['/content/best_rf_regressor.pkl']

In [22]:
"""
PART 1
------

Code Description:
  - Coding for Dash app initialization and the web application configuration
  - Setting up the app's layout with:
    - Input fields
    - Predict button to start the prediction process

Code Breakdown:
  - Step 1: Initializing Dash app
  - Step 2: Defining the app layout
  - App layout is sectioned off into 4 parts:
    - Step 3 (Title): Displaying main header of the app "Healthcare Billing Prediction" via "html.H1()"
    - Step 4 (Inputs): Creating an input area where users can fill out info for each field via "html.Div()" series
    - Step 5 (Predict Button): Creating a button that triggers prediction when data is updated via "html.Button()"
    - Step 6 (Output Display): Allowing it to show prediction results when button is triggered via "html.Div()"

"""

# Step 1:
app = dash.Dash(__name__)
server = app.server

# Step 2:
app.layout = html.Div([
    # Step 3:
    html.H1("Healthcare Billing Prediction"),
    # Step 4:
    html.Div([
        html.Div([html.Label("Age:"), dcc.Input(id="age", type="number", value=30)], className="three columns"),
        html.Div([html.Label("Gender:"), dcc.Input(id="gender", type="number", value=1)], className="three columns"),
        html.Div([html.Label("Blood Type:"), dcc.Input(id="blood_type", type="number", value=1)], className="three columns"),
        html.Div([html.Label("Insurance Provider:"), dcc.Input(id="insurance_provider", type="number", value=1)], className="three columns"),
        html.Div([html.Label("Admission Type:"), dcc.Input(id="admission_type", type="number", value=1)], className="three columns"),
        html.Div([html.Label("Medication:"), dcc.Input(id="medication", type="number", value=1)], className="three columns"),
        html.Div([html.Label("Test Results:"), dcc.Input(id="test_results", type="number", value=1)], className="three columns"),
        html.Div([html.Label("Medical Condition:"), dcc.Input(id="medical_condition", type="number", value=1)], className="three columns"),
    ], className="row"),
    # Step 5:
    html.Button("Predict Billing Amount", id="predict-btn", n_clicks=0),
    # Step 6:
    html.Div(id="prediction-output", style={"marginTop": "20px"}),
])

"""
PART 2
------

Code Description:
  - Implementing callback function to handle predictions in Dash
  - Setting it up so that it updates the output display based on user inputs

Code Breakdown:
  - Step 7: Setting up Callback via "@app.callback"
    - Linking the predict button's clicks to the prediction output
    - Defining inputs and states to capture user inputs from the app's interface
      - Input:
        - Specifies user action that triggers the callback function
        - User action: Clicks on the "Predict" button
      - Output:
        - Specifies where the prediction result is displayed
        - Where? In the "children' property of the HTML element with ID 'prediction-output'
      - State:
        - Gathering all the user provided data from the input fields at the time the predict button is clicked
  - Step 8: Creating the "predict" function to process inputs and generate predictions:
    - Step 9: Compiling all the input values into array format via "np.array()"
    - Step 10: Implementing the "best_rf_regressor" model to make predictions
    - Step 11: Returning prediction result. Features:
      - Displaying "Predicted Billing Amount:" and the "Predicted Output"
      - Prompting the user to "Enter patient details and click predict" if no prediction is made
  - Step 12: Running the App
    - "app.run_server(debug=True)": Making the app available for interaction.

"""

# Step 7:
@app.callback(
    Output("prediction-output", "children"),
    [Input("predict-btn", "n_clicks")],
    [State("age", "value"), State("gender", "value"), State("blood_type", "value"),
     State("insurance_provider", "value"), State("admission_type", "value"),
     State("medication", "value"), State("test_results", "value"),
     State("medical_condition", "value")]
)
# Step 8:
def predict_billing(n_clicks, age, gender, blood_type, insurance_provider, admission_type, medication, test_results, medical_condition):
    if n_clicks > 0:
        # Step 9:
        input_features = np.array([[age, gender, blood_type, insurance_provider, admission_type, medication, test_results, medical_condition]])
        # Step 10:
        prediction = best_rf_regressor.predict(input_features)
        # Step 11:
        return f"Predicted Billing Amount: ${prediction[0]:,.2f}"
    return "Enter patient details and click predict"

# Step 12:
if __name__ == '__main__':
    app.run_server(debug=True)

<IPython.core.display.Javascript object>

## 3. Instructions for Users:
Welcome! This is the Healthcare Billing Prediction App. Please see the instructions below to get the forecasted billing amount:
- **Fill in Patient Details**: Enter the patient information in the input boxes. Please use the cheat sheet below for features that require encoded values.
- **Click "Predict Billing Amount" Button**: When ALL details are updated, click the button to see the forecasted billing amount.

**Note**: Please ensure to use the cheat sheet below when entering details for features that require encoded values.

## 4. Cheat Sheet (Encoded Value):
**Gender**:
- Male: 1
- Female: 2

**Blood Type**:
- A+: 1
- O+: 2
- A-: 3
- B+: 4
- B-: 5
- AB-: 6
- AB+: 7
- O-: 8

**Insurance Provider**:
- UnitedHealthcare: 1
- Blue Cross: 2
- Cigna: 3
- Aetna: 4
- Medicare: 5

**Admission Type**:
- Emergency: 1
- Elective: 2
- Urgent: 3

**Medication**:
- Aspirin: 1
- Lipitor: 2
- Penicillin: 3
- Paracetamol: 4
- Ibuprofen: 5

**Test Results**:
- Abnormal: 1
- Inconclusive: 2
- Normal: 3

**Medical Condition**:
- Pneumonia: 1
- Cancer: 2
- Diabetes: 3
- Hypertension: 4
- Bronchitis: 5
- Asthma: 6