##  Part V: GUI (Demo) - Model Predictions | Jupyter Notebook

## Final Project Submission

Please fill out:
* __Student name:__ Sharonda Pettiett-Warner
* __Student pace:__ part time - PT_0610
* __Scheduled project review date/time:__ April 24, 2020
* __Instructor name:__ Eli Thomas

## Goal

> To create a user interface (UI) using Tkinter __demonstrating__ how to capture input parameters for a hotel reservation.
- The input data will be used by the model to provide the user with a prediction as either __Booked__ or __Canceled__ along with the prediction __Probability__.


# Begin Study
> In this notebook I will build a __Graphical User Interface (GUI)__ to capture user input related to hotel booking key features to perform a prediction as either Booked or Canceled along with the prediction Probability.  The objective is to predict if hotel booking will be canceled based on hotel and user attributes collected from the user.

In [1]:
import datetime
start = datetime.datetime.now()

# Import Libraries

In [2]:
import numpy as np 
import pandas as pd 

from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.svm import SVC
from sklearn.ensemble import AdaBoostClassifier, GradientBoostingClassifier
from xgboost import XGBClassifier

import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
sns.set(style="whitegrid")

import tkinter as tk
from tkinter import ttk

import warnings
warnings.filterwarnings('ignore')

# Define Functions

#### def printData(leadTime, country, custType, mktSeg)
- Function to print data

In [3]:
def printData(leadTime, country, custType, mktSeg):
    ' Function to print data'
    print(f'Lead Time:  {leadTime}')
    print(f'Country: {country}')
    print(f'Customer Type:  {custType}')
    print(f'Market Segment: {mktSeg}')

#### def get_input()
- Function to get input values and print

In [4]:
def get_input():
    ' Function to get input values and print'
    leadTime = radioValue.get()
    country  = comboCountry.get()
    custType = comboCustTyp.get()
    mktSeg   = comboMktSeg.get()
    printData(leadTime, country, custType, mktSeg )

#### def get_prediction()
- Function to get input values frm GUI and provide prediction and probablity

In [5]:
def get_prediction():
    ' Function to get input values frm GUI and provide prediction and probablity'
    leadTime = radioValue.get()
    country  = comboCountry.get()
    custType = comboCustTyp.get()
    mktSeg   = comboMktSeg.get()
    printData(leadTime, country, custType, mktSeg)
    
    # Default Feature Values:
    import_feats_dict = {
                        'tp_lead_time' : 7,
                        'tp_deposit_type_Non Refund':0, #Pre-paid (yes=1 or no=0)
                        'tp_country_PRT': 1,
                        'tp_adr': 95.00,
                        'tp_total_of_special_requests': 0,
                        'tp_arrival_date_day_of_month': 26,
                        'tp_agent': 28.0,
                        'tp_arrival_date_week_number': 15,  # 27 = July  35= Aug   # 4 = Jan
                        'tp_stays_in_week_nights': 2,
                        'tp_previous_cancellations': 0,
                        'tp_stays_in_weekend_nights': 1,
                        'tp_arrival_date_year': 2020,  #2017
                        'tp_required_car_parking_spaces':0,
                        'tp_market_segment_Online TA': 1,
                        'tp_customer_type_Transient': 1}
    tp_values = list(import_feats_dict.values())
    
    # Updated Feature Values from User Input:
    
    # Customer Type
    if (custType == 'Transient'):
        custType = 1
    else:
        custType = 0
        
    # Market Segment
    if (mktSeg == 'Online TA'):
        mktSeg = 1
    else:
        mktSeg = 0
        
    # Country
    if (country == 'PRT'):
        country = 1
    else:
        country = 0
        
    # Lead Time
    if (leadTime == 1):   # 1=This Week
        leadTime = 7
        adrPricing = 95.00
    elif (leadTime == 2): # 2=Next Week
        leadTime = 14
        adrPricing = 95.00
    elif (leadTime == 3): # 3=Next Month
        leadTime = 30
        adrPricing = 110.00 
    elif (leadTime == 4): # 4=6 mos
        leadTime = 180
        adrPricing = 110.00 
    elif (leadTime == 5): # 5=9 mos
        leadTime = 275
        adrPricing = 90.00
    elif (leadTime == 6): # 6=Next Year
        leadTime = 365
        adrPricing = 85.00
    
    # Update Features Dictionary with new values:
    import_feats_dict.update({'tp_customer_type_Transient': custType,
                              'tp_lead_time': leadTime, 
                              'tp_adr': adrPricing,
                              'tp_country_PRT': country,
                              'tp_market_segment_Online TA': mktSeg})
    tp_values = list(import_feats_dict.values())
    print(f'adr: {adrPricing}')
    print(tp_values)
    
    # Single Test Prediction
    y_single_pred_ex = model.predict([tp_values]) 
    display(y_single_pred_ex)

    y_single_pred_prob_ex = model.predict_proba([tp_values]) 
    display(y_single_pred_prob_ex)
    
    # Display Model Prediction Result
    if y_single_pred_ex == 1:
        pred_text_val = 'Canceled'
    else:
        pred_text_val = 'Booked'
        
    pred_results = 'Model Prediction: {} | Probabilty Stats : {}'.format(pred_text_val, y_single_pred_prob_ex)
    print(pred_results)
    print('\n_______________________________________________________')
    
    w2 = tk.Label(window, 
                  justify=tk.LEFT, # justify=tk.RIGHT
                  font = "Verdana 8 bold",
                  padx = 10, 
                  text=pred_results)
    w2.grid(column=0, row=40)


# De-Pickle File
> The training model was pickled from __Capstone_ML_Model (Final Prediction)__ jupyter notebook and will be de-pickled in this notebook for use.

In [6]:
# De-serialize the training model object - RandomForestClf()
import pickle
clf_file = open('./Pickle_Files/RandomForestClf.pickle', 'rb')
model = pickle.load(clf_file)
clf_file.close()

In [7]:
# Test de-pickle process using data based on rowid- 96734
y_single_pred_ex = model.predict([[121,0,0,85.00,0,26,28.0,21,2,0,1,2017,0,0,1]]) 
display(y_single_pred_ex)

y_single_pred_prob_ex = model.predict_proba([[121,0,0,85.00,0,26,28.0,21,2,0,1,2017,0,0,1]]) 
display(y_single_pred_prob_ex)

array([0], dtype=int64)

array([[0.99403579, 0.00596421]])

# Predicton Data

> The best model for performing the prediction is using the __RandomForestClf__ as indicated above.

## Important Features

#### The Most Predictive Features
From our analysis, we concluded that the most useful features for prediction were the following features (Top 15):

- lead_time                        0.116716
- deposit_type_Non Refund          0.108665
- country_PRT                      0.080293
- adr                              0.078294
- total_of_special_requests        0.069174
- arrival_date_day_of_month        0.056989
- agent                            0.051549
- arrival_date_week_number         0.048732
- stays_in_week_nights             0.037315
- previous_cancellations           0.030105
- stays_in_weekend_nights          0.025514
- arrival_date_year                0.024081
- required_car_parking_spaces      0.020868
- market_segment_Online TA         0.020407
- customer_type_Transient          0.019232

In [8]:
# Create dictionary of important features with default values
import_feats_dict = {
'tp_lead_time' : 7,
'tp_deposit_type_Non Refund':0,
'tp_country_PRT': 1,
'tp_adr': 85.00,
'tp_total_of_special_requests': 0,
'tp_arrival_date_day_of_month': 26,
't_agent': 28.0,
'tp_arrival_date_week_number': 21,
'tp_stays_in_week_nights': 2,
'tp_previous_cancellations': 0,
'tp_stays_in_weekend_nights': 1,
'tp_arrival_date_year': 2017,
'tp_required_car_parking_spaces':0,
'tp_market_segment_Online TA': 1,
'tp_customer_type_Transient': 1}

## Preview and Test Prediction Data

In [9]:
# Create list of important features values
tp_values = list(import_feats_dict.values())
tp_values                                       

[7, 0, 1, 85.0, 0, 26, 28.0, 21, 2, 0, 1, 2017, 0, 1, 1]

In [10]:
# Test Single Prediction
y_single_pred_ex = model.predict([tp_values]) 
display(y_single_pred_ex)

y_single_pred_prob_ex = model.predict_proba([tp_values]) 
display(y_single_pred_prob_ex)

array([0], dtype=int64)

array([[0.72968403, 0.27031597]])

In [11]:
# Update feature dictionary and preview the updates list of values
import_feats_dict.update({'tp_country_PRT': 1, 'tp_adr': 185.00}, )
tp_values = list(import_feats_dict.values())
tp_values

[7, 0, 1, 185.0, 0, 26, 28.0, 21, 2, 0, 1, 2017, 0, 1, 1]

In [12]:
# Create variables and hardcode values to test to updating dictionary using Variables
var_country = 1
var_leadTime = 121

import_feats_dict.update({'tp_adr': 185.00,'tp_lead_time': var_leadTime, 'tp_country_PRT': var_country})
tp_values = list(import_feats_dict.values())
tp_values

[121, 0, 1, 185.0, 0, 26, 28.0, 21, 2, 0, 1, 2017, 0, 1, 1]

In [13]:
# Test Single Prediction using variables
y_single_pred_ex = model.predict([tp_values]) 
display(y_single_pred_ex)

y_single_pred_prob_ex = model.predict_proba([tp_values]) 
display(y_single_pred_prob_ex)

array([1], dtype=int64)

array([[0.26490915, 0.73509085]])

In [14]:
# Print Model Prediction Result
if y_single_pred_ex == 1:
    pred_text_val = 'Canceled'
else:
    pred_text_val = 'Booked'
pred_results = 'Model Prediction: {} | Probabilty Stats : {}'.format(pred_text_val, y_single_pred_prob_ex)
pred_results

'Model Prediction: Canceled | Probabilty Stats : [[0.26490915 0.73509085]]'

# Tkinter GUI

> User Interface to capture input related to prediction variables.

In [15]:
# Tkinter GUI Layout Components

# ---------------------  LAYOUT ---------------------------------------------------------------------------------
# Tkinter window -- window = tkinter.Tk()
window = tk.Tk()

# Set window size
window.geometry('850x500')
window.title('Model Prediction GUI')


# ------ Window Title --------------------------------  
label0 = tk.Label(window, text = "Predict Booking or Cancellation Likelihood" ,
                          font = "Verdana 10 bold")
label0.grid(column=0, row=0)


# ----- Radio Button for Lead Time Input values --------------
labelTop0 = tk.Label(window,
                    text = '\nSelect Arrival Timeframe for Reservation:')

labelTop0.grid(column=0, row=5)

radioValue = tk.IntVar()
radioValue.set(1)         # To set default value

radio1 = tk.Radiobutton(window, text='This Week \t(Price: $95/nt)',  
                             variable=radioValue, value=1) 
radio2 = tk.Radiobutton(window, text='Next Week \t(Price: $95/nt)', 
                             variable=radioValue, value=2)
radio3 = tk.Radiobutton(window, text='Next Month \t(Price: $110/nt)', 
                             variable=radioValue, value=3)
radio4 = tk.Radiobutton(window, text='Next 6 months \t(Price: $110/nt)', 
                             variable=radioValue, value=4) 
radio5 = tk.Radiobutton(window, text='Next 9 months\t(Price: $90/nt)',
                             variable=radioValue, value=5)
radio6 = tk.Radiobutton(window, text='Next Year \t(Price: $85/nt)',
                             variable=radioValue, value=6)

radio1.grid(column=0, row=6,columnspan=2,sticky='W')
radio2.grid(column=0, row=7,columnspan=2,sticky='W')
radio3.grid(column=0, row=8,columnspan=2,sticky='W')
radio4.grid(column=0, row=9,columnspan=2,sticky='W')
radio5.grid(column=0, row=10,columnspan=2,sticky='W')
radio6.grid(column=0, row=11,columnspan=2,sticky='W')

# ----- Combobox for Country Input values --------------
labelTop1 = tk.Label(window,
                    text = 'Select Value for Country:')

labelTop1.grid(column=0, row=15)

comboCountry = ttk.Combobox(window, 
                            values=['PRT', 'GBR', 'FRA', 'ESP', 'DEU', 'USA', 'Other'])
comboCountry.current(0)
comboCountry.grid(column=1, row=15, sticky='W' )


# ----- Combobox for Customer Type Input values --------------
labelTop2 = tk.Label(window,
                    text = 'Select Value for Customer Type:')

labelTop2.grid(column=0, row=20)

comboCustTyp = ttk.Combobox(window, 
                            values=['Transient', 'Contract', 'Group', 'Other'])
comboCustTyp.current(0)
comboCustTyp.grid(column=1, row=20, sticky='W')

# ----- Combobox for Market Segment Input values --------------
labelTop3 = tk.Label(window,
                    text = 'Select Value for Market Channel:')

labelTop3.grid(column=0, row=25)

comboMktSeg = ttk.Combobox(window, 
                            values=['Direct', 'Corporate', 'Online TA', 'Offline TA/TO', 'Other'])
comboMktSeg.current(2)
comboMktSeg.grid(column=1, row=25, sticky='W')

# ----------------- Label - Spacing ONLY  --------------

labelTop1 = tk.Label(window, text = '\n')
labelTop1.grid(column=0, row=26)

# ----------------- Button - Prediction  --------------

button1 = tk.Button(window, text = "Predict", fg = "blue",font = "Verdana 8 bold")
button1.grid(column=0, row=30)
button1.config(command = get_prediction)   # command = get_input

# ----------------- Label - Spacing ONLY  --------------

labelTop1 = tk.Label(window, text = '\n\n')
labelTop1.grid(column=0, row=31)

# ----------------- Logo --------------
# logo = tk.PhotoImage(file="./lead_time_distribution-3.gif", width=400, height=300)

# w1 = tk.Label(window, image=logo)
# w1.grid(column=1, row = 50)

# explanation = """At present, only GIF and PPM/PGM
# formats are supported, but an interface 
# exists to allow additional image file
# formats to be added easily."""

# w2 = tk.Label(window, 
#               justify=tk.LEFT, # justify=tk.RIGHT
#               padx = 10, 
#               text=explanation)
# w2.grid(column=0, row=50)

window.mainloop()

Lead Time:  1
Country: PRT
Customer Type:  Transient
Market Segment: Online TA
adr: 95.0
[7, 0, 1, 95.0, 0, 26, 28.0, 15, 2, 0, 1, 2020, 0, 1, 1]


array([0], dtype=int64)

array([[0.70502021, 0.29497979]])

Model Prediction: Booked | Probabilty Stats : [[0.70502021 0.29497979]]

_______________________________________________________
Lead Time:  6
Country: PRT
Customer Type:  Transient
Market Segment: Online TA
adr: 85.0
[365, 0, 1, 85.0, 0, 26, 28.0, 15, 2, 0, 1, 2020, 0, 1, 1]


array([1], dtype=int64)

array([[0.29699391, 0.70300609]])

Model Prediction: Canceled | Probabilty Stats : [[0.29699391 0.70300609]]

_______________________________________________________


# Conclusions

####  Best Performance  Model
> __RandomForestClf__ 
- __Test Accuracy Score: 0.877677  |   Train Accuracy Score: 0.939452__
- __Accuracy Score:__ 0.8776766335158249   
- __Precision Score:__ 0.8693262411347518  
- __roc_auc_score:__ 0.8656552631067826 


#### Summary of Results

> __Prediction using GUI Default Values:__

Lead Time:  1

Country: PRT

Customer Type:  Transient

Market Segment: Online TA

adr: 95.0

[7, 0, 1, 95.0, 0, 26, 28.0, 15, 2, 0, 1, 2017, 0, 1, 1]

array([0], dtype=int64)

array([[0.70502021, 0.29497979]])

__Model Prediction: Booked | Probabilty Stats : [[0.70502021, 0.29497979]]__

# Future Work

- Consider, further evaluation to address the following:

    - Create a GUI to help hotel convert low cancellation probabilities into revenue, by offering suggestions for amenities, price discounts, etc.

# Notes

### Pickle

- __Blog:__  __Blog post URL:__ "How to Pickle your Trained Model"
    - https://medium.com/@spettiett/how-to-pickle-your-trained-model-f4b7051babaa?source=friends_link&sk=dccb5c965557f0964b1e554ebfb183e6
- __Published by:__ Sharonda Warner | Date: April 2020

In [16]:
end = datetime.datetime.now()
elapsed_time = end - start
print(f'Capstone_Model_Prediction_Using_Tkinter Total Execution Time: {elapsed_time}')

Capstone_Model_Prediction_Using_Tkinter Total Execution Time: 2:17:36.022307


# End Study