<center>
<h1>Flask Web Server For Colab</h1>
</center>

<hr>

The simplest approach to your final project is to use this notebook as the foundation for your web server. I have tried to set it up so the main components work on any dataset. That leaves you with two main tasks:

1. Loading in new files that go with your new dataset. I have the notebook set up to work with the titanic files. You will have to change over to your new files.

2. Reworking the html so you have an interface that fits the new features in your dataset. Your goal is to take info from a user, make predictions, then feed those predictions back to the user. The current html is focused on the Titanic.

#I. Before you even get to this notebook

This notebook assumes you have 10 files defined and ready to load in:

1. You have 4 threshold tables as csv files on github.

2. Your have 3 joblib files for knn, logreg and xgb.

3. You have an ANN folder (unzipped from github).

4. You have your test set as a file on github (from wrangling notebook).

5. You have a screenshot of the pipeline you used.

In essence, it assumes you have done all your wrangling and model tuning prior. And saved your results to file. You are then ready to tackle this notebook.

#II. My general strategy for wrangling in this notebook

I want to allow the user to leave fields blank. That means some (or all) features you use may be blank. What to do? Your models can't deal with blank values.

My strategy is to load the test table, the one you created in the wrangling notebook. 
When you get a new "row" from the user, add it on to the end of (a copy of) the test table.

Then run your pipeline on this new copy. This should wrangle your new row into shape.

Now you can pull it back off the end and you have a row ready for prediction. This is best strategy I could see for wrangling the new row into shape.

I do all of this for you in the code below.

#III. Moving to a real web server

If you decide you want to move the code here to a real web server, e.g., on a department machine, I don't think it would be too bad. You will:

1. Have to set up a python environment with all the libraries you need, many of which come for free in colab.

2. You will have to set up your files in local storage.

3. You have to make sure that I can run it. If you are serving off of localhost without ngrok, I likely won't be able to actually run your server and see results. And hence can't give you a grade.

But all that is optional for single-person teams. Feel free to just use this notebook. For two-person teams, I would like you to set up a persistent web server that I (and your future employment interviewers) can use. I did see this new tool for migration from notebook to cloud service but it looks like it might be complicated to use: https://ploomber.readthedocs.io/en/latest/get-started/what-is.html. I also run docker so if you can make that work for me, i.e., send me a docker image to run, that might be ok.

In [1]:
from joblib import load
from tensorflow import keras
import sklearn
from flask import Flask
from flask import request
import os

##Bring in your own library

In [2]:
github_name = 'attajunyah'
repo_name = 'cis523'
source_file = 'library.py'
url = f'https://raw.githubusercontent.com/{github_name}/{repo_name}/main/{source_file}'
!rm $source_file
!wget $url
%run -i $source_file

rm: cannot remove 'library.py': No such file or directory
--2022-12-10 12:30:22--  https://raw.githubusercontent.com/attajunyah/cis523/main/library.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.110.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 12562 (12K) [text/plain]
Saving to: ‘library.py’


2022-12-10 12:30:22 (81.5 MB/s) - ‘library.py’ saved [12562/12562]



##I would not expect you need my library

#cmd-m y to turn this into code cell, cmd-m m to go other way
%%capture
!pip install uo-puddles
from uo_puddles.cis423 import *

##If you have to run some code do it here

For instance, if you did not get your new pipeline in your library, you can create the pipeline here. Should be no need for splitting. You already did that when tuning and now have trained and tuned models ready to go.

In [3]:
#if need to set up pipeline for your new dataset, etc.
credit_transformer = Pipeline(steps=[
    ('drop', DropColumnsTransformer(['Home_Status'], 'drop')),
    ('loan_grade', MappingTransformer('Loan_Grade', {'A': 0, 'B': 1, 'C': 2, 
                                                     'D': 3, 'E': 4, 'F': 5,
                                                     'G': 6})),
    ('hist_def', MappingTransformer('Historical_Default', {'N': 0, 'Y': 1})),
    ('ohe', OHETransformer(target_column='Loan_Intent')),
    ('age', TukeyTransformer('Age', 'outer')),
    ('income', TukeyTransformer('Income', 'outer')),
    ('emp_len', TukeyTransformer('Employment_Length', 'outer')),
    ('loan', TukeyTransformer('Loan_Amount', 'outer')),
    ('interest', TukeyTransformer('Interest_Rate', 'outer')),
    ('loan_income_ratio', TukeyTransformer('Loan_Income_Percent', 'outer')),
    ('credit_hist', TukeyTransformer('Credit_History', 'outer')),
    ('scale', MinMaxTransformer()), 
    ('imputer', KNNTransformer())
    ], verbose=True)


In [4]:
the_transformer = credit_transformer  #just a renaming

#III. Load your files

You will need to load the 4 models, 4 threshold tables, and a test_df.

##Bring in models

You will need to change some of these depending on where you have things stored on github.

I chose to create two folders: `models` to hold all the model info and `thresholds` to hold the csv tables.

###Get KNN model

In [5]:
github_name = 'attajunyah'
repo_name = 'cis523'
# source_folder = 'models'

In [6]:
source_file = 'knn_model.joblib'
url = f'https://raw.githubusercontent.com/{github_name}/{repo_name}/main/{source_file}'
!rm $source_file
!wget $url

rm: cannot remove 'knn_model.joblib': No such file or directory
--2022-12-10 12:30:23--  https://raw.githubusercontent.com/attajunyah/cis523/main/knn_model.joblib
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.110.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2120998 (2.0M) [application/octet-stream]
Saving to: ‘knn_model.joblib’


2022-12-10 12:30:23 (209 MB/s) - ‘knn_model.joblib’ saved [2120998/2120998]



###Get LogReg model

In [7]:
source_file = 'logreg_model.joblib'
url = f'https://raw.githubusercontent.com/{github_name}/{repo_name}/main/{source_file}'
!rm $source_file
!wget $url

rm: cannot remove 'logreg_model.joblib': No such file or directory
--2022-12-10 12:30:23--  https://raw.githubusercontent.com/attajunyah/cis523/main/logreg_model.joblib
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.110.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 8970 (8.8K) [application/octet-stream]
Saving to: ‘logreg_model.joblib’


2022-12-10 12:30:23 (90.0 MB/s) - ‘logreg_model.joblib’ saved [8970/8970]



###Get ANN model (and unzip)

In [8]:
source_file = 'ann_model.zip'
url = f'https://raw.githubusercontent.com/{github_name}/{repo_name}/main/{source_file}'
!rm $source_file
!wget $url
!rm -r 'ann_model'
!unzip $source_file

rm: cannot remove 'ann_model.zip': No such file or directory
--2022-12-10 12:30:23--  https://raw.githubusercontent.com/attajunyah/cis523/main/ann_model.zip
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 541041 (528K) [application/zip]
Saving to: ‘ann_model.zip’


2022-12-10 12:30:23 (105 MB/s) - ‘ann_model.zip’ saved [541041/541041]

rm: cannot remove 'ann_model': No such file or directory
Archive:  ann_model.zip
   creating: ann_model/
  inflating: __MACOSX/._ann_model    
  inflating: ann_model/.DS_Store     
  inflating: __MACOSX/ann_model/._.DS_Store  
  inflating: ann_model/keras_metadata.pb  
  inflating: __MACOSX/ann_model/._keras_metadata.pb  
   creating: ann_model/variables/
  inflating: __MACOSX/ann_model/._variables  
  inflating: ann_model/sav

###Get XGBoost model

In [9]:
source_file = 'xgb_model.joblib'
url = f'https://raw.githubusercontent.com/{github_name}/{repo_name}/main/{source_file}'
!rm $source_file
!wget $url

rm: cannot remove 'xgb_model.joblib': No such file or directory
--2022-12-10 12:30:24--  https://raw.githubusercontent.com/attajunyah/cis523/main/xgb_model.joblib
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 111597 (109K) [application/octet-stream]
Saving to: ‘xgb_model.joblib’


2022-12-10 12:30:24 (31.7 MB/s) - ‘xgb_model.joblib’ saved [111597/111597]



##Now Threshold tables

In [10]:
source_folder = 'thresholds'

In [11]:
source_file = 'knn_thresholds.csv'
url = f'https://raw.githubusercontent.com/{github_name}/{repo_name}/main/{source_file}'
!rm $source_file
!wget $url

rm: cannot remove 'knn_thresholds.csv': No such file or directory
--2022-12-10 12:30:24--  https://raw.githubusercontent.com/attajunyah/cis523/main/knn_thresholds.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 543 [text/plain]
Saving to: ‘knn_thresholds.csv’


2022-12-10 12:30:24 (32.2 MB/s) - ‘knn_thresholds.csv’ saved [543/543]



In [12]:
source_file = 'logreg_thresholds.csv'
url = f'https://raw.githubusercontent.com/{github_name}/{repo_name}/main/{source_file}'
!rm $source_file
!wget $url

rm: cannot remove 'logreg_thresholds.csv': No such file or directory
--2022-12-10 12:30:24--  https://raw.githubusercontent.com/attajunyah/cis523/main/logreg_thresholds.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.108.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 541 [text/plain]
Saving to: ‘logreg_thresholds.csv’


2022-12-10 12:30:25 (40.5 MB/s) - ‘logreg_thresholds.csv’ saved [541/541]



In [13]:
source_file = 'xgb_thresholds.csv'
url = f'https://raw.githubusercontent.com/{github_name}/{repo_name}/main/{source_file}'
!rm $source_file
!wget $url

rm: cannot remove 'xgb_thresholds.csv': No such file or directory
--2022-12-10 12:30:25--  https://raw.githubusercontent.com/attajunyah/cis523/main/xgb_thresholds.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 541 [text/plain]
Saving to: ‘xgb_thresholds.csv’


2022-12-10 12:30:25 (24.6 MB/s) - ‘xgb_thresholds.csv’ saved [541/541]



In [14]:
source_file = 'ann_thresholds.csv'
url = f'https://raw.githubusercontent.com/{github_name}/{repo_name}/main/{source_file}'
!rm $source_file
!wget $url

rm: cannot remove 'ann_thresholds.csv': No such file or directory
--2022-12-10 12:30:25--  https://raw.githubusercontent.com/attajunyah/cis523/main/ann_thresholds.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.110.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 484 [text/plain]
Saving to: ‘ann_thresholds.csv’


2022-12-10 12:30:25 (24.9 MB/s) - ‘ann_thresholds.csv’ saved [484/484]



Should see something like this.

<img src='https://www.dropbox.com/s/8vkaibxg54krlym/Screen%20Shot%202022-11-22%20at%202.56.27%20PM.png?raw=1' height=200>

In [15]:
#Now load models in from local store

xgb_model = load('xgb_model.joblib')
logreg_model = load('logreg_model.joblib')
knn_model = load('knn_model.joblib')
ann_model = keras.models.load_model('ann_model')

https://scikit-learn.org/stable/modules/model_persistence.html#security-maintainability-limitations
https://scikit-learn.org/stable/modules/model_persistence.html#security-maintainability-limitations
https://scikit-learn.org/stable/modules/model_persistence.html#security-maintainability-limitations


In [16]:
#Now load tables in from local store

logreg_thresholds = pd.read_csv('logreg_thresholds.csv')
knn_thresholds = pd.read_csv('knn_thresholds.csv')
xgb_thresholds = pd.read_csv('xgb_thresholds.csv')
ann_thresholds = pd.read_csv('ann_thresholds.csv')

##Produce html from threshold tables



In [17]:
xgb_table = xgb_thresholds.to_html(index=False, justify='center', col_space=80).replace('<td>', '<td align="center">')
logreg_table = logreg_thresholds.to_html(index=False, justify='center', col_space=80).replace('<td>', '<td align="center">')
knn_table = knn_thresholds.to_html(index=False, justify='center', col_space=80).replace('<td>', '<td align="center">')
ann_table = ann_thresholds.to_html(index=False, justify='center', col_space=80).replace('<td>', '<td align="center">')

###Get test table in raw format

In [18]:
source_file = 'test_df.csv'
url = f'https://raw.githubusercontent.com/{github_name}/{repo_name}/main/{source_file}'
!rm $source_file
!wget $url
test_df  = pd.read_csv(source_file)

rm: cannot remove 'test_df.csv': No such file or directory
--2022-12-10 12:30:26--  https://raw.githubusercontent.com/attajunyah/cis523/main/test_df.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.108.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 178545 (174K) [text/plain]
Saving to: ‘test_df.csv’


2022-12-10 12:30:26 (38.9 MB/s) - ‘test_df.csv’ saved [178545/178545]



In [19]:
test_df

Unnamed: 0,Age,Income,Home_Status,Employment_Length,Loan_Intent,Loan_Grade,Loan_Amount,Interest_Rate,Loan_Income_Percent,Historical_Default,Credit_History
0,42,22000,RENT,,DEBTCONSOLIDATION,A,1000,7.29,0.05,N,13
1,23,65000,MORTGAGE,0.0,EDUCATION,C,8500,15.23,0.13,Y,3
2,35,110000,RENT,9.0,DEBTCONSOLIDATION,C,25000,13.49,0.23,N,8
3,30,42312,RENT,1.0,EDUCATION,A,5500,8.59,0.13,N,6
4,35,42300,RENT,3.0,PERSONAL,C,5500,13.06,0.13,N,8
...,...,...,...,...,...,...,...,...,...,...,...
3395,23,84000,MORTGAGE,4.0,DEBTCONSOLIDATION,A,7000,6.92,0.08,N,2
3396,29,40000,MORTGAGE,8.0,DEBTCONSOLIDATION,A,8000,8.59,0.20,N,9
3397,34,185000,MORTGAGE,18.0,DEBTCONSOLIDATION,C,18250,,0.10,Y,5
3398,37,95000,RENT,3.0,MEDICAL,A,10000,7.88,0.11,N,13


#IV. Debugging

Using `print` function does not work with ngrok and threading

At least not when in your own methods. What I ended up having to do to get debugging info was write it to a local file. I am giving you some functions that I hope will help. They all work with a file called `debugging.txt` in local colab storage.

In [20]:
from datetime import datetime
import os

def append_to_debugging(*str_to_append):
  f = open("debugging.txt", "a")
  curDT = datetime.now()
  time = curDT.strftime("%H:%M:%S")
  f.write('='*30 + '\n')
  f.write(time + '\n\n')  #can use write to write out *strings* to the file.
  for arg in str_to_append:
    f.write(arg + '\n')
    f.write('='*5+'\n')
  f.close()

In [21]:
def print_debugging():
  if os.path.exists("debugging.txt"):
    f = open("debugging.txt", "r")
    print(f.read())
    f.close()
  else:
    print('debugging.txt does not exist')

In [22]:
def delete_debugging():
  if os.path.exists("debugging.txt"):
    os.remove("debugging.txt")
  else:
    print("debugging.txt does not exist")

In [23]:
#I'll write out 2 separate strings
append_to_debugging(f'row0: {test_df.loc[0]}\n',
                    f'row1: {test_df.loc[1]}\n')

In [24]:
print_debugging()

12:30:27

row0: Age                                   42
Income                             22000
Home_Status                         RENT
Employment_Length                    NaN
Loan_Intent            DEBTCONSOLIDATION
Loan_Grade                             A
Loan_Amount                         1000
Interest_Rate                       7.29
Loan_Income_Percent                 0.05
Historical_Default                     N
Credit_History                        13
Name: 0, dtype: object

=====
row1: Age                           23
Income                     65000
Home_Status             MORTGAGE
Employment_Length            0.0
Loan_Intent            EDUCATION
Loan_Grade                     C
Loan_Amount                 8500
Interest_Rate              15.23
Loan_Income_Percent         0.13
Historical_Default             Y
Credit_History                 3
Name: 1, dtype: object

=====



In [25]:
delete_debugging()


In [26]:
print_debugging()  #should give warning

debugging.txt does not exist


#V. Server layout

There are three main pieces:

1. The flask server code augmented with ngrok. It defines "routes" or "views". The good news is we only have one page we are working with and only two routes.

2. The page as html/css with some javascript interjected. It is one long string. I use simple string-replace to update it.

3. The methods that handle data coming in and make predictions.

I'm going to present them out of order. I'll start with methods.

#VI. Server data handling method

This method will be called with a dictionary (`columns`) that has fields as keys and string values for each. So you will have to deal with converting numeric features (e.g., Age, Fare) to numeric. Don't try to convert categorical columns to numeric. That is what your transformer does!

`test_df` is what you loaded previously. It is in raw form.

`the_transformer` is the pipeline you created for your dataset.

###Gotcha

It is easy to mess up column name spellings. You have names you use in `fpage`, i.e., names html uses. These names are what are keys in `columns`. For instance I use `age_field` in html (see below) and hence as a key in `columns`.
<pre>
Enter Age ... name = "age_field" ...
</pre>
But you also have names of columns in `test_df`. Don't mix them up. You can see when I build `row` I use `Age` as a key becuase that is what is in `test_df`.



In [27]:
def handle_data(columns, test_df, the_transformer):

  #I would leave debugging line - you can comment it out later.
  append_to_debugging(f'from html: {columns}', f'test_df: {test_df.columns}')  #useful for debugging
  
  #massage data coming from user in columns - this is code you need to change to fit your new dataset
  age = np.nan
  if columns['age_field'].isdigit() and int(columns['age_field'])>0:  #note age_field is what appears in fpage
    age =  int(columns['age_field'])  #looks ok

  income = np.nan
  if columns['income_field'].isdigit() and int(columns['income_field'])>=0:
    fare =  int(columns['income_field'])

  employment_length = np.nan
  if columns['emp_len'].isdigit() and float(columns['emp_len'])>=0:
    fare =  int(columns['emp_len'])

  loan_amount = np.nan
  if columns['amount'].isdigit() and int(columns['amount'])>=0:
    fare =  int(columns['amount'])

  interest_rate = np.nan
  if columns['interest'].isdigit() and float(columns['interest'])>=0:
    fare =  int(columns['interest'])

  loan_income_percent = np.nan
  if columns['income_ratio'].isdigit() and float(columns['income_ratio'])>=0:
    fare =  int(columns['income_ratio'])

  credit_history = np.nan
  if columns['cred_hist'].isdigit() and int(columns['cred_hist'])>=0:
    fare =  int(columns['cred_hist'])

  loan_intent = np.nan if columns['intent']=='unknown' else columns['intent']
  loan_grade = np.nan if columns['grade']=='unknown' else columns['grade']
  historical_default = np.nan if columns['hist_def']=='unknown' else columns['hist_def']
 

  #keys should match column names in test_df
  # row = dict(Age=age, Gender=gender, Class=the_class, Joined=joined, Married=married, Fare=fare)  #33.0, 'Female', 'C2', 'Southampton', 0.0, 26.0
  row = dict(Age=age,Income=income,Employment_Length=employment_length,
             Loan_Intent=loan_intent, Loan_Grade=loan_grade, Loan_Amount=loan_amount,
             Interest_Rate=interest_rate, Loan_Income_Percent=loan_income_percent,
             Historical_Default=historical_default, Credit_History=credit_history)

  #end massaging - you should be able to use the code below as is

  #now add on your new row so can run pipeline
  n = len(test_df)
  test_extended = test_df.copy()  #don't mess up original
  test_extended.loc[n] = np.nan  #add blank row

  #fill in values we have from user
  for k,v in row.items():
    test_extended.loc[n, k] = v

  #run pipeline
  test_transformed = the_transformer.fit_transform(test_extended)

  #grab added row
  new_row = test_transformed.to_numpy()[-1:]

  #get predictions
  yhat_xgb, yhat_knn, yhat_logreg, yhat_ann = get_prediction(new_row)  #predict on last row that tacked on.
  return new_row, yhat_xgb, yhat_knn, yhat_logreg, yhat_ann

#VII. Get predictions from 4 models

This function is called, last thing, by `handle_data`. You should not have to change it.

**Note the 2nd assert**. The reason it may fail is because of a categorical column with n categories, e.g., `Joined` has 4 categories. The problem is that `test_df` might not contain all 4. 
I added code to the wrangling notebook to check to make sure you do not run afoul of this assert.

In [28]:
#I probably should pass all the models in but I'm using them as globals. My bad. Being lazy.

def get_prediction(row):
  assert row.shape[0]==1, f'Expecting numpy array but got {row}'
  assert logreg_model.n_features_in_ == len(row[0]), f'length mismatch with what was trained on and row to predict: {logreg_model.n_features_in_} and {len(row[0])}'

  #XGBoost
  xgb_raw = xgb_model.predict_proba(row)  #predict last row, we just tacked on
  yhat_xgb = xgb_raw[:,1]

  #KNN
  knn_raw = knn_model.predict_proba(row)
  yhat_knn = knn_raw[:,1]

  #logreg
  logreg_raw = logreg_model.predict_proba(row)
  yhat_logreg = logreg_raw[:,1]


  #ANN
  yhat_ann = ann_model.predict(row)[:,0]

  return [yhat_xgb, yhat_knn, yhat_logreg, yhat_ann]

#VIII. The html and some supporting code


#One page website

You can make this fancier but I am keeping it simple.

Note I stored my screenshot png file on Drive. Then followed directions here to get a link I could embed in my html: https://dev.to/temmietope/embedding-a-google-drive-image-in-html-3mm9.

In [29]:
fpage = '''
<!DOCTYPE html>
<html>
<head>
<meta name="viewport" content="width=device-width, initial-scale=1">

<style>
  body {
    background-color: linen;
  }

  h1 {
    color: maroon;
    margin-left: 40px;
  }

  .table-wrapper {
    display: none;
  }
  .table1 {
  float: left;
  }
 
  .table2 {
  float: right;
  padding-left: 20px;
  }
</style>
</head>

<body>

  <h1>Predict Loan Approval!</h1>
  <img src='https://miro.medium.com/max/735/0*lAkevA6upQBq-NCk.jpg' height=200>


  <form id="row_info" action="data" method = "POST">

        <!-- I am not using the hidden field - just have it in case can think of some need for it later -->
        <input type='hidden' id='hidden1' value='hidden value'/>

        <p style="font-size:24pt;">Enter Age <input style="font-size:18pt;" type = "text" name = "age_field" placeholder="Unknown"/></p>

        <p style="font-size:24pt;">Enter Income <input style="font-size:18pt;" type = "text" name = "income_field" placeholder="Unknown"/></p>

        <p style="font-size:24pt;">Enter Employment Length <input style="font-size:18pt;" type = "text" name = "emp_len" placeholder="Unknown"/></p>

        <p>
        <label for="intent" style="font-size:24pt;">Choose Loan Purpose:</label>
        <select id="intent" name="intent" style="font-size:18pt;">
          <option value="unknown">Unknown</option>
          <option value="DEBTCONSOLIDATION">Debt Consolidation</option>
          <option value="EDUCATION">Education</option>
          <option value="MEDICAL">Medical</option>
          <option value="PERSONAL">Personal</option>
          <option value="VENTURE">Venture</option>
        </select>

        <p>
        <label for="grade" style="font-size:24pt;">Choose Loan Grade:</label>
        <select id="grade" name="grade" style="font-size:18pt;">
          <option value="unknown">Unknown</option>
          <option value="A">A</option>
          <option value="B">B</option>
          <option value="C">C</option>
          <option value="D">D</option>
          <option value="E">E</option>
          <option value="F">F</option>
          <option value="G">G</option>
        </select>

        <p style="font-size:24pt;">Enter Loan Amount <input style="font-size:18pt;" type = "text" name = "amount" placeholder="Unknown"/></p>

        <p style="font-size:24pt;">Enter Interest Rate <input style="font-size:18pt;" type = "text" name = "interest" placeholder="Unknown"/></p>

        <p style="font-size:24pt;">Enter Loan Income Percent <input style="font-size:18pt;" type = "text" name = "income_ratio" placeholder="Unknown"/></p>

        <p>
        <label for="hist_def" style="font-size:24pt;">Defaulted Before?:</label>
        <select id="hist_def" name="hist_def" style="font-size:18pt;">
          <option value="unknown">Unknown</option>
          <option value="1">Y</option>
          <option value="0">N</option>
        </select>

        <p style="font-size:24pt;">Enter Credit History <input style="font-size:18pt;" type = "text" name = "cred_hist" placeholder="Unknown"/></p>

        <p><input style="font-size:24pt; padding: 8px; box-shadow:3px 3px grey" type = "submit" value = "Evaluate" /></p>
    </form>

    <script>
        <!-- toggle_image a bit of misnomer. It can be used to toggle tables on and off as well -->
        function toggle_image(im_id) {
            var state = document.getElementById(im_id).style.display;
            var new_state = 'block';
            if (state=='block'){new_state='none'};
            document.getElementById(im_id).style.display = new_state;
        }
    </script>

    <h1 onmouseover="toggle_image('image1')"
        onmouseout="toggle_image('image1')"
        style="display:inline" >  <!-- image1 is pipeline screenshot -->
      Results for data:
    </h1>
    <h3>%row_data%</h3>
    <ul>
      <!-- only showing threshold table for each method - might want to add tuning screenshot as well -->
      <li><button onclick="toggle_image('xgb')"
              style="display:inline; font-size:18pt; padding:8px; box-shadow:3px 3px grey">
            XGBoost alone: %xgb%    <!-- filled in once you have a prediction -->
          </button></li>
      <p>
      <li><button onclick="toggle_image('knn')"
              style="display:inline; font-size:18pt; padding:8px; box-shadow:3px 3px grey">
            KNN alone: %knn%
          </h2></li>
      <p>
      <li><button onclick="toggle_image('logreg')"
              style="display:inline; font-size:18pt; padding:8px; box-shadow:3px 3px grey">
            LogisticRegression alone: %logreg%
          </h2></li>
      <p>
      <li><button onclick="toggle_image('ann')"
              style="display:inline; font-size:18pt; padding:8px; box-shadow:3px 3px grey">
            ANN alone: %ann%
          </h2></li>
      <p>
      <li><h2>Ensemble: %ensemble%</h2>
    </ul>

    <!-- Here are onmouseover images and tables - see above -->
    <div style="position:absolute; top:10px; left:500px">
      <!-- See notes above for hoops you have to jump through to get link to Drive image -->
      <img id="image1" style="display:none" src='https://raw.githubusercontent.com/attajunyah/cis523/main/pipeline.png' height='400'>

      <!-- Below filled in with html table code when server starts up. They come from csv threshold files from your github. -->
      <div id="xgb" class='table-wrapper'>
        <div class="table1">
          <center><h2>XGBoost Threshold Table</h2></center>
          %xgb_table%
        </div>
        <div class="table2">
        <h2>XGBoost Lime Explanation</h2>
          %xgb_lime_table%
        </div>
      </div>
      <div id="knn" style='display:none'>
        <div class="table1">
        <center><h2>KNN Threshold Table</h2></center>
          %knn_table%
        </div>
        <div class="table2">
        <h2>KNN Lime Explanation</h2>
          %knn_lime_table%
        </div>
      </div>
      <div id="logreg" style='display:none'>
        <div class="table1">
        <center><h2>Logistic Regression Threshold Table</h2></center>
          %logreg_table%
        </div>
        <div class="table2">
        <h2>Logistic Regression Lime Explanation</h2>
          %logreg_lime_table%
        </div>
      </div>
      <div id="ann" style='display:none'>
        <div class="table1">
        <center><h2>ANN Threshold Table</h2></center>
          %ann_table%
        </div>
        <div class="table2">
        <h2>ANN Lime Explanation</h2>
          %ann_lime_table%
        </div>
      </div>
    </div>



  </body>
  </html>
'''

#These threshold tables do not change so add them now
fpage = fpage.replace('%xgb_table%', xgb_table)
fpage = fpage.replace('%logreg_table%', logreg_table)
fpage = fpage.replace('%knn_table%', knn_table)
fpage = fpage.replace('%ann_table%', ann_table)

#the default Lime info if you do not implement them
fpage = fpage.replace('%xgb_lime_table%', '<center><h2>Work in progress</h2></center>')
fpage = fpage.replace('%knn_lime_table%', '<center><h2>Work in progress</h2></center>')
fpage = fpage.replace('%logreg_lime_table%', '<center><h2>Work in progress</h2></center>')
fpage = fpage.replace('%ann_lime_table%', '<center><h2>Work in progress</h2></center>')

##Ok, we now have our html template in `fpage`

Now need a way to use template to build new page with predictions filled in. I'll use this simple method.


Flask has [a method for filling-in placeholders](https://pythonbasics.org/flask-tutorial-templates/) but I decided to do it myself with simple string replace.

In [30]:
def create_page(page, **fillers):
  new_page = page[:]  #copy
  for k,v in fillers.items():
    new_page = new_page.replace(f'%{str(k)}%', str(v))
  return new_page

#IX. The actual server

I have it threaded so that it will run behind the scenes.

We will also see a url printed in the output cell. This is what we can send to anyone on the planet with a browser and they can hook up to our server.

You want the url that has the form `http://....ngrok.io`.

I am being trusting and letting you use my auth_token. But would appreciate it if you created and used your own. Start here: https://dashboard.ngrok.com/login.

If you ask me questions about the ngrok code, I'll probably not be helpful. I came up with the code I have after multiple searches across stackoverflow. It is not the easiest library to work with.



In [31]:
import threading
!pip install pyngrok
#!ngrok authtoken '22vhRNG7M9pe3XT1t2e33ljMvor_63n7b9zwPtWuADusY8twi'
from pyngrok import ngrok

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting pyngrok
  Downloading pyngrok-5.2.1.tar.gz (761 kB)
[K     |████████████████████████████████| 761 kB 23.4 MB/s 
Building wheels for collected packages: pyngrok
  Building wheel for pyngrok (setup.py) ... [?25l[?25hdone
  Created wheel for pyngrok: filename=pyngrok-5.2.1-py3-none-any.whl size=19792 sha256=8685610c5911b758cf9b09e2c33a59961b93f33e1975d8b8265d2252051308dc
  Stored in directory: /root/.cache/pip/wheels/5d/f2/70/526da675d32f17577ec47ac4c663084efe39d47c826b6c3bb1
Successfully built pyngrok
Installing collected packages: pyngrok
Successfully installed pyngrok-5.2.1


##You need to get your own token!

If you use my token, you/we are limited to one server running at a time. So if I, or one of your classmates, is using my token, you will be locked out.

Here is where to sign up (for free):

https://dashboard.ngrok.com/signup

Follow links to get your own authorization token and replace mine.

In [32]:
ngrok_token = '2IhcuvMtAlzj5iQxAfOX6bj7cPg_2DLvbkj19X3Eyti6o3FtC'

In [33]:
!killall ngrok >/dev/null

os.environ["FLASK_ENV"] = "development"
app = Flask(__name__)
port = 5000
#Setting an auth token allows us to open multiple tunnels at the same time
ngrok.set_auth_token(ngrok_token)
# Open a ngrok tunnel to the HTTP server
connection = ngrok.connect(port)
print(f'Connection: {connection}')
public_url = connection if isinstance(connection, str) else connection.public_url

# Update any base URLs to use the public ngrok URL
app.config["BASE_URL"] = public_url

# Define Flask routes
@app.route("/")
#This function called when user first enters url into browser
def home():
    return create_page(fpage, xgb='', knn='', logreg='', ann='', ensemble='', row_data='')

@app.route('/data', methods = ['POST'])
#This function called when user hits Evaluate button
def data():
  form_data = request.form
  new_row, yhat_xgb, yhat_knn, yhat_logreg, yhat_ann = handle_data(form_data.to_dict(),test_df, the_transformer)  #calling my own function here
  ensemble = (yhat_xgb[0]+yhat_knn[0]+yhat_logreg[0]+yhat_ann[0])/4.0
  xgb = np.round(yhat_xgb[0], 2)
  knn = np.round(yhat_knn[0], 2)
  logreg = np.round(yhat_logreg[0], 2)
  ann = np.round(yhat_ann[0], 2)
  ensemble = np.round(ensemble, 2)

  

  return create_page(fpage, xgb=xgb, knn=knn, logreg=logreg, ann=ann, ensemble=ensemble, row_data=str(form_data.to_dict()))


# Start the Flask server in a new thread
threading.Thread(target=app.run, kwargs={"use_reloader": False}).start()


ngrok: no process found
Connection: NgrokTunnel: "http://537a-35-236-209-181.ngrok.io" -> "http://localhost:5000"
 * Serving Flask app "library" (lazy loading)


In [36]:
print_debugging()

12:31:38

from html: {'age_field': '', 'income_field': '', 'emp_len': '', 'intent': 'unknown', 'grade': 'unknown', 'amount': '', 'interest': '', 'income_ratio': '', 'hist_def': 'unknown', 'cred_hist': ''}
=====
test_df: Index(['Age', 'Income', 'Home_Status', 'Employment_Length', 'Loan_Intent',
       'Loan_Grade', 'Loan_Amount', 'Interest_Rate', 'Loan_Income_Percent',
       'Historical_Default', 'Credit_History'],
      dtype='object')
=====
12:32:46

from html: {'age_field': '35', 'income_field': '110000', 'emp_len': '9.0', 'intent': 'DEBTCONSOLIDATION', 'grade': 'C', 'amount': '25000', 'interest': '13.49', 'income_ratio': '0.59', 'hist_def': '0', 'cred_hist': '8'}
=====
test_df: Index(['Age', 'Income', 'Home_Status', 'Employment_Length', 'Loan_Intent',
       'Loan_Grade', 'Loan_Amount', 'Interest_Rate', 'Loan_Income_Percent',
       'Historical_Default', 'Credit_History'],
      dtype='object')
=====
12:36:26

from html: {'age_field': '22', 'income_field': '77100', 'emp_len': '8.0'

In [35]:
delete_debugging()

debugging.txt does not exist


INFO:werkzeug: * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
