## Building and Deploying a Streamlit App for Maching Learning
In this tutorial, we walk through how you can build and deploy a visual front-end to your machine learning applications. 
We will use [streamlit](https://docs.streamlit.io/library/get-started/main-concepts), which an easy-to-use framework for creating interactive visualizations and applications in Python (this is in contrast to [plotly dash](https://dash.plotly.com) which, although technically python, relies a lot on HTML and CSS knowledge).

### Installing Streamlit and Making a Virtual Environment 
When we make a new front-end application, we want to use a virtual environment. [Pipenv](https://pipenv-fork.readthedocs.io/en/latest/) will be our tool of choice. Pipenv makes it easy to combine both requirements management and virtual environments at the same time.

First, move into the project directory:

`cd my_project`

Then, run:

`pipenv shell` 
to activate the pipenv.

That's all you have to do! 

After that, when you install packages via `pipenv install [package_name]`, they will be automatically added to your `Pipfile.lock`. No need to update a requirements file all the time!

Note: if using pycharm, it can be helpful to use [these steps](https://stackoverflow.com/questions/46251411/how-do-i-properly-setup-pipenv-in-pycharm) to make sure you're using the virtual environment when running your code in pycharm. 

### Starting up the App
Open the file entitled `streamlit_app.py`.

Here you have a basic skeleton of an application. This code will simply display a table with 2 columns.


<img src="./images/code_snippet_1.png" width="300"/>
Code of the basic app.


We run the app using the following command:

`streamlit run streamlit_app.py`

This command opens a new browser window with our streamlit app running. The great thing about streamlit is that the app will be automatically updated whenever we save the file. 

Running the command, we see the following output in the browser:
<img src="./images/app_screenshot_1.png" width="300"/>


### Working with real data 
Here, we will import the actual Telco churn data and use it as an example for making a streamlit application to deploy. 

In [None]:
import streamlit as st
import pandas as pd

# importing customer churn data
data = pd.read_csv("./data/WA_Fn-UseC_-Telco-Customer-Churn.csv", index_col=0)
st.write("Churn Data")
st.write(data)

Note: look at what happens when you replace the line
`st.write(data)` in the code above with `st.table(data)`.

Now, our streamlit app looks like this:

<img src="./images/app_screenshot_2.png" width="300"/>

Note that it shows the entire dataframe, and we can even move around to explore the columns and rows.

### Adding Elements
Now, we will explore adding more elements to our app and making some visualizations. 

We'll start how we should always start in a new DS project: by visualizing the target.

Add the code in the next cell to your streamlit app, underneath the dataset visualization. You'll now see a nice bar chart. 

<img src="./images/app_screenshot_3.png" width="300"/>

In [None]:
st.write("How many customers in the dataset churned?")
target_bins = data.loc[:, 'Churn'].value_counts()
st.bar_chart(target_bins)

Data Analysis:

In [5]:
import pandas as pd
import matplotlib.pyplot as plt

In [4]:
data = pd.read_csv("./data/WA_Fn-UseC_-Telco-Customer-Churn.csv", index_col=0)

In [11]:
data.columns

Index(['gender', 'SeniorCitizen', 'Partner', 'Dependents', 'tenure',
       'PhoneService', 'MultipleLines', 'InternetService', 'OnlineSecurity',
       'OnlineBackup', 'DeviceProtection', 'TechSupport', 'StreamingTV',
       'StreamingMovies', 'Contract', 'PaperlessBilling', 'PaymentMethod',
       'MonthlyCharges', 'TotalCharges', 'Churn'],
      dtype='object')

Training Model

In [16]:
# generating test data
from sklearn.model_selection import train_test_split

In [18]:
X = data.drop(columns="Churn")
y = data.loc[:, "Churn"]

In [31]:
X_train, X_test, Y_train, Y_test = train_test_split(X, y)

In [32]:
X_test.shape

(1761, 19)

In [33]:
Y_test.shape

(1761,)

In [38]:
# Further splitting X into validation and test
training_data = pd.concat([X_train, Y_train], axis=1)
full_test_data = pd.concat([X_test, Y_test], axis=1)
holdout_data = full_test_data.iloc[:1000, :]
val_data = full_test_data.iloc[1000:, :]

In [39]:
training_data.shape

(5282, 20)

In [40]:
val_data.shape

(761, 20)

In [41]:
holdout_data.shape

(1000, 20)

In [43]:
training_data.to_csv('./data/training_data.csv')
holdout_data.to_csv('./data/holdout_data.csv')
val_data.to_csv('./data/validation_data.csv')

In [44]:
single_row = training_data.iloc[:1,:]

In [46]:
single_row.to_csv("./data/single_row_to_check.csv")

In [48]:
customer_data = X_test

In [49]:
customer = "4393-GEADV"

In [51]:
customer_data.reset_index(inplace=True)

In [53]:
customer_data.loc[customer_data.loc[:, 'customerID']==customer]

Unnamed: 0,customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,OnlineBackup,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges
5,4393-GEADV,Male,0,Yes,Yes,70,Yes,Yes,Fiber optic,Yes,Yes,Yes,Yes,Yes,Yes,Two year,No,Credit card (automatic),114.75,7842.3
