# Streamlit

**OBJECTIVES**

- Save objects with the `pickle` module
- Save `sklearn` models as `.pkl` files
- Build and deploy basic streamlit applications

In [1]:
import warnings
warnings.filterwarnings('ignore')

## Serialization and `pickle`

One approach to writing python objects out is to serialize, or create a byte stream. This is done using the `pickle` module, though other options exist. **Note**: Pickle files are not secure and you should not trust unknown sources of pickled files.

In [2]:
import pickle

In [3]:
v1 = [1, 2, 3]

In [4]:
with open('simple_list.pkl', 'wb') as f:
    pickle.dump(v1, f)

In [5]:
with open('simple_list.pkl', 'rb') as f:
    v2 = pickle.load(f)

In [6]:
v2

[1, 2, 3]

### Example: Regression Model

Below we build and save a pipeline to share with our streamlit app.  `Pipeline` objects will make input and transformations easy and are able to be pickled.  There are security issues with `pickle` and some alternative ideas if looking to use unknown sources for models such as [skops](https://skops.readthedocs.io/en/stable/index.html).

In [None]:
from sklearn.datasets import fetch_openml
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import make_column_transformer

houses = fetch_openml(data_id = 43926)
data = houses.frame
# data.head()
X = data[['Gr_Liv_Area', 'Overall_Qual', 'Sale_Condition', 'Lot_Area']]
y = data['Sale_Price']
transformer = make_column_transformer((OneHotEncoder(), X.select_dtypes('category').columns.tolist()),
                                      remainder = 'passthrough')
model = LinearRegression()
pipeline = Pipeline([('transformer', transformer), ('model', model)])
pipeline.fit(X, y)

In [16]:
X.head()

Unnamed: 0,Gr_Liv_Area,Overall_Qual,Sale_Condition,Lot_Area
0,1656,Above_Average,Normal,31770
1,896,Average,Normal,11622
2,1329,Above_Average,Normal,14267
3,2110,Good,Normal,11160
4,1629,Average,Normal,13830


In [17]:
with open('lr_model.pkl', 'wb') as f:
    pickle.dump(pipeline, f)

In [18]:
X['Overall_Qual'].unique()

['Above_Average', 'Average', 'Good', 'Very_Good', 'Excellent', 'Below_Average', 'Fair', 'Poor', 'Very_Excellent', 'Very_Poor']
Categories (10, object): ['Above_Average', 'Average', 'Below_Average', 'Excellent', ..., 'Poor', 'Very_Excellent', 'Very_Good', 'Very_Poor']

In [19]:
X['Sale_Condition'].unique()

['Normal', 'Partial', 'Family', 'Abnorml', 'Alloca', 'AdjLand']
Categories (6, object): ['Abnorml', 'AdjLand', 'Alloca', 'Family', 'Normal', 'Partial']

### Moving the model to Streamlit

Now we will create a basic application using the Streamlit library [docs](https://docs.streamlit.io/). To do so, we will first create a virtual environment for the project.  Over to VSCode.

```python
import streamlit as st 
import pickle
import pandas as pd
### regression model
from sklearn.datasets import fetch_openml
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import make_column_transformer

houses = fetch_openml(data_id = 43926)
data = houses.frame
X = data[['Gr_Liv_Area', 'Overall_Qual', 'Sale_Condition', 'Lot_Area']]
y = data['Sale_Price']
transformer = make_column_transformer((OneHotEncoder(), X.select_dtypes('category').columns.tolist()),
                                      remainder = 'passthrough')
model = LinearRegression()
pipeline = Pipeline([('transformer', transformer), ('model', model)])
pipeline.fit(X, y)






st.header('Regression App')

gr_area = st.number_input('What is the above ground living area:')
lot_area = st.slider('What is the total lot area:')
over_qual = st.selectbox('What was the quality?', 
                         ('Above_Average', 'Average', 'Good', 'Very_Good', 'Excellent', 'Below_Average', 'Fair', 'Poor', 'Very_Excellent', 'Very_Poor'))
sale_cond = st.selectbox("Condition at sale?",
                         ('Normal', 'Partial', 'Family', 'Abnorml', 'Alloca', 'AdjLand'))
#bring in our model
# with open('lr_model.pkl', 'rb') as f:
#     model = pickle.load(f)
    
X = pd.DataFrame({'Gr_Liv_Area': gr_area,
                  'Overall_Qual': over_qual,
                  'Sale_Condition': sale_cond,
                  'Lot_Area': lot_area}, index = [0])

pred = model.predict(X)
st.write(pred)
    

```