<a href="https://colab.research.google.com/github/stmulugheta/AI/blob/main/Tutorial_pickle_streamlit.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# How to pickle a model and implement it in streamlit
Tutorial by *Leonie Flueckiger* \
Tutorial on how to pickle a model and how to implement the pickle'd model in streamlit \

This tutorial is inspired by and uses extracts from the following tutorials: \
https://medium.com/analytics-vidhya/deploy-your-first-end-to-end-ml-model-using-streamlit-51cc486e84d7 by Apurva Sharma\
https://betterprogramming.pub/pickling-machine-learning-models-aeb474bc2d78 by Arian Deore

Create an example dataset

In [1]:
import pandas as pd

# Create example dataframe
df = pd.DataFrame({'Number of trees': [600, 700, 800, 1200, 1100, 900],
                   'Money invested': [10000, 20000, 30000, 100000, 80000, 50000],
                   'Carbon Sequestered': [12, 20, 36, 90, 86, 55]})
df.head(6)

Unnamed: 0,Number of trees,Money invested,Carbon Sequestered
0,600,10000,12
1,700,20000,20
2,800,30000,36
3,1200,100000,90
4,1100,80000,86
5,900,50000,55


Plot the example

In [2]:
import matplotlib.pyplot as plt
import plotly.express as px

# plot dataframe
fig = px.scatter(df, x='Number of trees', y='Carbon Sequestered')
fig.show()

Split dataframe in train and test datasets

In [3]:
# split in train and test set
from sklearn.model_selection import train_test_split

X = df[['Number of trees','Money invested']]
y = df['Carbon Sequestered']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=2)
X_train.head()

Unnamed: 0,Number of trees,Money invested
3,1200,100000
2,800,30000
5,900,50000
0,600,10000


Fit a Linear Regression

In [4]:
from sklearn.linear_model import LinearRegression

reg = LinearRegression().fit(X_train, y_train)
print("Test score "+str(reg.score(X_test, y_test)))
print("Train score "+str(reg.score(X_train, y_train)))

Test score 0.9551882460973372
Train score 0.9944662208900161


Pickle your model

In [5]:
import pickle

# Create Pickle file from the Linear Regression Model
with open('lineraregression.pickle', 'wb') as dump_var:
    pickle.dump(reg, dump_var)

Check that your file was written to Colab Sandbox

In [6]:
!ls

lineraregression.pickle  sample_data


### Create your streamlit App

In [7]:
!pip install streamlit
!pip install pyngrok==4.1.1 

Collecting streamlit
  Downloading streamlit-1.0.0-py2.py3-none-any.whl (8.3 MB)
[K     |████████████████████████████████| 8.3 MB 4.6 MB/s 
Collecting base58
  Downloading base58-2.1.0-py3-none-any.whl (5.6 kB)
Collecting gitpython!=3.1.19
  Downloading GitPython-3.1.24-py3-none-any.whl (180 kB)
[K     |████████████████████████████████| 180 kB 52.0 MB/s 
[?25hCollecting validators
  Downloading validators-0.18.2-py3-none-any.whl (19 kB)
Collecting blinker
  Downloading blinker-1.4.tar.gz (111 kB)
[K     |████████████████████████████████| 111 kB 42.2 MB/s 
Collecting watchdog
  Downloading watchdog-2.1.6-py3-none-manylinux2014_x86_64.whl (76 kB)
[K     |████████████████████████████████| 76 kB 4.9 MB/s 
Collecting pydeck>=0.1.dev5
  Downloading pydeck-0.7.0-py2.py3-none-any.whl (4.3 MB)
[K     |████████████████████████████████| 4.3 MB 18.4 MB/s 
Collecting gitdb<5,>=4.0.1
  Downloading gitdb-4.0.7-py3-none-any.whl (63 kB)
[K     |████████████████████████████████| 63 kB 1.8 MB/s 
C

Collecting pyngrok==4.1.1
  Downloading pyngrok-4.1.1.tar.gz (18 kB)
Building wheels for collected packages: pyngrok
  Building wheel for pyngrok (setup.py) ... [?25l[?25hdone
  Created wheel for pyngrok: filename=pyngrok-4.1.1-py3-none-any.whl size=15984 sha256=e9330f19694e335e9cb2aa8ae893124a25e8646f08d19736094e64f61070f4db
  Stored in directory: /root/.cache/pip/wheels/b1/d9/12/045a042fee3127dc40ba6f5df2798aa2df38c414bf533ca765
Successfully built pyngrok
Installing collected packages: pyngrok
Successfully installed pyngrok-4.1.1


Make simple inputs page for your app

In [8]:
%%writefile pickle_app.py 
# write app to sandbox

import streamlit as st
import numpy as np

# define your app content
def main():

  colA1, colA2 = st.beta_columns(2)
  with colA1:
    st.info("Number of trees")
    trees_amount = st.number_input("How many trees are planted", value=500, step=100)

  with colA2:
    st.info("Money invested")
    investment_amount = st.number_input("currency is USD", value=50000, step=10000)

  input = np.array([[trees_amount, investment_amount]]).astype(np.float64)


# execute the main function  	
if __name__ == '__main__':
	main()

Writing pickle_app.py


Run your app

In [9]:
# Start the streamlit app to run in the background
!streamlit run pickle_app.py &>/dev/null&

In [10]:
from pyngrok import ngrok

# Setup a tunnel to the streamlit port 8501
public_url = ngrok.connect(port='8501')

# Print URL to open it in new browser window
public_url



'http://dbbb-35-229-120-224.ngrok.io'

Load your pickle file into your app and make predictions based on your input \
You will see your predictions automatically updating as you change your input

In [11]:
%%writefile pickle_app.py 
# write app to sandbox

import streamlit as st
import pickle
import numpy as np

# define your app content
def main():

  colA1, colA2 = st.beta_columns(2)
  with colA1:
    st.info("Number of trees")
    trees_amount = st.number_input("How many trees are planted", value=500, step=100)

  with colA2:
    st.info("Money invested")
    investment_amount = st.number_input("currency is USD", value=50000, step=10000)

    input = np.array([[trees_amount, investment_amount]]).astype(np.float64)


# NEW CODE STARTS HERE

  # Load the Pickle file in memory
  pickle_in = open('lineraregression.pickle', 'rb')
  pickle_model = pickle.load(pickle_in)

  prediction = pickle_model.predict(input)

  st.info("Carbon Sequestered")
  st.write(prediction)

# NEW CODE ENDS HERE


# execute the main function  	
if __name__ == '__main__':
	main()

Overwriting pickle_app.py


Shut down streamlit and ngrok

In [12]:
# check the process streamlit is running on 
!pgrep streamlit

145


In [13]:
#kill the process streamlit is running on 
!kill 153

In [14]:
# shutdown ngrok from python
ngrok.kill()