<a href="https://colab.research.google.com/github/iPrinka/MITx-Micromasters-Statistics-Data-Science/blob/main/one_league_deep_learning_w13_intro_streamlit_github.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Streamlit 

Streamlit is an opinionated library for serving models and visualizations in pure Python. It is interactive and quick to deploy.

## Learning objectives

- Create a streamlit app to serve models and visualizations
- Use plotly express to create interactive visualizations
- Deploy your streamlit on streamlit

## Install packages

In your terminal, list the packages in your virtual environments.

If you don't see streamlit and plotly, then install them.

`pip install streamlit`
`pip install plotly`

## Create a streamlit app

https://streamlit.io/


1. Arrange your screen(s) so you can see a browser window and your text editor. 
2. Open your text editor
3. Navigate to the folder where you want to create your app
4. Make a new python file named `my_app.py`
5. In the file, add `import streamlit as st` and save the file
6. In the terminal, in the same directory as your `my_app.py` file, execute `streamlit run my_app.py`
    - You are now serving your app locally.
    - You should see a browser window pop up.
    - After changing your code.
    - In Streamlit, your code runs from top to bottom, every time a user does something.
    - You can deploy your model to the internet by pushing it to your personal GitHub account and connecting that repo to Streamlit's servers.
7. If you want to deploy your app to the web, you will need to push it to GitHub with a requirements.txt file and follow the instructions [here](https://docs.streamlit.io/streamlit-cloud/community)



In [None]:
# pip install streamlit plotly

In [1]:
import plotly.express as px

In [2]:
px.data.iris().head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species,species_id
0,5.1,3.5,1.4,0.2,setosa,1
1,4.9,3.0,1.4,0.2,setosa,1
2,4.7,3.2,1.3,0.2,setosa,1
3,4.6,3.1,1.5,0.2,setosa,1
4,5.0,3.6,1.4,0.2,setosa,1


In [3]:
iris = px.data.iris()

In [4]:
iris.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species,species_id
0,5.1,3.5,1.4,0.2,setosa,1
1,4.9,3.0,1.4,0.2,setosa,1
2,4.7,3.2,1.3,0.2,setosa,1
3,4.6,3.1,1.5,0.2,setosa,1
4,5.0,3.6,1.4,0.2,setosa,1


In [5]:
X = iris.drop(['species', 'species_id'], axis=1)

In [24]:
y = iris["species"]

In [7]:
iris.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 6 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   sepal_length  150 non-null    float64
 1   sepal_width   150 non-null    float64
 2   petal_length  150 non-null    float64
 3   petal_width   150 non-null    float64
 4   species       150 non-null    object 
 5   species_id    150 non-null    int64  
dtypes: float64(4), int64(1), object(1)
memory usage: 7.2+ KB


**PROBLEM 1**: Build classification model to predict species

In [8]:
import numpy as np
import pandas as pd

In [9]:
from sklearn.linear_model import Perceptron, SGDClassifier, LogisticRegression
from sklearn.svm import LinearSVC, SVC
from sklearn.model_selection import train_test_split

In [16]:
X_train, X_test, Y_train, Y_test =  train_test_split(X, y)

In [17]:
lr = LogisticRegression(random_state=42)
sgd = SGDClassifier(random_state=42)
perc = Perceptron(random_state=42)

lr.fit(X_train, Y_train)
sgd.fit(X_train, Y_train)
perc.fit(X_train, Y_train)

print(lr.score(X_test, Y_test))
print(sgd.score(X_test, Y_test))
print(perc.score(X_test, Y_test))

0.9736842105263158
0.868421052631579
0.868421052631579


STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


In [25]:
clf = LogisticRegression(random_state=42)
clf.fit(X, y)
clf.score(X, y)

0.9733333333333334

**PROBLEM 2**: Save the model using `pickle`

In [19]:
import pickle

In [20]:
a = [1, 2, 3, 4]

In [22]:
with open('model.pkl', 'wb') as f:
  pickle.dump(a, f)

with open('model.pkl', 'rb') as f:
  new_list = pickle.load(f)

new_list

[1, 2, 3, 4]

In [26]:
with open('iris-model.pkl', 'wb') as f:
  pickle.dump(clf, f)

with open('iris-model.pkl', 'rb') as f:
  model = pickle.load(f)

model

**PROBLEM 3**: Build an app to share model

**PROBLEM 4**: Use an API to return data given a request

In [27]:
gapminder = px.data.gapminder()
gapminder.head()

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
0,Afghanistan,Asia,1952,28.801,8425333,779.445314,AFG,4
1,Afghanistan,Asia,1957,30.332,9240934,820.85303,AFG,4
2,Afghanistan,Asia,1962,31.997,10267083,853.10071,AFG,4
3,Afghanistan,Asia,1967,34.02,11537966,836.197138,AFG,4
4,Afghanistan,Asia,1972,36.088,13079460,739.981106,AFG,4


In [37]:
gapminder["log_life"] = np.log(gapminder['lifeExp']) 
gapminder["log_gdp"] = np.log(gapminder['gdpPercap']) 

In [38]:
X = gapminder[["pop", "log_gdp"]]
y = gapminder[["log_life"]]

In [39]:
from sklearn.linear_model import LinearRegression, SGDRegressor

rgr = LinearRegression()
rgr.fit(X, y)
rgr.score(X, y)

0.6259463077524872

In [40]:
with open('le-model.pkl', 'wb') as f:
  pickle.dump(rgr, f)