# **Lab: ML Product**



## Exercise 3: Build Streamlit

We will create a web app for serving our trained model from last exercise and deploy it to Heroku

**Pre-requisites:**
- Create a DockerHub account (https://hub.docker.com/)
- Create a Render account (https://render.com/)

The steps are:
1.   Setup Repository
2.   Create Dockerfile
3.   Build Streamlit app
4.   Interact with Streamlit
5.   Publish Docker Image to Docker Hub
6.   Deploy to Render
7.   Push Changes


### 1. Setup Repository

**[1.1]** Go to a folder of your choice on your computer (where you store projects)

In [None]:
# Placeholder for student's code (command line)

In [None]:
#Solution:
cd ~/Projects/adv_mla_2024

**[1.2]** Create a folder called `lab4_app` and go inside the created folder

In [None]:
# Placeholder for student's code (command line)

In [None]:
#Solution:
mkdir lab4_app
cd lab4_app

**[1.3]** Copy the files from the previous exercises (scaler and trained model) into this folder

In [None]:
# Placeholder for student's code (command line)

In [None]:
#Solution:
mkdir models
cp ../adv_mla_lab_4/models/* models

**[1.4]** Initialise the repo

In [None]:
# Placeholder for student's code (command line)

In [None]:
#Solution:
git init

**[1.5]** Login into Github with your account (https://github.com/) and create a public repo with the name `adv_mla_lab_4_app`

**[1.6]** In your local repo `api`, link it with Github (replace the url with your username)

In [None]:
# Placeholder for student's code (command line)

In [None]:
#Solution:
git remote add origin git@github.com:<username>/adv_mla_lab_4_app

**[1.7]** Add you changes to git staging area and commit them

In [None]:
# Placeholder for student's code (command line)

In [None]:
#Solution:
git add .
git commit -m "init"

**[1.8]** Push your main/master branch to origin

In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution:
git push --set-upstream origin main

### 2. Create Dockerfile

**[2.1]** Create a file called `requirements.txt` with the following content:

```
streamlit==1.36.0
pandas==2.2.2
scikit-learn==1.5.1
```

**[2.2]** Create a file called `Dockerfile` with the following content:

```
FROM python:3.11.4

WORKDIR /app

COPY requirements.txt .

RUN pip install -r requirements.txt

COPY . /app

CMD ["streamlit", "run", "app.py"]
```

**[2.3]** Create an empty file called `app.py`

**[2.4]** Build the image from the Dockerfile

In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution
docker buildx build -t streamlit_lab4:latest .

**[2.5]** Run the built image with 8501 port published and a volume mounted to the current directory to /app

In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution
docker run -dit --rm --name advmla_lab4_app -p 8501:8501 -v "${PWD}":/app streamlit_lab4:latest


**[2.6]** List all running containers

In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution
docker ps

**[2.7]** Open a browser and navigate through:

http://localhost:8501

### 3. Build Streamlit app

**[3.1]** Inside the `app.py` file, import streamlit, pandas and load from joblip packages

In [None]:
# Placeholder for student's code (Python code)

In [None]:
# Solution:
import streamlit as st
from joblib import load
import pandas as pd

**[3.2]** Inside the `app.py` file, display the title `ML Streamlit App` on Streamlit

In [None]:
# Placeholder for student's code (Python code)

In [None]:
# Solution
st.title('ML Streamlit App')

**[3.3]** Inside the `app.py` file, load your trained model and scaler from `models` folder and save them into variables respectively called `model` and `scaler`

In [None]:
# Placeholder for student's code (Python code)

In [None]:
# Solution:
model = load('models/ada_reg.joblib')
scaler = load('models/scaler.joblib')

**[3.6]** Inside the `app.py` filedisplay a text that explains the purpose of your app

In [None]:
# Placeholder for student's code (Python code)

In [None]:
# Solution
st.write("""
## The Machine Learning App

You can use this application in order to get predictions from a trained Adaboost model on the estimated number of shares for a news article.

You will have to select the values that describe a news article.
""")

**[3.7]** Inside the `main.py` file, for each feature required by the model create a slider with their minimum, maximum and average values as respectively the minimum, maximum and defauklt value of each slider. And save each of these sliders into a separate variable.

In [None]:
# Placeholder for student's code (Python code)

In [None]:
# Solution:
timedelta                = st.slider("timedelta", 0, 750, 340)
n_tokens_title           = st.slider("n_tokens_title", 0, 25, 10)
n_tokens_content         = st.slider("n_tokens_content", 0, 9000, 540)
n_unique_tokens          = st.slider("n_unique_tokens", 0., 800., 0.5)
n_non_stop_words         = st.slider("n_non_stop_words", 0, 1100, 1)
n_non_stop_unique_tokens = st.slider("n_non_stop_unique_tokens", 0., 650., 1.0)
num_hrefs                = st.slider("num_hrefs", 0, 350, 10)
num_self_hrefs           = st.slider("num_self_hrefs", 0, 150, 4)
num_imgs                 = st.slider("num_imgs", 0, 150, 10)
num_videos               = st.slider("num_videos", 0, 100, 4)

average_token_length = st.slider("average_token_length", 0., 9., 5.)
num_keywords         = st.slider("num_keywords", 1., 10., 0.5)

data_channel_is_lifestyle     = st.slider("data_channel_is_lifestyle", 0, 1, 0)
data_channel_is_entertainment = st.slider("data_channel_is_entertainment", 0, 1, 0)
data_channel_is_bus           = st.slider("data_channel_is_bus", 0, 1, 0)
data_channel_is_socmed        = st.slider("data_channel_is_socmed", 0, 1, 0)
data_channel_is_tech          = st.slider("data_channel_is_tech", 0, 1, 0)
data_channel_is_world         = st.slider("data_channel_is_world", 0, 1, 0)

kw_min_min = st.slider("kw_min_min", -1, 400, 50)
kw_max_min = st.slider("kw_max_min", 0, 300000, 3500)
kw_avg_min = st.slider("kw_avg_min", 0, 45000, 100)
kw_min_max = st.slider("kw_min_max", 0, 900000, 3500)
kw_max_max = st.slider("kw_max_max", 0, 900000, 3500)
kw_avg_max = st.slider("kw_avg_max", 0, 900000, 3500)
kw_min_avg = st.slider("kw_min_avg", -1, 4000, 50)
kw_max_avg = st.slider("kw_max_avg", 0, 300000, 3500)
kw_avg_avg = st.slider("kw_avg_avg", 0, 45000, 100)

self_reference_min_shares  = st.slider("self_reference_min_shares", 0, 900000, 3500)
self_reference_max_shares  = st.slider("self_reference_max_shares", 0, 900000, 3500)
self_reference_avg_sharess = st.slider("self_reference_avg_sharess", 0, 900000, 3500)

weekday_is_monday    = st.slider("weekday_is_monday", 0, 1, 0)
weekday_is_tuesday   = st.slider("weekday_is_tuesday", 0, 1, 0)
weekday_is_wednesday = st.slider("weekday_is_wednesday", 0, 1, 0)
weekday_is_thursday  = st.slider("weekday_is_thursday", 0, 1, 0)
weekday_is_friday    = st.slider("weekday_is_friday", 0, 1, 0)
weekday_is_saturday  = st.slider("weekday_is_saturday", 0, 1, 0)
weekday_is_sunday    = st.slider("weekday_is_sunday", 0, 1, 0)
is_weekend           = st.slider("is_weekend", 0, 1, 0)

LDA_00 = st.slider("LDA_00", 0, 1, 0)
LDA_01 = st.slider("LDA_01", 0, 1, 0)
LDA_02 = st.slider("LDA_02", 0, 1, 0)
LDA_03 = st.slider("LDA_03", 0, 1, 0)
LDA_04 = st.slider("LDA_04", 0, 1, 0)

global_subjectivity        = st.slider("global_subjectivity", 0., 1., 0.)
global_sentiment_polarity  = st.slider("global_sentiment_polarity", -1., 1., 0.)
global_rate_positive_words = st.slider("global_rate_positive_words", 0., 1., 0.)
global_rate_negative_words = st.slider("global_rate_negative_words", 0., 1., 0.)

rate_positive_words   = st.slider("rate_positive_words", 0., 1., 0.)
rate_negative_words   = st.slider("rate_negative_words", 0., 1., 0.)
avg_positive_polarity = st.slider("avg_positive_polarity", 0., 1., 0.)
min_positive_polarity = st.slider("min_positive_polarity", 0., 1., 0.)
max_positive_polarity = st.slider("max_positive_polarity", 0., 1., 0.)

avg_negative_polarity  = st.slider("avg_negative_polarity", -1., 0., 0.)
min_negative_polarity  = st.slider("min_negative_polarity", -1., 0., 0.)
max_negative_polarity  = st.slider("max_negative_polarity", -1., 0., 0.)

title_subjectivity   = st.slider("title_subjectivity", 0., 1., 0.)
title_sentiment_polarity  = st.slider("title_sentiment_polarity", -1., 1., 0.)
abs_title_subjectivity   = st.slider("abs_title_subjectivity", 0., 0.5, 0.)
abs_title_sentiment_polarity   = st.slider("abs_title_sentiment_polarity", 0., 1., 0.)

**[3.8]** Inside the `main.py` file, create a button `predict` that will create a dataframe with the features names and values from the previous sliders, then it will standardise it, get the model prediction from it and display it on the Streamlit app

In [None]:
# Placeholder for student's code (Python code)

In [None]:
# Solution:
data = {
    'timedelta': [timedelta],
    'n_tokens_title': [n_tokens_title],
    'n_tokens_content': [n_tokens_content],
    'n_unique_tokens': [n_unique_tokens],
    'n_non_stop_words': [n_non_stop_words],
    'n_non_stop_unique_tokens': [n_non_stop_unique_tokens],
    'num_hrefs': [num_hrefs],
    'num_self_hrefs': [num_self_hrefs],
    'num_imgs': [num_imgs],
    'num_videos': [num_videos],
    'average_token_length': [average_token_length],
    'num_keywords': [num_keywords],
    'data_channel_is_lifestyle': [data_channel_is_lifestyle],
    'data_channel_is_entertainment': [data_channel_is_entertainment],
    'data_channel_is_bus': [data_channel_is_bus],
    'data_channel_is_socmed': [data_channel_is_socmed],
    'data_channel_is_tech': [data_channel_is_tech],
    'data_channel_is_world': [data_channel_is_world],
    'kw_min_min': [kw_min_min],
    'kw_max_min': [kw_max_min],
    'kw_avg_min': [kw_avg_min],
    'kw_min_max': [kw_min_max],
    'kw_max_max': [kw_max_max],
    'kw_avg_max': [kw_avg_max],
    'kw_min_avg': [kw_min_avg],
    'kw_max_avg': [kw_max_avg],
    'kw_avg_avg': [kw_avg_avg],
    'self_reference_min_shares': [self_reference_min_shares],
    'self_reference_max_shares': [self_reference_max_shares],
    'self_reference_avg_sharess': [self_reference_avg_sharess],
    'weekday_is_monday': [weekday_is_monday],
    'weekday_is_tuesday': [weekday_is_tuesday],
    'weekday_is_wednesday': [weekday_is_wednesday],
    'weekday_is_thursday': [weekday_is_thursday],
    'weekday_is_friday': [weekday_is_friday],
    'weekday_is_saturday': [weekday_is_saturday],
    'weekday_is_sunday': [weekday_is_sunday],
    'is_weekend': [is_weekend],
    'LDA_00': [LDA_00],
    'LDA_01': [LDA_01],
    'LDA_02': [LDA_02],
    'LDA_03': [LDA_03],
    'LDA_04': [LDA_04],
    'global_subjectivity': [global_subjectivity],
    'global_sentiment_polarity': [global_sentiment_polarity],
    'global_rate_positive_words': [global_rate_positive_words],
    'global_rate_negative_words': [global_rate_negative_words],
    'rate_positive_words': [rate_positive_words],
    'rate_negative_words': [rate_negative_words],
    'avg_positive_polarity': [avg_positive_polarity],
    'min_positive_polarity': [min_positive_polarity],
    'max_positive_polarity': [max_positive_polarity],
    'avg_negative_polarity': [avg_negative_polarity],
    'min_negative_polarity': [min_negative_polarity],
    'max_negative_polarity': [max_negative_polarity],
    'title_subjectivity': [max_negative_polarity],
    'title_sentiment_polarity': [title_sentiment_polarity],
    'abs_title_subjectivity': [abs_title_subjectivity],
    'abs_title_sentiment_polarity': [abs_title_sentiment_polarity],
}

if st.button("Predict"):
    df = pd.DataFrame(data)
    proba = model.predict(scaler.transform(df))
    st.write(f"Prediction: {proba}")

**[3.9]** Add you changes to git staging area, commit and push them

In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution
git add .
git commit -m "add streamlit"
git push

### 4. Interact with FastAPI

**[4.1]** Open a browser and navigate through:

http://localhost:8501


**[4.2]** Close the docker app

In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution
docker stop advmla_lab4_app

### 5.   Publish Docker Image to Docker Hub

**[5.1]** Log into your Docker Hub account

**[5.2]** Create a new repository called `advmla_lab4_<studentid>` with the following description: `Streamlit app for 36120 lab 4`

**[5.3]** Click create

**[5.4]** Log into docker.io

In [None]:
# Placeholder for student's code (command line)


In [None]:
# Solution
docker login -u <username> -p <password> docker.io

**[5.5]** Tag your docker image `streamlit_lab4` to your Docker repo `<username>/advmla_lab4_<studentid>:latest`

In [None]:
# Placeholder for student's code (command line)


In [None]:
# Solution
docker tag streamlit_lab4 <username>/advmla_lab4_<studentid>:latest


**[5.6]** Build a multi platform (Apple Silicon included) image and push to your Docker repo `<username>/advmla_lab4_<studentid>:latest`

Note: if you got error, you may need to create a driver first `docker buildx create --use desktop-linux`


In [None]:
# Placeholder for student's code (command line)


In [None]:
# Solution
docker buildx build \
 --tag <username>/advmla_lab4_<studentid>:latest:latest \
 --platform linux/arm64/v8,linux/amd64 \
 --push .

**[5.7]** Navigate to your docker repo and confirm your image has been pushed

### 6. Deploy to Render

**[6.1]** Log into your Render account

**[6.2]** On Render create a new Web Service

**[6.3]** Connect to your Github account and provide access to the `adv_mla_lab_4_app` repo

**[6.4]** Select `Existing Image` for the `Source Code`


**[6.5]** Paste your image URL: `docker.io/<username>/advmla_lab4_<studentid>:latest:latest`


**[6.6]** Select the `Free Tier`

**[6.7]** Click on `Deploy Service`

**[6.8]** It will take few minutes for Render to deploy the app (around 10 min)

**[6.9]** Once deployed, you copy the url set by Render and navigate through your new app

### 7.   Push changes

**[7.1]** Add you changes to git staging area

In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution:
git add .

**[7.2]** Create the snapshot of your repository and add a description

In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution:
git commit -m "build streamlit"

**[7.3]** Push your snapshot to Github

In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution:
git push

**[7.4]** Go to to github and merge your change to the master/main branch

**[7.5]** Check out to the master branch

In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution:
git checkout master

**[7.6]** Pull the latest updates

In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution:
git pull