<a href="https://colab.research.google.com/github/twisha-k/Python_notes/blob/main/97_coding.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lesson 97 -  Multipage Streamlit App II

### Teacher-Student Activities

In the previous class, we started building a multi-page web app capable of predicting the price of car using Streamlit.

Continuing from the previous class, today we will create the following two Python scripts for our multi-page app which will perform the following tasks:

- `plots.py`: This page will visualise the dataset using various charts and plots.

- `predict.py`: This page will help to perform prediction for new feature values.

Let's quickly go through the activities covered in the previous class and begin this class from **Activity 1: `Visualise Data` Page Configuration** section.


In [None]:
Select the apprpriate code to set the Streamlit Web page configuration to 'wide'.

st.set_page_configure(layout = 'wide')

st.page_configuration(layout = 'wide')

st.set_page_config(layout = 'wide')

st.page_configure(layout = 'wide')

---

#### Car Price Prediction - Problem Statement

Recall the linear regression model that you had built in one of your previous classes wherein you predicted the price of cars based on its technical specifications such as car manufacturer, its engine capacity, fuel efficiency, body-type etc.


**Dataset Description:**

The dataset contains 205 rows and 26 columns. Each column represents an attribute of a car as described in the table below.

|Sr No.|Attribute|Attribute Information|
|-|-|-|
|1|Car_ID|Unique id of each car (Integer)|
|2|Symboling|Assigned insurance risk rating; a value of +3 indicates that the car is risky; -3 suggests that it is probably a safe car (Categorical)|
|3|carCompany|Name of car company (Categorical)|
|4|fueltype| fuel-type i.e. petrol or diesel (Categorical)|
|5|aspiration|Aspiration used in a car (Categorical)|
|6|doornumber|Number of doors in a car (Categorical)|
|7|carbody|Body-type of a car (Categorical)|
|8|drivewheel|Type of drive wheel (Categorical)|
|9|enginelocation|Location of car engine (Categorical)|
|10|wheelbase|Weelbase of car (Numeric)|
|11|carlength|Length of car (Numeric)|
|12|carwidth|Width of car (Numeric)|
|13|carheight|Height of car (Numeric)|
|14|curbweight|The weight of a car without occupants or baggage (Numeric)|
|15|enginetype|Type of engine (Categorical)|
|16|cylindernumber|Number of cylinders placed in the car engine (Categorical)|
|17|enginesize|Capacity of an engine (Numeric)|
|18|fuelsystem|Fuel system of a car (Categorical)|
|19|boreratio|Bore ratio of car (Numeric)|
|20|stroke|Stroke or volume inside the engine (Numeric)|
|21|compressionratio|Compression ratio of an engine (Numeric)|
|22|horsepower|Power output of an engine (Numeric)|
|23|peakrpm|Peak revolutions per minute (Numeric)|
|24|citympg|Mileage in city (Numeric)|
|25|highwaympg|Mileage on highway (Numeric)|
|26|price(Dependent variable)|Price of a car (Numeric)|


**Dataset source:** https://archive.ics.uci.edu/ml/datasets/Automobile


For ease of implementation, we will use only following four features that we had obtained after performing feature selection using RFE (Recursive Feature Elimination) in one of the previous lessons:

|Sr No.|Features|Target|
|-|-|-|
|1|`'carwidth'`|price|
|2|`'enginesize'`|
|3|`'horsepower'`|
|4|`'drivewheel_fwd'`|
|5|`'car_company_buick'`|







---

#### Main Page

In [None]:
# Code for 'main_app.py' file.
# Importing the necessary Python modules.
import streamlit as st
import numpy as np
import pandas as pd

# Import the individual Python files
import home
import data
import plots
import predict

# Configure your home page by setting its title and icon that will be displayed in a browser tab.
st.set_page_config(page_title = 'Car Price Prediction',
                    page_icon = ':car:',
                    layout = 'centered',
                    initial_sidebar_state = 'auto'
                    )

# Dictionary containing positive integers in the form of words as keys and corresponding former as values.
words_dict = {"two": 2, "three": 3, "four": 4, "five": 5, "six": 6, "eight": 8, "twelve": 12}
def num_map(series):
    return series.map(words_dict)

# Loading the dataset.
@st.cache()
def load_data():
    # Reading the dataset
    cars_df = pd.read_csv("car-prices.csv")
    # Extract the name of the manufactures from the car names and display the first 25 cars to verify whether names are extracted successfully.
    car_companies = pd.Series([car.split(" ")[0] for car in cars_df['CarName']], index = cars_df.index)
    # Create a new column named 'car_company'. It should store the company names of a the cars.
    cars_df['car_company'] = car_companies
    # Replace the misspelled 'car_company' names with their correct names.
    cars_df.loc[(cars_df['car_company'] == "vw") | (cars_df['car_company'] == "vokswagen"), 'car_company'] = 'volkswagen'
    cars_df.loc[cars_df['car_company'] == "porcshce", 'car_company'] = 'porsche'
    cars_df.loc[cars_df['car_company'] == "toyouta", 'car_company'] = 'toyota'
    cars_df.loc[cars_df['car_company'] == "Nissan", 'car_company'] = 'nissan'
    cars_df.loc[cars_df['car_company'] == "maxda", 'car_company'] = 'mazda'
    cars_df.drop(columns= ['CarName'], axis = 1, inplace = True)
    cars_numeric_df = cars_df.select_dtypes(include = ['int64', 'float64'])
    cars_numeric_df.drop(columns = ['car_ID'], axis = 1, inplace = True)
    # Map the values of the 'doornumber' and 'cylindernumber' columns to their corresponding numeric values.
    cars_df[['cylindernumber', 'doornumber']] = cars_df[['cylindernumber', 'doornumber']].apply(num_map, axis = 1)
    # Create dummy variables for the 'carbody' columns.
    car_body_dummies = pd.get_dummies(cars_df['carbody'], dtype = int)
    # Create dummy variables for the 'carbody' columns with 1 column less.
    car_body_new_dummies = pd.get_dummies(cars_df['carbody'], drop_first = True, dtype = int)
    # Create a DataFrame containing all the non-numeric type features.
    cars_categorical_df = cars_df.select_dtypes(include = ['object'])
    #Get dummy variables for all the categorical type columns using the dummy coding process.
    cars_dummies_df = pd.get_dummies(cars_categorical_df, drop_first = True, dtype = int)
    #  Drop the categorical type columns from the 'cars_df' DataFrame.
    cars_df.drop(list(cars_categorical_df.columns), axis = 1, inplace = True)
    # Concatenate the 'cars_df' and 'cars_dummies_df' DataFrames.
    cars_df = pd.concat([cars_df, cars_dummies_df], axis = 1)
    #  Drop the 'car_ID' column
    cars_df.drop('car_ID', axis = 1, inplace = True)
    final_columns = ['carwidth', 'enginesize', 'horsepower', 'drivewheel_fwd', 'car_company_buick', 'price']
    return cars_df[final_columns]

final_cars_df = load_data()

# Adding a navigation in the sidebar using radio buttons.
# Create a dictionary.
pages_dict = {
                "Home": home,
                "View Data": data,
                "Visualise Data": plots,
                "Predict": predict
            }

# Add radio buttons in the sidebar for navigation and call the respective pages based on user selection.
st.sidebar.title('Navigation')
user_choice = st.sidebar.radio("Go to", tuple(pages_dict.keys()))
if user_choice == "Home":
    home.app()
else:
    selected_page = pages_dict[user_choice]
    selected_page.app(final_cars_df)
# This 'app()' function is defined in all the 'home.py', data.py', 'plots.py' and 'predict.py' files.
# Whichever option out of "View Data", "Visualise Data" and "Predict" a user selects, that option gets stored in the
# 'selection' variable and the correspoding value to that key gets stored in the 'page' variable and then the 'app()'
# function gets called from that Python file
# which could be either of 'home.py', 'data.py', 'plots.py' and 'predict.py' files in this case.

---

#### `Home` Page


In [None]:
# Code for 'home.py' file.
import streamlit as st

def app():
	st.header("Car Price Prediction App")
	st.text("""
            This web app allows a user to predict the prices of a car based on their
            engine size, horse power, dimensions and the drive wheel type parameters.
        	"""
        	)

---

#### `View Data` Page

In [None]:
# Code for 'data.py' file
# Import necessary modules
import numpy as np
import pandas as pd
import streamlit as st

# Define a function 'app()' which accepts 'car_df' as an input.
def app(car_df):
    st.header("View Data")
    # Add an expander and display the dataset as a static table within the expander.
    with st.beta_expander("View Dataset"):
        st.table(car_df)

    st.subheader("Columns Description:")
    if st.checkbox("Show summary"):
        st.table(car_df.describe())

    beta_col1, beta_col2, beta_col3 = st.beta_columns(3)

    # Add a checkbox in the first column. Display the column names of 'car_df' on the click of checkbox.
    with beta_col1:
        if st.checkbox("Show all column names"):
            st.table(list(car_df.columns))

    # Add a checkbox in the second column. Display the column data-types of 'car_df' on the click of checkbox.
    with beta_col2:
        if st.checkbox("View column data-type"):
            st.table(car_df.dtypes)

    # Add a checkbox in the third column followed by a selectbox which accepts the column name whose data needs to be displayed.
    with beta_col3:
        if st.checkbox("View column data"):
            column_data = st.selectbox('Select column', tuple(car_df.columns))
            st.write(car_df[column_data])


---

#### Activity 1: `Visualise Data` Page Configuration

Now that we have created the first two pages of our web app, let's write a Python program to create different types of charts or plots for `car_df` DataFrame in the empty `plots.py` file.

When a user selects the `View Data` option, the `data.py` script will be rendered which contains the code to display raw data and provide data description.

Now let's define the `app()` function for the `plots.py` file. It should take a Pandas DataFrame object as an input should not return anything. In the function:

- Add a **multiselect** widget that allows the user to choose the $x$-axis values for a scatter plot as the $y$-axis value is fixed i.e. `price` of car.  

- Add a **multiselect** widget that allows the user to choose the types of plots out of the following:
  - Histogram
  - Box Plot
  - Correlation Heatmap

- Store the plots selected by user in a variable `plot_types`.


- If `'Histogram'` exists in `plot_types`, display the histograms for all the continuous numeric columns in the data-frame using the `hist()` function of the `matplotlib.pyplot` module by asking the user to select the column for which the chart is to be plotted using a selectbox widget.

- Similarly, if `'Box Plot'` exists in `plot_types`, display the box plots for all the continuous numeric columns in the data-frame using the `boxplot()` function of the `seaborn` module by asking the user to select the column for which the chart is to be plotted using a selectbox widget.

- If `'Correlation Heatmap'` exists in `plot_types`, display the correlation heatmap for all the columns. Don't forget to adjust the height of the heatmap.

**Note:** Do not run the code shown below. It will throw an error.

In [None]:
# S1.1: Design the "Visualise Data" page of the multipage app.
import streamlit as st
import matplotlib.pyplot as plt
import seaborn as sns

# Define a function 'app()' which accepts 'car_df' as an input.
def app(car_df):
    st.header('Visualise data')
    # Remove deprecation warning.
    st.set_option('deprecation.showPyplotGlobalUse', False)

    # Subheader for scatter plot.
    st.subheader("Scatter plot")
    # Choosing x-axis values for scatter plots.
    features_list = st.multiselect("Select the x-axis values:",
                                            ('carwidth', 'enginesize', 'horsepower', 'drivewheel_fwd', 'car_company_buick'))
    # Create scatter plots.
    for feature in features_list:
        st.subheader(f"Scatter plot between {feature} and price")
        plt.figure(figsize = (12, 6))
        sns.scatterplot(x = feature, y = 'price', data = car_df)
        st.pyplot()

    # Add a multiselect widget to allow the user to select multiple visualisation.
    # Add a subheader in the sidebar with label "Visualisation Selector"
    st.subheader("Visualisation Selector")

    # Add a multiselect in the sidebar with label 'Select the charts or plots:'
    # and pass the remaining 3 plot types as a tuple i.e. ('Histogram', 'Box Plot', 'Correlation Heatmap').
    # Store the current value of this widget in a variable 'plot_types'.
    plot_types = st.multiselect("Select charts or plots:", ('Histogram', 'Box Plot', 'Correlation Heatmap'))

    # Display box plot using the 'matplotlib.pyplot' module and the 'st.pyplot()' function.
    if 'Histogram' in plot_types:
        st.subheader("Histogram")
        columns = st.selectbox("Select the column to create its histogram",
                                      ('carwidth', 'enginesize', 'horsepower'))
        # Note: Histogram is generally created for continuous values not for discrete values.
        plt.figure(figsize = (12, 6))
        plt.title(f"Histogram for {columns}")
        plt.hist(car_df[columns], bins = 'sturges', edgecolor = 'black')
        st.pyplot()

    # Create box plot using the 'seaborn' module and the 'st.pyplot()' function.
    if 'Box Plot' in plot_types:
        st.subheader("Box Plot")
        columns = st.selectbox("Select the column to create its box plot",
                                      ('carwidth', 'enginesize', 'horsepower'))
        plt.figure(figsize = (12, 2))
        plt.title(f"Box plot for {columns}")
        sns.boxplot(car_df[columns])
        st.pyplot()

    # Display correlation heatmap using the 'seaborn' module and the 'st.pyplot()' function.
    if 'Correlation Heatmap' in plot_types:
        st.subheader("Correlation Heatmap")
        plt.figure(figsize = (8, 5))
        ax = sns.heatmap(car_df.corr(), annot = True) # Creating an object of seaborn axis and storing it in 'ax' variable
        bottom, top = ax.get_ylim() # Getting the top and bottom margin limits.
        ax.set_ylim(bottom + 0.5, top - 0.5) # Increasing the bottom and decreasing the bottom margins respectively.
        st.pyplot()

After rerunning the entire app and selecting the `Visualise Data` option in the navigation sidebar, we can see the following output:

<center>
<img src="https://s3-whjr-v2-prod-bucket.whjr.online/aca3521b-f50e-4161-89a3-e2bda27482a7.gif"/></center>

**Note to the Teacher:** You can download the entire `plots.py` file from the link given below:

https://drive.google.com/file/d/1bc2CTyQJKpDYUz71usnrfuhZ3DR49KYN

---

#### Activity 2: Predict Page Configuration and  Markdown

Now that we have created the first three pages of our web app, let's write a Python program to allow a user to predict the price of a car by selecting some input values for the featuers of the car.

When a user selects the `Predict` option, the `predict.py` script will be rendered which contains the code to build an ML model to make predictions.

In the `predict.py` fiel:

- Imports necessary Python modules including the Streamlit module.

- Create a function, say `prediction()` that takes `car_df` DataFrame and all the features of a car as inputs and returns the following:
  
  - The predicted price of a car
  - Accuracy score of the linear regression model deployed
  - $R^2$ score of the model deployed
  - Mean absolute error of the model deployed
  - Mean squared log error of the model deployed
  - Root mean squared error of the model deployed
  
  The `prediction()` function should deploy a linear regression model.

**Note:**
- In the interest of learning multipage web app and saving time, we are skipping the standard normalisation part.

- Do not run the code shown below. It will throw an error.

In [None]:
# S2.1: Import the necessary Python modules and create the 'prediction()' function as directed above.
import numpy as np
import pandas as pd
import streamlit as st
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score, mean_absolute_error, mean_squared_error, mean_squared_log_error

# Define the 'prediction()' function.
@st.cache()
def prediction(car_df, car_width, engine_size, horse_power, drive_wheel_fwd, car_comp_buick):
    X = car_df.iloc[:, :-1]
    y = car_df['price']
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 42)

    lin_reg = LinearRegression()
    lin_reg.fit(X_train, y_train)
    score = lin_reg.score(X_train, y_train)

    price = lin_reg.predict([[car_width, engine_size, horse_power, drive_wheel_fwd, car_comp_buick]])
    price = price[0]

    y_test_pred = lin_reg.predict(X_test)
    test_r2_score = r2_score(y_test, y_test_pred)
    test_mae = mean_absolute_error(y_test, y_test_pred)
    test_msle = mean_squared_log_error(y_test, y_test_pred)
    test_rmse = np.sqrt(mean_squared_error(y_test, y_test_pred))

    return price, score, test_r2_score, test_mae, test_msle, test_rmse

Next step is to define the `app()` function that takes `car_df` as an input and doesn't return anything. It should do the following:

- Add a suitable subheader for the `Predict` page.

- Asks a user to select the input values of the features of a car.

- Deploy the linear regression model and predict the car price based on input values of car features.

- Display the model evaluation parameters especially **Mean Squared Log Error (MSLE)** because the price of a car is a large value. Hence, the MSE and MAE values will not be close to zero but MSLE value will be.

**Note:**
- Be careful with the indentation.

- Do not run the code shown below. It will throw an error.

In [None]:
# S2.2: Define the 'app()' function as directed above.
# S2.2: Define the 'app()' function as directed above.
def app(car_df):
    st.markdown("<p style='color:blue;font-size:25px'>This app uses <b>Linear regression</b> to predict the price of a car based on your inputs.", unsafe_allow_html = True)
    st.subheader("Select Values:")
    car_wid = st.slider("Car Width", float(car_df["carwidth"].min()), float(car_df["carwidth"].max()))
    eng_siz = st.slider("Engine Size", int(car_df["enginesize"].min()), int(car_df["enginesize"].max()))
    hor_pow = st.slider("Horse Power", int(car_df["horsepower"].min()), int(car_df["horsepower"].max()))
    drw_fwd = st.radio("Is it a forward drive wheel car?", ("Yes", "No"))
    if drw_fwd == 'No':
        drw_fwd = 0
    else:
        drw_fwd = 1
    com_bui = st.radio("Is the car manufactured by Buick?", ("Yes", "No"))
    if com_bui == 'No':
        com_bui = 0
    else:
        com_bui = 1

    # When 'Predict' button is clicked, the 'prediction()' function must be called
    # and the value returned by it must be stored in a variable, say 'price'.
    # Print the value of 'price' and 'score' variable using the 'st.success()' and 'st.info()' functions respectively.
    if st.button("Predict"):
        st.subheader("Prediction results:")
        price, score, car_r2, car_mae, car_msle, car_rmse = prediction(car_df, car_wid, eng_siz, hor_pow, drw_fwd, com_bui)
        st.success("The predicted price of the car: ${:,}".format(int(price)))
        st.info("Accuracy score of this model is: {:2.2%}".format(score))
        st.info(f"R-squared score of this model is: {car_r2:.3f}")
        st.info(f"Mean absolute error of this model is: {car_mae:.3f}")
        st.info(f"Mean squared log error of this model is: {car_msle:.3f}")
        st.info(f"Root mean squared error of this model is: {car_rmse:.3f}")


**Markdown:**

In the above code, we have added a short description of your model. You may want to render this description using a different font color and size like this:

<center><img src="https://s3-whjr-v2-prod-bucket.whjr.online/1d4c402b-b5af-437a-a8af-f22c9a3c7598.PNG"/></center>

The text display widgets like `st.title()`, `st.text()`, `st.header()`, `st.subheader()` and `st.write()` possess a standard set of typography provided by Streamlit and hence, you cannot change the font style of the text written using these widgets. However, this limitation can be overcomed using basic html tags and  `st.markdown()` widget.

**HTML:**

- HTML (HyperText Markup Language) is a language used to structure a web page and its content.
- There are many predefined tags used in HTML to design a web page. Each tag starts with a tag opener (`<>`) and ends with a tag closer (`</>`).
- For our app, we will use only two HTML tags:

  - `<p></p>`: Creates a new paragraph.
  - `<b></b>`: Displays the written text in bold format.

  To explore more HTML tags, you can refer this [cheat sheet](https://s3-whjr-v2-prod-bucket.whjr.online/26d0986e-cfaf-4dd0-bfaa-65a8230cd6f4.pdf).

- To apply certain styling declaration to an element or tag, we can use `style` attribute as follows:
  
  ```html
  <p style='color:red;font-size:40px'> Welcome to Python </p>
  ```
  The above html code will display the text "Welcome to Python" in red color with size of 40 pixels as follows:

  <center><img src="https://s3-whjr-v2-prod-bucket.whjr.online/2ca1041d-44f5-4f58-9d18-c229a4242cf6.PNG"/></center>


**The `st.markdown()` widget:**

- This widget displays the string formatted as Markdown. Markdown is nothing but a way of specifying the format for writing structured documents.

- We will use HTML (HyperText Markup Language) to specify the markdown for this widget.

  **Syntax:** `st.markdown(body, unsafe_allow_html = True|False)`

where,
 - `body`: The string to be displayed (enclosed in html tags).
 - `unsafe_allow_html`: By default, any HTML tags found in the body will be escaped and therefore treated as pure text. To turn off this behaviour, this argument must be set to `True`.

Let us add a brief text description in blue color with font size of 25 pixels in our app (as shown in the image above)  using `st.markdown()` and html `<p></p>` and `<b></b>` tags.






---

#### Activity 3: Hosting

We have completed the entire code for our Multi-page Streamlit app that predicts the price of a car.

**Note to the Teacher:** You can download all the python scripts from the link given below:

https://drive.google.com/drive/folders/1tpGR50XMsuH4tY5YA7ZrvGbOoiKS_Y8o

You can now host the app either on Heroku server or on the Streamlit sharing service. After deploying it on the Streamlit sharing service, your app should look like the following:

https://share.streamlit.io/srahuliitb/car-price-predictor-streamlit-sharing/main/main_app.py

Here's the GitHub repository for the same:

https://github.com/srahuliitb/car-price-predictor-streamlit-sharing

---

### **Project**
You can now attempt the **Applied Tech.Project 96 - Multipage Streamlit App I** on your own.

**Applied Tech.Project 96 - Multipage Streamlit App I**: https://colab.research.google.com/drive/1EXe-W2cfoypdHZ3L7DILHXDHYfnCje4N

---