# Introduction

Enabling a business user to interact with the data and the outcomes of a project is often very important. With `streamlit` you can easily create web apps that allow your client to interact with the data without having to deal with the details and complexities of the analysis.

### Communication is key

Communication is a key part of data mining & exploration. In the business understanding phase, communication is essential to get a profound understanding of the client's problem and to define the objective and data mining goals of the project. In later phases, it is essential to translate the outcome of the project to the client and the organization.

### Communication is more than mere visualization
Good communication of the outcomes of an analysis often requires well thought out and balanced data visualization. However, how effective and attractive they may be, showing only charts and visuals in a slide deck is not always sufficient. 

Often, clients needs to be able to interact with the data (e.g. slice, filter, zoom etc.) in a dashboard. 

With packages like `streamlit` it is relatively easy to build a web app or dashboard that allow a user to interact with the data. 

In many projects, this ability can make or break the adoption of the project outcomes.

# What you will learn

- The basic principles of a web app
- How to configure your development environment to work with `streamlit`
- How to create a simple `streamlit` app
- The most important `streamlit` methods

# Why `streamlit`?

There are many options for web app frameworks in Python. Rather simple (e.g. [Dash](https://plotly.com/dash/) for dashboards or [Flet](https://flet.dev/) for more general purpose web apps) and more sophisticated/flexible frameworks (e.g. [Flask](https://flask.palletsprojects.com/en/2.2.x/) or [Django](https://www.djangoproject.com/)).

We have chosen [`streamlit`](https://streamlit.io/) because of its popularity and ease of use. It allows you to create a good looking web app with a few lines of code.

# Installation

You can install `streamlit` with `poetry`:

```
$ poetry add streamlit
```

### Important note!
`streamlit` is not compatible with Python 3.9.7. Even, if you don't have that exact version installed, the installation may fail because your `pyproject.toml` may be too permissive for the Python version. 

To prevent this, modify your `pyproject.toml` as follows:

```
[tool.poetry.dependencies]
python = ">=3.9,<3.9.7 || >3.9.7,<3.10"
```

This line allows all Python versions between 3.9 and 3.10 except 3.9.7. Alternatively, you could fix the Python version that `poetry` uses to a specific version (other than 3.9.7).

# How a web app works

A web app works in a client server architecture, which is a bit different from a normal command line app. 

The basic concept is as follows:

- It starts with running a web server. This web server is an app that continuously listens to a port on your computer and waits for a request from a web browser.
- If the web server receives a request, it processes the request and returns an HTML page in response.
- The web browser receives the response, renders the HTML and shows the result in your web browser window.

### A minor complication when working remote

When you work remote (as we do on virtual machines), there is a minor complication. The web server that runs on your virtual machine cannot be reached directly from the outside world. We have to manually set up a connection from the ports of our local pc to the ports of the virtual machine. 

Fortunately, Visual Studio Code provides functionality for this so-called port mapping (i.e. connecting a remote to a local port) and it only requires a simple additional step. 

In the Ports panel, we have to configure a port forwarding. Once this has been done, we can submit requests in our local browser to the webserver on the (remote) virtual machine and we can receive responses.

Follow these steps to configure port forwarding in Visual Studio Code:

1. Select the `PORTS` tab in the lower panel of the screen
   
   ![port_forwarding_01](img/port_forwarding_01.png)
   
2. Click on `Forward a Port`
   
3. Enter the port number and press `<Enter>`. The port number is typically `8501` for `streamlit`, but it may be different in your case. You can find the port number the URL (after the `:`) that is shown when you run `streamlit`.



# Starting with `streamlit`

## How it works

If you want to turn your Python code into a `streamlit` app, all you have to do is add 

```{python}
import streamlit as st
``` 
to your code and call the proper `streamlit` "output" function. 

Note that we abbreviate the `streamlit` with `st`, so that we can call the `streamlit` functions with the conveniently short prefix `st`.

The most basic output function is `st.write`. We will use `st.write` for our first 'hello world' `streamlit` app.

## `streamlit` and Jupyter Notebooks don't go together

As you can imagine, Jupyter Notebooks and `streamlit` don't go together very well. You have to write code in modules (separate `.py` files) and run them with `streamlit`.

The code shown below in Jupyter cells is not meant to be run directly from the notebook. You should add it to your own module(s) and run the module(s) using `streamlit`.



## Hello world in `streamlit`

We will start with a simple but not very usefull app that shows a static text. 

In [None]:
import streamlit as st

st.write(
    r"""
    # Hello
    
    My name is **[put your name her]**.
    
    This is my first `streamlit` app for the Data Mining & Exploration course of the HU University of Applied Sciences Utrecht.
    
    As you can see, the `st.write` function supports (GitHub flavored) Markdown.
    
    # Header 1
    
    ## Header 2
    
    ### Header 3
    
    My personal goals for this course are:
    1. Learn robust data analysis in Python
    2. Conduct my own analysis in a project from my own practice
    3. Have some fun!
    
    $ x_{1,2} = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a} $
        
    I'm a list:
    - Item 1
    - Item 2 with some _emphasis_
    - Goto [Hello](#hello)
    ---
    """
)

Save this to a separate module `src/hello.py`. 

You can run the code from the command line in your shell:

```
$ streamlit run src/hello.py
```

You will see that the webserver has started and that it shows the address you can visit.

If you open the link in your browser (after having correctly configured port forwardig as shown above), you can see the results of your module.

As you can see, the `st.write` function is simple in use, but because it supports Markdown you can already create a good looking page with formatting, headers, lists, hyperlinks and mathematical equations.

Visit [this page](https://github.github.com/gfm/#what-is-github-flavored-markdown-) if you want to learn more about Markdown.

### Modifying your code

If you want to change your code, you don't have to stop the web server. All you have to do is save your module. The webserver will detect the changes and will ask you (in the right upper corner of your web page) whether you want to rerun the code. 

If you click `Always rerun` it will automatically refresh your page each time you save your file.

Note that if not otherwise specified, `streamlit` will always run the entire module when you save it. This may involve long-during analyses. In these cases, it is wise to have the page not refreshed automatically.

## Adding plots

We will now move on to a more realistic example with a plot. Create a new module `dashboard.py` for the code below.

In [None]:
from datetime import datetime

import matplotlib as mpl
import palmerpenguins
import seaborn as sns
import streamlit as st

Load the data into a Pandas dataframe.

In [None]:
penguins = palmerpenguins.load_penguins()

It is important to inform the user of a dashboard about how current the information on the dashboard is. Therefor, use `st.write` and the appropriate `datetime` function to add the folllowing text with the current date in it:


> Last updated: \<dd MMM yyyy\>

We want to enable the user to investigate the correlation between the bill depth and bill length of each penguin species. So, let's add a plot that does exactly that.

Add the following to your code:

In [None]:
# Create a (scatter) plot object.
f, ax = mpl.subplots(1,1)
h = sns.relplot(
    ax=ax,
    data=penguins, 
    x='bill_length_mm', 
    y='bill_depth_mm', 
    hue='species'
)
h.set_axis_labels('Bill length (mm)', 'Bill depth (mm)')
h.set(title='Bill length and depth correlation across species')

# Add the object to the page.
st.pyplot(h)

Note that on a dashboard style and details matter. We add a title, use legible axis labels and add units to quantities.

### Adding interactivity

We want to add interactivity to our dashboar. As a first step, we will add check boxes that allow us to select/unselect a species.

In order to be able to do so, we have to prepare the data. We need the following:
- A list `species` with the names of the species. 
- A list of booleans (with the same length as `species`) `is_checked` that indicate whether a species is selected or not.
- A filtered dataframe `penguins_selection` that includes only the selected species. 

Complete the following code with the appropriate `pandas`:

In [None]:
species = penguins['species']...
is_selected = []        # Will be initialized below.

# Put your checkbox code (see below) here.

penguins_selection = penguins[ ... ]    
        # Hint: combine .isin() with species and is_selected.
        
# Add your plot to the page here.

You create a checkboxes in `streamlit` with `st.checkbox`. 

Since the number of species is limited, we will position the checkboxes horizontally next tot each other. Each checkbox will get its own column. Columns are created with the `st.columns` function. It returns a context manager you can use to position objects:

```{python}
    with column_object:
        ...
        # create objects that are placed in this column
        ...
```

We create a checkbox and a column for each species and combine them with `zip()`:

In [None]:
columns = st.columns(spec=len(species), gap='small')
for col, spec in zip(columns, species):
    with col:
        is_selected.append(st.checkbox(spec, value=True))

Make sure to place the dataframe filtering and the checkbox code above your `st.pyplot(...)`.

If all went well, you are now able to select and unselect the species shown in the plot.

We might also want to show the exact data, stored in our data frame. `streamlit` provides an easy way to add an interactive table to the page with the `st.dataframe()` function:

In [None]:
st.dataframe(penguins_selection)

We might want to hide details in the table and we want to have neat and legible column titles. Use `pandas` to rename the columns properly and modify your code to have it only show the columns `species`, `bill_length_mm` and `bill_depth_mm`.

### Rearranging the plot and the table

Not every business user is interested in the table and not every user is interested in the plot. To rearrange them, we will put the plot and the data table on their own tab. However, we want the check boxes to appear above both, because they apply to both.

In [3]:
tabs = st.tabs(['Plot', 'Table'])

This will create a list of tab context managers that we can use to position our objects in.

```
    with tab[0]:
        ...
        # Place objects on tab0.
        ...

    with tab[1]:
        ...
        # Place objects on tab1.
```

Modify your code so that it:

- Always shows the checkboxes
- Shows the scatter plot on the `Plot` tab
- Shows the data frame on the `Table` tab
- Shows a table caption above the table on the `Table` tab (use the `st.caption` function)

