In [2]:
!pip install plotly pandas numpy

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


# Streamlit tutorial

## Covid19 dataset

In this tutorial we will be using covid19 data from https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series. 

In [7]:
import pandas as pd

confirmed = pd.read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv")
deaths    = pd.read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv")
recovered = pd.read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_recovered_global.csv")

In [4]:
confirmed

Unnamed: 0,Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,...,2/28/23,3/1/23,3/2/23,3/3/23,3/4/23,3/5/23,3/6/23,3/7/23,3/8/23,3/9/23
0,,Afghanistan,33.939110,67.709953,0,0,0,0,0,0,...,209322,209340,209358,209362,209369,209390,209406,209436,209451,209451
1,,Albania,41.153300,20.168300,0,0,0,0,0,0,...,334391,334408,334408,334427,334427,334427,334427,334427,334443,334457
2,,Algeria,28.033900,1.659600,0,0,0,0,0,0,...,271441,271448,271463,271469,271469,271477,271477,271490,271494,271496
3,,Andorra,42.506300,1.521800,0,0,0,0,0,0,...,47866,47875,47875,47875,47875,47875,47875,47875,47890,47890
4,,Angola,-11.202700,17.873900,0,0,0,0,0,0,...,105255,105277,105277,105277,105277,105277,105277,105277,105288,105288
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
284,,West Bank and Gaza,31.952200,35.233200,0,0,0,0,0,0,...,703228,703228,703228,703228,703228,703228,703228,703228,703228,703228
285,,Winter Olympics 2022,39.904200,116.407400,0,0,0,0,0,0,...,535,535,535,535,535,535,535,535,535,535
286,,Yemen,15.552727,48.516388,0,0,0,0,0,0,...,11945,11945,11945,11945,11945,11945,11945,11945,11945,11945
287,,Zambia,-13.133897,27.849332,0,0,0,0,0,0,...,343012,343012,343079,343079,343079,343135,343135,343135,343135,343135


Now we preprocess the data.

In [8]:
date_columns = [x for x in confirmed][4:]

confirmed = confirmed.groupby(['Country/Region'])[date_columns].sum().reset_index()
deaths    = deaths.groupby(['Country/Region'])[date_columns].sum().reset_index()
recovered = recovered.groupby(['Country/Region'])[date_columns].sum().reset_index()

countries = confirmed['Country/Region'].to_list()

confirmed.rename(columns={'Country/Region':'Country'}, inplace=True)
deaths.rename(columns={'Country/Region':'Country'}, inplace=True)
recovered.rename(columns={'Country/Region':'Country'}, inplace=True)

confirmed = confirmed.set_index(confirmed.Country)[date_columns].T
deaths = deaths.set_index(deaths.Country)[date_columns].T
recovered = recovered.set_index(recovered.Country)[date_columns].T
countries = [x for x in confirmed]

deaths

Country,Afghanistan,Albania,Algeria,Andorra,Angola,Antarctica,Antigua and Barbuda,Argentina,Armenia,Australia,...,Uruguay,Uzbekistan,Vanuatu,Venezuela,Vietnam,West Bank and Gaza,Winter Olympics 2022,Yemen,Zambia,Zimbabwe
1/22/20,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1/23/20,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1/24/20,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1/25/20,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1/26/20,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3/5/23,7896,3598,6881,165,1933,0,146,130463,8721,19459,...,7617,1637,14,5854,43186,5708,0,2159,4057,5668
3/6/23,7896,3598,6881,165,1933,0,146,130472,8721,19459,...,7617,1637,14,5854,43186,5708,0,2159,4057,5668
3/7/23,7896,3598,6881,165,1933,0,146,130472,8721,19459,...,7617,1637,14,5854,43186,5708,0,2159,4057,5668
3/8/23,7896,3598,6881,165,1933,0,146,130472,8727,19459,...,7617,1637,14,5854,43186,5708,0,2159,4057,5671


Now we can easily select data between dates for selected countries.

In [9]:
confirmed.loc['3/1/20':'11/27/20', ['Czechia','Austria']]

Country,Czechia,Austria
3/1/20,3,7
3/2/20,3,8
3/3/20,5,12
3/4/20,8,17
3/5/20,12,23
...,...,...
11/23/20,496638,247329
11/24/20,502534,249765
11/25/20,505215,254373
11/26/20,511520,260116


And plot selected data.

In [11]:
import plotly.graph_objects as go

fig=go.Figure()


for x in ['Czechia','Austria']:
    fig.add_trace(
        go.Scatter(
            y=confirmed.loc['3/1/20':'11/27/20', x].to_list(),
            x=date_columns, 
            name=x+' confirmed'
        )
    )
    fig.add_trace(
        go.Scatter(
            y=deaths.loc['3/1/20':'11/27/20', x].to_list(),
            x=date_columns, 
            name=x+' deaths'
        )
    )
    fig.add_trace(
        go.Scatter(
            y=recovered.loc['3/1/20':'11/27/20', x].to_list(),
            x=date_columns, 
            name=x+' recovered'
        )
    )


fig.update_layout(
    title="Number of COVID19 confirmed cases, deaths and recovered", 
    xaxis_title="Date",
    yaxis_title="Cases (logarithmic scale)",
    yaxis_type="log", # switch log scale on for y axis
    hovermode='x' # compare data on hoover by default
  )



It would be convenient to easily pick dates, select countries and other parameters. So is it possible to create web app for data visualisation? 

Yes, with streamlit.

## Streamlit

Streamlit is an open-source Python library that makes it easy to create and share beautiful, custom web apps for machine learning and data science. In just a few minutes you can build and deploy powerful data apps - as I will show you below.

API Documentation link: https://docs.streamlit.io/en/stable/api.html

Let's write simple file on disk.

In [12]:
!pip install streamlit
!npm install localtunnel

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting streamlit
  Downloading streamlit-1.22.0-py2.py3-none-any.whl (8.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.9/8.9 MB[0m [31m75.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting blinker>=1.0.0
  Downloading blinker-1.6.2-py3-none-any.whl (13 kB)
Collecting pympler>=0.9
  Downloading Pympler-1.0.1-py3-none-any.whl (164 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m164.8/164.8 kB[0m [31m19.7 MB/s[0m eta [36m0:00:00[0m
Collecting validators>=0.2
  Downloading validators-0.20.0.tar.gz (30 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting gitpython!=3.1.19
  Downloading GitPython-3.1.31-py3-none-any.whl (184 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m184.3/184.3 kB[0m [31m18.2 MB/s[0m eta [36m0:00:00[0m
Collecting watchdog
  Downloading watchdog-3.0.0-py3-none-manylinux2014_x86_64.whl

In [13]:
%%writefile streamlitapp.py

import streamlit as st

st.write("Hello world!")

Writing streamlitapp.py


In [15]:
# workaround for live running streamlit app in google colab
!streamlit run streamlitapp.py &>/content/logs.txt &
!npx localtunnel --port 8501

[K[?25hnpx: installed 22 in 3.252s
^C


In [16]:
!cat /content/logs.txt


Collecting usage statistics. To deactivate, set browser.gatherUsageStats to False.


  You can now view your Streamlit app in your browser.

  Network URL: http://172.28.0.12:8501
  External URL: http://34.86.190.6:8501

  Stopping...


You can now run streamlit server with command `streamlit run streamlitapp.py` and it should look like this:

![hello_world](https://raw.githubusercontent.com/zombak79/streamlit-tutorial/main/img/hello_world.gif)

Streamlit have also many interactive widgets:

In [17]:
%%writefile streamlitapp.py

import streamlit as st

st.title('Demo app')

st.write("Let's test some **interactive widgets:**")

# ****************** BUTTONS **************************
st.write("**Buttons**")
if st.button('Button 1', key='button_1'):
     st.write(':-)')
else:
     st.write(':-(')
                
if st.button('Button 2', key='button_2'):
     st.write(':-)')
else:
     st.write(':-(')
        
# ****************** CHECKBOX *************************
st.write("**Checkbox**")
check = st.checkbox("Check me", value=False, key='check_1')
if check:
    st.write('Checked')
else:
    st.write('Not checked')
    
# ******************** RADIO **************************
st.write("**Radio button**")
radio = st.radio(
    "What is your favorite season?",
    options=['Spring', 'Summer', 'Autumn', 'Winter'],
    key='radio_1'    
)
st.write(radio)

# ****************** SELECT BOX ***********************
st.write("**Select box**")
sbox = st.selectbox(
    "What is your favorite season?",
    options=['Spring', 'Summer', 'Autumn', 'Winter'],
    key='sbox_1'
)
st.write(sbox)

# ****************** MULTI SELECT *********************
st.write("**Select box**")
ms = st.multiselect(
    "What is your favorite season?",
    options=['Spring', 'Summer', 'Autumn', 'Winter'],
    key='ms_1'
)
st.write(ms)

# ****************** SLIDER ***************************
st.write("**Slider**")
slider = st.slider(
    "Select number",
    min_value = 0,
    max_value = 10,
    step = 1,
    key="slider_1"
)
st.write("Number selected", slider)
st.write("*Note: for picking from list, you can use* `st.select_slider`*. See documentation for more info.*")

# **************** INPUTS *****************************
st.write("**Text Input**")
some_text = st.text_input("Input text:", key="inp_1")
st.write(some_text)
st.write("*Note: same for numbers (number_input), long texts (textarea), date and time inputs.")


Overwriting streamlitapp.py


![widgets](https://raw.githubusercontent.com/zombak79/streamlit-tutorial/main/img/widgets.gif)

We can arrange objects on page in columns with `st.beta_columns()`.

We can also display running code with `st.echo()`.

In [18]:
%%writefile streamlitapp.py

import streamlit as st

st.title('Demo app')

st.write("Let's test some **interactive widgets:**")

# ****************** BUTTONS **************************

with st.beta_container():
    st.write("**Buttons**")
    with st.echo():
        col1, col2 = st.beta_columns(2)
        if col1.button('Button 1', key='bHello world!
utton_1'):
             col2.write(':-)')
        else:
             col2.write(':-(')

        col1, col2 = st.beta_columns(2)
        if col1.button('Button 2', key='button_2'):
             col2.write(':-)')
        else:
             col2.write(':-(')
                          
# ****************** CHECKBOX *************************
with st.beta_container():
    st.write("**Checkbox**")
    with st.echo():
        col1, col2 = st.beta_columns(2)
        check = col1.checkbox("Check me", value=False, key='check_1')
        if check:
            col2.write('Checked')
        else:
            col2.write('Not checked')
    
# ******************** RADIO **************************
with st.beta_container():
    st.write("**Radio button**")
    with st.echo():
        col1, col2 = st.beta_columns(2)
        radio = col1.radio(
            "What is your favorite season?",
            options=['Spring', 'Summer', 'Autumn', 'Winter'],
            key='radio_1'    
        )
        col2.write(radio)

# ****************** SELECT BOX ***********************
with st.beta_container():
    st.write("**Select box**")
    with st.echo():
        col1, col2 = st.beta_columns(2)
        sbox = col1.selectbox(
            "What is your favorite season?",
            options=['Spring', 'Summer', 'Autumn', 'Winter'],
            key='sbox_1'
        )
        col2.write(sbox)

# ****************** MULTI SELECT *********************
with st.beta_container():
    st.write("**Multi select box**")
    with st.echo():
        col1, col2 = st.beta_columns(2)
        ms = col1.multiselect(
            "What is your favorite season?",
            options=['Spring', 'Summer', 'Autumn', 'Winter'],
            key='ms_1'
        )
        col2.write(ms)

# ****************** SLIDER ***************************
with st.beta_container():
    st.write("**Slider**")
    with st.echo():
        col1, col2 = st.beta_columns(2)
        slider = col1.slider(
            "Select number",
            min_value = 0,
            max_value = 10,
            step = 1,
            key="slider_1"
        )
        # multiple inputs work only for st.write and st.sidebar.write
        col2.write("Number selected " + str(slider)) 
    st.write("*Note: for picking from list, you can use* `st.select_slider`*. See documentation for more info.*")

# **************** INPUTS *****************************
with st.beta_container():
    st.write("**Text Input**")
    with st.echo():
        col1, col2 = st.beta_columns(2)
        some_text = col1.text_input("Input text:", key="inp_1")
        col2.write(some_text)
    st.write("*Note: same for numbers (number_input), long texts (textarea), date and time inputs.*")



Overwriting streamlitapp.py


![columns_echo.gif](https://raw.githubusercontent.com/zombak79/streamlit-tutorial/main/img/columns_echo.gif)

With `st.sidebar` we can separate widgets for user input and visualised data.

In [19]:
%%writefile streamlitapp.py

import streamlit as st

st.set_page_config(
    page_title='Demo app', 
    page_icon=None, 
    layout='centered', 
    initial_sidebar_state='expanded')


st.title('Demo app')
st.sidebar.title("Parameters")
st.write("Let's test some **interactive widgets:**")

# ****************** MULTI SELECT *********************
st.write("**Select box**")
ms = st.sidebar.multiselect(
    "What is your favorite season?",
    options=['Spring', 'Summer', 'Autumn', 'Winter'],
    key='ms_1'
)
st.write(ms)



Overwriting streamlitapp.py


![sidebar.gif](https://raw.githubusercontent.com/zombak79/streamlit-tutorial/main/img/sidebar.gif)

## Lets put it all together

We will create two functions. First will download the data and the second will create plotly visualization. We will also cache them with `@st.cache` to save time.

Widgets will be in the sidebar.

In [20]:
%%writefile streamlitapp.py

import pandas as pd
import plotly.graph_objects as go
import streamlit as st
from datetime import datetime

st.set_page_config(
    page_title='Covid19 data explorer', 
    page_icon=None, 
    layout='centered', 
    initial_sidebar_state='expanded')

@st.cache
def load_data(date):
    confirmed = pd.read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv")
    deaths    = pd.read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv")
    recovered = pd.read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_recovered_global.csv")
    date_columns = [x for x in confirmed][4:]

    confirmed = confirmed.groupby(['Country/Region'])[date_columns].sum().reset_index()
    deaths    = deaths.groupby(['Country/Region'])[date_columns].sum().reset_index()
    recovered = recovered.groupby(['Country/Region'])[date_columns].sum().reset_index()

    countries = confirmed['Country/Region'].to_list()

    confirmed.rename(columns={'Country/Region':'Country'}, inplace=True)
    deaths.rename(columns={'Country/Region':'Country'}, inplace=True)
    recovered.rename(columns={'Country/Region':'Country'}, inplace=True)

    confirmed = confirmed.set_index(confirmed.Country)[date_columns].T
    deaths = deaths.set_index(deaths.Country)[date_columns].T
    recovered = recovered.set_index(recovered.Country)[date_columns].T
    countries = [x for x in confirmed]

    return countries, date_columns, confirmed, recovered, deaths

@st.cache
def get_plotly_object(selected_countries, date_from, date_to, show_confirmed, show_deaths, show_recovered, show_legend, logaritmic):
    date_from = date_from.strftime('%-m/%-d/%y')
    date_to = date_to.strftime('%-m/%-d/%y')
    
    fig=go.Figure()


    for x in selected_countries:
        
        if show_confirmed:
            fig.add_trace(
                go.Scatter(
                    y=confirmed.loc[date_from:date_to, x].to_list(),
                    x=dates[dates.index(date_from):dates.index(date_to)],
                    name=x+' confirmed'
                )
            )
        
        if show_deaths:
            fig.add_trace(
                go.Scatter(
                    y=deaths.loc[date_from:date_to, x].to_list(),
                    x=dates[dates.index(date_from):dates.index(date_to)], 
                    name=x+' deaths'
                )
            )
        
        if show_recovered:
            fig.add_trace(
                go.Scatter(
                    y=recovered.loc[date_from:date_to, x].to_list(),
                    x=dates[dates.index(date_from):dates.index(date_to)], 
                    name=x+' recovered'
                )
            )
    
    if logaritmic:
        ya = 'log'
    else:
        ya = 'linear'
        
    fig.update_layout(
        title="Number of COVID19 confirmed cases, deaths and recovered", 
        xaxis_title="Date",
        yaxis_title="Cases (logarithmic scale)",
        yaxis_type=ya, # switch log scale on for y axis
        hovermode='x', # compare data on hoover by default
        showlegend=show_legend
      )
    
    return fig


countries, dates, confirmed, recovered, deaths = load_data(datetime.today().strftime('%Y-%m-%d'))

st.title('Covid19 data explorer')

# sidebar
selected_countries = st.sidebar.multiselect(
    "Select countries",
    options=countries,
    default='Czechia',
    key='ms_1'
)

date_from = st.sidebar.date_input(
    "Date from",
    min_value = datetime.strptime(dates[0], '%m/%d/%y'),
    max_value = datetime.strptime(dates[-1], '%m/%d/%y'),
    value=datetime.strptime(dates[0], '%m/%d/%y'),
    key="date_from"
)

date_to = st.sidebar.date_input(
    "Date to",
    min_value = datetime.strptime(dates[0], '%m/%d/%y'),
    max_value = datetime.strptime(dates[-1], '%m/%d/%y'),
    value=datetime.strptime(dates[-1], '%m/%d/%y'),
    key="date_to"
)

show_confirmed = st.sidebar.checkbox("Show confirmed cases", value=True, key='check_1')
show_deaths = st.sidebar.checkbox("Show deaths", value=True, key='check_2')
show_recovered = st.sidebar.checkbox("Show recovered", value=True, key='check_3')
show_legend = st.sidebar.checkbox("Show legend", value=True, key='check_4')
logaritmic = st.sidebar.checkbox("Log scale", value=True, key='check_5')

# use data to generate plot

plotly_fig = get_plotly_object(selected_countries, date_from, date_to,show_confirmed,show_deaths,show_recovered,show_legend,logaritmic)
st.write(plotly_fig)

Overwriting streamlitapp.py


![covid.gif](https://raw.githubusercontent.com/zombak79/streamlit-tutorial/main/img/covid.gif)