
Demos - Geo - Plotly and streamlit
==================================


* [Medium - How to Create a Simple GIS Map with Plotly and Streamlit](https://towardsdatascience.com/how-to-create-a-simple-gis-map-with-plotly-and-streamlit-7732d67b84e2)
  + Date: Dec. 2023
  + Author: Alan Jones
  ([Alan Jones on Medium](https://medium.com/@alan-jones),
  [Alan Jones on LinkedIn](https://www.linkedin.com/in/alan-jones-032699100/))
  + Publisher: Medium TowardsDataScience
  + [GitHub - repository with the code for that notebook](https://github.com/alanjones2/st-choropleth/tree/main)
* [GitHub - Data Engineering helpers - Material about geo-analysis with Python](https://github.com/data-engineering-helpers/ks-cheat-sheets/blob/main/programming/python/geo/README.md)
* [Demos - Geo - Plotly and streamlit (this notebook)](https://github.com/data-engineering-helpers/databricks-examples/blob/main/ipython-notebooks/demos-geo-plotly-and-streamlit.ipynb)

In [0]:
%pip install streamlit plotly

[43mNote: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages.[0m
Collecting streamlit
  Downloading streamlit-1.29.0-py2.py3-none-any.whl (8.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.4/8.4 MB 47.2 MB/s eta 0:00:00
Collecting watchdog>=2.1.5
  Downloading watchdog-3.0.0-py3-none-manylinux2014_x86_64.whl (82 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 82.1/82.1 kB 14.1 MB/s eta 0:00:00
Collecting rich<14,>=10.14.0
  Downloading rich-13.7.0-py3-none-any.whl (240 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 240.6/240.6 kB 42.2 MB/s eta 0:00:00
Collecting pydeck<1,>=0.8.0b4
  Downloading pydeck-0.8.1b0-py2.py3-none-any.whl (4.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.8/4.8 MB 96.9 MB/s eta 0:00:00
Collecting tzlocal<6,>=1.1
  Downloading tzlocal-5.2-py3-none-any.whl (17 kB)
Collecting validators<1,>=0.2
  Downloading validators-0.22.0-py3-none-any.whl (26 kB)
Collecting toml<2,>=0.10.1
  Downloading toml-0.10.2

In [0]:
dbutils.library.restartPython()

In [0]:
import pandas as pd # pyspark.pandas as pd
import streamlit as st
import plotly.express as px
import json


# Data
Our final dashboard will use data about CO2 emissions. This data is derived from tables
in a GitHub repository belonging to
[Our World in Data](https://ourworldindata.org/).
OWID is a great source of data and analysis which I have used many times
(_e.g._ [New Data Demonstrates that 2023 was the Hottest Summer Ever](https://medium.com/towards-data-science/new-data-demonstrates-that-2023-was-the-hottest-summer-ever-d92d500a8f01)).

The data that I have copied from OWID represents the CO2 emissions for countries since 1750.
The original data contains far more information than we need, so I have created subsets
of the data and stored them in the app. You will find them in the data folder.


In [0]:
%sh

rm -rf /dbfs/tmp/demo-streamlit
mkdir -p /dbfs/tmp/demo-streamlit/{data,geo}
curl -kL https://github.com/alanjones2/st-choropleth/raw/main/data/co2_total.csv -o /dbfs/tmp/demo-streamlit/data/co2_total.csv
curl -kL https://github.com/alanjones2/st-choropleth/raw/main/data/Australian%20Bureau%20of%20Statistics.csv -o /dbfs/tmp/demo-streamlit/data/Australian-Bureau-of-Statistics.csv
head -5 /dbfs/tmp/demo-streamlit/data/co2_total.csv
curl -kL https://github.com/alanjones2/st-choropleth/raw/main/geo/australia.geojson -o /dbfs/tmp/demo-streamlit/geo/australia.geojson
ls -lFhR /dbfs/tmp/demo-streamlit


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  3  806k    3 31836    0     0  69739      0  0:00:11 --:--:--  0:00:11 69739100  806k  100  806k    0     0  1657k      0 --:--:-- --:--:-- --:--:-- 25.2M
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   468  100   468    0     0    895      0 --:--:-- --:--:-- --:--:--   895


,Entity,Code,Year,Annual CO₂ emissions
0,Afghanistan,AFG,1949,14656.0
1,Afghanistan,AFG,1950,84272.0
2,Afghanistan,AFG,1951,91600.0
3,Afghanistan,AFG,1952,91600.0


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  622k  100  622k    0     0  1567k      0 --:--:-- --:--:-- --:--:-- 1567k


/dbfs/tmp/demo-streamlit:
total 8.0K
drwxrwxrwx 2 nobody nogroup 4.0K Dec 28 18:04 data/
drwxrwxrwx 2 nobody nogroup 4.0K Dec 28 18:04 geo/

/dbfs/tmp/demo-streamlit/data:
total 808K
-rwxrwxrwx 1 nobody nogroup  468 Dec 28  2023 Australian-Bureau-of-Statistics.csv*
-rwxrwxrwx 1 nobody nogroup 807K Dec 28 18:04 co2_total.csv*

/dbfs/tmp/demo-streamlit/geo:
total 623K
-rwxrwxrwx 1 nobody nogroup 623K Dec 28  2023 australia.geojson*


In [0]:
df_total = pd.read_csv("/dbfs/tmp/demo-streamlit/data/co2_total.csv")
df_total_2021 = df_total[df_total['Year']==2021]

In [0]:
col = 'Annual CO₂ emissions'
max = df_total_2021[col].max()
min = df_total_2021[col].min()

In [0]:
fig = px.choropleth(df_total_2021, 
                    locations="Code",
                    color=col,
                    hover_name="Entity",
                    range_color=(min,max)
                    )

In [0]:
fig = px.choropleth(df_total_2021, 
                    locations="Code",
                    scope = 'europe',
                    color=col,
                    hover_name="Entity",
                    range_color=(min,max),
                    title = 'Europe'
                    )

In [0]:
st.title("Population of Australian States")
st.info("Hover over the map to see the names of the states and their population")
f = open('/dbfs/tmp/demo-streamlit/geo/australia.geojson')
oz = json.load(f)

df = pd.read_csv('/dbfs/tmp/demo-streamlit/data/Australian-Bureau-of-Statistics.csv')
fig = px.choropleth(df, geojson=oz, 
                    color="Population at 31 March 2023 ('000)",
                    locations="State", 
                    featureidkey="properties.name",
                    color_continuous_scale="Reds",
                    range_color=(0, 10000),
                    fitbounds = 'geojson',
                    template = 'plotly_white'
                   )
st.plotly_chart(fig)

[0;31m---------------------------------------------------------------------------[0m
[0;31mValueError[0m                                Traceback (most recent call last)
File [0;32m<command-4483247640063516>, line 7[0m
[1;32m      4[0m oz [38;5;241m=[39m json[38;5;241m.[39mload(f)
[1;32m      6[0m df [38;5;241m=[39m pd[38;5;241m.[39mread_csv([38;5;124m'[39m[38;5;124m/dbfs/tmp/demo-streamlit/data/Australian-Bureau-of-Statistics.csv[39m[38;5;124m'[39m)
[0;32m----> 7[0m fig [38;5;241m=[39m px[38;5;241m.[39mchoropleth(df, geojson[38;5;241m=[39moz, 
[1;32m      8[0m                     color[38;5;241m=[39m[38;5;124m"[39m[38;5;124mPopulation at 31 March 2023 ([39m[38;5;124m'[39m[38;5;124m000)[39m[38;5;124m"[39m,
[1;32m      9[0m                     locations[38;5;241m=[39m[38;5;124m"[39m[38;5;124mState[39m[38;5;124m"[39m, 
[1;32m     10[0m                     featureidkey[38;5;241m=[39m[38;5;124m"[39m[38;5;124mproperties.name[39

In [0]:
df = pd.read_csv('data/europop.csv')

fig = px.scatter_geo(df, scope = 'europe', 
                    color="Population (historical estimates)",
                    size="Population (historical estimates)",
                    locations="Code", 
                    hover_name = 'Entity',
                    color_continuous_scale="Purples",
                    range_color=(0, 100000000),
                    fitbounds = 'locations',
                    template = 'plotly_white',
                    title = "European populations"
                   )

st.plotly_chart(fig)

[0;31m---------------------------------------------------------------------------[0m
[0;31mIllegalArgumentException[0m                  Traceback (most recent call last)
File [0;32m<command-4483247640063517>, line 1[0m
[0;32m----> 1[0m df [38;5;241m=[39m pd[38;5;241m.[39mread_csv([38;5;124m'[39m[38;5;124mdata/europop.csv[39m[38;5;124m'[39m)
[1;32m      3[0m fig [38;5;241m=[39m px[38;5;241m.[39mscatter_geo(df, scope [38;5;241m=[39m [38;5;124m'[39m[38;5;124meurope[39m[38;5;124m'[39m, 
[1;32m      4[0m                     color[38;5;241m=[39m[38;5;124m"[39m[38;5;124mPopulation (historical estimates)[39m[38;5;124m"[39m,
[1;32m      5[0m                     size[38;5;241m=[39m[38;5;124m"[39m[38;5;124mPopulation (historical estimates)[39m[38;5;124m"[39m,
[0;32m   (...)[0m
[1;32m     12[0m                     title [38;5;241m=[39m [38;5;124m"[39m[38;5;124mEuropean populations[39m[38;5;124m"[39m
[1;32m     13[0m                 

In [0]:
fig = px.choropleth(df_total[df_total['Year']==year], 
                    locations="Code",
                    color=col,
                    hover_name="Entity",
                    range_color=(min,max),
                    scope= 'world',
                    projection="orthographic",
                    color_continuous_scale=px.colors.sequential.Reds)
st.plotly_chart(fig)

[0;31m---------------------------------------------------------------------------[0m
[0;31mNameError[0m                                 Traceback (most recent call last)
File [0;32m<command-4483247640063522>, line 1[0m
[0;32m----> 1[0m fig [38;5;241m=[39m px[38;5;241m.[39mchoropleth(df_total[df_total[[38;5;124m'[39m[38;5;124mYear[39m[38;5;124m'[39m][38;5;241m==[39myear], 
[1;32m      2[0m                     locations[38;5;241m=[39m[38;5;124m"[39m[38;5;124mCode[39m[38;5;124m"[39m,
[1;32m      3[0m                     color[38;5;241m=[39mcol,
[1;32m      4[0m                     hover_name[38;5;241m=[39m[38;5;124m"[39m[38;5;124mEntity[39m[38;5;124m"[39m,
[1;32m      5[0m                     range_color[38;5;241m=[39m([38;5;28mmin[39m,[38;5;28mmax[39m),
[1;32m      6[0m                     scope[38;5;241m=[39m [38;5;124m'[39m[38;5;124mworld[39m[38;5;124m'[39m,
[1;32m      7[0m                     projection[38;5;241m=[39m

In [0]:
st.set_page_config(layout="wide")

st.title("CO2 Emissions")
st.write("""The following maps display the CO2 emissions for a
            range of countries over a range of time""")

st.info("""Use the slider to select a year to display
           the total emissions for each country. 
           Scroll down to see an interactive 3D representation.""")

col1, col2 = st.columns(2)

df_total = pd.read_csv('data/co2_total.csv')
col = 'Annual CO₂ emissions'
max = df_total[col].max()
min = df_total[col].min()

# To get the whole range replace 1950 with the comment that follows it
first_year = 1950 #df_total['Year'].min()
last_year = df_total['Year'].max()
year = st.slider('Select year',first_year,last_year, key=col)

col1.write("""This map uses the 'Natural Earth' projection""")
fig = px.choropleth(df_total[df_total['Year']==year], 
                    locations="Code",
                    color=col,
                    hover_name="Entity",
                    range_color=(min,max),
                    scope= 'world',
                    projection="natural earth",
                    color_continuous_scale=px.colors.sequential.Reds)
col1.plotly_chart(fig)

col2.write("""This map uses the 'Orthographic' projection.
Click on the globe and move the pointer to rotate it.
""")

fig = px.choropleth(df_total[df_total['Year']==year], locations="Code",
                    color=col,
                    hover_name="Entity",
                    range_color=(min,max),
                    scope= 'world',
                    projection="orthographic",
                    color_continuous_scale=px.colors.sequential.Reds)
col2.plotly_chart(fig)

[0;31m---------------------------------------------------------------------------[0m
[0;31mIllegalArgumentException[0m                  Traceback (most recent call last)
File [0;32m<command-4483247640063523>, line 13[0m
[1;32m      7[0m st[38;5;241m.[39minfo([38;5;124m"""[39m[38;5;124mUse the slider to select a year to display[39m
[1;32m      8[0m [38;5;124m           the total emissions for each country. [39m
[1;32m      9[0m [38;5;124m           Scroll down to see an interactive 3D representation.[39m[38;5;124m"""[39m)
[1;32m     11[0m col1, col2 [38;5;241m=[39m st[38;5;241m.[39mcolumns([38;5;241m2[39m)
[0;32m---> 13[0m df_total [38;5;241m=[39m pd[38;5;241m.[39mread_csv([38;5;124m'[39m[38;5;124mdata/co2_total.csv[39m[38;5;124m'[39m)
[1;32m     14[0m col [38;5;241m=[39m [38;5;124m'[39m[38;5;124mAnnual CO₂ emissions[39m[38;5;124m'[39m
[1;32m     15[0m [38;5;28mmax[39m [38;5;241m=[39m df_total[col][38;5;241m.[39mmax()

File 

In [0]:
# add/subtract from the selected countries
    c = st.multiselect('Add a country:', countries, default=['United States', 'China', 'Russia', 'Germany'])
    tab1, tab2 = col2.tabs(["Graph", "Table"])

    with tab1:
        # plot a line graph of emissions for selected countries
        fig = px.line(df_total[df_total['Entity'].isin(c)], x='Year', y='Annual CO₂ emissions', color = 'Entity')
        st.plotly_chart(fig, use_container_width=True)
    with tab2:
        table = df_total[df_total['Year']==st.session_state['year']]
        st.dataframe(table[table['Entity'].isin(c)], use_container_width=True)

[0;36m  File [0;32m<command-4483247640063524>, line 2[0;36m[0m
[0;31m    c = st.multiselect('Add a country:', countries, default=['United States', 'China', 'Russia', 'Germany'])[0m
[0m    ^[0m
[0;31mIndentationError[0m[0;31m:[0m unexpected indent
