# DASHboard
quick, cheap, beautiful, healthy workshop

## What is a dashboard?

### Carriage dashboard

![Carriage dashboard](images/carriagedashboard.jpg)
> https://sla.talkbank.org/tours/carriagedashboard.jpg

### Car dashboard

![Car dashboard](images/cardashboard.jpg)
> https://image.cpsimg.com/sites/carparts-mc/assets/classroom/images/dash_gauges.jpg

### Business dashboard

![Business dashboard](images/businessdashboard.png)
> http://www.byteorigin.com/wp-content/uploads/2016/07/206531097.png

## Why to build a dashboard?

- Good overview on important data
- No need to visit multiple places
- Quick identification of problems - being able to react on time
- Discovering insights from data
- Ability to share results easily (reports)

## How to build a dashboard?

### Understand needs

- Understand data
    - Ask right questions
    - KPIs (Key Performance Indicators)
- Audience
    - Strategical
    - Analytical (lot of interaction)

### Design

- Make it simple
    - Speak user's language
    - Minimize distracitons
    - Narrow down filters and metrics to the ones that will have impact
    - Make every pixel count `data_ink_ratio = pixels_presenting_data / total_pixels`
- Choose right visualizations
    - For example: 3D pie charts for 2D data is just bad... Bar is much easier to compare
- Make data actionable
    - Provide enough context

### Tools

- Excel-like software
- Custom webapp
- Custom desktop widget
- Many "out of the box solutions"
- ...
- Dash

## More
- https://en.wikipedia.org/wiki/Dashboard
- https://en.wikipedia.org/wiki/Dashboard_(business)
- https://www.youtube.com/watch?v=RtKDSfWFQIA

# Gapminder

Dashboard we are building

## Inspiration

https://www.youtube.com/watch?v=hVimVzgtD6w

## Tool

https://www.gapminder.org/tools/

## Demo of Poor man's Gapminder

..app_f***l.py

# Data

## Pandas

http://pandas.pydata.org/

In [43]:
import pandas as pd

In [44]:
countries = pd.Series(['Argentina', 'Austria', 'Australia'])
countries

0    Argentina
1      Austria
2    Australia
dtype: object

In [45]:
dataset = pd.DataFrame({
    'country': ['Argentina', 'Austria', 'Australia'],
    'population': [43e6, 8e6, 24e6],
})
dataset

Unnamed: 0,country,population
0,Argentina,43000000.0
1,Austria,8000000.0
2,Australia,24000000.0


In [46]:
dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
country       3 non-null object
population    3 non-null float64
dtypes: float64(1), object(1)
memory usage: 128.0+ bytes


In [47]:
dataset.describe()

Unnamed: 0,population
count,3.0
mean,25000000.0
std,17521420.0
min,8000000.0
25%,16000000.0
50%,24000000.0
75%,33500000.0
max,43000000.0


In [48]:
dataset.head(2)

Unnamed: 0,country,population
0,Argentina,43000000.0
1,Austria,8000000.0


In [49]:
dataset['population']

0    43000000.0
1     8000000.0
2    24000000.0
Name: population, dtype: float64

In [50]:
dataset['population'].values

array([43000000.,  8000000., 24000000.])

...however many things in Python echosystem work directly on Series as if you passed it as list/np.array. Casting happens underood.

In [51]:
dataset.loc[0]

country       Argentina
population      4.3e+07
Name: 0, dtype: object

In [52]:
dataset.loc[0, 'population']

43000000.0

In [53]:
country_indexed_dataset = dataset.set_index('country')
country_indexed_dataset

Unnamed: 0_level_0,population
country,Unnamed: 1_level_1
Argentina,43000000.0
Austria,8000000.0
Australia,24000000.0


In [78]:
country_indexed_dataset.index

Index(['Argentina', 'Austria', 'Australia'], dtype='object', name='country')

In [54]:
country_indexed_dataset.loc['Austria']

population    8000000.0
Name: Austria, dtype: float64

In [55]:
dataset['population'] > 10e6

0     True
1    False
2     True
Name: population, dtype: bool

In [56]:
dataset[dataset['population'] > 10e6]

Unnamed: 0,country,population
0,Argentina,43000000.0
2,Australia,24000000.0


In [57]:
dataset['doubled_population'] = dataset.apply(
    lambda row: row['population'] * 2,
    axis=1,
)
dataset

Unnamed: 0,country,population,doubled_population
0,Argentina,43000000.0,86000000.0
1,Austria,8000000.0,16000000.0
2,Australia,24000000.0,48000000.0


### More

https://github.com/Nozdi/first-steps-with-pandas-workshop

## Data for dashboard

In [58]:
from gapminder.data import (
    get_config,
    load_data,
)

In [59]:
data = load_data()
data.keys()

dict_keys(['Population density (per square km)', 'Total population', 'Murder per 100,000, age adjusted', 'Life expectancy', 'Median age', 'CO2 per capita', 'Democracy, based on PolityIV', 'GDP per capita', 'Under five mortality'])

In [60]:
data['Population density (per square km)'].head()

Unnamed: 0_level_0,1950,1951,1952,1953,1954,1955,1956,1957,1958,1959,...,2001,2002,2003,2004,2005,2006,2007,2008,2009,2010
Population density (per square km),Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Abkhazia,,,,,,,,,,,...,,,,,,,,,,
Afghanistan,12.501,12.693,12.893,13.101,13.318,13.545,13.781,14.028,14.285,14.552,...,36.31,37.786,39.379,40.935,42.348,43.584,44.696,45.761,46.892,48.171
Akrotiri and Dhekelia,,,,,,,,,,,...,,,,,,,,,,
Albania,42.264,43.132,44.153,45.309,46.586,47.968,49.444,50.999,52.622,54.3,...,107.047,107.478,108.067,108.698,109.288,109.803,110.257,110.665,111.059,111.461
Algeria,3.675,3.759,3.838,3.916,3.995,4.079,4.167,4.258,4.352,4.444,...,13.008,13.201,13.399,13.602,13.809,14.02,14.236,14.455,14.674,14.892


In [61]:
data['Population density (per square km)'].describe()

Unnamed: 0,1950,1951,1952,1953,1954,1955,1956,1957,1958,1959,...,2001,2002,2003,2004,2005,2006,2007,2008,2009,2010
count,229.0,229.0,229.0,229.0,229.0,229.0,229.0,229.0,229.0,229.0,...,229.0,229.0,229.0,229.0,229.0,229.0,229.0,229.0,229.0,229.0
mean,191.870852,192.700834,193.983646,195.614559,197.519616,199.657109,201.999598,204.550729,207.339201,210.397424,...,390.340183,393.549485,396.627157,400.037773,404.151087,409.095891,414.743515,420.758908,426.674882,432.134786
std,1054.789037,1050.054405,1047.964733,1048.814506,1052.772409,1059.847212,1069.801828,1082.068743,1096.002577,1110.711895,...,2010.494047,2026.297857,2040.868439,2056.306198,2074.270159,2095.299423,2118.947202,2144.241333,2169.661239,2193.965671
min,0.011,0.011,0.011,0.011,0.012,0.012,0.013,0.013,0.014,0.014,...,0.026,0.026,0.026,0.026,0.026,0.026,0.026,0.026,0.026,0.026
25%,8.175,8.308,8.524,8.735,8.816,8.951,9.161,9.386,9.619,9.861,...,27.203,27.833,28.69,29.568,29.842,29.801,31.222,31.674,31.923,32.172
50%,26.421,26.772,27.133,27.504,27.889,28.286,28.697,29.538,30.791,31.964,...,72.515,72.385,72.12,72.881,74.096,75.791,76.83,78.21,79.989,81.004
75%,84.005,86.224,88.306,89.625,91.069,91.628,93.304,95.408,95.869,96.251,...,178.908,181.253,183.613,187.416,189.855,192.199,194.423,196.563,199.95,200.953
max,13422.819,13334.228,13319.463,13377.852,13504.027,13688.591,13917.45,14170.47,14426.846,14662.416,...,23683.221,23718.792,23706.711,23679.195,23664.43,23669.128,23687.919,23715.436,23742.953,23763.087


In [62]:
data['Population density (per square km)']['2000'].head()

Population density (per square km)
Abkhazia                     NaN
Afghanistan               35.051
Akrotiri and Dhekelia        NaN
Albania                  106.855
Algeria                   12.820
Name: 2000, dtype: float64

### Bonus quiz

On which continent is situated `Akrotiri and Dhekelia`?

In [63]:
(
    data['Population density (per square km)']['2000']
    .dropna()
    .head(10)
)

Population density (per square km)
Afghanistan             35.051
Albania                106.855
Algeria                 12.820
American Samoa         289.573
Andorra                138.107
Angola                  11.171
Anguilla               121.626
Antigua and Barbuda    175.692
Argentina               13.283
Armenia                103.225
Name: 2000, dtype: float64

In [64]:
X_AXIS = "Life expectancy"
Y_AXIS = "GDP per capita"

conf = get_config(data, x_axis=X_AXIS, y_axis=Y_AXIS)

conf

{'countries': ['Afghanistan',
  'Albania',
  'Algeria',
  'Andorra',
  'Angola',
  'Antigua and Barbuda',
  'Argentina',
  'Armenia',
  'Australia',
  'Austria',
  'Azerbaijan',
  'Bahamas',
  'Bahrain',
  'Bangladesh',
  'Barbados',
  'Belarus',
  'Belgium',
  'Belize',
  'Benin',
  'Bhutan',
  'Bolivia',
  'Bosnia and Herzegovina',
  'Botswana',
  'Brazil',
  'Brunei',
  'Bulgaria',
  'Burkina Faso',
  'Burundi',
  'Cambodia',
  'Cameroon',
  'Canada',
  'Cape Verde',
  'Central African Republic',
  'Chad',
  'Chile',
  'China',
  'Colombia',
  'Comoros',
  'Congo, Dem. Rep.',
  'Congo, Rep.',
  'Costa Rica',
  "Cote d'Ivoire",
  'Croatia',
  'Cuba',
  'Cyprus',
  'Czech Republic',
  'Denmark',
  'Djibouti',
  'Dominica',
  'Dominican Republic',
  'Ecuador',
  'Egypt',
  'El Salvador',
  'Equatorial Guinea',
  'Eritrea',
  'Estonia',
  'Ethiopia',
  'Fiji',
  'Finland',
  'France',
  'Gabon',
  'Gambia',
  'Georgia',
  'Germany',
  'Ghana',
  'Greece',
  'Grenada',
  'Guatemala',
  '

In [65]:
get_config?

In [66]:
conf.keys()

dict_keys(['countries', 'years', 'x_df', 'y_df', 'z_df'])

### Keys Cheatsheet

- *countries* - countries that we are going to handle on dashboard
- *years* - interesction of years available in x_df, y_df, z_df
- *x_df* - DataFrame to be used for x dimension
- *y_df* - DataFrame to be used for y dimension
- *z_df* - DataFrame to be used for z dimension. We do not do 3D charts. Z dimension can be.. bubble size

In [67]:
conf['y_df'].head()

Unnamed: 0_level_0,1800,1810,1820,1830,1840,1850,1860,1870,1880,1890,...,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015
GDP per capita,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Afghanistan,603.0,604.0,604.0,625.0,647.0,669.0,692.0,716.0,741.0,767.0,...,1173.0,1298.0,1311.0,1548.0,1637.0,1695.0,1893.0,1884.0,1877.0,1925.0
Albania,667.0,668.0,669.0,685.0,701.0,717.0,733.0,750.0,870.0,1008.0,...,7476.0,7977.0,8644.0,8994.0,9374.0,9640.0,9811.0,9961.0,10160.0,10620.0
Algeria,716.0,725.0,735.0,819.0,913.0,1017.0,1134.0,1264.0,1409.0,1570.0,...,12088.0,12289.0,12314.0,12285.0,12494.0,12606.0,12779.0,12893.0,13179.0,13434.0
Andorra,1197.0,1219.0,1242.0,1397.0,1573.0,1770.0,1992.0,2242.0,2523.0,2839.0,...,42738.0,43442.0,41426.0,41735.0,38982.0,41958.0,41926.0,43735.0,44929.0,46577.0
Angola,618.0,645.0,674.0,704.0,736.0,769.0,804.0,840.0,878.0,917.0,...,5445.0,6453.0,7103.0,7039.0,7047.0,7094.0,7230.0,7488.0,7546.0,7615.0


## Exercises

Get Population (from `data`) of countries in 2000. Get population of Poland.

In [68]:
# Here solution

Get config for:

    x - life expectancy
    y - murder rate per 100000 ppl
    
and access life expectancy of Japan in 2000 from that config

In [69]:
# Here solution

# First graphs

Plotly: https://plot.ly/python/getting-started/

In [70]:
import plotly
import plotly.graph_objs as go

In [71]:
plotly.offline.plot({
    'data': [
        go.Bar(
            x=['Australia', 'Austria'],
            y=[24e6, 8e6]
        )
    ],
    'layout': {}
})

'file:///Users/jacek/Code/DASHboard/temp-plot.html'

In [92]:
# We can do plots inline

plotly.offline.init_notebook_mode(connected=True)

In [73]:
plotly.offline.iplot({
    'data': [
        go.Bar(
            x=['Australia', 'Austria'],
            y=[24e6, 8e6]
        )
    ],
    'layout': {}
})

In [74]:
# Different GraphObjects can be mixed

plotly.offline.iplot({
    'data': [
        go.Bar(
            x=['Australia', 'Austria'],
            y=[24e6, 8e6]
        ),
        go.Scatter(
            x=['Argentina', 'Austria'],
            y=[43e6, 8e6]
        )
    ],
    'layout': {}
})

In [77]:
# How to style dot differently

plotly.offline.iplot({
    'data': [
        go.Scatter(
            x=['Argentina'],
            y=[43e6],
#             marker={
#                 'color': 'blue',
#                 'size': 100,
#             },
#             name='Argentina'
        ),
        go.Scatter(
            x=['Austria'],
            y=[8e6],
#             marker={
#                 'color': 'red',
#                 'size': 40,
#             },
#             name='Argentina'
        )
    ],
    'layout': {
#         'showlegend': False
    }
})

In [87]:
# Layout can be customized

plotly.offline.iplot({
    'data': [
        go.Scatter(
            x=data['Life expectancy']['2000'],
            y=data['Total population']['2000'],
#             mode='markers'
        )
    ],
    'layout': {
#         'title': 'Life expectancy over total population',
#         'xaxis': {
#             'title': 'Life expectancy',
#             'titlefont': {
#                 'size': 30
#             }
#         }
    }
})

In [91]:
# Index can be also used as data

plotly.offline.iplot({
    'data': [
        go.Bar(
            x=data['Total population']['2000'].loc[['China', 'India', 'Poland']].index,
            y=data['Total population']['2000'].loc[['China', 'India', 'Poland']],
        )
    ],
})

## More

- https://plot.ly/python/reference/#scatter
- https://plot.ly/python/reference/#layout

## Exercise

Reuse config about life expectancy and murder rate from previous exercise and plot (scatter) year last year available in config.
Play around with customization.

In [97]:
# Here solution

# Dash and its components

Go to `theory_1.py`, then `app_1.py`

# Callbacks

Go to `theory_2.py`, then `app_2.py`

# Advanced stuff

Go to `theory_3.py`, then `app_3.py`

# Finish

## Interesting facts
- https://www.random.org/coins/?num=1&cur=60-usd.5000c-gold-buffalo