## Introduction
This tutorial will introduce the basic tools and concept of building interactive web base dashboards using Dash and Plotly. One of the most important part of the data science pipline is to visualize and present insights founded using data. As data scientists, how to build a clear dashboard and allow users to interact with the data and draw conclusions is essential. This tutorial will include installing the Dash framework, designing the layout of the web application, basic callbacks, some core components in dash including dropdowns, checklists, and many more. This tutorial allows you to learn a new way to present your hard work in a clear and aesthetically pleasing manner by building web applications using only Python!

## Tutorial Content
In this Tutorial, we will show how to build smoe basic dash boards using 
Dash from plotly [Dah PLotly](https://plotly.com/dash/).

The data used in this tutorial for demonstration will be provided by the U.S. Energy Information Administration [eia](https://www.eia.gov/). The U.S. Energy Information Administration (EIA) collects, analyzes, and disseminates independent and impartial energy information to promote sound policymaking, efficient markets, and public understanding of energy and its interaction with the economy and the environment. It is a credited wbesite to look for Energy related data in not only the U.S. but the whole world.

We will cover the following topics in this tutorial:
- [Installing the libraries](#Installing-the-libraries)
- [Basic Layout](#Basic-Layout)
- [Basic Components in Dash](#Basic-Components-in-Dash)
- [Call Backs in Dash](#Call-Backs-in-Dash)
- [GeoSpatial data](#Geospatial-data)
- [Data autmatically updates in real time! Could be used with APIs](#Data-autmatically-updates-in-real-time!-Could-be-used-with-APIs)
- [Example application: ISO Load Analysis](#Example-application:-ISO-Load-Analysis)

## Installing the Libraries

Before getting into the fun stuff, we will have to install and import the libraries. Installation is rather straight forward, the code is provided below. 

In [None]:
#Installing dash and plotly to your environment
pip install dash
pip install plotly
pip install jupyter-dash

What we are importing below are the basic modules provided by Dash Plotly to create our web application. The 'dcc' stands for dash core components, 
it gives you access to many interactive components, including, dropdowns, checklists, and sliders. The 'html' acts as a translator from Python to HTML. We will compose our layout using Python and the modules will automatically convert the code into HTML format. This is how we build "web based application" without writing HTML code.

The plotly.express module (usually imported as px) contains functions that can create entire figures at once, and is referred to as Plotly Express or PX. 
Plotly Express is a built-in part of the plotly library, and is the recommended starting point for creating most common figures.

The reason we are importing JupyterDash is an environment issue that is out of the scope of this tutorial. Just remeber that if you are running this code on a Jupyter Notebook, you will have to run "your_app_name = JupyterDash(__name__)" instead of "your_app_name = Dash(__name__)". The latter is used for other IDEs like VSCode or PyCharm etc.

In [1]:
# Dash Plotly imports
import dash
from dash import dcc
from dash import html
from dash.dependencies import Input, Output
from dash.exceptions import PreventUpdate
import plotly
import plotly.express as px
import plotly.graph_objects as go
#import plotly.graph_objs as go
from jupyter_dash import JupyterDash


#Other Libraries we might use 
import pandas as pd
import random
from collections import deque

## Basic Layout

Writing HTML without writing HTML. Below we are writing code in Python, and Dash automatically converts it into HTML format.

After creating our app_layout application object and the fig object we choose to display in our application, we then continue with the layout. Here I am simply adding a title and a substitle that describes the application. More explanation on Dash's HTML component can be found [here](https://dash.plotly.com/dash-html-components). In addition, I find [this page](https://html.com/#Creating_Your_First_HTML_Webpage) the most frinedly to learn the basics of HTML.

The code below 'app_layout.run_server(mode = "inline")' is to generate the webe application directly on Jupyter Notebook, it is used solely for the purpose of this tutorial to give visual aid.
If you wish to create a temporary url, please run the code that is commented out. 

In [18]:
#Import data using pandas

df_basic_layout = pd.read_excel('weekly_gas_price.xlsx')

#=============================================================================================================
#Building the layout of our first application

app_layout = JupyterDash(__name__)

app_layout.layout = html.Div(children=[
    html.H1(children='Dash Tutorial'),

    html.Div(children='''
        Example application for the layout.
    '''),

    dcc.Graph(
        id = 'example-graph',
        figure = px.line(df_basic_layout, x="Date", y="Gas Price", color = 'Region', title='Weekly Gas Price')
    )
])

#=============================================================================================================
app_layout.run_server(mode = "inline")
# if __name__ == '__main__':
#     app_layout.run_server(debug=True)

## Basic Components in Dash


In this section, we will be going over some of the most common componenets you will use in your application, if your are interested in other components feel free to visit [this website](https://dash.plotly.com/dash-core-components). We will go through, dropdowns, rangesliders, checklists, and most importantly graphs.

In [22]:

app_basic_components = JupyterDash(__name__)

#=============================================================================================================

app_basic_components.layout = html.Div([
    
    html.H1(children='Building dropdown menus'),
    html.Br(),
    dcc.Dropdown(['Coal', 'Nulcear', 'Solar', 'Wind'], value = 'Coal', multi=True),
    html.Br(),
    html.Br(),
    
    html.H1(children='Building rangesliders'),
    html.Br(),
    dcc.RangeSlider(min = 1995, max = 2020, marks = {1995:'1995',2000:'2000',2005:'2005',2010:'2010',2015:'2015',2020:'2020'}),
    html.Br(),
    html.Br(),
    
    html.H1(children='Building checklists'),
    html.Br(),
    dcc.Checklist(
    ['Coal', 'Nulcear', 'Solar', 'Wind'],
    ['Coal'],
    inline=True),
    html.Br(),
    html.Br(),
    
    html.H1(children='Building graphs'),
    html.Br(),
    dcc.Graph(
    figure={
        'data': [
            {'x': [1990, 1995, 2000, 2005], 'y': [4, 1, 2, 6], 'type': 'bar', 'name': 'Coal'},
            {'x': [1990, 1995, 2000, 2005], 'y': [2, 4, 6, 8], 'type': 'bar', 'name': 'Nulcear'},
            {'x': [1990, 1995, 2000, 2005], 'y': [7, 1, 3, 5], 'type': 'bar', 'name': 'Solar'},
            {'x': [1990, 1995, 2000, 2005], 'y': [2, 5, 7, 6], 'type': 'bar', 'name': 'Wind'},
            {'x': [1990, 1995, 2000, 2005], 'y': [15/4, 11/4, 18/4, 25/4], 'type': 'line', 'name': 'Mean'},
        ],
        'layout': {
            'title': 'Dash Data Visualization'
        }
    }
)
    
    
])

#=============================================================================================================

app_basic_components.run_server(mode = "inline")
# if __name__ == '__main__':
#     app_basic_components.run_server(debug=True)

## CallBacks in Dash (Connecting the components and the graph)
The Dash application uses callback functions to respond to user interaction. They are functions that are automatically called by Dash whenever an input component's property changes, in order to update some property in another component (the output).
To build a simple dash application, we first have to design the layout of the application. We then utilize the callback function to allow the application to respond to user input. To show that the application does change, we are using the texts.

The first two examples below are simple applicationsc using rangesliders and dropwdowns. Play with the components and you should see that the text changes according to the componenets. The last examples is more advanced and more realistic to what we ultimately want to build.

In the callback function, we have to define the output and inputs. We correspond the output and input by setting ids in the layout. Lasly, we define function 'update_output' to update the application. More detailed explanation of the callback function can be found [here](https://dash.plotly.com/basic-callbacks).

In [20]:
app_1 = JupyterDash(__name__)

#=============================================================================================================

app_1.layout = html.Div([
    
    html.H1(children='Rangesliders demo'),
    html.Br(),
    dcc.RangeSlider(min = 1995, max = 2020, marks = {1995:'1995',2000:'2000',2005:'2005',2010:'2010',2015:'2015',2020:'2020'}, id='my-range-slider'),
    html.Br(),
    html.Br(),
    html.Div(id='output-container-range-slider')
    
])

#=============================================================================================================

@app_1.callback(
    Output('output-container-range-slider', 'children'),
    [Input('my-range-slider', 'value')])

def update_output(value):
    return 'You have selected year interval "{}"'.format(value)

#=============================================================================================================

app_1.run_server(mode = "inline")
# if __name__ == '__main__':
#     app_1.run_server(debug=True)

In [21]:

app_2 = JupyterDash(__name__)

#=============================================================================================================

app_2.layout = html.Div([
    
    html.H1(children='Dropdown demo'),
    html.Br(),
    dcc.Dropdown(['Coal', 'Nulcear', 'Solar', 'Wind'], 'Coal', id='drop-down'),
    html.Br(),
    html.Div(id='output-drop-down')
    
])

#=============================================================================================================

@app_2.callback(
    Output('output-drop-down', 'children'),
    [Input('drop-down', 'value')])
def update_output(value):
    return 'You have selected power generation source "{}"'.format(value)

#=============================================================================================================

app_2.run_server(mode = "inline")
# if __name__ == '__main__':
#     app_2.run_server(debug=True)

# Example of a more advanced callback

In [32]:
# preparing the data frame, 
df = pd.read_excel("annual_generation_state.xls",header = 1)
df_clean = df.groupby(['YEAR','STATE','ENERGY SOURCE'])[['GENERATION (Megawatthours)']].sum()
df_clean['GENERATION (Megawatthours)'] = df_clean['GENERATION (Megawatthours)']*0.001
df_clean = df_clean.rename(columns={"GENERATION (Megawatthours)": "GENERATION (Gegawatthours)"})
df_clean.reset_index(inplace=True)
df_clean = df_clean[df_clean['STATE'] != 'US-TOTAL']
df_clean = df_clean[df_clean['STATE'] != 'US-Total']
df_clean = df_clean[df_clean['STATE'] != '  ']
df_clean.head(100)
# In this process what df_clean contains is the annual GWh produced by state by different energy source from 1990 to 2020

Unnamed: 0,YEAR,STATE,ENERGY SOURCE,GENERATION (Gegawatthours)
0,1990,AK,Coal,1021.146
1,1990,AK,Hydroelectric Conventional,1949.042
2,1990,AK,Natural Gas,6932.522
3,1990,AK,Petroleum,994.232
4,1990,AK,Total,11199.012
...,...,...,...,...
95,1990,HI,Other Biomass,1680.010
96,1990,HI,Other Gases,32.326
97,1990,HI,Petroleum,17471.724
98,1990,HI,Total,19405.504


In this example we are using annual GWh produced by state by different energy source from 1990 to 2020 as our data. The reason it is more advaced than the previous example is because instead of 
one output and input in the callback, there are multiple. This means that the user have multiple variables to custmize(because of mutiple inputs), and the applications have multiple parts that
is changing depending on the inputs. The layout for the application first gives the user a coice of year, than the choice of states. The text then shows the choices selected while updating the graph.

In [33]:

app = JupyterDash(__name__)

#=============================================================================================================

app.layout = html.Div([
    
    html.H1("Annual GWh produced by state by different energy source from 1990 to 2020"),
    dcc.Dropdown(id = 'select_year',
                options = [
                    {'label': "1990",'value':1990},{'label': "1991",'value':1991},{'label': "1992",'value':1992},{'label': "1993",'value':1993},
                    {'label': "1994",'value':1994},{'label': "1995",'value':1995},{'label': "1996",'value':1996},{'label': "1997",'value':1997},
                    {'label': "1998",'value':1998},{'label': "1999",'value':1999},{'label': "2000",'value':2000},{'label': "2001",'value':2001},
                    {'label': "2002",'value':2002},{'label': "2003",'value':2003},{'label': "2004",'value':2004},{'label': "2005",'value':2005},
                    {'label': "2006",'value':2006},{'label': "2007",'value':2007},{'label': "2008",'value':2008},{'label': "2009",'value':2009},
                    {'label': "2010",'value':2010},{'label': "2011",'value':2011},{'label': "2012",'value':2012},{'label': "2013",'value':2013},
                    {'label': "2014",'value':2014},{'label': "2015",'value':2015},{'label': "2016",'value':2016},{'label': "2017",'value':2017},
                    {'label': "2018",'value':2018},{'label': "2019",'value':2019},{'label': "2020",'value':2020}],
                 multi = False,
                 value = 1990,
                 style = {'width' : "40%"}
                ),
    html.Br(),
    dcc.Dropdown(id = 'select_source',
                options = [
                    {'label': 'Coal','value':'Coal'},{'label': 'Hydroelectric Conventional','value':'Hydroelectric Conventional'},
                    {'label': 'Natural Gas','value':'Natural Gas'},{'label': 'Petroleum','value':'Petroleum'},{'label': 'Total','value':'Total'},
                    {'label': 'Wind','value':'Wind'},{'label': 'Wood and Wood Derived Fuels','value':'Wood and Wood Derived Fuels'},
                    {'label': 'Nuclear','value':'Nuclear'},{'label': 'Other Biomass','value':'Other Biomass'},{'label': 'Other Gases','value':'Other Gases'},
                    {'label': 'Pumped Storage','value':'Pumped Storage'},{'label': 'Geothermal','value':'Geothermal'},
                    {'label': 'Solar Thermal and Photovoltaic','value':'Solar Thermal and Photovoltaic'},
                    {'label': 'Other','value':'Other'}],
                 multi = False,
                 value = 'Total',
                 style = {'width' : "40%"}
                ), 
    html.Br(),
    html.Div(id = "output_year",children = []),
    html.Br(),
    html.Div(id = "output_source",children = []),
    html.Br(),
    
    dcc.Graph(id = 'map',figure = {})
])

#=============================================================================================================

@app.callback(
    [Output(component_id = 'output_year',component_property='children'),
     Output(component_id = 'output_source',component_property='children'),
     Output(component_id = 'map',component_property='figure')],
    [Input(component_id='select_year',component_property='value'),
     Input(component_id='select_source',component_property='value')
    ]
)

def update_graph(selected_year,selected_source):
    
    container_year = "The year chosen is: {}".format(selected_year)
    container_source = "The source chosen is: {}".format(selected_source)
    
    dff = df_clean.copy()
    dff = dff[dff["YEAR"] == selected_year]
    dff = dff[dff['ENERGY SOURCE'] == selected_source]
    
    fig = px.bar(
        data_frame = dff,
        x = 'STATE',
        y = 'GENERATION (Gegawatthours)',
        #locationmode = 'USA-states',
        #locations = 'STATE',
        #scope = 'usa',
        #color = 'GENERATION (Gegawatthours)',
        hover_data = ['YEAR','STATE','GENERATION (Gegawatthours)'],
        #color_continuous_scale = px.colors.sequential.Viridis,
        #labels = {'GENERATION (Gegawatthours)': 'GENERATION (Gegawatthours)'},
        #template = 'plotly_dark'
    )
    
    return container_year, container_source, fig

#=============================================================================================================

app.run_server(mode = "inline")
# if __name__ == '__main__':
#     app.run_server(debug=True)

## GioSpacial Data

Geospatial data forms a key aspect of data science. Plotly has a library dedicated to geospatial data plotting, in other words, we could build giospatial application using Dash and Plotly. In this part of the tutorial, we will be using a graph object from plotly as the component and visualize the renewable energy generation of different
states in the U.S. over the years.

In [24]:
df_renewable = pd.read_excel('renewable_percentage.xlsx')
df_renewable.head(100)

Unnamed: 0,State,Year,Renewable Percentage,type
0,AL,2015,2.16,renewables
1,AL,2016,2.30,renewables
2,AL,2017,2.57,renewables
3,AK,2015,3.42,renewables
4,AK,2016,3.97,renewables
...,...,...,...,...
95,NY,2017,5.00,renewables
96,NC,2015,3.09,renewables
97,NC,2016,4.78,renewables
98,NC,2017,6.60,renewables


In [25]:

app_gio = JupyterDash(__name__)

#=============================================================================================================

app_gio.layout = html.Div([

    html.H1("Gio-spatial data", style={'text-align': 'center'}),

    dcc.Dropdown(id="slct_year",
                 options=[
                     {"label": "2015", "value": 2015},
                     {"label": "2016", "value": 2016},
                     {"label": "2017", "value": 2017}],
                 multi=False,
                 value=2015,
                 style={'width': "40%"}
                 ),

    html.Div(id='output_container'),
    
    html.Br(),

    dcc.Graph(id='map')

])

#=============================================================================================================

@app_gio.callback(
    [Output(component_id='output_container', component_property='children'),
     Output(component_id='map', component_property='figure')],
    [Input(component_id='slct_year', component_property='value')]
)

def update_graph(option_slctd):

    container = "The year chosen: {}".format(option_slctd)

    dff = df_renewable.copy()
    dff = dff[dff["Year"] == option_slctd]
   
    # Plotly Express
    fig = px.choropleth(
        data_frame=dff,
        locationmode='USA-states',
        locations='State',
        scope="usa",
        color='Renewable Percentage',

    )
    return container, fig

#=============================================================================================================

app_gio.run_server(mode = "inline")
# if __name__ == '__main__':
#     app_gio.run_server(debug=True)

## Data autmatically updates in real time! Could be used with APIs
In many cases, data scientists find themselves dealing with real time data. For example, to predict electricity price in the wholesale markets for traders, we forecast the supply and demand. To forecast demand, we have to include data like weather into our model because renewables like solar and wind highly depend on weather. Another example is the trading in crude oil. Quantiative analyst must pay attention to real time prices to make their calculations accurate and up to date so traders can make their move in split seconds. below we show how Dash can build dashboards that constantly updates itself using the interval component.

Side note: Since using APIs are is not in the scope of this tutorial, the example will be using data that is preprocessed.

In [11]:

df = pd.read_excel('weekly_gas_price.xlsx')
df_closing_price = pd.read_excel('closing_price.xlsx')

close_price = df_closing_price['price'].to_list()
year = df_closing_price['year'].to_list()

X = deque(maxlen=33)
X.append(year[0])
Y = deque(maxlen=33)
Y.append(close_price[0])

#=============================================================================================================

app_real_time = JupyterDash(__name__)
app_real_time.layout = html.Div(
    [
    html.H1(children='Data update with one second interval'),
    html.Div(children='''
        Annual Average Closing Price of Crude oil (USD$)
    '''),

        dcc.Graph(id='live-graph', animate=True),
        dcc.Interval(
            id='graph-update',
            interval=1000,
            n_intervals = 0,
            max_intervals=32,
        ),
    ]
)

#=============================================================================================================

@app_real_time.callback(Output('live-graph', 'figure'),
        [Input('graph-update', 'n_intervals')])

def update_graph_scatter(n):


    X.append(year[0]+n)
    Y.append(close_price[n])
    data = plotly.graph_objs.Scatter(
            x=list(X),
            y=list(Y),
            name='Scatter',
            mode= 'lines+markers'
            )

    return {'data': [data],'layout' : go.Layout(xaxis=dict(range=[min(X),max(X)+1]),
                                                yaxis=dict(range=[min(Y)-10,max(Y)+10]),)}

#=============================================================================================================

app_real_time.run_server(mode = "inline")
# if __name__ == '__main__':
#     app_real_time.run_server(debug=True)

## Example application
To end this tutorial, we will be building a dash application that can be used to perform simple data analysis. The data we are visualizing here are data collected from Independent System Operators(ISO).ISO coordinates, controls, and monitors the operation of the electrical power system, usually within a single US state, but sometimes encompassing multiple states. They forecast electricity demand and supply in real time and make sure that they are balanced. The data is a time series load record of three ISOs that operate in the East Coast. We will build a web application from the dataset and make three observations of the plotted data.

In [26]:
df_iso = pd.read_csv('ISO_historical_peaks.csv')

In [31]:

app_example = JupyterDash(__name__)

#=============================================================================================================
app_example.layout = html.Div(children=[
    html.Div([
        html.H1(children='ISO ELECTRICITY LOAD ANALYSIS'),
        html.Div(children='''
            ISO ELECTRICITY LOAD ANALYSIS
                '''), 
        html.Br(),
        dcc.Dropdown(id = 'select_iso',
                options = [
                    {'label': 'NYISO','value':'NYISO'},
                    {'label': 'PJM','value':'PJM'},
                    {'label': 'ISONE','value':'ISONE'}],
                 multi = False,
                 value = 'NYISO',
                 style = {'width' : "40%"}
                ),
        dcc.Graph(id = 'time_series_graph',figure = {}),
    ]),

    html.Div([
        html.Div([
            dcc.Graph(id = 'plot_scatter',figure = {}, style={'display': 'inline-block'}),
            dcc.Graph(id = 'hist_plot',figure = {}, style={'display': 'inline-block'}),
            ]),
    ]),
])



#=============================================================================================================

@app_example.callback(
    [Output(component_id = 'time_series_graph',component_property = 'figure' ),
    Output(component_id = 'plot_scatter',component_property = 'figure' ),
     Output(component_id = 'hist_plot',component_property = 'figure' )],
    Input(component_id = 'select_iso',component_property = 'value')
)

def update_graph(selected_iso):
    
    df_iso_copy = df_iso.copy()
    filter_df_iso = df_iso_copy[df_iso_copy['ISO'] == selected_iso]
    
    time_series_fig = px.line(filter_df_iso,
                             x='timestamp', y ='load_MW',
                             title = f'Time Series Load in {selected_iso}')

    scatter_fig = px.scatter(filter_df_iso,
        x="load_MW", y="temperature",
        color='season',
        title = f'Peak Load vs.Temperature {selected_iso}')
    
    
    hist_fig = px.histogram(filter_df_iso,
        x="load_MW",nbins=75,
        title = f'Load distribution in {selected_iso}')
     

    return time_series_fig,scatter_fig,hist_fig

#=============================================================================================================

app_example.run_server(mode = "inline")
#if __name__ == '__main__':
    #app_example.run_server(debug=True)

What we can conclude through observation using the dashboard created.
1. PJM has a few spikes in the time series graph, looking at the Peak Load versus temeprature graph, we see that the spikes are sdistributed into all four seasons. This means that there is a high posibility that they are outliers.
2. In all three ISOs, we see that demand is high during Summer and Winter and less during Fall and Spring, this might mean that a strong coefficient on demand is temperature. 
3. From the histogram, we see that NYISO and ISONE has similar load distribution, where the variance are larger compared to PJM.

## Summary
This tutorial only scratches the surface of Dash and Plotly There are resources online that goes in-depth into the different function that it offers. One of the most important part would be to uploading your finished application onto a permanent url. However, it is difficult to use jupyter notebook to go into detail of how to do that. Therefore, if you hope to expand your portfolio by building interactive dashboards. I suggest you visit [this website](https://towardsdatascience.com/deploying-your-dash-app-to-heroku-the-magical-guide-39bd6a0c586c) for more details.