# Intro to interactive plotting with Plotly

 In this notebook we will explore the [Plotly package](https://plotly.com/) for interactive plotting with Python. We will use the same geographical data and time series data as in the notebook on plotting with Bokeh to demonstrate the differences and similarities between the packages. The learning curve for Plotly is very steep and it is more focused on the commercial market than other Python packages, but it is useful to know about its capabilities. Plotly is closely linked to Dash, which can be used to create web-based data apps. Interesting examples of these can be found on [the example apps](https://plotly.com/examples/) page. This is worthy of a course on its own, in this notebooks we'll explore some of the basic plotting capabilities of Plotly.

### 1. Importing the necessary Python packages

Plotly Express is Plotly's high-level API for creating figures in Python, which is intended to be able to create common figures quickly. We'll explore some of its options by revisiting the water balance dataset, and then we'll move on to the Switzerland data to showcase Plotly's interactive plotting capabilities. These require some classes and functions from Dash, hence the last line of imports in the code cell below.

In [None]:
import pandas as pd
import plotly.express as px
from dash import Dash, html, dcc, Output, Input, callback

One type of plot that isn't easy to create with Matplotlib is a ternary diagram. In Plotly Express it is directly available by calling `scatter_ternary`. Note how the function accepts a DataFrame as the first argument, while the subsequent arguments `a`, `b` and `c` are the names of the columns in the DataFrame that should be used on the axes of the ternary plot. Observe what happens when you hover over the data points with the mouse cursor.

In [None]:
# Read the water balance data from excel using pandas
df = pd.read_excel('data/water_balance_data.xlsx',
                   index_col=0,
                   parse_dates=True)

pan_factor = 1.2

# Calculate the volumetric rates
df['P'] = df['area'] * df['rain'] / 1000.
df['E'] = df['area'] * df['evaporation'] / (1000. * pan_factor)
df['dV'] = -df['volume'].diff(periods=-1)
df['I'] = df['P'] - df['E'] - df['dV']

# Draw a ternary diagram with the water balance components
fig0 = px.scatter_ternary(df, a="P", b="E", c="I")
fig0.show()

Another figure type that is easy to create is a stacked bar diagram. By passing the column names to be plotted in a list to the `y` argument of the function, Plotly automatically understands that the bars need to be stacked. Just like in the previous example, the DataFrame is the first argument to be passed to the function. Note that in the legend you can click on the items to switch them on or off.

In [None]:

# Draw a stacked bar diagram of the water balance components
fig1 = px.bar(df, x=df.index, y=["P", "E", "I"], title="Water balance components")
fig1.show()

### Load some example data

Now back to the time series from groundwater monitoring wells in Switzerland, collected from the [Swiss Federal Office for the Environment](https://www.bafu.admin.ch). Remember that the dataset contains 1) the time series in `heads.csv` and 2) the metadata for each well in `metadata.csv`.

In [None]:
## Import time series data
data = pd.read_csv("data/heads.csv", index_col=0, parse_dates=True)
data.head()

In [None]:
# Import metadata
metadata = pd.read_csv("data/metadata.csv", index_col=0)
metadata.head()

### Pandas / Plotly plot

Remember that a DataFrame has a `plot` method. By default, the plot will be created using Matplotlib, but this can be changed to Plotly as well.

In [None]:
pd.options.plotting.backend = "plotly"
data.plot()

### Plot a single time series using Plotly Express

The `line` function from Plotly Express is the equivalent of the Matplotlib `plot` function. As with the previous functions, the DataFrame to be used is the first function argument, the subsequent arguments determine which data go on which axis.

In [None]:
fig = px.line(data, x=data.index, y="Oberwichtrach")
fig.show()

The result is already quite nice, but using the `update_layout` method, the appearance of the graph can be tweaked.

In [None]:
name="Oberwichtrach"
fig = px.line(data, x=data.index, y=name)

# Dress up the plot
fig.update_layout(
    title = name,
    xaxis_title = "Date",
    yaxis_title = "Heads",
)
fig.show()

A very nice feature of Plotly is the ability to save the figure in HTML5 format. This can be viewed in a browser as an interactive figure. This makes it possible to easily share the figures with others. There is no need to include the underlying data files because the data are stored with the HTML file. Obviously, if you are working with confidential data, you need to be careful here!

In [None]:
fig.write_html(f'{name}.html', auto_open=True)

## Plot geographic data

Now we will plot the geographic data. We will use the `metadata` DataFrame to get the coordinates of each location. Bokeh uses the Mercator projection for plotting, while our data is provided in longitude and latitude. We use the Python package PyProj to transform the coordinates and add new columns to the metadata DataFrame. 


### Make a map

We are now ready to make a map. We will use the `scatter_mapbox` function from Plotly Express, which requires the coordinates to be in lat/lon format. The data from the "Prec" column will be used to determine the size and color of the markers. The `hover_name` and `hover_data` arguments determine which data from the DataFrame 'metadata' are displayed when the user hovers their mouse cursor over the station locations. By setting the `mapbox_style` to "open-street-map" an OSM background map will be displayed.

In [None]:
figmap = px.scatter_mapbox(
    metadata, 
    lat="lat", 
    lon="lon", 
    size="Prec",
    color="Prec",
    color_continuous_scale="viridis",
    hover_name=metadata.index,
    hover_data=['id', 'Evap', 'Temp', 'Prec'],
    zoom=6, 
    height=600,
)

figmap.update_layout(
    coloraxis_colorbar_title_text='Precipitation [mm/yr]',
    mapbox_style="open-street-map",
)

figmap.show()

### Bonus material: Interactive plot

The final example shows how to combine a map with a graph. Clicking a station location on the map automatically updates the time series displayed in the graph on the right. The code is quite complex but some comments are provided in the code cell to explain the main steps.

In [None]:
# Use Dash to create an interactive app that shows the figures created previously side by side
app = Dash()

app.layout = html.Div(
    [
        html.Div(
            [
                dcc.Graph(figure=figmap, id="metadata-map") # 'figmap' was created earlier, 'id' is its name within the Dash app
            ],
            style={'width': '49%', 'display': 'inline-block'} # some formatting options
        ),
        
        html.Div(
            [
                dcc.Graph(figure=fig, id="data-graph")
            ],
            style={'width': '49%', 'display': 'inline-block'}
        )
    ],
)

# Add a callback function that redraws the graph when a point on the map is clicked
@callback(
    Output('data-graph', 'figure'), # name of the target graph, and the type of output (a figure)
    Input('metadata-map', 'clickData'), # name of the input graph, and the kind of data that will be passed to the update_graph function
)
def update_graph(clickData):
    if clickData is None:
        name = "Oberwichtrach"
    else:
        name = clickData['points'][0]['hovertext'] # This gets the station name
    
    fig = px.line(data, x=data.index, y=name) # Recrete the time series graph

    # Dress up the plot
    fig.update_layout(
        title = name,
        xaxis_title = "Date",
        yaxis_title = "Heads",
    )

    return fig # Returns the new figure to replace the old one in 'data-graph'

app.run_server()