# Plotting the data with Plotly

## Outline

### 1. Importing the data
### 2. Working with Plotly
* Individual Scatter Plots
* Yearly Average Scatter Plots

### 3. Using Plotly for All Locations

## 1. Importing the data
First import our packages and then import the `Final_validation.csv` file. This will be used for plotting our map and forecasting the data. 

In [89]:
import numpy as np
import pandas as pd
import datetime
import plotly.express as px

from statsmodels.tsa.statespace.sarimax import SARIMAX
from json_to_csv import geojson_to_csv
from ts_train_test_split import uni_selection

from forecast_single import forecast
from arima_dataframe import arima_results
from arima_yearly_averages import arima_averages

In [90]:
locations = pd.read_csv('Final_validation.csv')
locations.tail()

Unnamed: 0.1,Unnamed: 0,lat,lon,Place,p,d,q,P,D,Q,filepath,MSE,r2
22,22,46.2199,-119.0837,"Kennewick, WA",7,0,2,1,1,2,NASA/POWER_Point_Monthly_Timeseries_1981_2020_...,0.00181,0.846994
23,23,46.1704,-123.7804,"Navy Heights, OR",8,0,3,2,1,1,NASA/POWER_Point_Monthly_Timeseries_1981_2020_...,0.003364,0.58539
24,24,46.1514,-122.8191,"Kelso, WA",6,0,8,1,1,2,NASA/POWER_Point_Monthly_Timeseries_1981_2020_...,0.003331,0.589424
25,25,46.0562,-118.3476,"Walla Walla, WA",7,0,8,3,0,2,NASA/POWER_Point_Monthly_Timeseries_1981_2020_...,0.001904,0.839012
26,26,45.4969,-122.5938,"Portland, OR",3,0,8,0,1,1,NASA/POWER_Point_Monthly_Timeseries_1981_2020_...,0.003567,0.632208


Map of all the locations being used in our forecasting model. It provides the coordinates and nearest town/city. 

In [58]:
fig = px.scatter_mapbox(locations, lat="lat", lon="lon", hover_name = "Place", 
                        color_discrete_sequence=["darkviolet"], zoom=5.5, height=400, width = 600)
# styles: "open-street-map" or "carto-positron" are the best options 
fig.update_layout(mapbox_style="carto-positron")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

#fig.write_html("Map.html")

## 2. Working with Plotly

### Individual Scatter Plots

Plotly only takes data in a certain format, therefore, the best way we've been able to plot the predictions from SARIMAX is by converting the data into a new data frame and calculating the year-month-day associated with the predicted data. Here we will use `forecast` and `arima_results` from the py files `forecast_single.py` and `arima_dataframe.py` respectively. These are used to predict the data and obtain the year-month-day format. For now, we will be plotting only the first location. 

In [93]:
df = forecast(locations, (0))
df

Unnamed: 0,solar
1984-01-01,0.440000
1984-02-01,0.410000
1984-03-01,0.510000
1984-04-01,0.480000
1984-05-01,0.450000
...,...
2035-08-01,0.564628
2035-09-01,0.478397
2035-10-01,0.424538
2035-11-01,0.360520


In [94]:
# ONLY RUN ONCE per location

df_predicted = arima_results(df)
df_predicted['Solar Ratio'] = df_predicted['Solar Ratio'].multiply(1361)
df_predicted = df_predicted.rename(columns={"Solar Ratio": "Solar Irradiance ($W/m^2$)"})
df_predicted

Unnamed: 0,Solar Irradiance ($W/m^2$),Year
1984-01-01,598.840000,1984-01-01
1984-02-01,558.010000,1984-02-01
1984-03-01,694.110000,1984-03-01
1984-04-01,653.280000,1984-04-01
1984-05-01,612.450000,1984-05-01
...,...,...
2035-08-01,768.458907,2035-08-01
2035-09-01,651.097923,2035-09-01
2035-10-01,577.795735,2035-10-01
2035-11-01,490.667639,2035-11-01


Once the data frame is in the format show above, it's easy to plot the data with Plotly, as shown below for the first location Abbotsford, Canada.

In [95]:
fig = px.scatter(df_predicted, x = "Year", y = "Solar Irradiance ($W/m^2$)",  trendline="ols",
                 trendline_scope="overall", title="Abbotsford, Canada 49.0362\N{DEGREE SIGN}N 122.3247\N{DEGREE SIGN}W")
fig.show()

In [96]:
fig = px.line(df_predicted, x = 'Year', y = "Solar Irradiance ($W/m^2$)", 
              title="Abbotsford, Canada 49.0362\N{DEGREE SIGN}N 122.3247\N{DEGREE SIGN}W")
fig.show()

### Yearly Average Scatter Plots

From the py file `arima_yearly_averages.py`, we can use the function `arima_averages` to represent the results from SARIMAX in terms of yearly averages.

In [98]:
df_predicted = df_predicted.rename(columns={"Solar Irradiance ($W/m^2$)": "Solar Ratio"})
yearly_avg = arima_averages(df_predicted)
yearly_avg = yearly_avg.rename(columns={"Solar Ratio": "Solar Irradiance ($W/m^2$)"})

In [99]:
yearly_avg

Unnamed: 0,Solar Irradiance ($W/m^2$),Year
0,639.67,1984
1,698.646667,1985
2,661.219167,1986
3,704.3175,1987
4,653.28,1988
5,670.2925,1989
6,680.5,1990
7,704.3175,1991
8,679.365833,1992
9,687.305,1993


Plot of the yearly averages for the first location: Abbotsford, Canada.

In [100]:
fig = px.scatter(yearly_avg, x = "Year", y = "Solar Irradiance ($W/m^2$)", trendline="ols",
                 trendline_scope="overall", title="Abbotsford, Canada 49.0362\N{DEGREE SIGN}N 122.3247\N{DEGREE SIGN}W")
fig.show()
#fig.write_html("file.html") this will save it as an html file

In [102]:
fig = px.line(yearly_avg, x = "Year", y = "Solar Irradiance ($W/m^2$)", 
              title="Abbotsford, Canada 49.0362\N{DEGREE SIGN}N 122.3247\N{DEGREE SIGN}W")
fig.show()

### 3. Using Plotly for All Locations
Here we will take the same concepts from before and create data frames for all of the locations with year-month-day values and yearly averages. Using the following for loop, we can create the new data frames `all_pred` and `allAverages`. 

In [46]:
all_pred = pd.DataFrame()
allAverage = pd.DataFrame()

for i in range(27):
    df = forecast(locations, (i))
    df_predicted = arima_results(df)
    df_average = arima_averages(df_predicted)
    
    df_predicted['Lon'] = locations.loc[i]['lat']
    df_predicted['Lat'] = locations.loc[i]['lon']
    df_predicted['Place'] = locations.loc[i]['Place']
    
    df_average['Lon'] = locations.loc[i]['lat']
    df_average['Lat'] = locations.loc[i]['lon']
    df_average['Place'] = locations.loc[i]['Place']
    
    all_pred = all_pred.append(df_predicted)
    allAverage = allAverage.append(df_average)
 



Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.


Non-invertible starting MA parameters found. Using zeros as starting parameters.



Here we will be renaming the column `Solar Ratio` to `Solar Irradiance (W/m^2)`. This will make it easier for the users to understand the data. 

In [47]:
# ONLY RUN ONCE

all_pred['Solar Ratio'] = all_pred['Solar Ratio'].multiply(1361)
all_pred = all_pred.rename(columns={"Solar Ratio": "Solar Irradiance ($W/m^2$)"})
allAverage['Solar Ratio'] = allAverage['Solar Ratio'].multiply(1361)
allAverage = allAverage.rename(columns={"Solar Ratio": "Solar Irradiance ($W/m^2$)"})

In [104]:
all_pred

Unnamed: 0,Solar Irradiance ($W/m^2$),Year,Lon,Lat,Place
1984-01-01,598.840000,1984-01-01,49.0362,-122.3247,"Abbotsford, Canada"
1984-02-01,558.010000,1984-02-01,49.0362,-122.3247,"Abbotsford, Canada"
1984-03-01,694.110000,1984-03-01,49.0362,-122.3247,"Abbotsford, Canada"
1984-04-01,653.280000,1984-04-01,49.0362,-122.3247,"Abbotsford, Canada"
1984-05-01,612.450000,1984-05-01,49.0362,-122.3247,"Abbotsford, Canada"
...,...,...,...,...,...
2035-08-01,822.460653,2035-08-01,45.4969,-122.5938,"Portland, OR"
2035-09-01,738.000372,2035-09-01,45.4969,-122.5938,"Portland, OR"
2035-10-01,663.525927,2035-10-01,45.4969,-122.5938,"Portland, OR"
2035-11-01,543.420040,2035-11-01,45.4969,-122.5938,"Portland, OR"


In [105]:
allAverage

Unnamed: 0,Solar Irradiance ($W/m^2$),Year,Lon,Lat,Place
0,639.670000,1984,49.0362,-122.3247,"Abbotsford, Canada"
1,698.646667,1985,49.0362,-122.3247,"Abbotsford, Canada"
2,661.219167,1986,49.0362,-122.3247,"Abbotsford, Canada"
3,704.317500,1987,49.0362,-122.3247,"Abbotsford, Canada"
4,653.280000,1988,49.0362,-122.3247,"Abbotsford, Canada"
...,...,...,...,...,...
47,651.394863,2031,45.4969,-122.5938,"Portland, OR"
48,651.394863,2032,45.4969,-122.5938,"Portland, OR"
49,651.394863,2033,45.4969,-122.5938,"Portland, OR"
50,651.394863,2034,45.4969,-122.5938,"Portland, OR"


For the monthly data, we can make individual line or scatter plots. We can also express the data all together in one graph. 

In [106]:
fig = px.line(all_pred, x = "Year", y = "Solar Irradiance ($W/m^2$)", 
              color = "Place", line_group = "Place", hover_name = "Place", 
              line_shape="spline", render_mode="svg")
fig.show()

Here we can compare the monthly data to the yearly averages. 

In [107]:
fig = px.line(allAverage, x = "Year", y = "Solar Irradiance ($W/m^2$)", 
              color = "Place", line_group = "Place", hover_name = "Place",
              line_shape="spline", render_mode="svg")
fig.show()

With the number of locations, it can be hard to interpret which line corresponds to which location. Below we've provided an animated scatter plot and an animated bar graph to express the yearly averages more clearly. These graphs can also be exported as html files. The advantage of these graphs is that they are interactive. You can hit the play button to watch the animation from 1984 to 2035 or you can stop at each year and analyze the data by hovering over each location.

In [109]:
fig = px.scatter(allAverage, x = "Year", y = "Solar Irradiance ($W/m^2$)", animation_frame = "Year", animation_group = "Place",
                 color = "Place", hover_name = "Place", range_x = [1984,2035], range_y = [550,800])
fig.show()
#fig.write_html("Yearly Averages.html")  #this will save it as an html file

In [110]:
fig = px.bar(allAverage, x = "Place", y = "Solar Irradiance ($W/m^2$)", color="Place",
              animation_frame = "Year", animation_group = "Place", range_y=[550,800])
fig.show()
#fig.write_html("Yearly Averages bar graph.html")

Currently unable to get the map to work with Plotly.

In [56]:
fig = px.scatter_mapbox(allAverage, lat = "Lat", lon = "Lon", hover_name = "Place", color = "Place", size = "Solar Irradiance ($W/m^2$)",
                       animation_frame = "Year", animation_group = "Place", zoom = 10, title = "All Averages")
                        #color_discrete_sequence=["darkviolet"], zoom=5.5, height=400, width = 600)
# styles: "open-street-map" or "carto-positron" are the best options 
fig.update_layout(mapbox_style="carto-positron")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()