In this notebook, we will use plotly.express and pandas to plot, view, and analyze our sensor data. These are the same libraries the real time dashboard running on your pi [here](http://0.0.0.0:8050/) is using to create its real time plots.

In [None]:
# We use the plotly.express library and alias it to px
import plotly.express as px

## Plot example data provided by Plotly

First, this snippet of code is a quick example of how to use plotly taken from [here](https://plotly.com/python/time-series/). It plots Google stock data over time.

In [None]:
# px.data.stocks() is a pandas dataframe containing data about some stock indexes over time that plotly provides for
# learning purposes
df = px.data.stocks()
# This tells plotly to create a line graph from our dataframe using the 'date' column as the x-axis and the 'GOOG'
# column as the y-axis
fig = px.line(df, x='date', y='GOOG')
# This displays the figure we created
fig.show()

## Plot example data collected from our sensors

In order to display our own sensor data, we will need to load it as a dataframe first. This means we need to import pandas.

In [None]:
# We alias pandas to pd here. This is very commonly done when importing pandas
import pandas as pd

We will now import a small snippet of sensor data into a dataframe and visualize it using plotly.express just like before only this time creating our own dataframe out of a file instead of using one provided by plotly. If you have never worked with a dataframe before, relevent documentation is [here](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html). Note that we are using Pandas 1.4.3, but it is unlikely there will be any significant differences between the version we are using and the latest version for our purposes.

In [None]:
# Path to small data snippet provided
data_fp = './assets/sensor-data-sample.csv'

# Create a dataframe from our data
df = pd.read_csv(data_fp)
# Create a figure from our dataframe using 'Time' as our x-axis and 'Temperature' as our y-axis
fig = px.line(df, x='Time', y='Temperature')
# Display the figure like before. There will be a couple of oddities in the figure shown. See if you can figure out
# what they are, and what might have caused them
fig.show()

If you've gotten here, you should have successfully loaded the provided data file into a dataframe and viewed its temeprature over time in a line graph. There are a few unusual things in this graph.

1. Around 10:19 there is a hole in the line
2. A bit after 10:28 there is a flat section lasting about 5 minutes

The causes for these are as follows:

The hole in the graph is caused by missing data. There is no temperature reading in the data for a few of the time points. Plotly handles this by simply not plotting anything for that portion of the graph. This is what will happen if the sensor is not reading (most likely due to being unplugged) for some period of time.

The flat line is caused by a gap in the timestamps on the data. There is a five minute gap between the last piece of data before that line, and the first piece of data after that line. Plotly reacts to this by plotting a flat line between the two points for that duration. The most likely cause of this would be the device that was collecting the data realizing it was behind on the time and fast forwarding to the correct time in between two data collections.

In this case, I artificially made these issues with the data for demonstrative purposes, but this is the sort of thing you have to deal with in real world data all the time!

## Identify and experiment with other data in the dataframe

The dataframe that we loaded contains information other than time and temperature. We can get a preview of the data in the dataframe by printing the 5 most recent entries of it as follows:

In [None]:
df.tail()

Now let's do something a bit silly just to see what happens. There is no rule saying we have to plot our data over time, so let's try plotting our temperature over our humidity.

In [None]:
fig = px.line(df, x='Humidity', y='Temperature')
fig.show()

YIKES that's a bit of a mess. You can't just blindly plot your data and get something useful! You have to do it in a way that actually makes sense. The full documentation for line graphs in plotly can be found [here](https://plotly.com/python-api-reference/generated/plotly.express.line), and if you explore the rest of that site, you can find the documentation for other forms of graph that plotly can create.

## Plot "real" (non-example) data collected from sensors

In [None]:
# Path to your data - fill in the file name
data_fp = './assets/jgc-1-sensor-log.csv'

# Create a dataframe from our data
df = pd.read_csv(data_fp)
# Create a figure from our dataframe using 'Time' as our x-axis and 'Temperature' as our y-axis
fig = px.line(df, x='Time', y=['PM1.0', 'PM2.5', 'PM10'])
# Display the figure like before. There will be a couple of oddities in the figure shown. See if you can figure out
# what they are, and what might have caused them
fig.show()