![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

## Analysis of a Plant's Environment

In this short notebook, we look at some data collected from a house plant growing in one of our homes.

<img src="images/plant2.jpg" alt="A plant with sensor" width="400"/>
<div align="center">
A basement window with our plant.
</div>

We have a number of data sensors around the plant, made by the Phidgets company https://www.phidgets.com/. These sensors keep track of the temperature, humidity, soil moisture, and light levels. Values from the sensors have been recorded several times a day and stored in an online spreadsheet. 

You can see the spreadsheet here: https://ethercalc.net/CallystoPlant01

In this notebook, we download the data and save it in a Pandas Dataframe. From there we can plot the data, and perform some numerical calculations that give us an idea about the state of the plant's environment.


## Step 1

Let's import some Python libraries for Pandas, Plotly, and tools for requesting data from the online spreadsheet. 

In [1]:
import pandas as pd
from plotly.subplots import make_subplots
import plotly.graph_objects as go
from datetime import datetime
#import requests


## Step 2

Create the URL web address that points to our spreadsheet.

If you make your own spreadsheet of data on EtherCalc, you can change the name here. 

In [2]:
base_url = 'https://ethercalc.net/'  # the web address for EtherCalc spreadsheets
ethercalc_id = 'callystoplant02'  # the name of our spreadsheet in EtherCalc
post_url = base_url+'_/'+ethercalc_id # combine the two

## Step 3

Read in the data as a CSV file, and save it as a dataframe we call "df."

In [7]:
# get data
print('data from', base_url+ethercalc_id)
df = pd.read_csv(base_url+ethercalc_id+'.csv')
df

data from https://ethercalc.net/callystoplant02


Unnamed: 0,Date,Time,Temperature,Humidity,Moisture,Luminance
0,2023-04-30,21:43:42,20.64,50.29,0.199,0.0193
1,2023-04-30,21:43:43,20.64,50.29,0.199,0.0193
2,2023-04-30,21:43:47,20.66,50.26,0.198,0.0440
3,2023-04-30,21:43:47,20.64,50.26,0.198,0.0440
4,2023-04-30,21:43:52,20.66,50.24,0.199,0.0430
...,...,...,...,...,...,...
452,2023-05-16,16:17:32,25.19,51.99,0.400,33.2220
453,2023-05-16,16:26:52,25.19,52.62,0.400,33.3572
454,2023-05-16,16:27:29,25.15,52.57,0.400,33.2102
455,2023-05-16,16:36:43,25.20,52.77,0.401,33.5453


In [10]:
df = pd.read_csv('https://ethercalc.net/MyWeirdName23.csv')
df

Unnamed: 0,Date,Time,Temperature,Humidity,Moisture,Luminance
0,2023-04-30,21:43:42,20.64,50.29,0.199,0.0193
1,2023-04-30,21:43:43,20.64,50.29,0.199,0.0193
2,2023-04-30,21:43:47,20.66,50.26,0.198,0.0440
3,2023-04-30,21:43:47,20.64,50.26,0.198,0.0440
4,2023-04-30,21:43:52,20.66,50.24,0.199,0.0430
...,...,...,...,...,...,...
678,2023-05-17,13:01:39,23.46,55.22,0.385,88.1470
679,2023-05-17,13:06:41,23.42,55.46,0.387,95.7617
680,2023-05-17,13:07:02,23.44,55.38,0.385,96.8553
681,2023-05-17,13:11:40,23.47,55.38,0.385,103.3900


In [11]:
df["DateTime"] = df["Date"] + ' ' + df["Time"]
df

Unnamed: 0,Date,Time,Temperature,Humidity,Moisture,Luminance,DateTime
0,2023-04-30,21:43:42,20.64,50.29,0.199,0.0193,2023-04-30 21:43:42
1,2023-04-30,21:43:43,20.64,50.29,0.199,0.0193,2023-04-30 21:43:43
2,2023-04-30,21:43:47,20.66,50.26,0.198,0.0440,2023-04-30 21:43:47
3,2023-04-30,21:43:47,20.64,50.26,0.198,0.0440,2023-04-30 21:43:47
4,2023-04-30,21:43:52,20.66,50.24,0.199,0.0430,2023-04-30 21:43:52
...,...,...,...,...,...,...,...
678,2023-05-17,13:01:39,23.46,55.22,0.385,88.1470,2023-05-17 13:01:39
679,2023-05-17,13:06:41,23.42,55.46,0.387,95.7617,2023-05-17 13:06:41
680,2023-05-17,13:07:02,23.44,55.38,0.385,96.8553,2023-05-17 13:07:02
681,2023-05-17,13:11:40,23.47,55.38,0.385,103.3900,2023-05-17 13:11:40


In [16]:
fig = make_subplots(rows=2, cols=2,
    subplot_titles=("Temperature", "Humidity", "Moisture", "Luminance")
)

fig.add_trace(
    go.Scatter(x=df['DateTime'], y=df['Temperature'],
                mode='markers', # 'lines' or 'markers'
                name='Temperature'),
    row=1, col=1
)

fig.add_trace(
    go.Scatter(x=df['DateTime'], y=df['Humidity'],
                mode='markers', # 'lines' or 'markers'
                name='Humidity'),
    row=1, col=2
)

fig.add_trace(
    go.Scatter(x=df['DateTime'], y=df['Moisture'],
                mode='markers', # 'lines' or 'markers'
                name='Moisture'),
    row=2, col=1
)

fig.add_trace(
    go.Scatter(x=df['DateTime'], y=df['Luminance'],
                mode='markers', # 'lines' or 'markers'
                name='Luminance'),
    row=2, col=2
)

fig.update_layout(height=800, width=800, title_text="Plant Environment Dataset")
fig.show()

## Looking at the data

Before doing any statistical analysis, what can we discover just by looking at the raw data?

1. What is the typical range of temperature? Humidity? Moisture? Luminance?
2. Is there any correlation with time of day and the various data? State in words what the connection is?
3. It looks like temperature goes up when humidity goes down, and vice versa. Can you state this more precisely?
4. Does temperature go up with luminance? Or not?
5. Is this an indoor plant? Can you say anything about the house it is in? Does it have air conditioning for summer? Heating for winter? How about the lights -- is the only light coming from outside, or is there indoor light as well?


## Statistical analysis

From the observations above, can we come up with numerical statements that make these observations quantifiable. 

Can we explore correlations, other measures that compare the data?

[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)