### Pandas Dataframe to JSON
Here, we test and demonstrate how to get data from the back-end to the front via json.

First we create a dataframe of fake data for three stations and three variables. Some variables are missing for some stations. We want to see how we handle missing values and varying column/variable names. We will use these three stations, eventually switching for real data:

| Location   | Variable 1  | Variable 2 | Variable 3   |
|------------|-------------|------------|--------------|
| Roosevelt  | PM2.5       | Ozone      | Temperature  |
| Vernal     | Ozone       | NOx        | PM2.5        |
| Horsepool  | NO          | NOx        | Temperature  |



In [1]:


import os 

import pandas as pd 
import numpy as np 



In [2]:
# Create a dataframe that holds the test data for the three stations of interest. 
# The columns are the variables of interest and each station only has a max 
# of three variables, hence some columns are full of NaN values for some stations.

# Example numbers 
# NO 3 ppb
# NOx 4 ppb
# Temp 21 C 
# Wind Speed 2 m/s
# PM2.5 4 ug/m3
# Ozone 40 ppb


df_dict = {
    "Roosevelt": {
        "PM2.5": 5.0,
        "Ozone": 46.0,
        "Temperature": 16.5,
            },
    "Vernal": {
        "Ozone": 42.0,
        "NOx": 4.3,
        "PM2.5": 2.5,
            },
    "Horsepool": {
        "NO": 3.0,
        "NOx": 4.0,
        "Temperature": 18.0,
            }
}

df = pd.DataFrame(df_dict)
df

Unnamed: 0,Roosevelt,Vernal,Horsepool
PM2.5,5.0,2.5,
Ozone,46.0,42.0,
Temperature,16.5,,18.0
NOx,,4.3,4.0
NO,,,3.0


In [3]:
# Convert the dataframe to a JSON string
json_str = df.to_json(orient="index")

# Print the JSON string to preview it
print(json_str)

{"PM2.5":{"Roosevelt":5.0,"Vernal":2.5,"Horsepool":null},"Ozone":{"Roosevelt":46.0,"Vernal":42.0,"Horsepool":null},"Temperature":{"Roosevelt":16.5,"Vernal":null,"Horsepool":18.0},"NOx":{"Roosevelt":null,"Vernal":4.3,"Horsepool":4.0},"NO":{"Roosevelt":null,"Vernal":null,"Horsepool":3.0}}


In [4]:
# Save the JSON string to a file in the test_data directory based at the root 
# This notebook is in backend, so parallel (up and down) to test_data directory
fpath = os.path.join("..", "test_data", "test_liveobs.json")
with open(fpath, "w") as f:
    f.write(json_str)

In [5]:
# Generate some test/fake time series data
np.random.seed(0)
n = 48 * 2
time = pd.date_range(start="2021-01-01", periods=n, freq="30min")
wind_speed = 2.3 * np.sin(2 * np.pi * time.hour / 24) + np.random.normal(0, 0.5, n)

# Ensure wind_speed is a numpy array
wind_speed = np.array(wind_speed)

# Anything less than 0.25 m/s is "stall speed" for the wind sensor
# Anything negative is obviously not possible
# Both require wind speed to be set to 0
wind_speed[wind_speed < 0.25] = 0

# Convert the numpy array to a pandas Series
wind_speed_series = pd.Series(wind_speed, index=time)

# Create a dataframe, with time as index, value of wind speed as value in that row,
# and the column name as "Wind Speed"
df_wind = pd.DataFrame(wind_speed_series, columns=["Wind Speed"])
df_wind

Unnamed: 0,Wind Speed
2021-01-01 00:00:00,0.882026
2021-01-01 00:30:00,0.000000
2021-01-01 01:00:00,1.084653
2021-01-01 01:30:00,1.715730
2021-01-01 02:00:00,2.083779
...,...
2021-01-02 21:30:00,0.000000
2021-01-02 22:00:00,0.000000
2021-01-02 22:30:00,0.000000
2021-01-02 23:00:00,0.000000


In [6]:
# Write the dataframe to json in one line to define the filepath and another to write the file
df_wind.to_json(os.path.join("..", "test_data", "test_wind_ts.json"), orient="index")

### CLyfar 
We will get two products from Clyfar:

#### Deterministic forecast ("crisp") in ppb
This is a single value Clyfar issues as a forecast without uncertainty estimates 

#### Possibility distribution for each ozone category
At each time, for each ozone category Clyfar uses (background, moderate, elevated, extreme) and a fifth "unsure" category, Clyfar issues a possibility. The same five categories also have a necessity value. We may have other estimates of uncertainty in future.

In [7]:
# Now some clyfar data with ten columns for the five categories and their possibility and necessity values
# The rows will be daily forecasts of ozone for 14 days ahead. 

# TODO: pause here to finish Clyfar possibility and necessity functions 

