## CS184a Final Project - Saved Model Example for Demonstration
#### 2023-12-08

****

Notebook Help From:
- https://www.kaggle.com/code/zulqarnainali/explained-critical-point-rnn/notebook
- https://towardsdatascience.com/deep-learning-on-a-combination-of-time-series-and-tabular-data-b8c062ff1907
- https://www.kaggle.com/code/aiaiaidavid/detect-sleep-states-interactive-eda-and-nn-model/notebook
- https://stanford.edu/~shervine/teaching/cs-230/cheatsheet-recurrent-neural-networks
- https://www.kaggle.com/competitions/child-mind-institute-detect-sleep-states/discussion/437264

Imports and Constants

In [9]:
import pandas as pd
import numpy as np
from keras.models import load_model

BASE_PATH = "./data/child-mind-institute-detect-sleep-states/"
SAMPLE_STEP_WINDOW_SIZE = 150
SAMPLE_STEP_WINDOW_SIZE_HALF = SAMPLE_STEP_WINDOW_SIZE // 2
FEATURES = [
    "anglez",
    "enmo",
    "anglez_difference",
    "enmo_difference",
    "anglez_rolling_mean",
    "enmo_rolling_mean",
]

Get Test Data

In [10]:
def load_data(path: str) -> pd.DataFrame:
    data = None
    ext = path.split(".")[-1]
    if ext == "parquet":
        data = pd.read_parquet(path)
    elif ext == "csv":
        data = pd.read_csv(path)
    else:
        raise (f"{ext} is not valid datatype.")

    return data


test_series = load_data(BASE_PATH + "test_series.parquet")

test_series

Unnamed: 0,series_id,step,timestamp,anglez,enmo
0,038441c925bb,0,2018-08-14T15:30:00-0400,2.636700,0.0217
1,038441c925bb,1,2018-08-14T15:30:05-0400,2.636800,0.0215
2,038441c925bb,2,2018-08-14T15:30:10-0400,2.637000,0.0216
3,038441c925bb,3,2018-08-14T15:30:15-0400,2.636800,0.0213
4,038441c925bb,4,2018-08-14T15:30:20-0400,2.636800,0.0215
...,...,...,...,...,...
445,0402a003dae9,145,2018-12-18T12:57:05-0500,-59.696899,0.0601
446,0402a003dae9,146,2018-12-18T12:57:10-0500,-35.656601,0.0427
447,0402a003dae9,147,2018-12-18T12:57:15-0500,-21.582399,0.0309
448,0402a003dae9,148,2018-12-18T12:57:20-0500,-42.616001,0.0328


Import Best Model

_see model.ipynb for full info (will take longer than 1 minute to run)_

In [17]:
# https://stackoverflow.com/questions/51278213/what-is-the-use-of-a-pb-file-in-tensorflow-and-how-does-it-work

model_path = "./Best RNN Model/"
RNN = load_model(model_path)

RNN.summary()

2023-12-08 10:34:27.921240: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'gradients_reversev2_grad_reversev2_reversev2_axis' with dtype int32 and shape [1]
	 [[{{node gradients_reversev2_grad_reversev2_reversev2_axis}}]]
2023-12-08 10:34:27.922019: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'gradients_split_2_grad_concat_split_2_split_dim' with dtype int32
	 [[{{node gradients_split_2_grad_concat_split_2_split_dim}}]]
2023-12-08 10:34:27.922064: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message):

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 150, 6)]          0         
                                                                 
 bidirectional (Bidirectiona  (None, 150, 300)         142200    
 l)                                                              
                                                                 
 dense (Dense)               (None, 150, 1)            301       
                                                                 
Total params: 142,501
Trainable params: 142,501
Non-trainable params: 0
_________________________________________________________________


2023-12-08 10:34:29.897591: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'gradients/ReverseV2_grad/ReverseV2/ReverseV2/axis' with dtype int32 and shape [1]
	 [[{{node gradients/ReverseV2_grad/ReverseV2/ReverseV2/axis}}]]
2023-12-08 10:34:29.914672: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'gradients/split_2_grad/concat/split_2/split_dim' with dtype int32
	 [[{{node gradients/split_2_grad/concat/split_2/split_dim}}]]
2023-12-08 10:34:29.915148: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message):

Create unique list of test series IDs

In [12]:
test_series_ids_list = list(test_series["series_id"].unique())
print(len(test_series_ids_list), "test series for submission")
print(test_series_ids_list)

3 test series for submission
['038441c925bb', '03d92c9f6f8a', '0402a003dae9']


Create/format test samples DataFrame

In [13]:
test_samples = {}

for id in test_series_ids_list:
    test_samples[id] = test_series[test_series["series_id"] == id].reset_index(
        drop=True
    )

    samples_list = list(range(len(test_samples[id]) // SAMPLE_STEP_WINDOW_SIZE))
    samples = {}
    for s in samples_list:
        first_step = SAMPLE_STEP_WINDOW_SIZE * s
        samples[s] = test_samples[id].loc[
            first_step : first_step + SAMPLE_STEP_WINDOW_SIZE - 1
        ]
test_samples[id] = pd.concat([samples[s] for s in samples_list], axis="rows")

test_series = pd.concat(
    [test_samples[id] for id in test_series_ids_list], axis="rows"
).reset_index(drop=True)

# test_series.info()

Create new features for test series: 
- ENMO difference from previous row
- AngleZ difference from previous row
- ENMO one minute rolling mean 
- AngleZ one minute rolling mean 

In [14]:
test_series_with_features = {}

for id in test_series_ids_list:
    test_series_with_features[id] = test_series[test_series["series_id"] == id]
    test_series_with_features[id]["anglez_difference"] = (
        test_series_with_features[id]["anglez"].diff().bfill().ffill()
    )
    test_series_with_features[id]["enmo_difference"] = (
        test_series_with_features[id]["enmo"].diff().bfill().ffill()
    )
    test_series_with_features[id]["anglez_rolling_mean"] = (
        test_series_with_features[id]["anglez"].rolling(12).mean().bfill().ffill()
    )
    test_series_with_features[id]["enmo_rolling_mean"] = (
        test_series_with_features[id]["enmo"].rolling(12).mean().bfill().ffill()
    )

test_series = pd.concat(
    [test_series_with_features[id] for id in test_series_ids_list], axis=0
)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  test_series_with_features[id]["anglez_difference"] = (
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  test_series_with_features[id]["enmo_difference"] = (
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  test_series_with_features[id]["anglez_rolling_mean"] = (
A value is trying to be set on a copy of

Make predictions on test set

In [15]:
## https://www.kaggle.com/code/aiaiaidavid/detect-sleep-states-interactive-eda-and-nn-model/notebook

X_test = (
    test_series[FEATURES]
    .to_numpy()
    .reshape(len(test_series_ids_list), SAMPLE_STEP_WINDOW_SIZE, len(FEATURES))
)

# Make submission predictions (probabilities of asleep)
test_predictions = RNN.predict(X_test)

# Make binary asleep predictions (1=asleep)
test_classes = np.where(test_predictions > 0.5, 1, 0)

# Reshape
test_predictions = test_predictions.reshape(len(test_series), 1)
test_classes = test_classes.reshape(len(test_series), 1)

# Create asleep and event predictions columns
test_series["score"] = test_predictions
test_series["asleep"] = test_classes
test_series["asleep_difference"] = test_series["asleep"].diff().bfill().ffill()
test_series["event"] = test_series[test_series["score"] > 0.4][
    "asleep_difference"
].replace({1: "onset", -1: "wakeup"})

2023-12-08 10:33:05.035363: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'gradients/split_2_grad/concat/split_2/split_dim' with dtype int32
	 [[{{node gradients/split_2_grad/concat/split_2/split_dim}}]]
2023-12-08 10:33:05.035944: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'gradients/split_grad/concat/split/split_dim' with dtype int32
	 [[{{node gradients/split_grad/concat/split/split_dim}}]]
2023-12-08 10:33:05.036597: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You mus



Create submission in Kaggle format

In [16]:
submission = (
    test_series.loc[
        ((test_series["event"] == "onset") | (test_series["event"] == "wakeup"))
    ][["series_id", "step", "event", "score"]]
    .copy()
    .reset_index(drop=True)
    .reset_index(names="row_id")
)

submission

Unnamed: 0,row_id,series_id,step,event,score
0,0,038441c925bb,3,wakeup,0.467221
1,1,038441c925bb,19,onset,0.677479
2,2,038441c925bb,75,onset,0.546805
3,3,038441c925bb,149,wakeup,0.487548
4,4,03d92c9f6f8a,3,onset,0.545985
5,5,03d92c9f6f8a,4,wakeup,0.442696
6,6,03d92c9f6f8a,10,onset,0.521569
7,7,03d92c9f6f8a,70,wakeup,0.458847
8,8,0402a003dae9,0,onset,0.852695
9,9,0402a003dae9,85,onset,0.552598
