## **Turi Activity Classifier**

Welcome to the activity classification model quickstart on Skafos! The purpose of this notebook is to get you going end-to-end. Below we will do the following:

1. Load activity training data generated from Apple watchOS devices.
2. Build an activity classification model.
3. Convert the model to CoreML format and sae it to the Saves framework.

The example is based on [Turi Create's Activity Classifier](https://apple.github.io/turicreate/docs/userguide/activity_classifier/) and data [from this blog post]().

---

Execute each cell one-by-one, by selecting the cell and do one of the following:

-  Clicking the "play" button at the top of this frame.
-  Typing 'Control + Enter' or 'Shift + Enter'.

In [2]:
# If this is your first time in the JupyterLab workspace - install external dependencies
from utilities.dependencies import install
install(timeout=500)

# No need to do this in the future for this notebook

In [3]:
# Import necessary libraries
from skafossdk import *
import turicreate as tc

import utilities.save_models as sm

In [None]:
ska = Skafos() # initialize Skafos

### 1. **Load the data**
The data for this example comes from watchOS devices. Motion-sensory readings from the device's accerlometer and gyroscope were collected during a live experiment where three activities took place: walking, sitting, and standing. Read more about the experiment and the data cleaning process [in this blog post](). You can also find the raw data in the `...skafos.example.data/ActivityClassifier/raw/..` bucket on s3.

Once loaded, the data is split into train and test sets, where 80% of the data is used for training, and 20% is used for model evaluation.

In [4]:
# Set the column names for the dataframe
activity_data_columns = {
    "motionRotationRateX(rad/s)": "rotation_x",
    "motionRotationRateY(rad/s)": "rotation_y",
    "motionRotationRateZ(rad/s)": "rotation_z",
    "motionUserAccelerationX(G)": "acceleration_x",
    "motionUserAccelerationY(G)": "acceleration_y",
    "motionUserAccelerationZ(G)": "acceleration_z",
    "sessionId": "session_id",
    "activity": "activity"
}

s3_url = "https://s3.amazonaws.com/skafos.example.data/ActivityClassifier/cleaned/watch_activity_data.csv"
data = tc.SFrame.read_csv(s3_url)[list(activity_data_columns.keys())].rename(activity_data_columns)

------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[int,float,float,float,float,float,float,str,int,str]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------


In [5]:
# Make a train-test split such that each data slice contains whole session chunks
train_data, test_data = tc.activity_classifier.util.random_split_by_session(data, session_id='session_id', fraction=0.8)

In [6]:
# Take a look at the data
train_data.head(5)

rotation_x,rotation_y,rotation_z,acceleration_x
-0.0634413883090019,-0.0328197255730629,-0.0423141345381736,0.0072054117918014
-0.0290454775094986,0.0024865106679499,0.0045987796038389,-0.0041680335998535
-0.0002681682817637,-0.0071450336836278,0.0069057811051607,0.0010231807827949
-0.0045244493521749,0.0046083838678896,0.0068134455941617,-0.0035822018980979
0.0156796798110008,-0.0007592616602778,0.0005138712003827,0.0052630454301834

acceleration_y,acceleration_z,session_id,activity
-0.0306147933006286,0.0028187632560729,0,sitting
-0.0206325054168701,0.0141564011573791,0,sitting
-0.0091954469680786,-0.0003412961959838,0,sitting
-0.0020730793476104,0.007745623588562,0,sitting
0.0028619170188903,-0.0005234479904174,0,sitting


### 2. **Build the model**
We use the `tc.activity_classifier.create` function and specify the data, target variable (that has our activity label), session_id, and a few other arguments needed to properly train the model. To understand more about this specific function, check out the [Turi Create Documentation](https://apple.github.io/turicreate/docs/userguide/activity_classifier/).

In [None]:
# Create an activity classifier
model = tc.activity_classifier.create(
    dataset=train_data,
    session_id='session_id',
    target='activity',
    prediction_window=30,
    max_iterations=30,
    batch_size=64
)

---
**Prediction Window** = *number_seconds_between_predictions* x *sample_frequency*

-  *number_seconds_between_predictions* is how many seconds you want between each prediction. Varies by use-case.
-  *sample_frequency* is how many sensor readings are taken each second (by the application).

For this example, the motion-sensor data from the [watch data collection experiment]() was collected at 10Hz (or 10 times per second). If we want our model to give an activity prediction every 3 seconds, we should set the prediction window to 30 ~ (3 * 10).

In [None]:
# Find 20 seconds of moving data constrained to a single experiment where moving took place
exp_id_with_moving = test_data[test_data['activity'] == 'moving']['session_id'].value_counts()[0]['value']
moving_20_seconds = test_data[(test_data['activity'] == 'moving') & (test_data['session_id'] == exp_id_with_moving)][:200]

# Find 20 seconds of sitting data constrained to a single experiment where sitting took place
exp_id_with_sitting = test_data[test_data['activity'] == 'sitting']['session_id'].value_counts()[0]['value']
sitting_20_seconds = test_data[(test_data['activity'] == 'sitting') & (test_data['session_id'] == exp_id_with_sitting)][:200]

# Find 20 seconds of standing data constrained to a single experiment where standing took place
exp_id_with_standing = test_data[test_data['activity'] == 'standing']['session_id'].value_counts()[0]['value']
standing_20_seconds = test_data[(test_data['activity'] == 'standing') & (test_data['session_id'] == exp_id_with_standing)][:200]

In [None]:
# Check if the model properly classifies each second as walking or something else
model.predict(moving_20_seconds, output_frequency='per_window')

In [None]:
# Check if the model properly classifies each second as sitting or something else
model.predict(sitting_20_seconds, output_frequency='per_window')

In [None]:
# Check if the model properly classifies each second as standing or something else
model.predict(standing_20_seconds, output_frequency='per_window')

In [None]:
# Calculate accuracy of the model against the entire hold out testing set
accuracy = tc.evaluation.accuracy(test_data['activity'], model.predict(test_data))
print(f'The activity classifier predicted {accuracy*100} % of the testing observations correctly!', flush=True)

### 3. **Save the model**
Once your model has been created, it must be saved to the Skafos framework via the code below. This will trigger a push to your mobile app.

In [None]:
# Export the model to CoreML
coreml_model_name = "ActivityClassifier.mlmodel"
compressed_model_name = coreml_model_name + ".gz" 
res = model.export_coreml(coreml_model_name)

# Compress the model
compressed_model = sm.compress_model(coreml_model_name)

# Save to Skafos
sm.skafos_save_model(
    skafos=ska,
    model_name=compressed_model_name,
    compressed_model=compressed_model,
    permissions='public'
)