<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#1.-MVC-project-description" data-toc-modified-id="1.-MVC-project-description-1">1. MVC project description</a></span></li><li><span><a href="#2.-Setup" data-toc-modified-id="2.-Setup-2">2. Setup</a></span></li><li><span><a href="#3.-Get-the-data" data-toc-modified-id="3.-Get-the-data-3">3. Get the data</a></span><ul class="toc-item"><li><span><a href="#3.1.-From-matlab-to-dict" data-toc-modified-id="3.1.-From-matlab-to-dict-3.1">3.1. From matlab to dict</a></span></li><li><span><a href="#3.2.-From-dict-to-pandas" data-toc-modified-id="3.2.-From-dict-to-pandas-3.2">3.2. From dict to pandas</a></span></li></ul></li><li><span><a href="#4.-Save-the-data" data-toc-modified-id="4.-Save-the-data-4">4. Save the data</a></span></li></ul></div>

# 1. MVC project description

**Links**
- [github repo](https://github.com/romainmartinez/mvc)
- [plotly figures](https://plot.ly/organize/romainmartinez:114)

**Author**: _Romain Martinez._

# 2. Setup

In [1]:
# Common imports
import pandas as pd
import numpy as np

# custom functions
import mvc
%load_ext autoreload
%autoreload 2

# Path
from pathlib import Path
PROJECT_PATH = Path('./')
DATA_PATH = PROJECT_PATH / 'data'
MODEL_PATH = PROJECT_PATH / 'model'

import dill as pickle

ModuleNotFoundError: No module named 'dill'

# 3. Get the data

## 3.1. From matlab to dict

In [3]:
DATA_FORMAT = 'only_max'
data, DATASET_NAMES = mvc.fileio.load_data(
    data_path=DATA_PATH,
    data_format=DATA_FORMAT,
    normalize=False,
    verbose=True)
normalized, _ = mvc.fileio.load_data(
    data_path=DATA_PATH, data_format=DATA_FORMAT, normalize=True)

project 'Romain2017' (32 participants)
project 'Landry2015_1' (14 participants)
project 'Landry2012' (18 participants)
project 'Yoann_2015' (22 participants)
project 'Tennis' (16 participants)
project 'Violon' (10 participants)
project 'Patrick_2013' (16 participants)
project 'Landry2015_2' (11 participants)
project 'Landry2016' (15 participants)
project 'Sylvain_2015' (10 participants)
project 'Landry2013' (21 participants)

	total participants: 184


## 3.2. From dict to pandas

In [4]:
df_tidy = pd.DataFrame({
    'participant': data['participants'],
    'dataset': data['datasets'],
    'muscle': data['muscles'],
    'test': data['tests'],
    'mvc': data['mvc']
}).dropna()

print(f'dataset shape = {df_tidy.shape}')
df_tidy.head()

dataset shape = (18465, 5)


Unnamed: 0,participant,dataset,muscle,test,mvc
0,0,0,0,0,0.000381
3,0,0,0,3,0.0003
4,0,0,0,4,0.000348
5,0,0,0,5,0.000111
8,0,0,0,8,0.000249


In [5]:
df_tidy_normalized = pd.DataFrame({
    'participant': normalized['participants'],
    'dataset': normalized['datasets'],
    'muscle': normalized['muscles'],
    'test': normalized['tests'],
    'mvc': normalized['mvc']
}).dropna()

print(f'dataset shape = {df_tidy.shape}')
df_tidy_normalized.head()

dataset shape = (18465, 5)


Unnamed: 0,participant,dataset,muscle,test,mvc
0,0,0,0,0,100.0
3,0,0,0,3,78.620986
4,0,0,0,4,91.248041
5,0,0,0,5,29.21831
8,0,0,0,8,65.428287


In [6]:
df_wide = df_tidy.pivot_table(
    index=['dataset', 'participant', 'muscle'],
    columns='test',
    values='mvc',
    fill_value=np.nan).reset_index()

df_wide = df_wide.drop(['dataset', 'participant'], axis=1)
df_wide.columns = df_wide.columns.astype(str)

print(f'dataset shape = {df_wide.shape}')
df_wide.head()

dataset shape = (1721, 17)


test,muscle,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
0,0,0.000381,,,0.0003,0.000348,0.000111,,,0.000249,3.6e-05,1.1e-05,,,0.000162,,
1,3,0.000471,,,0.000442,0.00043,1e-05,,,0.000237,1.3e-05,2.1e-05,,,0.000373,,
2,4,0.000158,,,0.000136,0.00015,6.5e-05,,,0.00012,2.7e-05,1.9e-05,,,3.4e-05,,
3,5,8.2e-05,,,5.1e-05,7.6e-05,0.000268,,,8.7e-05,0.000161,1.4e-05,,,9e-06,,
4,6,1.2e-05,,,6.3e-05,2.1e-05,2.7e-05,,,8e-06,1.1e-05,5.1e-05,,,7.9e-05,,


# 4. Save the data

In [7]:
df_tidy.reset_index(drop=True).to_feather(DATA_PATH / 'df_tidy')
df_tidy_normalized.reset_index(drop=True).to_feather(DATA_PATH / 'df_tidy_normalized')
df_wide.reset_index(drop=True).to_feather(DATA_PATH / 'df_wide')

In [8]:
conf = {
    'DATASETS': DATASET_NAMES,
    'MUSCLES': [
        'upper trapezius', 'middle trapezius', 'lower trapezius',
        'anterior deltoid', 'middle deltoid', 'posterior deltoid',
        'pectoralis major', 'serratus anterior', 'latissimus dorsi',
        'supraspinatus', 'infraspinatus', 'subscapularis'
    ],
    'TESTS': np.arange(16).tolist()
}

with open(MODEL_PATH / 'conf.pkl', 'wb') as h:
    pickle.dump(conf, h)

NameError: name 'pickle' is not defined