# Pre-Processing of the 2016-ANSAMO Dataset
**Directory:**
    > `Subject_<nr>_ADL_<activity>.csv`
    > ...

**Types of Executed ADLs:**   
1) normal walking, 2) light jogging, 3) body bending, 4) hopping, 5) climbing stairs (up), 6) climbing stairs (down), 7) lying down and getting up from a bed, 8) sitting down (and up) on (from) a chair.

**Columns Units:**  
    After the header, every line in the files corresponds to a measurement captured by a particular mobility sensor of a determined node (mote or SensorTag).  
    The format of the lines, which is also explained in the file header, includes 7 numerical values separated by a semicolon:  
        -The time (in ms) since the experiment began.  
        -The number of the sample (for the same sensor and node).  
        -The three real numbers describing the measurements of the triaxial sensor (x-axis, y-axis and z-axis). The units are g, °/s or μT depending on whether the measurement was performed by an accelerometer, a gyroscope or a magnetometer, respectively.  
        -An integer (0, 1 or 2) describing the type of the sensor that originated the measurement (Accelerometer = 0 , Gyroscope = 1, Magnetometer = 2)  
        -An integer (from 0 to 4) informing about the sensing node (the correspondence between this numerical code and the Bluetooth MAC address and position of the motes is described in the file header).
        
## Desired Dataset Format

current header = TS(ms);SampleNr;X-Axis;Y-Axis;Z-Axis;SensorType;SensorID  
desired header = TS(ms); AccX; AccY; AccZ; MagnX; MagnY; MagnZ; GyroX; GyroY; GyroZ; SubjectID; Gender; Age; Position; Label; and other Feature Extraction Columns (mean, std, corr, etc...)

Pre-Processing Tasks:
    - clean headers information
    - divide the files by subject: "subject_01.csv"; "subject_02.csv"; ... with the desired header above.

In [1]:
## Readme File ##
from IPython.display import IFrame
IFrame("./2016-ANSAMO-Readme.pdf", width=800, height=800)

In [4]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

In [3]:
##
## first download dataset
##

import sys
sys.path.append('../../../')
from ipynb.fs.full.Utils import download_from_google_drive
download_from_google_drive('fuMxj-dnMXEPHGhHwuQoNjBovO_v5Kdk', '../../../datasets/ANSAMO-2016.zip');

The File Already Exists. Please Change The Path Destination.


In [218]:
SUBJECTS = pd.DataFrame([
    {'name': 'manel', 'age':'1'}, 
    {'name': 'manel1', 'age':2}, 
])
user = SUBJECTS[SUBJECTS['name'].isin(['manel'])];
print(user['age'])

0    1
Name: age, dtype: object


In [259]:
SUBJECTS = pd.DataFrame([
    {'id': 'Subject 01', 'age':'22', 'gender':'female', 'height':'167', 'weight': '63'},
    {'id': 'Subject 02', 'age':'27', 'gender':'male', 'height':'173', 'weight': '90'},
    {'id': 'Subject 03', 'age':'23', 'gender':'male', 'height':'179', 'weight': '68'}, 
    {'id': 'Subject 04', 'age':'24', 'gender':'male', 'height':'175', 'weight': '79'},
    {'id': 'Subject 05', 'age':'28', 'gender':'male', 'height':'195', 'weight': '81'},
    
    {'id': 'Subject 06', 'age':'22', 'gender':'female', 'height':'167', 'weight': '57'},
    {'id': 'Subject 07', 'age':'55', 'gender':'male', 'height':'170', 'weight': '83'},
    {'id': 'Subject 08', 'age':'19', 'gender':'male', 'height':'178', 'weight': '68'},
    {'id': 'Subject 09', 'age':'26', 'gender':'male', 'height':'176', 'weight': '73'},
    {'id': 'Subject 10', 'age':'51', 'gender':'female', 'height':'155', 'weight': '55'},
    
    {'id': 'Subject 11', 'age':'14', 'gender':'female', 'height':'159', 'weight': '50'},
    {'id': 'Subject 12', 'age':'22', 'gender':'female', 'height':'164', 'weight': '52'},
    {'id': 'Subject 13', 'age':'26', 'gender':'male', 'height':'179', 'weight': '67'},
    {'id': 'Subject 14', 'age':'21', 'gender':'male', 'height':'173', 'weight': '77'},
    {'id': 'Subject 15', 'age':'27', 'gender':'female', 'height':'166', 'weight': '66'},
    
    {'id': 'Subject 16', 'age':'24', 'gender':'male', 'height':'177', 'weight': '66'},
    {'id': 'Subject 17', 'age':'23', 'gender':'female', 'height':'163', 'weight': '93'},
])

def read_file(filepath):
    filename = ""
    try:
        filepath.index("/");
        array = filepath.split('/');
        filename = array[len(array) - 1]
    except:
        filename = filepath
    print(filename)
    ## filename = UMAFall_Subject_01_ADL_Bending_1_2016-06-13_20-25-34.csv
    array = filename.split('_')
    subjectID = array[1] + " " + array[2]
    label = array[4]
    experiment = array[4] + "_" + array[5]
    ### read csv
    train_dataRaw = pd.read_csv(filepath)
    ### remove first 31 lines
    train_data = train_dataRaw.iloc[31:]
    ### user characteristics
    user = SUBJECTS[SUBJECTS['id'].isin([subjectID])];
    age = user['age']
    gender = user['gender']
    height = user['height']
    weight = user['weight']

    return {"filecontent": train_data, 
             'experiment': experiment, 
             'label': label, 
             'subjectID': subjectID,
             'filename': filename, 
             'gender': gender.values[0], 
             'age': age.values[0], 
             'height': height.values[0], 
             'weight': weight.values[0]};
   

###output = read_file("../../../datasets/ANSAMO-2016/UMA_ADL_FALL_Dataset/UMAFall_Subject_01_ADL_Bending_1_2016-06-13_20-25-34.csv");

UMAFall_Subject_01_ADL_Bending_1_2016-06-13_20-25-34.csv


In [264]:
##  Sensor_ID	 Position	 Device Model                                   
#0	 RIGHTPOCKET	 lge-LG-H815-5.1                                
#1	 CHEST	 SensorTag
#2	 WAIST	 SensorTag
#3	 WRIST	 SensorTag                                            
#4	 ANKLE	 SensorTag                                                                                                            
RIGHTPOCKET = '0'
CHEST = '1'
WAIST = '2'
WRIST = '3'
ANKLE = '4'
#% Sensor_Type:                                                                     
#% Accelerometer = 0                                                                
#% Gyroscope = 1                                                                    
#% Magnetometer = 2 
ACCE = '0'
GYRO = '1'
MAGN = '2'
    
def split_positions_and_signals(filecontent):
    ''' filecontent = pd.DataFrame '''
    ### divide line into several columns
    header_tmp = {'ts(ms)': [], 'SampleNr': [], 'X-Axis': [], 'Y-Axis': [], 'Z-Axis': [], 'SensorType': [] , 'SensorID': []}
    values = pd.DataFrame(data=header_tmp)
    i = 0
    for index, row in filecontent.iterrows():
            line = row[0]
            array = line.split(';')
            if i > 0: values = values.append(pd.DataFrame({'ts(ms)': [array[0]], 'SampleNr': [array[1]], 'X-Axis': [array[2]], 'Y-Axis': [array[3]], 'Z-Axis': [array[4]], 'SensorType': [array[5]] , 'SensorID': [array[6]]}))
            i+=1;
    ### sorting per sensorid, sensortype, and ts
    values = values.sort_values(['SensorID', 'SensorType', 'ts(ms)']);
    ### start splitting into positions and signals
    # divide signals per position
    train_data_pocket = train_data_processed_ordered[train_data_processed_ordered['SensorID'].isin([RIGHTPOCKET])];
    train_data_chest = train_data_processed_ordered[train_data_processed_ordered['SensorID'].isin([CHEST])];
    train_data_waist = train_data_processed_ordered[train_data_processed_ordered['SensorID'].isin([WAIST])];
    train_data_wrist = train_data_processed_ordered[train_data_processed_ordered['SensorID'].isin([WRIST])];
    train_data_ankle = train_data_processed_ordered[train_data_processed_ordered['SensorID'].isin([ANKLE])];
    # pocket 
    pocket_acc = train_data_pocket[train_data_pocket['SensorType'].isin([ACCE])];
    pocket_gyro = train_data_pocket[train_data_pocket['SensorType'].isin([GYRO])];
    pocket_magn = train_data_pocket[train_data_pocket['SensorType'].isin([MAGN])];
    # wrist
    wrist_acc = train_data_wrist[train_data_wrist['SensorType'].isin([ACCE])];
    wrist_gyro = train_data_wrist[train_data_wrist['SensorType'].isin([GYRO])];
    wrist_magn = train_data_wrist[train_data_wrist['SensorType'].isin([MAGN])];
    # ankle
    ankle_acc = train_data_ankle[train_data_ankle['SensorType'].isin([ACCE])];
    ankle_gyro = train_data_ankle[train_data_ankle['SensorType'].isin([GYRO])];
    ankle_magn = train_data_ankle[train_data_ankle['SensorType'].isin([MAGN])];
    # waist
    waist_acc = train_data_waist[train_data_waist['SensorType'].isin([ACCE])];
    waist_gyro = train_data_waist[train_data_waist['SensorType'].isin([GYRO])];
    waist_magn = train_data_waist[train_data_waist['SensorType'].isin([MAGN])];
    # chest
    chest_acc = train_data_chest[train_data_chest['SensorType'].isin([ACCE])];
    chest_gyro = train_data_chest[train_data_chest['SensorType'].isin([GYRO])];
    chest_magn = train_data_chest[train_data_chest['SensorType'].isin([MAGN])];
    
    return {'pocket_acc':pocket_acc,
           'pocket_gyro':pocket_gyro,
           'pocket_magn':pocket_magn,
           
           'wrist_acc':wrist_acc,
           'wrist_gyro':wrist_gyro,
           'wrist_magn':wrist_magn,
           
           'ankle_acc':ankle_acc,
           'ankle_gyro':ankle_gyro,
           'ankle_magn':ankle_magn,
           
           'waist_acc':waist_acc,
           'waist_gyro':waist_gyro,
           'waist_magn':waist_magn,
           
           'chest_acc':chest_acc,
           'chest_gyro':chest_gyro,
           'chest_magn':chest_magn,
           
            'train_data_pocket':train_data_pocket,
            'train_data_chest':train_data_chest,
            'train_data_waist':train_data_waist,
            'train_data_wrist':train_data_wrist,
            'train_data_ankle':train_data_ankle,
           };

##filecontent = pd.DataFrame(output['filecontent']);
##values = split_positions_and_signals(filecontent);

all values length: 6568


In [351]:
def join_all_signals_types(signals, info):
    ''' Receives the argument: 
    sigmals = pd.DataFrame([
    {   'value': pocket_acc, 'signalType': 'acc'},  {'value': pocket_gyro, 'signalType': 'gyro'}, ... 
    info = pd.DataFrame([{
        'userGender': 'male',
        'userAge': '55',
        'userID':'sdasd'
        'position': 2
        'filename': 'asda.csv',
        'experiment': 'Bending_1',
        'label': 'Bending'}]) 
    '''
    userAge = info['userAge']
    userGender = info['userGender']
    userID = info['userID']
    label = info['label']
    filename = info['filename']
    experiment = info['experiment']
    
    header = [ 'ts(ms)', 'accX', 'accY', 'accZ',
              'magX', 'magY', 'magZ',
              'gyrX', 'gyrY', 'gyrZ',
              'userGender', 'userAge', 'userID',
              'position', 
              'label',
              'filename',
              'experiment' ];
    all_values = pd.DataFrame(columns=header);
    
    for indexR, lineR in signals.iterrows():
        array = lineR['value'];
        signalType = lineR['signalType'];
        position = lineR['position'];
        
        for index, line in array.iterrows():
            row = all_values[all_values['ts(ms)'] == line['ts(ms)']];
            ts_tmp = [line['ts(ms)']];
            
            if signalType is "acc": 
                 acc_values_tmp = [ [line['X-Axis']] ,  [line['Y-Axis']],  [line['Z-Axis']]];
                 magn_values_tmp = [ [-1] ,  [-1],  [-1]];
                 gyro_values_tmp = [ [-1] ,  [-1],  [-1]]; 
            elif signalType is "magn":
                 acc_values_tmp = [ [-1] ,  [-1],  [-1]]; 
                 magn_values_tmp = [ [line['X-Axis']] ,  [line['Y-Axis']],  [line['Z-Axis']]];
                 gyro_values_tmp = [ [-1] ,  [-1],  [-1]]; 
            else:
                acc_values_tmp = [ [-1] ,  [-1],  [-1]];    
                magn_values_tmp = [ [-1] ,  [-1],  [-1]];
                gyro_values_tmp = [ [line['X-Axis']] ,  [line['Y-Axis']],  [line['Z-Axis']]];      

            if row.size == 0:
                all_values = all_values.append(pd.DataFrame({'ts(ms)': ts_tmp, 
                                    'accX':acc_values_tmp[0],
                                    'accY':acc_values_tmp[1],
                                    'accZ': acc_values_tmp[2], 
                                    'magX':magn_values_tmp[0],
                                    'magY':magn_values_tmp[1], 
                                    'magZ': magn_values_tmp[2], 
                                    'gyrX':gyro_values_tmp[0],
                                    'gyrY': gyro_values_tmp[1],
                                    'gyrZ': gyro_values_tmp[2], 
                                    'userGender': userGender, 
                                    'userAge': userAge, 
                                    'userID': userID,
                                    'position': position, 
                                    'label':label.values,
                                    'filename':filename,
                                    'experiment': experiment}))
            else:
                row.at[index, 'accX'] = acc_values_tmp[0];
                row.at[index, 'accY'] = acc_values_tmp[1];
                row.at[index, 'accZ'] = acc_values_tmp[2];
                
                row.at[index, 'gyrX'] = gyro_values_tmp[0];
                row.at[index, 'gyrY'] = gyro_values_tmp[1];
                row.at[index, 'gyrZ'] = gyro_values_tmp[2];
                
                row.at[index, 'magZ'] = magn_values_tmp[2];
                row.at[index, 'magZ'] = magn_values_tmp[2];
                row.at[index, 'magZ'] = magn_values_tmp[2];
                
    return all_values;

###
###signals = pd.DataFrame([{'value': wrist_gyro, 'signalType': 'gyro'}])
###all_values = join_all_signals_types(signals)
###print("len of all values:", len(all_values));

In [349]:
output = read_file("../../../datasets/ANSAMO-2016/UMA_ADL_FALL_Dataset/UMAFall_Subject_01_ADL_Bending_1_2016-06-13_20-25-34.csv");
filecontent = pd.DataFrame(output['filecontent']);

experiment = output['experiment'];
print("experiment:", experiment)

label = output['label'];
print("label:", label)

subjectID = output['subjectID'];
print("subjectID:", subjectID)

filename = output['filename'];
print("filename:", filename)

gender = output['gender'];
print("gender:", gender)

age = output['age'];
print("age:", age)

height = output['height'];
print("height:", height)

weight = output['weight'];
print("weight:", weight)

info = pd.DataFrame([{
    'userGender': gender,
    'userAge': age,
    'userID': subjectID,
    'filename': filename,
    'experiment': experiment,
    'label': label
}])

UMAFall_Subject_01_ADL_Bending_1_2016-06-13_20-25-34.csv
experiment: Bending_1
label: Bending
subjectID: Subject 01
filename: UMAFall_Subject_01_ADL_Bending_1_2016-06-13_20-25-34.csv
gender: female
age: 22
height: 167
weight: 63


In [284]:
# split into positions and signal type
values = split_positions_and_signals(filecontent);

In [340]:
pocket_acc = values['pocket_acc']
pocket_gyro = values['pocket_gyro']
pocket_magn = values['pocket_magn']

wrist_acc = values['wrist_acc']
wrist_gyro = values['wrist_gyro']
wrist_magn = values['wrist_magn']

ankle_acc = values['ankle_acc']
ankle_gyro = values['ankle_gyro']
ankle_magn = values['ankle_magn']

waist_acc = values['waist_acc']
waist_gyro = values['waist_gyro']
waist_magn = values['waist_magn']

chest_acc = values['chest_acc']
chest_gyro = values['chest_gyro']
chest_magn = values['chest_magn']

signalsPocket = pd.DataFrame(
    [
     {'value': pocket_acc, 'signalType': 'acc', 'position': 'pocket'},
     {'value': pocket_gyro, 'signalType': 'gyro', 'position': 'pocket'},
     {'value': pocket_magn, 'signalType': 'magn', 'position': 'pocket'},
    ]);

signalsAnkle = pd.DataFrame(
    [
    {'value': ankle_acc, 'signalType': 'acc', 'position': 'ankle'},
    {'value': ankle_gyro, 'signalType': 'gyro', 'position': 'ankle'},
    {'value': ankle_magn, 'signalType': 'magn', 'position': 'ankle'},
    ]);

signalsWrist = pd.DataFrame(
    [
     {'value': wrist_acc, 'signalType': 'acc', 'position': 'wrist'},
     {'value': wrist_gyro, 'signalType': 'gyro', 'position': 'wrist'},
     {'value': wrist_magn, 'signalType': 'magn', 'position': 'wrist'},
    ]);

signalsWaist = pd.DataFrame(
    [
    {'value': waist_acc, 'signalType': 'acc', 'position': 'waist'},
    {'value': waist_gyro, 'signalType': 'gyro', 'position': 'waist'},
    {'value': waist_magn, 'signalType': 'magn', 'position': 'waist'},
    ]);

signalsChest = pd.DataFrame(
    [
    {'value': chest_acc, 'signalType': 'acc', 'position': 'chest'},
    {'value': chest_gyro, 'signalType': 'gyro', 'position': 'chest'},
    {'value': chest_magn, 'signalType': 'magn', 'position': 'chest'},
    ]);

In [354]:
### join all values
all_values = join_all_signals_types(signalsPocket, info);
all_values = all_values.append(join_all_signals_types(signalsAnkle, info))
all_values = all_values.append(join_all_signals_types(signalsWrist, info))
all_values = all_values.append(join_all_signals_types(signalsWaist, info))
all_values = all_values.append(join_all_signals_types(signalsChest, info))

In [355]:
all_values[['ts(ms)']] = all_values[['ts(ms)']].apply(pd.to_numeric)
all_values = all_values.sort_values(['ts(ms)']);
all_values.head(5)

Unnamed: 0,accX,accY,accZ,experiment,filename,gyrX,gyrY,gyrZ,label,magX,magY,magZ,position,ts(ms),userAge,userGender,userID
0,-0.4218206405639648,1.136497378349304,0.278784453868866,Bending_1,UMAFall_Subject_01_ADL_Bending_1_2016-06-13_20...,-1.0,-1.0,-1.0,Bending,-1,-1,-1,pocket,145,22,female,Subject 01
0,-0.3587130010128021,0.6935194730758667,0.1525676548480988,Bending_1,UMAFall_Subject_01_ADL_Bending_1_2016-06-13_20...,-1.0,-1.0,-1.0,Bending,-1,-1,-1,pocket,146,22,female,Subject 01
0,0.8876953125,-0.05078125,0.464111328125,Bending_1,UMAFall_Subject_01_ADL_Bending_1_2016-06-13_20...,-1.0,-1.0,-1.0,Bending,-1,-1,-1,waist,158,22,female,Subject 01
0,-1.0,-1.0,-1.0,Bending_1,UMAFall_Subject_01_ADL_Bending_1_2016-06-13_20...,12.9453125,2.28125,4.65625,Bending,-1,-1,-1,waist,159,22,female,Subject 01
0,-0.851615309715271,0.4340080618858337,0.896072268486023,Bending_1,UMAFall_Subject_01_ADL_Bending_1_2016-06-13_20...,-1.0,-1.0,-1.0,Bending,-1,-1,-1,pocket,161,22,female,Subject 01


In [316]:
wrist_acc.head(5)

Unnamed: 0,SampleNr,SensorID,SensorType,X-Axis,Y-Axis,Z-Axis,ts(ms)
0,199,3,0,-0.8466796875,-0.50537109375,0.109375,10055
0,200,3,0,-0.9013671875,-0.435546875,0.1025390625,10086
0,201,3,0,-0.899658203125,-0.43212890625,0.107177734375,10135
0,19,3,0,-1.017822265625,-0.535400390625,0.066162109375,1018
0,202,3,0,-0.8994140625,-0.43212890625,0.1162109375,10193


In [346]:
acc = False
gyr = False
mag = False
pocket = False
ankle = False
for index, line in all_values.iterrows():
    if line['accX'] != -1:
        acc = True;
    if line['gyrX'] != -1:
        gyr = True
    if line['magX'] != -1:
        mag = True
    if line['position'] == 'pocket':
        pocket = True
    if line['position'] == 'wrist':
        ankle = True

print(acc)
print(gyr)
print(mag)
print(ankle)
print(pocket)

True
True
True
True
True


In [268]:
###
### validate data, check if everything is correct
###
print("all values length:", len(filecontent))
print("train_data_pocket.describe():", len(values['train_data_pocket']))
print("train_data_chest.describe():", len(values['train_data_chest']))
print("train_data_waist.describe():", len(values['train_data_waist']))
print("train_data_wrist.describe():", len(values['train_data_wrist']))
print("train_data_ankle.describe():", len(values['train_data_ankle']))
total_values_length = len(train_data_pocket) + len(train_data_chest) + len(train_data_waist) + len(train_data_wrist) + len(train_data_ankle)
print("total_values_length:", total_values_length)

all values length: 6568
train_data_pocket.describe(): 2973
train_data_chest.describe(): 900
train_data_waist.describe(): 900
train_data_wrist.describe(): 897
train_data_ankle.describe(): 897
total_values_length: 6567


In [242]:
###
### Put In Memory All Fies
### 

# upload raw csv file
train_dataRaw = pd.read_csv('../../../datasets/ANSAMO-2016/UMA_ADL_FALL_Dataset/UMAFall_Subject_01_ADL_Bending_1_2016-06-13_20-25-34.csv')
train_dataRaw.head(31)

# ignore fist lines
train_data = train_dataRaw.iloc[31:]
train_data.head(5)
##
print(type(train_dataRaw))

<class 'pandas.core.frame.DataFrame'>


In [14]:
train_data.describe()

Unnamed: 0,% Universidad de Malaga - ETSI de Telecomunicacion (Spain)
count,6568
unique,6568
top,5994;119;0.91552734375;-0.086181640625;0.47509...
freq,1


In [16]:
train_dataRaw = pd.read_csv('../../../datasets/ANSAMO-2016/UMA_ADL_FALL_Dataset/UMAFall_Subject_01_ADL_Bending_1_2016-06-13_20-25-34.csv')
header_tmp = {'ts(ms)': [], 'SampleNr': [], 'X-Axis': [], 'Y-Axis': [], 'Z-Axis': [], 'SensorType': [] , 'SensorID': []}
values = pd.DataFrame(data=header_tmp)
i = 0
for index, row in train_data.iterrows():
        line = row[0]
        array = line.split(';')
        if i > 0: values = values.append(pd.DataFrame({'ts(ms)': [array[0]], 'SampleNr': [array[1]], 'X-Axis': [array[2]], 'Y-Axis': [array[3]], 'Z-Axis': [array[4]], 'SensorType': [array[5]] , 'SensorID': [array[6]]}))
        i+=1;

values.head(5)

Unnamed: 0,SampleNr,SensorID,SensorType,X-Axis,Y-Axis,Z-Axis,ts(ms)
0,1,0,0,-0.4218206405639648,1.136497378349304,0.278784453868866,145
0,2,0,0,-0.5496244430541992,0.902985155582428,0.1047172695398331,145
0,3,0,0,-0.6334832310676575,0.8695393800735474,0.0339203476905822,145
0,4,0,0,-0.6050428152084351,0.9466843605041504,0.0695630833506584,145
0,5,0,0,-0.5244794487953186,0.9684126377105712,0.065900482237339,145


In [15]:
values.describe()

Unnamed: 0,SampleNr,SensorID,SensorType,X-Axis,Y-Axis,Z-Axis,ts(ms)
count,6567,6567,6567,6567.0,6567.0,6567.0,6567
unique,2974,5,3,3052.0,3050.0,3314.0,5518
top,226,0,0,50.16666793823242,0.9328912496566772,62.66666793823242,146
freq,13,2973,4171,41.0,35.0,20.0,22


In [24]:
train_data_processed = values.copy()
train_data_processed_ordered = train_data_processed.sort_values(['SensorID', 'SensorType', 'ts(ms)']);
train_data_processed_ordered.head(10)

Unnamed: 0,SampleNr,SensorID,SensorType,X-Axis,Y-Axis,Z-Axis,ts(ms)
0,1983,0,0,-0.2003931552171707,0.9299615025520324,-0.1074323281645775,10003
0,1984,0,0,-0.1989290565252304,0.9316698908805848,-0.1104850172996521,10008
0,1985,0,0,-0.1991733312606812,0.929351568222046,-0.1063338592648506,10013
0,1986,0,0,-0.1973420232534409,0.9271546006202698,-0.1076766029000282,10018
0,199,0,0,-0.1048152893781662,0.9808630347251892,-0.052381195127964,1003
0,1987,0,0,-0.1992946863174439,0.9300844073295592,-0.1107277423143387,10040
0,1988,0,0,-0.2006374299526215,0.9320370554924012,-0.1102407425642014,10040
0,1989,0,0,-0.1986847668886185,0.9317927956581116,-0.1109720170497894,10040
0,1990,0,0,-0.1973420232534409,0.9328912496566772,-0.1087750792503357,10040
0,1991,0,0,-0.1985618621110916,0.9342340230941772,-0.1084094420075417,10049


In [28]:
###
### analysis
###

mylist = list(set(train_data_processed_ordered['SensorType']))
print("Unique Sensor Types:", mylist)
mylist = list(set(train_data_processed_ordered['SensorID']))
print("Unique Sensor IDs:", mylist)

Unique Sensor Types: ['1', '0', '2']
Unique Sensor IDs: ['3', '4', '0', '1', '2']


In [50]:
#train_data_processed_ordered_tmp = train_data_processed_ordered.sort_values(['SensorType', 'ts(ms)']);
### xi - xi+1 , avgIntertvalTime: ? , minIntervalTime: ?, maxIntervalTime?

##  Sensor_ID	 Position	 Device Model                                   
#0	 RIGHTPOCKET	 lge-LG-H815-5.1                                
#1	 CHEST	 SensorTag
#2	 WAIST	 SensorTag
#3	 WRIST	 SensorTag                                            
#4	 ANKLE	 SensorTag                                                                                                            
RIGHTPOCKET = '0'
CHEST = '1'
WAIST = '2'
WRIST = '3'
ANKLE = '4'
#% Sensor_Type:                                                                     
#% Accelerometer = 0                                                                
#% Gyroscope = 1                                                                    
#% Magnetometer = 2 
ACCE = '0'
GYRO = '1'
MAGN = '2'

## divide signals per position
train_data_pocket = train_data_processed_ordered[train_data_processed_ordered['SensorID'].isin([RIGHTPOCKET])];
train_data_chest = train_data_processed_ordered[train_data_processed_ordered['SensorID'].isin([CHEST])];
train_data_waist = train_data_processed_ordered[train_data_processed_ordered['SensorID'].isin([WAIST])];
train_data_wrist = train_data_processed_ordered[train_data_processed_ordered['SensorID'].isin([WRIST])];
train_data_ankle = train_data_processed_ordered[train_data_processed_ordered['SensorID'].isin([ANKLE])];

print("all values length:", len(values))
print("train_data_pocket.describe():", len(train_data_pocket))
print("train_data_chest.describe():", len(train_data_chest))
print("train_data_waist.describe():", len(train_data_waist))
print("train_data_wrist.describe():", len(train_data_wrist))
print("train_data_ankle.describe():", len(train_data_ankle))
total_values_length = len(train_data_pocket) + len(train_data_chest) + len(train_data_waist) + len(train_data_wrist) + len(train_data_ankle)
print("total_values_length:", total_values_length)

###
### Verify If The Splitting Was Correcly Made
### total_values_length  == len(values) !!! or there's a bug!
if total_values_length == len(values):
    print("Split Was Correctly Made.");
else:
    print("Split Wasnt Correctly Made Because "
          "The Sum Of All Position Arrary Lengths is Different Than All Values Array Length")
    raise Exception('The Sum Of All Position Arrary Lengths is Different Than All Values Array Length') # Don't! If you catch, likely to hide bugs.

all values length: 6567
train_data_pocket.describe(): 2973
train_data_chest.describe(): 900
train_data_waist.describe(): 900
train_data_wrist.describe(): 897
train_data_ankle.describe(): 897
total_values_length: 6567
Split Was Correctly Made.


In [108]:
###
### Divide all data by sensor type
###

# pocket 
pocket_acc = train_data_pocket[train_data_pocket['SensorType'].isin([ACCE])];
pocket_gyro = train_data_pocket[train_data_pocket['SensorType'].isin([GYRO])];
pocket_magn = train_data_pocket[train_data_pocket['SensorType'].isin([MAGN])];
print("train_data_pocket length:", len(train_data_pocket))
print("pocket_acc length:", len(pocket_acc))
print("pocket_gyro length:", len(pocket_gyro))
print("pocket_magn length:", len(pocket_magn))
total_legth = len(pocket_acc) + len(pocket_gyro) + len(pocket_magn)
if len(train_data_pocket) != total_legth:
    raise  Exception('The Sum Of All Sensor Type Arrary Lengths is Different Than All Values Array Length')

# wrist
wrist_acc = train_data_wrist[train_data_wrist['SensorType'].isin([ACCE])];
wrist_gyro = train_data_wrist[train_data_wrist['SensorType'].isin([GYRO])];
wrist_magn = train_data_wrist[train_data_wrist['SensorType'].isin([MAGN])];
print("train_data_pocket length:", len(train_data_wrist))
print("pocket_acc length:", len(wrist_acc))
print("pocket_gyro length:", len(wrist_gyro))
print("pocket_magn length:", len(wrist_magn))
total_legth = len(wrist_acc) + len(wrist_gyro) + len(wrist_magn)
if len(train_data_wrist) != total_legth:
    raise  Exception('The Sum Of All Sensor Type Arrary Lengths is Different Than All Values Array Length')
    
# ankle
ankle_acc = train_data_ankle[train_data_ankle['SensorType'].isin([ACCE])];
ankle_gyro = train_data_ankle[train_data_ankle['SensorType'].isin([GYRO])];
ankle_magn = train_data_ankle[train_data_ankle['SensorType'].isin([MAGN])];
print("train_data_pocket length:", len(train_data_ankle))
print("pocket_acc length:", len(ankle_acc))
print("pocket_gyro length:", len(ankle_gyro))
print("pocket_magn length:", len(ankle_magn))
total_legth = len(ankle_acc) + len(ankle_gyro) + len(ankle_magn)
if len(train_data_ankle) != total_legth:
    raise  Exception('The Sum Of All Sensor Type Arrary Lengths is Different Than All Values Array Length')

# waist
waist_acc = train_data_waist[train_data_waist['SensorType'].isin([ACCE])];
waist_gyro = train_data_waist[train_data_waist['SensorType'].isin([GYRO])];
waist_magn = train_data_waist[train_data_waist['SensorType'].isin([MAGN])];
print("train_data_pocket length:", len(train_data_waist))
print("pocket_acc length:", len(waist_acc))
print("pocket_gyro length:", len(waist_gyro))
print("pocket_magn length:", len(waist_magn))
total_legth = len(waist_acc) + len(waist_gyro) + len(waist_magn)
if len(train_data_waist) != total_legth:
    raise  Exception('The Sum Of All Sensor Type Arrary Lengths is Different Than All Values Array Length')

# chest
chest_acc = train_data_chest[train_data_chest['SensorType'].isin([ACCE])];
chest_gyro = train_data_chest[train_data_chest['SensorType'].isin([GYRO])];
chest_magn = train_data_chest[train_data_chest['SensorType'].isin([MAGN])];
print("train_data_pocket length:", len(train_data_chest))
print("pocket_acc length:", len(chest_acc))
print("pocket_gyro length:", len(chest_gyro))
print("pocket_magn length:", len(chest_magn))
total_legth = len(chest_acc) + len(chest_gyro) + len(chest_magn)
if len(train_data_chest) != total_legth:
    raise  Exception('The Sum Of All Sensor Type Arrary Lengths is Different Than All Values Array Length')

train_data_pocket length: 2973
pocket_acc length: 2973
pocket_gyro length: 0
pocket_magn length: 0
train_data_pocket length: 897
pocket_acc length: 299
pocket_gyro length: 299
pocket_magn length: 299
train_data_pocket length: 897
pocket_acc length: 299
pocket_gyro length: 299
pocket_magn length: 299
train_data_pocket length: 900
pocket_acc length: 300
pocket_gyro length: 300
pocket_magn length: 300
train_data_pocket length: 900
pocket_acc length: 300
pocket_gyro length: 300
pocket_magn length: 300


In [130]:
def join_all_signals_types(signals):
    ''' Receives the argument: pd.DataFrame([{'value': pocket_acc, 'signalType': 'acc'}, {'value': pocket_gyro, 'signalType': 'gyro'}])
    '''
    header = [ 'ts(ms)', 'accX', 'accY', 'accZ',
              'magX', 'magY', 'magZ',
              'gyrX', 'gyrY', 'gyrZ',
              'userGender', 'userAge', 'userID',
              'position', 
              'label',
              'filename',
              'experiment' ];
    all_values = pd.DataFrame(columns=header);
    
    for indexR, lineR in signals.iterrows():
        array = lineR['value'];
        signalType = lineR['signalType'];
        
        for index, line in array.iterrows():
            row = all_values[all_values['ts(ms)'] == line['ts(ms)']];
            ts_tmp = [line['ts(ms)']];
            
            if signalType is "acc": 
                 acc_values_tmp = [ [line['X-Axis']] ,  [line['Y-Axis']],  [line['Z-Axis']]];
                 magn_values_tmp = [ [-1] ,  [-1],  [-1]];
                 gyro_values_tmp = [ [-1] ,  [-1],  [-1]]; 
            elif signalType is "magn":
                 acc_values_tmp = [ [-1] ,  [-1],  [-1]]; 
                 magn_values_tmp = [ [line['X-Axis']] ,  [line['Y-Axis']],  [line['Z-Axis']]];
                 gyro_values_tmp = [ [-1] ,  [-1],  [-1]]; 
            else:
                acc_values_tmp = [ [-1] ,  [-1],  [-1]];    
                magn_values_tmp = [ [-1] ,  [-1],  [-1]];
                gyro_values_tmp = [ [line['X-Axis']] ,  [line['Y-Axis']],  [line['Z-Axis']]];      

            if row.size == 0:
                all_values = all_values.append(pd.DataFrame({'ts(ms)': ts_tmp, 
                                    'accX':acc_values_tmp[0],
                                    'accY':acc_values_tmp[1],
                                    'accZ': acc_values_tmp[2], 
                                    'magX':magn_values_tmp[0],
                                    'magY':magn_values_tmp[1], 
                                    'magZ': magn_values_tmp[2], 
                                    'gyrX':gyro_values_tmp[0],
                                    'gyrY': gyro_values_tmp[1],
                                    'gyrZ': gyro_values_tmp[2], 
                                    'userGender': [1], 
                                    'userAge': [1], 
                                    'userID': [1],
                                    'position': line['SensorID'], 
                                    'label':['null'],
                                    'filename':['filename'],
                                    'experiment': ['experiment']}))
                
    return all_values;
###
signals = pd.DataFrame([{'value': pocket_acc, 'signalType': 'gyro'}])
all_values = join_all_signals_types(signals)
print("len of all values:", len(all_values));

len of all values: 299


In [131]:
print("len of all values:", len(all_values));
print("len of all values:", len(wrist_gyro));
wrist_gyro.describe()

len of all values: 299
len of all values: 299


Unnamed: 0,SampleNr,SensorID,SensorType,X-Axis,Y-Axis,Z-Axis,ts(ms)
count,299,299,299,299.0,299.0,299.0,299
unique,299,1,1,275.0,273.0,289.0,299
top,64,3,1,-4.4921875,27.5859375,-38.5234375,8039
freq,1,299,299,3.0,3.0,3.0,1


In [132]:
wrist_gyro.head(5)

Unnamed: 0,SampleNr,SensorID,SensorType,X-Axis,Y-Axis,Z-Axis,ts(ms)
0,200,3,1,-1.7265625,-0.3125,-0.59375,10056
0,201,3,1,-1.4921875,-0.609375,-0.7265625,10087
0,202,3,1,-6.1796875,-0.75,0.609375,10137
0,20,3,1,20.4609375,-72.1484375,-117.0859375,1019
0,203,3,1,-4.375,-0.7734375,-2.7578125,10194


In [133]:
all_values.head(5)

Unnamed: 0,accX,accY,accZ,experiment,filename,gyrX,gyrY,gyrZ,label,magX,magY,magZ,position,ts(ms),userAge,userGender,userID
0,-1,-1,-1,experiment,filename,-1.7265625,-0.3125,-0.59375,,-1,-1,-1,3,10056,1,1,1
0,-1,-1,-1,experiment,filename,-1.4921875,-0.609375,-0.7265625,,-1,-1,-1,3,10087,1,1,1
0,-1,-1,-1,experiment,filename,-6.1796875,-0.75,0.609375,,-1,-1,-1,3,10137,1,1,1
0,-1,-1,-1,experiment,filename,20.4609375,-72.1484375,-117.0859375,,-1,-1,-1,3,1019,1,1,1
0,-1,-1,-1,experiment,filename,-4.375,-0.7734375,-2.7578125,,-1,-1,-1,3,10194,1,1,1


In [134]:
all_values.describe()

Unnamed: 0,accX,accY,accZ,experiment,filename,gyrX,gyrY,gyrZ,label,magX,magY,magZ,position,ts(ms),userAge,userGender,userID
count,299,299,299,299,299,299.0,299.0,299.0,299.0,299,299,299,299,299,299,299,299
unique,1,1,1,1,1,275.0,273.0,289.0,1.0,1,1,1,1,299,1,1,1
top,-1,-1,-1,experiment,filename,-4.4921875,27.5859375,-38.5234375,,-1,-1,-1,3,8039,1,1,1
freq,299,299,299,299,299,3.0,3.0,3.0,299.0,299,299,299,299,1,299,299,299


In [88]:
signals = pd.DataFrame([{'value': [1,2,3], 'id': ['asdas']}, {'value': [1,2,3], 'id': ['asdas']}])
signals.at[0,'value'] = [1,2, 4]
print(signals.loc[0]['value'])

for index, line in signals.iterrows():
    print(line['value'])

[1, 2, 4]
[1, 2, 4]
[1, 2, 3]
