# Datalogger and Format Handler
The datalogger format handler provides scripts for handling the datalogger dataset format such as binary conversion to csv-format, meta data handling and labeling information extraction. Basically, the datalogger format handler offers two classes: <code>class GLP_Decoder()</code> for decoding binary format and <code>class DatasetFormatHandler()</code> for processing of the logged data, meta-data and labeling information.

Author: Sergej Scheiermann (BST/ESW4)  
Date: 2019-10-19  

## Include modules

In [12]:
import sys
import os
import re
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import json
import datetime

import DataloggerFormatHandler as dlf

## Include Datasets for Processing

In [13]:
#############################################################################
datasetList = []
datasetList.append('tutorial\\')

print("[INFO] following datasets included:")
for dataset in datasetList:
    print("\t%s"%(os.path.abspath(dataset)))

[INFO] following datasets included:
	C:\Users\ina8cob\scripts\tutorial_datalogger\tutorial


## Binary Files Conversion to CSV-Format (only if needed)

In [14]:
DO_CONVERSION = True # set to False if binary conversion are not required (e.g. was already converted)

# check if conversion needed
if DO_CONVERSION:
    hndlr = dlf.DatasetFormatHandler()
    print('[INFO] start conversion job.')
    #### decode all binaries
    for path in datasetList:
        hndlr.convert_bin2csv(path)
    print('[INFO] conversion job done. :-)')
else:
    print('[INFO] conversion job deactivated.')

[INFO] start conversion job.
[INFO]		converting file: C:\Users\ina8cob\scripts\tutorial_datalogger\tutorial\01_data\bin\LOG_AFD0ED_1603887725768.bin
[INFO]		converting file: C:\Users\ina8cob\scripts\tutorial_datalogger\tutorial\01_data\bin\LOG_AFD0ED_1603887786067.bin
[INFO] conversion job done. :-)


# Datalogger Dataset Structure
The data format handler implements generic datalogger format and meta information interfaces and helps to extract labels, configuratios and data from the datasets folders.

The dataset contains meta folder (***00_meta***) and data folder (***01_data***).
Every created dataset after all cenversions follows structure listed below:
- \yourdataset
    - \00_meta
        - datasetInfo.json
        - userinfo.txt
        - fileinfo.txt
        - activity.txt
        - deviceConfig.json
        - algoResults.txt
    - \01_data
        - \bin
            - LOG_DEVID_UNIXTIMESTAMP_YOURLOGFILE1.bin
            - LOG_DEVID_UNIXTIMESTAMP_YOURLOGFILE2.bin
            - ...
        - \csv
            - LOG_DEVID_UNIXTIMESTAMP_YOURLOGFILE1.csv
            - LOG_DEVID_UNIXTIMESTAMP_YOURLOGFILE2.csv
            - ...
        - \label
            - LABEL_LOG_UNIXTIMESTAMP_YOURLOGFILE1.csv
            - LABEL_LOG_UNIXTIMESTAMP_YOURLOGFILE2.csv
            - ...            
        - \plots
            - LOG_DEVID_UNIXTIMESTAMP_YOURLOGFILE1.png
            - LOG_DEVID_UNIXTIMESTAMP_YOURLOGFILE2.png
            - ...
    - \99_configs
        - activities.json
        - device_location.json


## Meta Information
The BST datalogger application contains the meta information for the logged files such as date of creation, 
For every binary file logged with datalogger application, meta information will be created.<br>
Datalogger application (Android Apk: BST Datalogger) contains following meta information:
* **datasetInfo.json** contains meta information about dataset such as date of creation, name and comments
* **userinfo.txt** - contains meta information about user such as age, gender, name, location etc... 
* **fileinfo.txt** - contains meta information about the logged files such as start time, stop time, device id and others.
* **activity.txt** - contains meta information about file, activity and comments.
* **deviceConfig.json** - contains meta information about device configurations and logged sensors such as device id, device name, sensor type, name, measurement range etc...
* **algoResults.txt** - contains meta information about file, algorithm reszlts and comments (**NOT USED TODAY**).

***example dataset: meta structure***

In [15]:
for datasetDir in datasetList:
    hndlr = dlf.DatasetFormatHandler()
    hndlr.list_metafiles(datasetDir)

[INFO]: meta-files in directory C:\Users\ina8cob\scripts\tutorial_datalogger\tutorial\00_meta
	C:\Users\ina8cob\scripts\tutorial_datalogger\tutorial\00_meta\activity.txt
	C:\Users\ina8cob\scripts\tutorial_datalogger\tutorial\00_meta\algoResults.txt
	C:\Users\ina8cob\scripts\tutorial_datalogger\tutorial\00_meta\datasetInfo.json
	C:\Users\ina8cob\scripts\tutorial_datalogger\tutorial\00_meta\deviceConfig.json
	C:\Users\ina8cob\scripts\tutorial_datalogger\tutorial\00_meta\fileInfo.txt
	C:\Users\ina8cob\scripts\tutorial_datalogger\tutorial\00_meta\userInfo.txt


### Dataset Meta Information
Dataset meta information contains: name of the dataset, date of creation, and description of the dataset given at time of creation by user/tester.<br>

***code example for dataset meta extraction***

In [16]:
for datasetDir in datasetList:
    hndlr = dlf.DatasetFormatHandler()
    hndlr.get_datasetinfo(datasetDir)
    display(hndlr.datasetinfo)

{'name': 'Test',
 'description': 'newTimestamp',
 'date_of_creation': '2020-10-28 17:50:57.941'}

### User Meta Information

In [17]:
for datasetDir in datasetList:
    hndlr = dlf.DatasetFormatHandler()
    hndlr.get_userinfo(datasetDir)
    display(hndlr.userinfo)

Unnamed: 0,user_id,name,age,gender,height,weight,country,nationality,experience
0,3,Karthick,29,Male,173,63,Cob,IND,6


### Files Meta Information
'fileinfo.txt' contains information about logged files in the dataset. For every logged file following information will be stored:
- file_name --> name of the file logged on the device
- file_id --> contains the file_id / run id (same for the )

***code example on the example dataset***

In [18]:
for datasetDir in datasetList:
    hndlr = dlf.DatasetFormatHandler()
    hndlr.get_fileinfo(datasetDir)
    display(hndlr.fileinfo)

Unnamed: 0,file_name,file_id,dev_id,comments,start_time,stop_time,user_id
0,LOG_AFD0ED_1603887725768.bin,1603887725768,D0:53:08:AF:D0:ED,walk,2020-10-28 17:52:05.768,2020-10-28 17:52:58.131,3
1,LOG_AFD0ED_1603887786067.bin,1603887786067,D0:53:08:AF:D0:ED,test2,2020-10-28 17:53:06.067,2020-10-28 17:53:51.399,3


### Device Configuration Meta Information
Device configurations 


In [19]:
for datasetDir in datasetList:
    hndlr = dlf.DatasetFormatHandler()
    hndlr.get_device_configs(datasetDir)

    print("keys in the device dict and sensor dict")
    print(hndlr.deviceconfigs[0].keys()) # show keys only device config for first file
    print(hndlr.deviceconfigs[0]["sensors"][0].keys()) # show keys in sensors config for first file

    print('\nfull device config dict content: ')
    display(hndlr.deviceconfigs[0])

keys in the device dict and sensor dict
dict_keys(['file_name', 'file_id', 'dev_id', 'dev_loc', 'dev_fw', 'dev_name', 'type', 'sensors'])
dict_keys(['dev_id', 'sensor_type', 'name', 'range', 'bw', 'odr', 'unit', 'samplingfreq'])

full device config dict content: 


{'file_name': 'LOG_AFD0ED_1603887725768.bin',
 'file_id': '1603887725768',
 'dev_id': 'D0:53:08:AF:D0:ED',
 'dev_loc': 'LEFT',
 'dev_fw': 'V1.3.0',
 'dev_name': 'BST-DataLogger',
 'type': 'DEVICE',
 'sensors': [{'dev_id': 'D0:53:08:AF:D0:ED',
   'sensor_type': 'PRESSURE',
   'name': 'BMP390',
   'range': None,
   'bw': None,
   'odr': 'SENSOR_ENV_ODR_25_HZ',
   'unit': 'Physical units',
   'samplingfreq': '25'},
  {'dev_id': 'D0:53:08:AF:D0:ED',
   'sensor_type': 'ACCEL',
   'name': 'BMI270_ACCEL',
   'range': 'SENSOR_ACCEL_RANGE_8G',
   'bw': 'SENSOR_ACCEL_BW_NORMAL_AVG4',
   'odr': 'SENSOR_ODR_100HZ',
   'unit': 'LSB',
   'samplingfreq': '100'},
  {'dev_id': 'D0:53:08:AF:D0:ED',
   'sensor_type': 'GYRO',
   'name': 'BMI270_GYRO',
   'range': 'SENSOR_GYRO_RANGE_2000_DPS',
   'bw': 'SENSOR_GYRO_BW_NORMAL_MODE',
   'odr': 'SENSOR_ODR_100HZ',
   'unit': 'LSB',
   'samplingfreq': '100'},
  {'dev_id': 'D0:53:08:AF:D0:ED',
   'sensor_type': 'MAG',
   'name': 'BMM150',
   'range': None,
   '

### Activity Meta Information

In [20]:
for datasetDir in datasetList:
    hndlr = dlf.DatasetFormatHandler()
    hndlr.get_activityinfo(datasetDir)
    display(hndlr.activityinfo)

Unnamed: 0,file_name,file_id,activity,start_time,stop_time,comments
0,LOG_AFD0ED_1603887725768.bin,1603887725768,STANDING_STILL,2020-10-28 17:52:05.768,2020-10-28 17:52:58.131,walk
1,LOG_AFD0ED_1603887786067.bin,1603887786067,STANDING_STILL,2020-10-28 17:53:06.067,2020-10-28 17:53:51.399,test2


## Datalogger Format
### binary-format
see GLP_Decoder()
### csv-format

    

In [21]:
for datasetDir in datasetList:
    dataHandler = dlf.DatasetFormatHandler()
    dataHandler.get_csvdata_files(datasetDir)
    dataHandler.csvDataList["file_fullpath"][0]

    # read first file from csv list
    data_df = dataHandler.read_csvfile(dataHandler.csvDataList["file_fullpath"][0])
    # display first 3 lines in csv-data file
    display(data_df.head(2))

Unnamed: 0,timestamp[s],packetcount,bmi270ax[lsb],bmi270ay[lsb],bmi270az[lsb],bmi270gx[lsb],bmi270gy[lsb],bmi270gz[lsb],bmi270t[degC],bmm150mx[lsb],bmm150my[lsb],bmm150mz[lsb],TMG_Prox[lsb],bmp390p[hPa],bmp390t[degC],UnixTime[ms]
0,0.01,0.0,-690.0,2507.0,3165.0,-3.0,30.0,-1.0,29.931,327.0,435.0,689.0,11136.0,961.906494,35.900002,1603888000000.0
1,0.02,1.0,-689.0,2514.0,3163.0,6.0,29.0,-1.0,29.922001,327.0,435.0,689.0,11136.0,961.906494,35.900002,1603888000000.0


# Hands-On: Using Datalogger Format Handler
Example code for reading the dataset data and plotting the information including labels.

Processing steps:
1. list the meta files, data files and label files from the repository
2. read the meta files, data files and label files from the repository
3. read the data files, data files and label files from the repository
4. plot the logged sensor data


In [22]:
plt.rcParams['figure.figsize'] = (12, 14) # plot size parameters

# evaluate repositories and extract data, labels and visualize them
for datasetDir in datasetList:
    print("\n[INFO] dataset repo \t %s"%(os.path.abspath(datasetDir)))
    dataHandler = dlf.DatasetFormatHandler()
    # 1. list the meta files, data files and label files from the repository
    #dataHandler.list_metafiles(datasetDir)
    #dataHandler.list_datacsvfiles(datasetDir)
    #dataHandler.list_labelsfiles(datasetDir)

    # 2. read the meta files, 
    dataHandler.get_fileinfo(datasetDir)
    dataHandler.get_activityinfo(datasetDir)
    dataHandler.get_userinfo(datasetDir)
    dataHandler.get_device_configs(datasetDir)
    
    # 3. data files and label files from the repository
    dataHandler.get_csvdata_files(datasetDir)
    dataHandler.get_bindata_files(datasetDir)
    dataHandler.get_labels_files(datasetDir)

    # 4. do process the files and plot data and labels
    for fileName in dataHandler.csvDataList["file_fullpath"]:        
        print("\n[INFO] processing file: %s"%(fileName))
        
        # get index of the file to be processed
        file_idx = dataHandler.csvDataList["file_fullpath"].index(fileName)
        
        # read data from csv file
        data_df = dataHandler.read_csvfile(fileName)
        data_filename = dataHandler.csvDataList["file_name"][file_idx]
        
        for labelName in dataHandler.labelDataList["file_name"]:
            
            # get index of the file to be processed
            label_idx = dataHandler.labelDataList["file_name"].index(labelName)
            label_filename = dataHandler.labelDataList["file_name"][label_idx]
            
            if data_filename in label_filename:
                # read labels
                label_df = dataHandler.read_labelsfile(dataHandler.labelDataList["file_fullpath"][label_idx])
                labels_in_file = dataHandler.get_labelnames(label_df)

                # 
                labelsList = []
                t_start = dataHandler.fileinfo["start_time"][label_idx]
                t_vec = data_df["timestamp[s]"]*dataHandler.TIME_DT
                # extract labels with same id
                for iii in range(len(labels_in_file["label_id"])):
                    same_labels = dataHandler.extract_labels_with_sameid(label_df, labels_in_file["label_id"][iii])
                    label_sig = dataHandler.create_labelsignals(t_vec,t_start,same_labels)
                    labelsList.append(label_sig)
        
        # read sensor configs for the logged file
        sensors_cfg = dataHandler.config_get_sensors(dataHandler.binDataList["file_name"][file_idx])
        
        # plot sensor data
        plot_idxL = len(sensors_cfg)+1
        plot_idx = 1
        
        plt.figure()

        for sensor in sensors_cfg:
            sensordata = dataHandler.get_sensordata(data_df,sensor["name"])
            senscfg = dataHandler.get_sensor_configs(sensor)            
            plt.subplot(plot_idxL,1,plot_idx)
            dataHandler.plot_data(sensordata,senscfg["name"], dataHandler.TIME_DT, senscfg["lsb2unit"], title=dataHandler.csvDataList["file_name"][file_idx])
            plot_idx += 1
        
        for labelName in dataHandler.labelDataList["file_name"]:
            
            # get index of the file to be processed
            label_idx = dataHandler.labelDataList["file_name"].index(labelName)
            label_filename = dataHandler.labelDataList["file_name"][label_idx]
            
            if data_filename in label_filename:
                # plot labels
                plt.subplot(plot_idxL,1,plot_idx)
                plt.title('labels '+dataHandler.csvDataList["file_name"][file_idx])
                for idx in range(len(labelsList)):
                    plt.plot(labelsList[idx]["time"],labelsList[idx]["label_id"],label=labels_in_file["label_name"][idx])
                plt.xlabel('time (s)')
                plt.ylabel('label_id')
                plt.legend()
                plt.grid()
        
        plt.tight_layout()
        plt.show()


[INFO] dataset repo 	 C:\Users\ina8cob\scripts\tutorial_datalogger\tutorial

[INFO] processing file: C:\Users\ina8cob\scripts\tutorial_datalogger\tutorial\01_data\csv\LOG_AFD0ED_1603887725768.csv


KeyError: "['timestamp(s)'] not in index"

<Figure size 864x1008 with 0 Axes>