In [6]:
from DQM import DQM_single_file
from load_data import *

Load a sufficient json record and demonstrate the structure of the loaded data frame. For the original json file data structure, check the provided test data under data/sensor/json in the repo. modules/Sensor/load_data.py provides examples of how to load a json/csv file into this particular structure.

In [7]:
df = load_data_json_single_file('/home/lin/Documents/CAMH/QA-module/data/sensor/json/SUBJ00001_Accelerometer_REC000000.json')
print(df)

               timestamp         x         y          z
0    1593954057754000000  2.229002  6.114791   5.324705
1    1593954058125000000  2.135628  4.283227   6.833052
2    1593954058506000000 -1.721431  5.561730  10.201694
3    1593954058729000000 -0.220267  5.365406   7.668628
4    1593954058937000000 -0.177171  5.130774   8.346188
..                   ...       ...       ...        ...
440  1593954154639000000 -0.258574  6.212954   7.388506
441  1593954154858000000 -0.301669  6.212954   7.378930
442  1593954155076000000 -0.275333  6.193800   7.407660
443  1593954155276000000 -0.385466  6.205771   7.474698
444  1593954155491000000 -0.385466  6.215348   7.129933

[445 rows x 4 columns]


Create the DQM_single_file object, set the input data frame, then compute DQM

In [8]:
single_json_file_dqm = DQM_single_file()
single_json_file_dqm.set_input_data(df)
single_json_file_dqm.compute_DQM()

Start computing the DQM... This may take a long time if the dataset is large
The total time for computing the DQM is: 0.2938528060913086 seconds.


Get the fields of computed data quality matrix

In [9]:
single_json_file_dqm.get_fields()

['IRLR', 'SNR', 'SCR', 'SRC', 'MDR', 'APD']

Get the computed data quality matrix as a list, each element in the list represents a metric.

In [10]:
single_json_file_dqm.get_DQM()

['1',
 '-3.096340603605768',
 '1',
 '0.9011216710642995',
 '0.350561797752809',
 '0.02696629213483146']

Get each single metric of the computed DQM as a str

In [11]:
single_json_file_dqm.get_IRLR()

'1'

In [12]:
single_json_file_dqm.get_SNR()

'-3.096340603605768'

In [13]:
single_json_file_dqm.get_SCR()

'1'

In [14]:
single_json_file_dqm.get_SRC()

'0.9011216710642995'

In [15]:
single_json_file_dqm.get_MDR()

'0.350561797752809'

In [16]:
single_json_file_dqm.get_APD()

'0.02696629213483146'

Save the computed DQM as a csv file at given path.

In [17]:
single_json_file_dqm.save_to_file('/home/lin/Documents/CAMH/SenseActivity/data/demo_result_single_file.csv')

Data successfuly saved.


Display the saved csv file.

In [18]:
with open("/home/lin/Documents/CAMH/SenseActivity/data/demo_result_single_file.csv", "r") as DQM_csv:
    for row in DQM_csv:
        print(row)

IRLR,SNR,SCR,SRC,MDR,APD

1,-3.096340603605768,1,0.9011216710642995,0.350561797752809,0.02696629213483146



Set SNR to be not included in DQM

In [19]:
single_json_file_dqm.set_SNR(False)

Compute and get the updated DQM, now SNR should not be included in DQM.

In [20]:
single_json_file_dqm.compute_DQM()

Start computing the DQM... This may take a long time if the dataset is large
The total time for computing the DQM is: 0.29285621643066406 seconds.


In [21]:
single_json_file_dqm.get_fields()

['IRLR', 'SCR', 'SRC', 'MDR', 'APD']

In [22]:
single_json_file_dqm.get_DQM()

['1', '1', '0.9011216559108686', '0.2606741573033708', '0.02696629213483146']

Now get_SNR should not be called

In [23]:
single_json_file_dqm.get_SNR()

ValueError: 'SNR' is not in list

Note that you can set any metrics except IRLR to be not included as above.

For an insufficient record, DQM will have np.NaN for all metrics except IRLR. IRLR will have a value of 0 to indicate the input record is insufficient.

In [24]:
insufficient_df = load_data_json_single_file('/home/lin/Documents/CAMH/QA-module/data/sensor/json/SUBJ00001_Accelerometer_REC000006.json')
print(insufficient_df)
single_json_file_dqm.set_input_data(insufficient_df)
single_json_file_dqm.compute_DQM()
single_json_file_dqm.get_DQM()

Empty DataFrame
Columns: []
Index: []
Start computing the DQM... This may take a long time if the dataset is large
The total time for computing the DQM is: 2.2649765014648438e-05 seconds.


['0', nan, nan, nan, nan]

Now load a sufficient csv sensor data, similarly, you can check modules/Sensor/load_data.py for the code that loads the csv into this structure

In [25]:
df = load_data_csv_one_file('/home/lin/Documents/CAMH/QA-module/data/sensor/csv/id9603e9c3.csv')
print(df)

         time_s   lw_x   lw_y   lw_z   lh_x   lh_y   lh_z   la_x   la_y  \
0          0.01 -0.004  0.945 -0.461  0.055  0.992 -0.199  0.168  1.418   
1          0.02 -0.406 -0.379  1.031 -0.320 -0.102  0.844 -0.422  0.559   
2          0.03 -0.340 -0.918  1.160 -0.152 -0.676  1.160 -0.500  1.004   
3          0.04 -0.207 -1.023  0.344  0.059 -0.797  0.453 -0.363  1.363   
4          0.05 -0.152 -1.098 -0.012  0.133 -0.875  0.133 -0.301  1.582   
...         ...    ...    ...    ...    ...    ...    ...    ...    ...   
111295  1112.96  0.000  0.000  0.000  0.246  0.012  0.949  0.453  0.102   
111296  1112.97  0.000  0.000  0.000  0.246  0.012  0.949  0.449  0.102   
111297  1112.98  0.000  0.000  0.000  0.246  0.012  0.949  0.449  0.102   
111298  1112.99  0.000  0.000  0.000  0.246  0.012  0.949  0.449  0.102   
111299  1113.00  0.000  0.000  0.000  0.246  0.012  0.949  0.453  0.102   

         la_z   ra_x   ra_y   ra_z  
0      -0.070 -0.105  1.000  0.125  
1       0.891 -0.352 -0.0

Compute the DQM

In [26]:
single_csv_file_dqm = DQM_single_file()
single_csv_file_dqm.set_input_data(df)
single_csv_file_dqm.compute_DQM()


Start computing the DQM... This may take a long time if the dataset is large
The total time for computing the DQM is: 159.89496684074402 seconds.


Output the results

In [27]:
single_csv_file_dqm.get_fields()

['IRLR', 'SNR', 'SCR', 'SRC', 'MDR', 'APD']

In [28]:
single_csv_file_dqm.get_DQM()

['1',
 '-3.436961042075442',
 '1',
 '0.9999999999986589',
 '0.17067385444743935',
 '0.005390835579514825']