In [1]:
from DQM import DQM_multiple_file
from load_data import *

Load a list of sufficient json records and demonstrate the structure of the loaded data. For the original json file data structure, check the provided test data under data/sensor/json in the repo. modules/Sensor/load_data.py provides examples of how to load a json/csv file into this particular structure.

In [3]:
df_list = load_data_json_multi_file('/home/lin/Documents/CAMH/QA-module/data/sensor/json')
print(df_list)

[                timestamp         x         y         z
0     1593963206555000000  2.944868  9.244013 -1.522712
1     1593963206769000000  3.038242  9.196129 -1.829170
2     1593963206973000000  3.059790  9.200917 -1.587356
3     1593963207179000000  3.040636  9.203311 -1.539472
4     1593963207385000000  3.014300  9.208099 -1.553837
...                   ...       ...       ...       ...
1062  1593963608699000000 -0.543484  3.928885  8.767567
1063  1593963611106000000 -0.454898  3.940856  8.834604
1064  1593963611310000000 -0.562637  3.653552  8.968679
1065  1593963611518000000 -0.505177  3.612850  8.923190
1066  1593963611733000000 -0.445322  3.598485  8.894460

[1067 rows x 4 columns],                 timestamp         x         y         z
0     1593965108713000000 -0.418985  8.166622 -5.542577
1     1593965108914000000 -0.469264  8.145074 -5.604826
2     1593965109115000000 -0.490811  8.154651 -5.590461
3     1593965109335000000 -0.502782  8.183381 -5.499481
4     159396510953900

Create the DQM_multiple_file object, set the input data frame, then compute DQM

In [5]:
multi_json_file_dqm = DQM_multiple_file()
multi_json_file_dqm.set_input_data(df_list)
multi_json_file_dqm.compute_DQM()

Start computing the DQM... This may take a long time if the dataset is large
The total time for computing the DQM is: 16.248345851898193 seconds.


Get the fields of computed data quality matrix

In [6]:
multi_json_file_dqm.get_fields()

['IRLR', 'SNR', 'SCR', 'RLC', 'VRC', 'SRC', 'MDR', 'APD']

Get the computed data quality matrix as a list, each element in the list represents a metric.

In [7]:
multi_json_file_dqm.get_DQM()

['0.9411764705882353',
 '-2.2764849187263625',
 '1.0',
 '0.6111766701110624',
 '0.6713937097996874',
 '0.4218179319873303',
 '0.16701263642758973',
 '0.014823767816290018']

Get each single metric of the computed DQM as a str

In [8]:
multi_json_file_dqm.get_IRLR()

'0.9411764705882353'

In [9]:
multi_json_file_dqm.get_SNR()

'-2.2764849187263625'

In [10]:
multi_json_file_dqm.get_SCR()

'1.0'

In [11]:
multi_json_file_dqm.get_SRC()

'0.4218179319873303'

In [12]:
multi_json_file_dqm.get_MDR()

'0.16701263642758973'

In [13]:
multi_json_file_dqm.get_APD()

'0.014823767816290018'

In [15]:
multi_json_file_dqm.get_VRC()

'0.6713937097996874'

In [16]:
multi_json_file_dqm.get_RLC()

'0.6111766701110624'

Save the computed DQM as a csv file at given path.

In [17]:
multi_json_file_dqm.save_to_file('/home/lin/Documents/CAMH/SenseActivity/data/demo_result_multi_file.csv')

Data successfuly saved.


Display the saved csv file.

In [18]:
with open("/home/lin/Documents/CAMH/SenseActivity/data/demo_result_multi_file.csv", "r") as DQM_csv:
    for row in DQM_csv:
        print(row)

IRLR,SNR,SCR,RLC,VRC,SRC,MDR,APD

0.9411764705882353,-2.2764849187263625,1.0,0.6111766701110624,0.6713937097996874,0.4218179319873303,0.16701263642758973,0.014823767816290018



Set SNR to be not included in DQM

In [19]:
multi_json_file_dqm.set_SNR(False)

Compute and get the updated DQM, now SNR should not be included in DQM.

In [20]:
multi_json_file_dqm.compute_DQM()

Start computing the DQM... This may take a long time if the dataset is large
The total time for computing the DQM is: 17.030556201934814 seconds.


In [21]:
multi_json_file_dqm.get_fields()

['IRLR', 'SCR', 'RLC', 'VRC', 'SRC', 'MDR', 'APD']

In [22]:
multi_json_file_dqm.get_DQM()

['1.0',
 '1.0',
 '0.6111766701110624',
 '0.6657626357073461',
 '0.42181792959974745',
 '0.1670126366186453',
 '0.01748431277503671']

Now get_SNR should not be called

In [23]:
multi_json_file_dqm.get_SNR()

ValueError: 'SNR' is not in list

Note that you can set any metrics except IRLR to be not included as above.

For an empty record that cannot be loaded without error, load it as an empty data frame instead.

Now load a list csv sensor data, similarly, you can check modules/Sensor/load_data.py for the code that loads the csv into this structure

In [24]:
df_list = load_data_csv_multi_file('/home/lin/Documents/CAMH/QA-module/data/sensor/csv')
print(df_list)

[       time_s   lw_x   lw_y   lw_z   lh_x   lh_y   lh_z   la_x   la_y   la_z  \
0        0.01 -0.004  0.949 -0.426  0.000  0.000  0.000  1.000  0.359 -0.191   
1        0.02 -0.477 -0.117  0.758 -0.398 -0.047  1.180 -0.055 -0.441  0.926   
2        0.03 -0.547 -0.637  1.016 -0.543  0.449  1.484 -0.254 -0.828  1.332   
3        0.04 -0.445 -0.742  0.301 -0.500  0.730  0.801 -0.195 -0.875  0.660   
4        0.05 -0.406 -0.801  0.000 -0.461  0.836  0.512 -0.176 -0.910  0.363   
...       ...    ...    ...    ...    ...    ...    ...    ...    ...    ...   
94195  941.96 -0.348 -0.832 -0.109  0.000  0.000  0.000 -0.012 -0.988  0.094   
94196  941.97 -0.348 -0.871 -0.109  0.000  0.000  0.000 -0.012 -0.992  0.094   
94197  941.98 -0.348 -0.898 -0.109  0.000  0.000  0.000 -0.008 -0.988  0.102   
94198  941.99 -0.363 -0.910 -0.176  0.000  0.000  0.000 -0.008 -0.977  0.109   
94199  942.00 -0.465 -0.930 -0.250  0.000  0.000  0.000 -0.008 -0.973  0.117   

        ra_x   ra_y   ra_z  
0      0.

Compute the DQM

In [25]:
multi_csv_file_dqm = DQM_multiple_file()
multi_csv_file_dqm.set_input_data(df_list)
multi_csv_file_dqm.compute_DQM()


Start computing the DQM... This may take a long time if the dataset is large
The total time for computing the DQM is: 556.904128074646 seconds.


Output the results

In [26]:
multi_csv_file_dqm.get_fields()

['IRLR', 'SNR', 'SCR', 'RLC', 'VRC', 'SRC', 'MDR', 'APD']

In [27]:
multi_csv_file_dqm.get_DQM()

['1.0',
 '-3.3257782415997195',
 '1.0',
 '0.9655312574226851',
 '0.9742128674915318',
 '0.999999999998742',
 '0.0',
 '0.012419969252906999']