In [1]:
from DQM import DQM_multiple_file
from load_data import *

Load a list of sufficient json records and demonstrate the structure of the loaded data. For the original json file data structure, check the provided test data under data/sensor/json in the repo. modules/Sensor/load_data.py provides examples of how to load a json/csv file into this particular structure.

In [2]:
df_list = load_data_json_multi_file('/home/lin/Documents/CAMH/QA-module/data/sensor/json')
print(df_list)

[                timestamp         x         y         z
0     1593963206555000000  2.944868  9.244013 -1.522712
1     1593963206769000000  3.038242  9.196129 -1.829170
2     1593963206973000000  3.059790  9.200917 -1.587356
3     1593963207179000000  3.040636  9.203311 -1.539472
4     1593963207385000000  3.014300  9.208099 -1.553837
...                   ...       ...       ...       ...
1062  1593963608699000000 -0.543484  3.928885  8.767567
1063  1593963611106000000 -0.454898  3.940856  8.834604
1064  1593963611310000000 -0.562637  3.653552  8.968679
1065  1593963611518000000 -0.505177  3.612850  8.923190
1066  1593963611733000000 -0.445322  3.598485  8.894460

[1067 rows x 4 columns],                 timestamp         x         y         z
0     1593965108713000000 -0.418985  8.166622 -5.542577
1     1593965108914000000 -0.469264  8.145074 -5.604826
2     1593965109115000000 -0.490811  8.154651 -5.590461
3     1593965109335000000 -0.502782  8.183381 -5.499481
4     159396510953900

Create the DQM_multiple_file object, set the input data frame, then compute DQM

In [3]:
multi_json_file_dqm = DQM_multiple_file()
multi_json_file_dqm.set_input_data(df_list)
multi_json_file_dqm.compute_avg_DQM()

Start computing the DQM... This may take a long time if the dataset is large
The total time for computing the DQM is: 16.445945501327515 seconds.


Get the fields of computed data quality matrix

In [4]:
multi_json_file_dqm.get_avg_fields()

['IRLR', 'SNR', 'SCR', 'RLC', 'VRC', 'SRC', 'MDR', 'APD', 'VDR']

Get the computed data quality matrix as a list, each element in the list represents a metric.

In [5]:
multi_json_file_dqm.get_avg_DQM()

['0.9411764705882353',
 '-2.2764849187263625',
 '1.0',
 '0.6111766701110624',
 '0.6713937097996874',
 '0.4218179319873303',
 '0.2447423162562039',
 '0.013581938407908568',
 '1.0']

Get each single metric of the computed DQM as a str

In [6]:
multi_json_file_dqm.get_IRLR()

'0.9411764705882353'

In [7]:
multi_json_file_dqm.get_SNR()

'-2.2764849187263625'

In [8]:
multi_json_file_dqm.get_SCR()

'1.0'

In [9]:
multi_json_file_dqm.get_SRC()

'0.4218179319873303'

In [10]:
multi_json_file_dqm.get_MDR()

'0.2447423162562039'

In [11]:
multi_json_file_dqm.get_APD()

'0.013581938407908568'

In [12]:
multi_json_file_dqm.get_VRC()

'0.6713937097996874'

In [13]:
multi_json_file_dqm.get_RLC()

'0.6111766701110624'

Save the computed DQM as a csv file at given path.

In [14]:
multi_json_file_dqm.save_avg_DQM_to_file('/home/lin/Documents/CAMH/SenseActivity/data/demo_result_multi_file.csv')

Data successfuly saved.


Display the saved csv file

In [16]:
with open("/home/lin/Documents/CAMH/SenseActivity/data/demo_result_multi_file.csv", "r") as DQM_csv:
    for row in DQM_csv:
        print(row)

IRLR,SNR,SCR,RLC,VRC,SRC,MDR,APD,VDR

0.9411764705882353,-2.2764849187263625,1.0,0.6111766701110624,0.6713937097996874,0.4218179319873303,0.2447423162562039,0.013581938407908568,1.0



Now we retrieve the individual DQMs

In [17]:
multi_json_file_dqm.compute_individual_DQM()

Start computing the DQM... This may take a long time if the dataset is large
The total time for computing the DQM is: 16.491915941238403 seconds.


In [18]:
multi_json_file_dqm.get_individual_fields()

['IRLR', 'SNR', 'SRC', 'MDR', 'APD', 'VDR']

Now the returned list represents a list of DQMs, each element for a corresponding record in the input data list.

In [19]:
multi_json_file_dqm.get_individual_DQM()

[['1',
  '-2.5443294444725577',
  '1.0',
  '0.21887748986956956',
  '0.42696025778732544',
  '0.004686035613870665'],
 ['1',
  '-2.1372237098572295',
  '1.0',
  '0.6597869045768825',
  '0.11221719457013575',
  '0.013761467889908258'],
 ['1',
  '-2.425568247528492',
  '1.0',
  '0.22998553798698063',
  '0.21399839098954143',
  '0.007676560900716479'],
 ['1',
  '-3.096340603605768',
  '1.0',
  '0.9011216710642995',
  '0.013303769401330377',
  '0.02696629213483146'],
 ['1',
  '-2.1372237098572295',
  '1.0',
  '0.6597869045768825',
  '0.11221719457013575',
  '0.014271151885830785'],
 ['1',
  '-2.425568247528492',
  '1.0',
  '0.22998553798698063',
  '0.21399839098954143',
  '0.011770726714431934'],
 ['1',
  '-1.865302738232099',
  '1.0',
  '3.6927333790792716e-07',
  '0.734890286154241',
  '0.009196515004840271'],
 ['1',
  '-2.623577265481869',
  '1.0',
  '0.9209837701144623',
  '0.006015037593984963',
  '0.023449319213313162'],
 ['1',
  '-2.6345095072869413',
  '1.0',
  '0.853564399886102',

Save the list of DQMs to csv file 

In [20]:
multi_json_file_dqm.save_individual_DQM_to_file("/home/lin/Documents/CAMH/SenseActivity/data/demo_result_multi_file_individual.csv")

Data successfuly saved.


Display the saved csv file.

In [21]:
with open("/home/lin/Documents/CAMH/SenseActivity/data/demo_result_multi_file_individual.csv", "r") as DQM_csv:
    for row in DQM_csv:
        print(row)

IRLR,SNR,SRC,MDR,APD,VDR

1,-2.5443294444725577,1.0,0.21887748986956956,0.42696025778732544,0.004686035613870665

1,-2.1372237098572295,1.0,0.6597869045768825,0.11221719457013575,0.013761467889908258

1,-2.425568247528492,1.0,0.22998553798698063,0.21399839098954143,0.007676560900716479

1,-3.096340603605768,1.0,0.9011216710642995,0.013303769401330377,0.02696629213483146

1,-2.1372237098572295,1.0,0.6597869045768825,0.11221719457013575,0.014271151885830785

1,-2.425568247528492,1.0,0.22998553798698063,0.21399839098954143,0.011770726714431934

1,-1.865302738232099,1.0,3.6927333790792716e-07,0.734890286154241,0.009196515004840271

1,-2.623577265481869,1.0,0.9209837701144623,0.006015037593984963,0.023449319213313162

1,-2.6345095072869413,1.0,0.853564399886102,0.014149274849663955,0.019734481521349122

1,-1.6358899797539834,1.0,0.0008434478296908132,0.7531806615776081,0.030927835051546393

1,-1.8544629444980452,1.0,0.011961611519057058,0.1642219387755102,0.002098435711560473

1,-2.56392056

Set SNR to be not included in DQM

In [22]:
multi_json_file_dqm.set_SNR(False)

Compute and get the updated DQM, now SNR should not be included in DQM.

In [23]:
multi_json_file_dqm.compute_avg_DQM()

Start computing the DQM... This may take a long time if the dataset is large
The total time for computing the DQM is: 16.451045751571655 seconds.


In [24]:
multi_json_file_dqm.get_avg_fields()

['IRLR', 'SCR', 'RLC', 'VRC', 'SRC', 'MDR', 'APD', 'VDR']

In [None]:
multi_json_file_dqm.get_avg_DQM()

Now get_SNR should not be called

In [None]:
multi_json_file_dqm.get_SNR()

Note that you can set any metrics except IRLR to be not included as above.

For an empty record that cannot be loaded without error, load it as an empty data frame instead.

Now load a list csv sensor data, similarly, you can check modules/Sensor/load_data.py for the code that loads the csv into this structure

In [None]:
df_list = load_data_csv_multi_file('/home/lin/Documents/CAMH/QA-module/data/sensor/csv')
print(df_list)

Compute the DQM

In [None]:
multi_csv_file_dqm = DQM_multiple_file()
multi_csv_file_dqm.set_input_data(df_list)
multi_csv_file_dqm.compute_DQM()


Output the results

In [None]:
multi_csv_file_dqm.get_fields()

In [None]:
multi_csv_file_dqm.get_DQM()