# Linkage and Blocking Evaluation and Tuning Tool

The purpose of this tool is to look at the various options for implementing blocking in [anonlink](https://anonlink-entity-service.readthedocs.io/en/stable/) for CODI datasets. Currently, anonlink uses blocklib library which supports two blocking methods:

- “p-sig”: Probabilistic signature
- “lambda-fold”: LSH based lambda-fold

which are proposed by the following publications:

- [Scalable Entity Resolution Using Probabilistic Signatures on Parallel Databases](https://arxiv.org/abs/1712.09691)
- [An LSH-Based Blocking Approach with a Homomorphic Matching Technique for Privacy-Preserving Record Linkage](https://www.computer.org/csdl/journal/tk/2015/04/06880802/13rRUxASubY)

Adjustments to the blocklib configuration will be made to the type of blocking, encoding, and threshold. We will evaluate multiple runs of our linkage tools using an example data set. The metrics for evaluation include:

- Precision
- Recall
- Reduction Ratio
- Set completeness
- Performance based on average block size

## Useful Terminology

- Blocking - a technique that makes record linkage scalable. It is achieved by partitioning datasets into groups, called blocks and only comparing records in corresponding blocks. This can reduce the number of comparisons that need to be conducted to find which pairs of records should be linked.
- Bloom filter - a probabilistic data structure used to test set membership. It tells if an element may be in a set, or definitely isn't.
- Precision – how many of the found matches are actual matches (found groups : true matches)
- Recall – how many of the actual matches we found (true matches : found groups)
- Reduction Ratio – measures the proportion of number of comparisons reduced by using blocking
- Set Completeness – how many true matches are maintained after blocking
- Feature hashing – a fast and space-efficient way of vectorizing features, i.e. turning arbitrary features into indices in a vector or matrix
- `p-sig` signature – A subrecord of an entity that can be used to uniquely link commonality between multiple records of an entity
- `p-sig` Blocking keys – lower the cost of comparison between datasets by selecting partitions of the raw records (ex. First name, last name, postal code) *its assumed records sharing no blocking keys do not match with each other*


## Setting up the Environment

- The basic requirement is that you have [data-owner-tools](https://github.com/mitre/data-owner-tools) set up with all of Synthetic Denver sites extracted via extract.py and named with the format pii_\*.csv (e.g. pii_ch.csv) for all 5 sites (scripts can easily be adjusted to work with other data) 

In [3]:
from __future__ import print_function
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets
import seaborn as sns
from IPython.display import FileLink, FileLinks
import qgrid
import csv
import json
from pathlib import Path
import dcctools.config
from itertools import combinations
import importlib
import time

### Setting up a new run:

Set an optional run identifier / description string for the row that will be created in the table of run data

In [4]:
run_description = 'File reorg run'

Set path to data-owner-tools project:

In [10]:
DATA_OWNER_TOOLS_DIR = '/Users/apellitieri/Desktop/CDC/CODI/data-owner-tools/'
SECRET_FILE = '/Users/apellitieri/Desktop/CDC/CODI/deidentification_secret.txt'

Ensure the blocking schema file to use for the run is set correctly in `config.json`:

In [6]:
CONFIG = dcctools.config.Configuration("config.json")
print('Config file in use:')
with open('config.json', 'r') as config:
    config_data = json.load(config)
    print(json.dumps(config_data, indent=2))
blocking_schema_file = ''
if CONFIG.blocked():
    blocking_schema_file = CONFIG.blocking_schema()
    with open(blocking_schema_file, 'r') as blocking_schema:
        schema_data = json.load(blocking_schema)
        print('\nBlocking schema being used:')
        print(json.dumps(schema_data, indent=2))
else:
    blocking_schema_file = 'None'
    print('\nNo blocking set to be used on this run.')

Config file in use:
{
  "systems": [
    "site_a",
    "site_b",
    "site_c",
    "site_d",
    "site_e",
    "site_f"
  ],
  "projects": [
    "name-sex-dob-phone",
    "name-sex-dob-zip",
    "name-sex-dob-parents",
    "name-sex-dob-addr"
  ],
  "schema_folder": "/Users/apellitieri/Desktop/CDC/CODI/data-owner-tools/example-schema",
  "inbox_folder": "/Users/apellitieri/Desktop/CDC/CODI/inbox",
  "matching_results_folder": "/Users/apellitieri/Desktop/CDC/CODI/results",
  "output_folder": "/Users/apellitieri/Desktop/CDC/CODI/output",
  "entity_service_url": "http://localhost:8851/api/v1",
  "matching_threshold": 0.75,
  "mongo_uri": "localhost:27017",
  "blocked": false,
  "blocking_schema": "/Users/apellitieri/Desktop/CDC/CODI/data-owner-tools/example-blocking-schema/lambda.json",
  "household_match": true,
  "household_schema": "/Users/apellitieri/Desktop/CDC/CODI/data-owner-tools/example-schema/household-schema/fn-phone-addr-zip.json"
}

No blocking set to be used on this run.


## Garble and block with anonlink

The following block runs the script to garble the pii_\*.csv files and then block and package the CLKs into the inbox folder.

[TODO]: Make the garble scripts not specific to the synthetic denver sites - currently need to change name of script depending on using synthetic denver or the new site

[TODO]: Figure out a way to record the blocking statistics from the run. The anonlink client blocking program prints out statistics about the blocking run but does not provide the output in an easily digestable format. The important output to look at is the maximum and average block size. From the [anonlink-client documentation](https://anonlink-client.readthedocs.io/en/stable/tutorial/Blocking%20with%20Anonlink%20Entity%20Service.html#Blocking):
```
The record linkage run time will be largely dominated by the maximum block size, and the number of blocks. In general the smaller the average block size, the better.
```

In [12]:
inbox_folder = CONFIG.inbox_folder()
garble_start = time.perf_counter()
if CONFIG.blocked():
    shell_script = "{}testing-and-tuning/blocking-garble.sh".format(DATA_OWNER_TOOLS_DIR)
    !$shell_script {blocking_schema_file} {inbox_folder}
else:
    shell_script = "{}testing-and-tuning/garble.sh".format(DATA_OWNER_TOOLS_DIR)
    !$shell_script {inbox_folder} {SECRET_FILE}
garble_end = time.perf_counter()
garble_time = garble_end - garble_start

Cleaning inbox...
Running garble.py for site_a
[31mCLK data written to output/name-sex-dob-phone.json[0m
[31mCLK data written to output/name-sex-dob-zip.json[0m
[31mCLK data written to output/name-sex-dob-parents.json[0m
[31mCLK data written to output/name-sex-dob-addr.json[0m
[0]
[1]
[2]
[3]
[4, 473]
[5]
[6, 367]
[7, 679, 691]
[8, 371, 383]
[9]
[10]
[11]
[12, 29, 638, 643, 673]
[13]
[14, 611]
[15]
[16, 309, 416, 453, 674]
[17]
[18]
[19, 449, 714]
[20, 261, 722]
[21, 368, 655]
[22, 365]
[23, 423]
[24, 112]
[25, 420, 490]
[26, 78, 619, 640, 685]
[27, 414, 599]
[28]
[30]
[31, 413, 468, 547, 583, 687]
[32, 224, 466, 469]
[33, 60, 152, 213, 238, 292, 411, 476, 742]
[34, 44]
[35, 252, 352, 585, 749]
[36, 217]
[37, 76, 387]
[38, 318, 544, 584, 712]
[39]
[40, 128, 588]
[41, 267]
[42, 188]
[43]
[45]
[46]
[47, 208]
[48, 594]
[49, 222, 228, 579, 729]
[50]
[51, 320]
[52, 72, 533, 686]
[53, 111, 306]
[54, 408, 502]
[55]
[56]
[57, 273, 340, 452]
[58, 353]
[59, 457]
[61, 117, 336, 671, 683]


[528]
[529]
[530]
[531]
[532]
[533]
[534]
[538]
[540]
[545]
[547]
[548]
[549]
[551]
[552]
[557]
[558]
[561]
[562]
[564]
[568]
[569]
[572]
[573]
[574]
[575]
[576]
[578]
[580]
[581]
[582]
[584]
[587]
[588]
[591]
[592]
[594]
[595]
[596]
[597]
[598]
[599, 730]
[600]
[602]
[603]
[605]
[609]
[611]
[613, 668]
[614]
[616]
[617]
[618]
[620]
[625, 735]
[631]
[634]
[635]
[637]
[638]
[639]
[642]
[645]
[649]
[650]
[652]
[653]
[654]
[660]
[663]
[667]
[674]
[676, 708]
[678]
[681, 723]
[682]
[683]
[685]
[687]
[689]
[692]
[693, 732]
[696]
[699]
[701]
[703]
[705]
[707]
[711]
[712]
[713]
[715]
[717]
[718]
[721]
[727]
[728]
[731]
[734]
[736]
[739]
[746]
[747]
[748]
[749]
[750]
[31mCLK data written to output/households/fn-phone-addr-zip.json[0m
Running garble.py for site_c
[31mCLK data written to output/name-sex-dob-phone.json[0m
[31mCLK data written to output/name-sex-dob-zip.json[0m
[31mCLK data written to output/name-sex-dob-parents.json[0m
[31mCLK data written to output/name-sex-dob-addr.json

[366]
[367]
[370]
[371]
[372, 451]
[376]
[377, 428, 462, 522]
[379]
[381]
[382]
[383]
[385]
[386]
[388]
[390]
[391]
[392]
[396, 488, 725]
[397, 638]
[399]
[400, 602]
[401]
[404]
[405]
[406, 686]
[407, 585]
[408]
[409]
[410]
[411]
[413]
[414]
[415]
[416]
[417]
[418]
[420]
[422, 426, 749]
[423, 531, 558]
[424]
[425, 429, 492]
[432]
[434]
[436]
[437]
[438, 525]
[439]
[440]
[442, 726]
[446]
[449]
[450]
[455]
[458]
[460]
[461, 606]
[463]
[464]
[465, 467]
[466]
[468, 553, 697]
[471]
[476]
[477]
[478, 590, 660]
[479]
[480]
[481, 682]
[484]
[485]
[486]
[487]
[490]
[491]
[493]
[496]
[497]
[498]
[499]
[500, 635, 641]
[501]
[505, 694]
[506]
[509]
[514]
[520]
[524]
[527]
[528]
[529]
[530]
[533]
[536]
[538]
[540]
[541]
[543]
[544]
[545]
[547]
[548]
[549]
[551]
[552]
[554]
[555]
[556]
[557]
[560, 747]
[561]
[562]
[565]
[566, 708]
[567]
[570]
[571]
[573]
[577, 643]
[578]
[579]
[580]
[582]
[591]
[592]
[594]
[595]
[596]
[597]
[598]
[600]
[601]
[608]
[611]
[614]
[615]
[616]
[617]
[618]
[620]
[621]
[622]

[234]
[235, 508]
[236]
[237]
[238]
[239]
[241, 494]
[242]
[244]
[245]
[246]
[248, 398]
[249, 552]
[250]
[254, 315]
[255]
[256]
[258, 545]
[259, 611, 662, 711, 722]
[262]
[263, 282, 647]
[265]
[268]
[269, 570]
[270]
[272]
[276, 707]
[277, 334]
[278, 385]
[280]
[283, 316]
[284]
[286]
[287]
[288]
[289]
[291, 351, 446, 451, 519]
[292]
[293, 660]
[294]
[295]
[296, 325, 501, 645, 675]
[297]
[299]
[302, 320]
[303]
[304, 698]
[305]
[306, 403, 723]
[307]
[308, 510]
[310, 314, 557]
[312]
[313]
[317]
[318]
[322, 573, 602]
[326]
[327]
[329]
[330]
[331, 742]
[335]
[336, 368, 562]
[337, 396, 485]
[338]
[340]
[343]
[346, 542, 712]
[348]
[349, 467, 565, 740]
[354]
[355, 551]
[356]
[357, 380]
[360]
[362]
[366]
[367]
[369, 399, 563]
[371, 633]
[372]
[373]
[374]
[375, 677]
[376, 701]
[377]
[379]
[381, 395]
[382]
[383, 554]
[388]
[390]
[392]
[393]
[394]
[397, 749]
[400]
[401]
[402, 415]
[404]
[405]
[406]
[408]
[409]
[410]
[411]
[414]
[416, 561, 750]
[417]
[419]
[420, 559]
[421, 608, 609]
[422]
[423]
[424]

In [117]:
print(f"Garble and block (if enabled) took {garble_time:0.2f} seconds")

Garble and block (if enabled) took 28.14 seconds


In [118]:
!python validate.py

All necessary input is present


In [129]:
# Need to drop database collection here if previous run took place
!python drop.py

Dropped match_groups collection from previous run.


In [120]:
match_start = time.perf_counter()
!python match.py
match_end = time.perf_counter()
match_time = match_end - match_start

Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "1666ccc204062c11ef57a2ffaaed641178ea700cd9f34dfc"}Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "0b6e439a9463a8e5d19d7b6fe1e7f75cd8ea5320f5f33708"}Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "ee1f02f28295bf206e444de185ac7f7046642e1400b4d751"}Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "b1695b2c9d6b5310dc28644df329681d7af4359018bc8d82"}Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "66190cf9ad2cbb7db723e8f9ded6cdf63829d7b6ebf30869"}Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "9b2605aa2548f9cf88e4b1b6d40ad59f3a985bcde2f0e37e"}{'current_stage': {'description': 'waiting for CLKs', 'number': 1, 'progress': {'absolute': 6, 'description': 'number of parties already contributed', 'relative': 1.0}}, 'stages': 3, '

Inserted 100 of 1297 records.
Inserted 200 of 1297 records.
Inserted 300 of 1297 records.
Inserted 400 of 1297 records.
Inserted 500 of 1297 records.
Inserted 600 of 1297 records.
Inserted 700 of 1297 records.
Inserted 800 of 1297 records.
Inserted 900 of 1297 records.
Inserted 1000 of 1297 records.
Inserted 1100 of 1297 records.
Inserted 1200 of 1297 records.
Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "e488bde13c76d33bb89e6657aa9405dcab7bd45f96a483bd"}Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "25d6bfd51f6fcb56edf751629025f383f9ad5f13c9423187"}Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "0c3332c63b3fb372b10bf6a61cbd1c11ff667b26aabd34f1"}Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "f781d34d00d3f762dd52224075e8289a54885bba9bb5ce17"}Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "d0c5b22

{'current_stage': {'description': 'compute output', 'number': 3}, 'stages': 3, 'state': 'completed', 'time_added': '2021-05-18T16:50:00.801299+00:00', 'time_completed': '2021-05-18T16:50:15.646570+00:00', 'time_started': '2021-05-18T16:50:08.240284+00:00'}
{'groups': [[[0, 245], [5, 318]], [[0, 242], [5, 79]], [[0, 366], [1, 289], [2, 163]], [[0, 77], [1, 482], [5, 442]], [[0, 591], [1, 577], [2, 512]], [[0, 448], [3, 590], [5, 500], [1, 326], [4, 566]], [[4, 229], [5, 634]], [[1, 374], [2, 324], [3, 426], [4, 91]], [[0, 633], [1, 428], [4, 190], [5, 644]], [[2, 139], [3, 699]], [[2, 209], [3, 337], [1, 408], [4, 729]], [[1, 362], [2, 245]], [[0, 43], [1, 715], [5, 139]], [[4, 576], [5, 613]], [[4, 536], [5, 87]], [[4, 534], [5, 161]], [[3, 82], [4, 635]], [[4, 331], [5, 486]], [[4, 213], [5, 378]], [[4, 211], [5, 667], [0, 603]], [[4, 640], [5, 512]], [[4, 75], [5, 724]], [[3, 684], [5, 269]], [[3, 611], [5, 568]], [[3, 3], [5, 282], [4, 109]], [[3, 738], [4, 568]], [[3, 594], [4, 479

Inserted 100 of 1265 records.
Inserted 200 of 1265 records.
Inserted 300 of 1265 records.
Inserted 400 of 1265 records.
Inserted 500 of 1265 records.
Inserted 600 of 1265 records.
Inserted 700 of 1265 records.
Inserted 800 of 1265 records.
Inserted 900 of 1265 records.
Inserted 1000 of 1265 records.
Inserted 1100 of 1265 records.
Inserted 1200 of 1265 records.
Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "cfbf3ec67aee3346dc87aaba779c6f8e5f5cad1402636e31"}Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "c879929a6886bff16bae2e842dbdaeb059985a5410b57583"}Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "2eddac5fa7fee16466297367651f30f47aecafbaeaa30190"}Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "e48d4fcd39364f62822c7fdec46fe29a6ba0be18f2ba50e1"}Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "90beb21

Inserted 100 of 1299 records.
Inserted 200 of 1299 records.
Inserted 300 of 1299 records.
Inserted 400 of 1299 records.
Inserted 500 of 1299 records.
Inserted 600 of 1299 records.
Inserted 700 of 1299 records.
Inserted 800 of 1299 records.
Inserted 900 of 1299 records.
Inserted 1000 of 1299 records.
Inserted 1100 of 1299 records.
Inserted 1200 of 1299 records.
Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "8a2594947953d68f77ff5d99144f65e9050bcf8c32559713"}Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "dc8d65b11eb1d291c7aa791db9f4937e49b384f072b8d5d0"}Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "610414f14998c21e9d3fe9333eb51d3b3c2c09e557080e74"}Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "4cfc51959f58215f9fdf45168cc6519c23d3793d7549ea35"}Anonlink client: Uploading to entity service
{"message": "Updated", "receipt_token": "ff2ec62

Inserted 100 of 1350 records.
Inserted 200 of 1350 records.
Inserted 300 of 1350 records.
Inserted 400 of 1350 records.
Inserted 500 of 1350 records.
Inserted 600 of 1350 records.
Inserted 700 of 1350 records.
Inserted 800 of 1350 records.
Inserted 900 of 1350 records.
Inserted 1000 of 1350 records.
Inserted 1100 of 1350 records.
Inserted 1200 of 1350 records.
Inserted 1300 of 1350 records.


In [121]:
print(f"Matching took {match_time:0.2f} seconds")

Matching took 103.39 seconds


In [13]:
!python link_ids.py

results/link_ids.csv created
Before deconflict: 1153
After deconflict and before add singles: 1153
Final linkage count: 2028
Exact individual links found in pprl household links: 1111
Number of individual links conflicting with pprl links: 1026
Number of individual links combined into PPRL links: 1286
Number of individual links skipped from previous conflict: 0
results/household_link_ids.csv created


In [24]:
!python -m tuning-files-scripts.patid_translate --dotools {DATA_OWNER_TOOLS_DIR}

results/patid_link_ids.csv created
/Users/apellitieri/.pyenv/versions/3.7.4/bin/python: Error while finding module specification for 'tuning-files-scripts.patid_translate.py' (ModuleNotFoundError: __path__ attribute not found on 'tuning-files-scripts.patid_translate' while trying to find 'tuning-files-scripts.patid_translate.py')


## Record Results

In [25]:
!python -m tuning-files-scripts.household_score --dotools {DATA_OWNER_TOOLS_DIR}

Pair-wise scoring:
Precision: 0.944079810974009 Recall: 0.7094101400670744 F-Score: 0.8100923631448524
Perfect scoring:
Precision: 0.5027932960893855 Recall: 0.5990016638935108 F-Score: 0.5466970387243737
Partial scoring:
Precision: 0.9040307101727447 Recall: 0.78369384359401 F-Score: 0.8395721925133689


In [124]:
RESULTS_PATH = '/Users/apellitieri/Desktop/CDC/CODI/results'
ANSWER_KEY_CSV = '/Users/apellitieri/Desktop/CDC/CODI/new_answer_key.csv'

In [125]:
run_precision = 0
run_recall = 0
run_f_score = 0
answer_key_length = 0
proposed_pairs_count = 0

systems = CONFIG.systems()
threshold = CONFIG.matching_threshold()

true_positives = 0
false_positives = 0

answer_key = []

with open(ANSWER_KEY_CSV) as ak_csv:
  ak_reader = csv.reader(ak_csv)
  next(ak_reader)
  for row in ak_reader:
    if row[3] == '1':
      answer_pair = [row[1], row[2]]
      answer_pair.sort()
      answer_key.append(answer_pair)

answer_key_length = len(answer_key)

patid_csv_path = Path(RESULTS_PATH) / "patid_link_ids.csv"

with open(patid_csv_path) as patid_csv:
  pat_id_reader = csv.reader(patid_csv)
  next(pat_id_reader)
  for row in pat_id_reader:
    patids = row[1:6]
    patids = list(filter(lambda id: len(id) > 0, patids))
    combos = combinations(patids, 2)
    for a, b in combos:
      pair = [a, b]
      pair.sort()
      if pair in answer_key:
        true_positives += 1
      else:
        false_positives += 1

run_precision = true_positives / (true_positives + false_positives)
run_recall = true_positives / answer_key_length
run_f_score = 2 * ((run_precision * run_recall) / (run_precision + run_recall))
proposed_pairs_count = true_positives + false_positives

In [126]:
print(f"Precision: {run_precision:0.2f}\nRecall: {run_recall:0.2f}\nF-Score: {run_f_score:0.2f}")

Precision: 0.96
Recall: 0.60
F-Score: 0.74


In [127]:
with open('tuning-files-scripts/example_run_data.csv', 'a', newline='') as csvfile:
    fieldnames = ['Run Description', 'Blocking', 'Match Threshold', 'Precision',
                  'Recall', 'F-Score', 'Answer Key Size',
                  'Proposed Pairs', 'True Positives', 'Garble & Block Time (s)',
                  'Match Time (s)', 'Blocking Config']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    # writer.writeheader() # Remove after initial run
    writer.writerow({
        'Run Description': run_description, 'Blocking': CONFIG.blocked(),
        'Match Threshold': threshold,
        'Precision': run_precision, 'Recall': run_recall, 'F-Score': run_f_score,
        'Answer Key Size': answer_key_length, 'Proposed Pairs': proposed_pairs_count,
        'True Positives': true_positives,
        'Garble & Block Time (s)': garble_time,
        'Match Time (s)': match_time, 'Blocking Config': blocking_schema_file
    })
print("Successfully added run to example_run_data.csv!")

Successfully added run to example-run-data.csv!


In [128]:
pd.read_csv('tuning-files-scripts/example_run_data.csv')

Unnamed: 0,Run Description,Blocking,Match Threshold,Precision,Recall,F-Score,Answer Key Size,Proposed Pairs,True Positives,Garble & Block Time (s),Match Time (s),Blocking Config
0,Lambda-fold run 1 (Synthetic Denver),True,0.8,0.941725,0.83299,0.884026,970,858,,22.998019,81.662922,/Users/apellitieri/Desktop/CDC/CODI/data-owner...
1,No Blocking Run 1 (Synthetic Denver),False,0.8,0.942263,0.841237,0.888889,970,866,,13.903272,76.427395,
2,No Blocking Run 1 (New Data Set),False,0.8,0.996756,0.563965,0.720354,7082,4007,,15.183644,89.882657,
3,No Blocking Run 2 (New Data Set),False,0.8,0.996756,0.563965,0.720354,7082,4007,3994.0,15.443283,100.035062,
4,No Blocking Run 3 (New Data Set),False,0.75,0.988751,0.595736,0.743502,7082,4267,4219.0,15.563336,91.285207,
5,No Blocking Run 4 (New Data Set),False,0.7,0.970907,0.607879,0.747655,7082,4434,4305.0,16.600817,97.703846,
6,No Blocking Run 4 (New Data Set),False,0.65,0.955512,0.615645,0.748819,7082,4563,4360.0,15.385254,92.289256,
7,No Blocking Run 4 (New Data Set),False,0.5,0.895652,0.450861,0.599793,7082,3565,3193.0,15.295187,134.370978,
8,Lambda Run 1 (New Data Set),True,0.65,0.955566,0.601243,0.738083,7082,4456,4258.0,28.138004,103.387432,/Users/apellitieri/Desktop/CDC/CODI/data-owner...
