diff --git a/data/README.md b/data/README.md index 80c47e54..c5c3e63a 100644 --- a/data/README.md +++ b/data/README.md @@ -24,8 +24,7 @@ Note that every file that is available as a *Java Serialized Object* (**JSO**) c # Dirty ER datasets -| Dataset Name | Entities | Name-Value Pairs | Duplicates | Average NVP per Entity | Brute-force Comparisons | -File Format | Data Origin | +| Dataset Name | Entities | Name-Value Pairs | Duplicates | Average NVP per Entity | Brute-force Comparisons | File Format | Data Origin | | --- | --- | --- | --- | --- | --- | --- | --- | | Restaurant | 864 | 4,319 | 112 | 5.0 | 3.73E+05 | JSO ([entity file](dirtyErDatasets/restaurantProfiles), [groundtruth file](dirtyErDatasets/restaurantIdDuplicates)) | Real data | | Census | 841 | 3,913 | 344 | 4.7 | 3.53E+05 | JSO ([entity file](dirtyErDatasets/censusProfiles), [groundtruth file](dirtyERfiles/censusIdDuplicates)) | Real data | @@ -41,3 +40,4 @@ File Format | Data Origin | +