Scoping: Towards streamlined entity collections for multi-sourced Entity Resolution with self-supervised agents
The goal of Scoping is to reduce the space of candidate entity pairs by ranking, detecting, and removing unlinkable entities through outlier algorithms and self-supervised reusable autoencoders, leaving intact the set of true linkages.
The annotated multi-sourced entity linkage dataset is sourced from sample schemas from the following three database vendors:
-
MySQL: https://www.mysqltutorial.org/mysql-sample-database.aspx
-
Category domain-specific: 3 x Orders-Customers schemas (Oracle, MySQL, SAP HANA)
-
Category domain-agnostic: 1 additional Human-Resources schema (Oracle)
leonard.traeger@umbc.edu for any related questions