Scoping: Towards streamlined entity collections for multi-sourced Entity Resolution with self-supervised agents
The goal of Scoping is to reduce the space of candidate entity pairs by ranking, detecting, and removing unlinkable entities through outlier algorithms and self-supervised reusable autoencoders, leaving intact the set of true linkages.
The annotated multi-sourced entity linkage dataset is sourced from sample schemas from the following three database vendors:
-
MySQL: https://www.mysqltutorial.org/mysql-sample-database.aspx
-
Category domain-specific: 3 x Orders-Customers schemas (Oracle, MySQL, SAP HANA)
-
Category domain-agnostic: 1 additional Human-Resources schema (Oracle)
If you want to use scoping or the OC3-HR dataset, please reference the conference paper:
- DOI: 10.5220/0012607500003690
- Harvard, Bibtex, and EndNote citation format available here: https://www.scitepress.org/Link.aspx?doi=10.5220/0012607500003690
leonard.traeger@umbc.edu for any related questions