Fork of the Freely Extensible Biomedical Record Linkage program
-
Updated
Nov 4, 2016 - Python
Entity resolution (also known as data matching, data linkage, record linkage, and many other terms) is the task of finding entities in a dataset that refer to the same entity across different data sources (e.g., data files, books, websites, and databases). Entity resolution is necessary when joining different data sets based on entities that may or may not share a common identifier (e.g., database key, URI, National identification number), which may be due to differences in record shape, storage location, or curator style or preference.
Fork of the Freely Extensible Biomedical Record Linkage program
Matching algorithm for movies in Amazon and Rotten Tomatoes datasets
My entry to a data analysis / record linkage coding challenge
Performs unique entity estimation corresponding to Chen, Shrivastava, Steorts (2018).
Learning String Alignments for Entity Aliases
Intent detection and Slot filling
A fancy idea and experimental verification on crowdsourced-based entity resolution
🆔 Command line tool for deduplicating CSV files
Compact time- and attribute-aware node representations
Tool to explain Entity Resolution model predictions
Code for analyzing hate speech tweets using Wikipedia-based contextual representations.
Learned string similarity for entity names using optimal transport.
Company Match algorithm with Spark and Python on DataBricks
A Python package for efficient evaluation based on OASIS (Optimal Asymptotic Sequential Importance Sampling).
This repository contains the code and data download links to reproduce the experiments of the PVLDB paper "Dual-Objective Fine-Tuning of BERT for Entity Matching" by Ralph Peeters and Christian Bizer.
Merge Dirty Data with Clean Reference Tables
Created by Halbert L. Dunn
Released 1946