Skip to content

tteofili/cheapER

Repository files navigation

CheapER

CheapER is a tool for performing Entity Resolution tasks with few labeled training samples.

CheapER adopts large language models within a noisy training framework, in combination with adaptive fine tuning, consistency training, adaptive softmax and Monte Carlo dropout.

CheapER pipeline

Experiments

CheapER requires less labeled training data with respect to SotA systems (as of early 2023) to reach the same F1.

CheapER cost on DM datasets

Experiments on the DeepMatcher datasets can be reproduced using the eval.py script.

Notebooks

Citing CheapER

If you extend or use this work, please cite:

@article{teofili2023cheaper,
  title={CheapER: Low Cost Entity Resolution},
  author={Teofili, Tommaso and Firmani, Donatella and Merialdo, Paolo},
  year={2023}
}

About

Low Cost Entity Resolution with Transformers

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published