Skip to content

amorgani/AND

Repository files navigation

AND - Author Name Disambiguation

These are the files that make up the AND corpus: (1) 1500_pairs_train.csv and (2) 400_pairs_test.csv

These files contain randomly selected pairs of MEDLINE publications sharing an author with the same last name and first initial.

Each file has the following headers:

PMID1/2 - pubmed ID of a first/second publication in a pair. Last_name1/2 - Author last names. Initials1/2 - Author initials. First_name1/2 - Author first names Authorship - YES means that the authors are the same person and NO otherwise.

You should cite this data with the following publication:

Dina Vishnyakova, Raul Rodriguez-Esteban, Fabio Rinaldi, A new approach and gold standard toward author disambiguation in MEDLINE, Journal of the American Medical Informatics Association, , ocz028, https://doi.org/10.1093/jamia/ocz028

https://academic.oup.com/jamia/advance-article-abstract/doi/10.1093/jamia/ocz028/5432091

About

Author Name Disambiguation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published