Skip to content

Yonsei-TSMM/author_name_disambiguation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

# Author Name Disambiguation Data Set v.01

We created a carefully hand-crafted training dataset drawn from the entire PubMed collection by going through multiple iterations.

[1] AND_CORPUS_v01.txt
- This is a human-labelled training dataset which consists of author IDX, PMID, first author last name, and first author initial. 
- The dataset contains 2,875 publications authored by 385 real authors with 431 name variants.

[2] AND_1000_MATCHES.txt.
- This file includes SCOPUS author ids that matched a PMID from SCOPUS database (http://www.scopus.com/).
- The first column is a SCOPUS id, the second is a PMID, and the last is the author name of PMID.

Please visit http://informatics.yonsei.ac.kr/tsmm/author_name_disambiguation/and.html for further information.

About

Author Name Disambiguation DataSet

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published