No description, website, or topics provided.
Switch branches/tags
Nothing to show
Clone or download
zfjsail Update
dataset link
Latest commit 2b10712 Sep 14, 2018
Failed to load latest commit information.
.gitignore unnesessary import Jul 5, 2018

Name Disambiguation in AMiner

This is implementation of our KDD'18 paper:

Yutao Zhang, Fanjin Zhang, Peiran Yao, and Jie Tang. Name Disambiguation in AMiner: Clustering, Maintenance, and Human in the Loop. In Proceedings of the Twenty-Forth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'18).


  • Linux
  • python 3
  • install requirements via pip install -r requirements.txt

Note: Running this project will consume upwards of 10GB hard disk space. The overall pipeline will take several hours. You are recommended to run this project on a Linux server.


Please download data here (or via OneDrive). Unzip the file and put the data directory into project directory.

How to run

cd $project_path
export PYTHONPATH="$project_path:$PYTHONPATH"
python3 scripts/

# global model
python3 global_/
python3 global_/
python3 global_/

# local model
python3 local/gae/

# estimate cluster size
python3 cluster_size/

Note: Training data in this demo are smaller than what we used in the paper, so the performance (F1-score) will be a little bit lower than reported scores.