Skip to content
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


Code and resources for the Author Name Disambiguation (AND) tool developed in the Scalable Author Disambiguation (SCAD) project.

Currently under construction!


$ mkdir scad
$ cd scad
$ git clone
$ cd scad-tool
$ conda create --name scad-env python=3.6
$ source activate scad-env
$ pip install -r scad-requirements-wo-wombat.txt 
$ git clone
$ pip install WOMBAT/.
$ git clone
$ unzip 'resources/wombat/*.zip' -d resources/wombat/

Starting the SCAD server

The following will start an instance of the SCAD server on the local machine on port 50001.

$ python scad-server/ localhost 50001 &

The above will start the server in the background and return a PID that can be used in
$ kill PID

to stop the server.

Starting the demo SCAD client

This project includes a simple Python client which processes a JSON file and disambiguates it by making API calls against the SCAD server. The following will process publications belonging to the block a smith from the KISTI corpus, using the semantic matching method avg_of_cos with a dblp-trained word2vec resource (cf. below):
$ python scad-client/ \
   --scad_url             http://localhost:50001 \
   --pubfile              publicationdata/full-kisti-plain-sng-sorted.json \
   --blocking_pattern     "'name': '(a[^\']* smith)'" \
   --name_matching_method match:shortname \
   --paramfile            resources/scad_params.json \
   --resourcefile         resources/scad_resources.json \

The matching methods to use are specified in resources/scad_params.json.

Visualized example results can be found at (Use 'Open in new tab/window')

You can’t perform that action at this time.