Skip to content
scad-tool
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
docs
publicationdata
resources
scad-client
scad-server
.gitignore
README.md
scad-requirements-wo-wombat.txt

README.md

scad-tool

Code and resources for the Author Name Disambiguation (AND) tool developed in the Scalable Author Disambiguation (SCAD) project.

Currently under construction!

Setup

$ mkdir scad
$ cd scad
$ git clone https://github.com/nlpAThits/scad-tool.git
$ cd scad-tool
$ conda create --name scad-env python=3.6
$ source activate scad-env
$ pip install -r scad-requirements-wo-wombat.txt 
$ git clone https://github.com/nlpAThits/WOMBAT.git
$ pip install WOMBAT/.
$ git clone https://github.com/conll/reference-coreference-scorers.git
$ unzip 'resources/wombat/*.zip' -d resources/wombat/

Starting the SCAD server

The following will start an instance of the SCAD server on the local machine on port 50001.

$ python scad-server/app.py localhost 50001 &

The above will start the server in the background and return a PID that can be used in
$ kill PID

to stop the server.

Starting the demo SCAD client

This project includes a simple Python client which processes a JSON file and disambiguates it by making API calls against the SCAD server. The following will process publications belonging to the block a smith from the KISTI corpus, using the semantic matching method avg_of_cos with a dblp-trained word2vec resource (cf. below):
$ python scad-client/run_simple_scad_client.py \
   --scad_url             http://localhost:50001 \
   --pubfile              publicationdata/full-kisti-plain-sng-sorted.json \
   --blocking_pattern     "'name': '(a[^\']* smith)'" \
   --name_matching_method match:shortname \
   --paramfile            resources/scad_params.json \
   --resourcefile         resources/scad_resources.json \
   --evaluate

The matching methods to use are specified in resources/scad_params.json.

Visualized example results can be found at https://nlpathits.github.io/scad-tool/ (Use 'Open in new tab/window')

You can’t perform that action at this time.