Translate the SPECIALIST lexicon from XML to RDF
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
src
.gitignore
LICENSE
README.md
pom.xml

README.md

The SPECIALIST Lexicon RDF Translator

This code translates the SPECIALIST lexicon from XML to RDF. It has been tested on a MacBook Pro running OS X El Capitan (10.11.6) and Java SE 1.8.

Compiling

Download RDF4J 2.2.2 onejar, save to the lib folder, and install using Maven:

mvn install:install-file -DgroupId=org.eclipse.rdf4j -DartifactId=rdf4j \
-Dversion=2.2.2 -Dpackaging=jar -Dfile=lib/eclipse-rdf4j-2.2.2-onejar.jar \
-DgeneratePom=true

Run mvn clean install to create the JAR file.

Download the SPECIALIST lexicon 2017 XML file from https://lexsrv3.nlm.nih.gov/LexSysGroup/Projects/lexicon/2017/release/LEX/XML/LEXICON.xml.

Running

Run with java -jar specialist-rdf-jar-with-dependencies.jar -s <source XML file> -d <destination NQuads file>

Dependencies

This code includes a modified version of the SPECIALIST lexicon XSD schemas and uses the dependencies in the table below:

Name Licence
SPECIALIST lexicon 2017 release Terms and Conditions for Use of the SPECIALIST NLP Tools
Sesame-Schemagen MIT License
RDF4J Eclipse Distribution License - v 1.0
JUnit 4 Eclipse Public License - v 1.0
JAXB2 Apache License, Version 2.0
Apache Maven Apache License, Version 2.0
Apache Commons CLI Apache License, Version 2.0
SLF4J MIT License
log4j Apache License, Version 2.0

TODOs

Some strings in the lexicon could be meaningfully parsed to produce additional links, e.g. <compl>pphr(of,np)</compl>. These have been left as TODOs.