Slovene Named Entity Extractor
Java Shell Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
bin
gradle/wrapper
lexicons
lib
sbin
src/si/ijs
tagger
.gitignore
LICENSE
README.md
build.gradle
build.xml
gradlew
gradlew.bat
settings.gradle

README.md

Slovene Named Entity Extractor

Usage

Build jar or run tests:

gradle build        # builds "jar"
gradle fatJar       # builds fat jar with all dependencies included
gradle test

Running with run.sh wrapper.

./sbin/run.sh --out-model modelx.ser.gz --in ./corpus/jos16534_entities.tsv

Running fat jar

java -cp "build/libs/slner-all-1.1.jar" si.ijs.slner.SloveneNER 

Training

./sbin/run.sh --in ./corpus/jos100k-train.xml.zip --out-model model.ser.gz

Usage (OLD):

Compile into single .jar file: ./build.sh

Download training dataset: ./download.sh

Train with downloaded corpus: ./train.sh

Evaluate with training corpus java -jar slner-all.jar --in corpus.tei.xml

Train: java -jar slner-all.jar --out-model model.ser.gz --in corpus.tei.xml

Tag: java -jar slner-all.jar --in corpus.tei.xml --in-model model.ser.gz --out corpus_with_entities.tei.xml

For implementation details, see the Štajner, T., Erjavec, T., Krek, S. (2013): Razpoznavanje imenskih entitet v slovenskem besedilu. Slovenščina 2.0, 1 (2): 58–81..

License