Showcase is a Pytorch implementation of the Japanese Predicate-Argument Structure (PAS) analyser presented in the paper of Matsubayashi & Inui (2018) with some improvements. Given a input sentence, Showcase identifies verbal and nominal predicates in the sentence and detects their nominative (が), accusative (を), and dative (に) case arguments. The output case labels are based on the label definition of the NAIST Text Corpus where case markers in different voices are generalized into the case markers of an active voice.
http://www.cl.ecei.tohoku.ac.jp/showcase/
echo '今日は雨が降る' | showcase
cat example.txt | showcase
- One raw sentence per line.
- A blank line can be used to segment a document. (Showcase just resets an argument index to zero.)
- Python 3.5 (or higher)
- We do not support Python 2
- CaboCha with JUMAN dict
- PyTorch 0.4.0
pip install showcase-parser
Resources include following files:
- 10 Model files for predicate detector (
pred_model_0{0..9}.h5
) - 10 Model files for argument detector (
arg_model_0{0..9}.h5
) - Word embedding Matrix (
word_embedding.npz
) - POS embedding Matrix (
pos_embedding.npz
) - Word index file (
word.index
) - Part-of-Speech tag index file (
pos.index
)
Resources are all available at Google Drive.
- train/*.h5: models trained with the training set described in the paper.
- train-test/*.h5: models trained with the training and test sets.
Run showcase setup
to create config.json
file in $HOME/.config/showcase
.
Then edit config.json
and specify valid paths for:
- Resources downloaded in Step 2
- CaboCha and its JUMAN dictionary
Original config.json
can be found at showcase/data/config.json
of this repo.
You may specify path to config.json
as follows:
showcase -c /path/to/config/config.json
Note that the apporopriate thresholds (hyperparameters) for arguments differ for each model. The thresholds for the provided models are described in the sample config file in each Google Drive directory.
TBA
TBA
TBA
- run
get_vocab_from_word2vec.py
andconvert_word2vec_to_npy.py
@InProceedings{matsubayashi:2018:coling,
author = {Matsubayashi, Yuichiroh and Inui, Kentaro},
title = {Distance-Free Modeling of Multi-Predicate Interactions in End-to-End Japanese Predicate Argument Structure Analysis},
booktitle = {Proceedings of the 27th International Conference on Computational Linguistics (COLING)},
year = {2018},
}