Discourse Parsers Details

How to regenerate the discourse models

Note: this is of limited utility to regular users; but it might be useful to developers interested in modifying (i.e., retraining) the discourse parsers.

To regenerate the constituent-syntax model, run this command:

    sbt 'run-main org.clulab.discourse.rstparser.RSTParserMain -train /data/nlp/corpora/RST_cached_preprocessing/rst_train -model model.const.rst.gz'

This will generate the model file, model.const.rst.gz, in the current directory. To evaluate, move model.const.rst.gz to main/src/main/resources/. and run:

    sbt 'run-main org.clulab.discourse.rstparser.RSTParserMain -test /data/nlp/corpora/RST_cached_preprocessing/rst_test -model model.const.rst.gz'

To regenerate the dependency-syntax model, run this command:

    sbt 'run-main org.clulab.discourse.rstparser.RSTParserMain -train /data/nlp/corpora/RST_cached_preprocessing/rst_train -model model.dep.rst.gz -dep'

Similarly, this command generates the model that uses only dependency information: model.dep.rst.gz, in the current directory. To evaluate, move model.dep.rst.gz to main/src/main/resources/. and run:

    sbt 'run-main org.clulab.discourse.rstparser.RSTParserMain -test /data/nlp/corpora/RST_cached_preprocessing/rst_test -model model.dep.rst.gz -dep'

Note that the latter dependency-based model is both faster and more accurate than the former constituent-based one.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discourse Parsers Details

How to regenerate the discourse models

Table of Contents

Clone this wiki locally