Less is More

The program used in the paper 'Less is More, More or Less... – Finding the Optimal Threshold for Lexicalisation in Chunking' by Balázs Indig

This is a stripped and evolved version of Gut, Besser, Chunker. The code focuses on parameter optimisation.

Prerequisites

Data (in train, devel test folders): the original train.txt and test.txt were obtained from the CoNLL-2000 data
CRFsuite installed and in path
Python3, JRE 1.8 (for IOButils), GNU parallel (optional) installed

Usage

python3 start.py --train train/train_new.9.txt.corenlp.IOB2.IOBES --test test/test.txt.1.1.corenlp.IOB2.IOBES --freq 13 --out result.txt
The empty folders will contain the partial outputs
One can use GNU parallel to speed up running for multiple parameters
See the paper and the code for more documentation

License

The repository contains many 3rd-party code, that has its own license. This code is made available under the GNU Lesser General Public License v3.0.

Reference

If you use this program please cite:

    @inproceedings{indig2017less,
      author    = {{I}ndig, {B}al\'azs},
      title     = {{Less is More, More or Less...} -- {F}inding the {O}ptimal {T}hreshold for {L}exicalisation in {C}hunking},
      booktitle = {Computational Linguistics and Intelligent Text Processing - 18th International Conference, CICLing 2017, Budapest, Hungary, April 17-23, 2017},
      year      = {2017}
    }

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
devel		devel
test		test
train		train
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
start.py		start.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Less is More

Prerequisites

Usage

License

Reference

About

Releases

Packages

Languages

License

ppke-nlpg/less-is-more

Folders and files

Latest commit

History

Repository files navigation

Less is More

Prerequisites

Usage

License

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages