-
Notifications
You must be signed in to change notification settings - Fork 23
CLI Annotate
David Campos edited this page Oct 14, 2016
·
1 revision
To rapidly annotate a large amount of documents on your machine or server, you can use the neji.sh
executable, which provides usage and self-explanatory information.
By using the CLI, you can annotate various documents in parallel, taking advantage of the multi-thread processing features.
Neji is distributed with an example, which is provided in the "example/annotate" and "resources" folders. The following resources are provided:
- Corpus: a set of 10 abstracts from MEDLINE;
- Dictionaries: two small dictionaries: one for disorders and another for anatomy;
- Machine Learning (ML) model: one machine learning model for gene and protein names, including preferred and synonym dictionaries for normalization.
To annotate the corpus using both dictionaries and ML models, execute the following command:
./neji.sh -i example/annotate/in/ -if RAW
-o example/annotate/out/ -of JSON
-d resources/dictionaries/
-m resources/models/
-t 1
If you prefer to not use the ML model, remove the associated option:
./neji.sh -i example/annotate/in/ -if RAW
-o example/annotate/out/ -of JSON
-d resources/dictionaries/
-t 1
If you want to use only the concepts provided by the ML, remove the dictionary option:
./neji.sh -i example/annotate/in/ -if RAW
-o example/annotate/out/ -of JSON
-m resources/models/
-t 1