Skip to content

CLI Annotate

David Campos edited this page Oct 14, 2016 · 1 revision

To rapidly annotate a large amount of documents on your machine or server, you can use the neji.sh executable, which provides usage and self-explanatory information.

CLI annotate

By using the CLI, you can annotate various documents in parallel, taking advantage of the multi-thread processing features.

Example

Neji is distributed with an example, which is provided in the "example/annotate" and "resources" folders. The following resources are provided:

  • Corpus: a set of 10 abstracts from MEDLINE;
  • Dictionaries: two small dictionaries: one for disorders and another for anatomy;
  • Machine Learning (ML) model: one machine learning model for gene and protein names, including preferred and synonym dictionaries for normalization.

To annotate the corpus using both dictionaries and ML models, execute the following command:

./neji.sh -i example/annotate/in/ -if RAW 
          -o example/annotate/out/ -of JSON 
          -d resources/dictionaries/ 
          -m resources/models/ 
          -t 1 

If you prefer to not use the ML model, remove the associated option:

./neji.sh -i example/annotate/in/ -if RAW 
          -o example/annotate/out/ -of JSON 
          -d resources/dictionaries/ 
          -t 1 

If you want to use only the concepts provided by the ML, remove the dictionary option:

./neji.sh -i example/annotate/in/ -if RAW 
          -o example/annotate/out/ -of JSON 
          -m resources/models/ 
          -t 1  
Clone this wiki locally