No description, website, or topics provided.
Erlang Java
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
lib initial import Aug 11, 2014
src initial import Aug 11, 2014
.gitignore initial import Aug 11, 2014 Adding documentaiton Sep 10, 2014
chemxseer-tagger-training.txt initial import Aug 11, 2014
pom.xml initial import Aug 11, 2014

ChemxSeer Tagger

ChemxSeer Tagger provides a chemcial entity extractor that identifies mentions of chemical formula and names in free text.



  • Download the distribution from here
  • Extract the compressed file
tar xvf chemxseer-tagger-dist.tar.gz
  • To run the tagger on a plain text document or a folder containing multiple documents, use the script:
./ indir outdir

where indir is path to directory containing text files, and outdir is directory into which tagged files will be written For each input file, an output file will be created that contains in each line the entity extracted, it's beginning offset within the file, and its end offset. Values are tab separated in each line.

Remember that only text files can be processed now. If you would like to extract entities out of PDFs and other formats, please convert them to text files first using tools like Apache Tika

From Source

  • check the code out
  • run: mvn install


Apache License 2.0