Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Resources
Tagger
LICENSE
README.md

README.md

SimpleTagger

This repository provides a basic tagger which identifies strings and lemmas in text and links them to identifiers from a resource.

Current version: 0.1

Dependencies:

  • KafNafParserPy
  • Python3

Usage:

python ontology_tagger_on_naf.py input_directory output_directory [list of resources]

Input:

  • Directory of NAF files with (at least) token and term layer
  • Output directory
  • At least one resource file providing strings and identifiers

Output:

NAF files containing all information from input NAF and additional markables; each markable links a string (set of tokens) to an identifier.

Resource structure:

Resources should be tsv files of the following structure:

surface string (lower case) TAB lemmas of string TAB head word (lower case) TAB lemma of head word TAB identifier

Features Coming up:

Options:

  • apply to single file instead of directory
  • only string match or only lemma match
  • option to set pointer to resource or to retrieve it from the resource itself

For questions contact:

antske.fokkens@vu.nl

You can’t perform that action at this time.