Producing decoded output from files with existing entity markup #8

antonyscerri · 2012-10-08T13:56:05Z

Whilst verifying results and evaluating models I noticed that if you run the decoder over a file which has the entities marked up (as used for testing) you end up with both sets of information in the output file. It would be nice if it could strip out any markup as used for the training when decoding, or otherwise flag the two distinct sets. This was all done wrt the basic text files.

Other formats may have alternative ways to express both in a single output file.

Added "--strip" option to remove existing 'tags' from input files, addresses #8

wellner added a commit that referenced this issue Oct 15, 2012

Merge pull request #9 from elsevierlabs/strip-tags

1c7ac7b

Added "--strip" option to remove existing 'tags' from input files, addresses #8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Producing decoded output from files with existing entity markup #8

Producing decoded output from files with existing entity markup #8

antonyscerri commented Oct 8, 2012

Producing decoded output from files with existing entity markup #8

Producing decoded output from files with existing entity markup #8

Comments

antonyscerri commented Oct 8, 2012