How to correlate between parser input string and output tree/tokens

Sometimes an input string can have for example multiple spaces, or multiple sentences, or other characters that are being removed by the parser, and sometimes the parser returns a forest (list of trees, each representing a sentence).

I want to be able to correlate exactly from an input segment (start/end character), to the corresponding token in the output forest. How can I easily do it? I mean, the parser knows what characters were removed/trimmed, so it would be useful to add an API parameter (for the CoreNLP server), that instead of just returning the tokens, and their POS, also the start/end character in the corresponding input sentence.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to correlate between parser input string and output tree/tokens #687

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to correlate between parser input string and output tree/tokens #687

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions