Skip to content

How to correlate between parser input string and output tree/tokens #687

@ndvbd

Description

@ndvbd

Sometimes an input string can have for example multiple spaces, or multiple sentences, or other characters that are being removed by the parser, and sometimes the parser returns a forest (list of trees, each representing a sentence).

I want to be able to correlate exactly from an input segment (start/end character), to the corresponding token in the output forest. How can I easily do it? I mean, the parser knows what characters were removed/trimmed, so it would be useful to add an API parameter (for the CoreNLP server), that instead of just returning the tokens, and their POS, also the start/end character in the corresponding input sentence.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions