Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service: Search GET - sentence mark not included #574

Open
notesjor opened this issue Mar 28, 2023 · 1 comment
Open

Service: Search GET - sentence mark not included #574

notesjor opened this issue Mar 28, 2023 · 1 comment
Assignees

Comments

@notesjor
Copy link

If you use the search and set the context parameter to sentence. You get the sentence, but not the punctuation mark as a token. However, the punctuation mark is contained in the snippet. Please also add the punctuation mark to the token output.

@Akron
Copy link
Member

Akron commented Mar 31, 2023

Punctuation marks are not treated as tokens in KorAP to be in line with word distances in Cosmas-II. So - this is a wontfix. But we may be able to support a simple "preceding"-data token structure, that returns all tokens of a match including preceding data. This would possibly add an empty token at the end to account for "following"-data as in your example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants