Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix merging of adjacent spans #8

Merged
merged 1 commit into from
Dec 15, 2022

Conversation

hatzel
Copy link
Contributor

@hatzel hatzel commented Dec 14, 2022

Background: We merge separate spans that cover a continuous logical span of text. E.g: [(0, 17), (17, 35)] -> [(0, 35)]

Previously we sometimes encountered negative span annotations like this one: (208, 147). This was caused by incorrect merging of adjacent annotations. Instead of filtering the end positions using the original start positions we filtered using the already filtered start positions. As a result no end positions were ever removed, meaning the function produced two lists of different lengths, the longer of which (with the end positions) was truncated by the call to zip.

Background: We merge separate spans that cover a continuous logical span of text.
E.g: [(0, 17), (17, 35)] -> [(0, 35)]

Previously we sometimes encountered negative span annotations like this one:
(208, 147). This was caused by incorrect merging of adjacent
annotations. Instead of filtering the end positions using the original
start positions we filtered using the already filtered start positions.
As a result no end positions were ever removed, meaning the function
produced two lists of different lengths, the longer of which (with the
end positions) was truncated by the call to zip.
@maltem-za
Copy link
Member

Thanks for the fix and the explanation! It would be great if you could also give the function a name that actually indicates what it's doing, as well as review the docstring and any comments.

@hatzel
Copy link
Contributor Author

hatzel commented Dec 15, 2022

Sorry, but I am not interested in writing further documentation as part of this bugfix PR.

@maltem-za maltem-za changed the base branch from main to fix-span-merge December 15, 2022 13:22
@maltem-za maltem-za merged commit 5c19847 into forTEXT:fix-span-merge Dec 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants