Split replace penalty #21

Meesch · 2023-05-17T20:42:56Z

This PR implements a solution to issue #19. It simply provides the user with the option to turn on 'split_penalty' in the settings, which will add a 0.1 penalty to the edit distance of Alignments that split tokens apart in a replacement.

I have racked my brain and I think there is no issue with setting a penalty of 0.1 for operations that split a token, because it will only impact which Alignment is chosen in cases where the edit distance is equal. The unit tests all still pass, but I added it as a setting instead of hardcoding it in the source so that we might find a way to test it on a larger dataset, maybe with Jan's help/input. Please let me know what you think or if you see any issues with this approach.

Meesch · 2023-05-17T20:59:46Z

mypy is causing the github actions to fail still, I would appreciate your insight on how to make it be chill about this.

Meesch added 3 commits May 17, 2023 14:32

add documentation

22d89f2

implement split_penalty setting

cc9eb2c

add comment

b5ec42a

Meesch requested a review from oktaal May 17, 2023 20:42

Meesch added 2 commits May 17, 2023 15:47

update type hinting

d0bb1f0

update type hinting for python3.7

1f8c73b

Reworked splitting distance: spaces are inserts

969fbb5

oktaal merged commit 67ef6bb into develop May 31, 2023

oktaal deleted the bugfix/alignment_bugs branch May 31, 2023 15:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split replace penalty #21

Split replace penalty #21

Meesch commented May 17, 2023 •

edited

Loading

Meesch commented May 17, 2023 •

edited

Loading

Split replace penalty #21

Split replace penalty #21

Conversation

Meesch commented May 17, 2023 • edited Loading

Meesch commented May 17, 2023 • edited Loading

Meesch commented May 17, 2023 •

edited

Loading

Meesch commented May 17, 2023 •

edited

Loading