Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The edits for inserting multiple words are sometimes wrong #6

Closed
kavehtp opened this issue Apr 28, 2016 · 4 comments
Closed

The edits for inserting multiple words are sometimes wrong #6

kavehtp opened this issue Apr 28, 2016 · 4 comments
Assignees
Labels

Comments

@kavehtp
Copy link
Contributor

kavehtp commented Apr 28, 2016

For example:

S Thursday , is it not ?
A 0 1|||UNK|||It 's|||REQUIRED|||-NONE-|||0
A 3 5|||UNK|||n't it|||REQUIRED|||-NONE-|||0

Target:

It 's Thursday , is n't it ?

@tamhd
Copy link
Contributor

tamhd commented Apr 28, 2016

We get incorrect evaluation using M2Scorer as well (please double check).

Hypothesis:

It 's Thursday , is n't it ?

M2 file:

S Thursday , is it not ?
A 0 0|||Mec|||It 's|||REQUIRED|||-NONE-|||0
A 3 5|||Mec|||n't it|||REQUIRED|||-NONE-|||0

Based on my understanding, the reason is the initialization step of the levenshtein matrix. It creates the first row using the index of the hypothesis, i.e: the necessary edits to transform an empty sentence into the hypothesis. I think the start and end of such edits should always be 0.

My proposed solution is to change line 827of file levenshtein.py into:

edit = ("ins", 0, 0, '', second[j-1], 0) # always insert at the beginning

It appears to me that the modification will fix the problem, yet I am not sure whether other errors may arise.

Tam

@kavehtp
Copy link
Contributor Author

kavehtp commented Apr 28, 2016

I tried it. The generated edits are correct with the change. Can you make a pull request?

@tamhd
Copy link
Contributor

tamhd commented Apr 29, 2016

I tried it. The modification does not change the result of the 13 teams participating in CoNLL-2014 shared task (Table 7).

@kavehtp
Copy link
Contributor Author

kavehtp commented Apr 29, 2016

OK. Thanks.

@kavehtp kavehtp closed this as completed Apr 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants