Skip to content

TEXT-188 Speed up LevenshteinDistance with threshold by exiting early#174

Closed
vesterstroem wants to merge 2 commits intoapache:masterfrom
vesterstroem:TEXT-188
Closed

TEXT-188 Speed up LevenshteinDistance with threshold by exiting early#174
vesterstroem wants to merge 2 commits intoapache:masterfrom
vesterstroem:TEXT-188

Conversation

@vesterstroem
Copy link
Contributor

Copy link
Member

@kinow kinow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vesterstroem CI passed on GitHub actions. I should be able to review it this weekend. Just want to sit down with ☕ and the IDE to debug and confirm I understand what changed, and also that it is being covered by our tests (quite sure it's).

Thanks for your pull request!
Bruno

@coveralls
Copy link

coveralls commented Oct 2, 2020

Coverage Status

Coverage increased (+0.001%) to 98.672% when pulling 381b800 on vesterstroem:TEXT-188 into 58b1cdc on apache:master.

}
// if the lower bound is greater than the threshold, then exit early
if (lowerBound > threshold) {
return -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vesterstroem it looks to me like this nullifies the if (p[n] <= threshold) { further down.

I was looking at the coverage report, and it seems to be that the coverage decreased because that return -1 never happens.

If so, I think we can remove it and simply return p[n];. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The improvement introduces the need for an extra test, which I forgot to add.

The assertion:

    assertThat(new LevenshteinDistance(1).apply("abc", "acb")).isEqualTo(-1);

should be inserted into e.g. testGetLevenshteinDistance_StringStringInt of LevenshteinDistanceTest.

That will give coverage to the uncovered line - and it will fail, if we just always return p[n].

Should I make a new pull request or update the existing one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better update this one. We can squash/edit commits later if necessary 👍 After you push a new commit I think the CI jobs will run again, and coveralls will confirm the code is being tested now.

Thanks!

@kinow
Copy link
Member

kinow commented Oct 4, 2020

Infra is restarting our JIRA. So will merge it later today @vesterstroem . Thanks again for your contribution and for adding the test 👍

@kinow kinow closed this in 42c22ce Oct 4, 2020
@kinow
Copy link
Member

kinow commented Oct 4, 2020

Merged, with changelog added for 1.9.1. Thanks @vesterstroem

@vesterstroem
Copy link
Contributor Author

Merged, with changelog added for 1.9.1. Thanks @vesterstroem

You're welcome :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants