Skip to content

Commit

Permalink
fixes issue #65
Browse files Browse the repository at this point in the history
  • Loading branch information
markvanderloo committed Mar 13, 2018
1 parent 08e0dd1 commit a2a57e4
Show file tree
Hide file tree
Showing 4 changed files with 11 additions and 4 deletions.
4 changes: 2 additions & 2 deletions pkg/DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,14 @@ Description: Implements an approximate string matching version of R's native
implementation of soundex is provided as well. Distances can be computed between
character vectors while taking proper care of encoding or between integer
vectors representing generic sequences.
Version: 0.9.4.6
Version: 0.9.4.7
Depends:
R (>= 2.15.3)
Imports:
parallel
URL: https://github.com/markvanderloo/stringdist
BugReports: https://github.com/markvanderloo/stringdist/issues
Date: 2017-07-26
Date: 2018-03-05
Suggests:
testthat
RoxygenNote: 6.0.1
4 changes: 4 additions & 0 deletions pkg/NEWS
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
version 0.9.4.7
- Fixed edge case where cosine distance with q=1, between strings of repeating characters
yielded Inf (Thanks to Markus Dumke)

version 0.9.4.6
- Fixed argument passing error in lower_tri (thanks to Kurt Hornik)

Expand Down
5 changes: 3 additions & 2 deletions pkg/src/qgram.c
Original file line number Diff line number Diff line change
Expand Up @@ -381,8 +381,9 @@ double qgram_dist(
dist[0] = 0.0;
} else {
// there are several ways to express the rhs (including ones that give 0L
// at equal strings) but this has least chance of overflow.
dist[0] = 1.0 - dist[0]/(sqrt(dist[1]) * sqrt(dist[2]));
// at equal strings) but this has least chance of overflow
// fabs is taken to avoid numerical -0.
dist[0] = fabs(1.0 - dist[0]/(sqrt(dist[1]) * sqrt(dist[2])));
}
break;
case 2:
Expand Down
2 changes: 2 additions & 0 deletions pkg/tests/testthat/testStringdist.R
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,8 @@ test_that("cosine distance computes correctly",{
round(stringdist("aaa","abc",method="cosine",q=2),8),
1.0
)
# see issue #65
expect_equal(stringdist("abc","abcabc",method='cosine',q=1),0)
# numerical accuracy test (thanks to Ben Haller)
# note that 1 - 2/(sqrt(2)*sqrt(2)) != 0, so this used to give ~2.2E-16.
expect_equal( stringdist("ab","ab",method="cosine"),0.0,tolerance=0.0 )
Expand Down

0 comments on commit a2a57e4

Please sign in to comment.