Skip to content
This repository has been archived by the owner on May 1, 2023. It is now read-only.

compareTwoStrings returning 1 for small and different strings #78

Open
Satrofu opened this issue Dec 24, 2020 · 1 comment
Open

compareTwoStrings returning 1 for small and different strings #78

Satrofu opened this issue Dec 24, 2020 · 1 comment

Comments

@Satrofu
Copy link

Satrofu commented Dec 24, 2020

Hello everyone, I hope all is doing good. I found a case in which there is a difference (search for <_15>), the difference is that the first string has <_15>FLL while the second one has <_15>ORD, yet the function is returning 1 as if it were a perfect match. The version used for this comparison was 4.0.1. Below you can see an example ready to be ran in node.js (system version 14.4.0):

const similarity = require("string-similarity");

const body1 = '<REQ><_0>MSG</_0><_1/><_2>55</_2><_3>ORG</_3><_4>F1</_4><_5>MIA</_5><_6>07560685</_6><_7>AC30</_7><_8>HFD</_8><_9>F1</_9><_10>T</_10><_11>US</_11><_12>USD</_12><_13>ZE</_13><_14>ODI</_14><_15>FLL</_15><_16>ORD</_16><_17>UNT</_17><_18>5</_18><_19>1</_19><_20>UNZ</_20><_21>1</_21><_22>000000</_22><_23/></REQ>';

const body2 = '<REQ><_0>MSG</_0><_1/><_2>55</_2><_3>ORG</_3><_4>F1</_4><_5>MIA</_5><_6>07560685</_6><_7>AC30</_7><_8>HFD</_8><_9>F1</_9><_10>T</_10><_11>US</_11><_12>USD</_12><_13>ZE</_13><_14>ODI</_14><_15>ORD</_15><_16>FLL</_16><_17>UNT</_17><_18>5</_18><_19>1</_19><_20>UNZ</_20><_21>1</_21><_22>000000</_22><_23/></REQ>';

console.log(similarity.compareTwoStrings(body1, body2));

Thanks!

@dlnnlsn
Copy link

dlnnlsn commented Nov 1, 2022

The documentation is misleading when it claims that having a similarity of 1 indicates identical strings. In fact, the Dice coefficient of your two strings is 1 since they contain exactly the same bigrams, just in a different order. An even shorter example is that the Dice coefficient of the strings aba and bab is 1, since in each case the two bigrams in the string are ab and ba.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants