-
Hi all, after appling the Thank you! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 7 replies
-
You're right, if the data is identical they should have the same match key. Could it be something like whitespace or other unusual characters meaning they look identical but are in fact different? Don't suppose you're able to share a reproducible example? Appreciate that may be hard. By 'the same matching pair ' do you mean identical records with different unique IDs? The results should not have duplicate pairs (by unique id) if things are working correctly |
Beta Was this translation helpful? Give feedback.
-
Here the repro using the file as attachment:
I'm using the Splink version 3.9.3 on Windows with the following input file: |
Beta Was this translation helpful? Give feedback.
aha - so that reveals the source of the bug (and how you can temporarily fix that before we do a proper fix):
Your blocking rule:
l.name LIKE CONCAT('%',r.name,'%') or r.name LIKE CONCAT('%',l.name,'%')
contains an OR,
but that OR then is 'evaluated too broadly' when it's interpolated into the SQL.
You want:
Note the additional …