Skip to content

Same match having different match_key values #1417

Answered by RobinL
lucazav asked this question in Q&A
Discussion options

You must be logged in to vote

aha - so that reveals the source of the bug (and how you can temporarily fix that before we do a proper fix):
Your blocking rule:
l.name LIKE CONCAT('%',r.name,'%') or r.name LIKE CONCAT('%',l.name,'%')
contains an OR,
but that OR then is 'evaluated too broadly' when it's interpolated into the SQL.

You want:

from __splink__df_concat_with_tf as l
            inner join __splink__df_concat_with_tf as r
            on
            (l.name LIKE CONCAT('%',r.name,'%') or r.name LIKE CONCAT('%',l.name,'%'))
            AND NOT (coalesce((l.name = r.name and l.addr = r.addr), false) OR coalesce((l.name = r.name and l.city = r.city), false))
            where l."id" < r."id"

Note the additional …

Replies: 2 comments 7 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
7 replies
@RobinL
Comment options

@RobinL
Comment options

@RobinL
Comment options

Answer selected by RobinL
@lucazav
Comment options

@NickCrews
Comment options

@RobinL
Comment options

@NickCrews
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants