You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The deduplication appears to take the match pattern and matched value's index and take the highest zeta value, but does not account for zeta values that are exactly equal. This leads to weird behavior.
Prefact: Issue can be recreated if you append the first row of dfA (where firstname is "daniel") to both dfA and dfB. This means the record will be an exact match to a row in dfA and dfB.
Issue 1: The dedupe algorithm will return all of the matched values as setup above. However, if you change the value of the firstname in the first row to NA, then it will be removed.
Issue 2: f you change the lastname "secuya" to "secuyas" while leaving the first name as NA, it will still be removed by the dedupe function. But, if you add the name "daniel" back to the firstname, it will not be deduped.
The text was updated successfully, but these errors were encountered:
I found the issue with the deduplication. The order of the dataframes matters because the duplicate row ids are removed before checking for them again in dfb.
The deduplication appears to take the match pattern and matched value's index and take the highest zeta value, but does not account for zeta values that are exactly equal. This leads to weird behavior.
Prefact: Issue can be recreated if you append the first row of dfA (where firstname is "daniel") to both dfA and dfB. This means the record will be an exact match to a row in dfA and dfB.
Issue 1: The dedupe algorithm will return all of the matched values as setup above. However, if you change the value of the firstname in the first row to NA, then it will be removed.
Issue 2: f you change the lastname "secuya" to "secuyas" while leaving the first name as NA, it will still be removed by the dedupe function. But, if you add the name "daniel" back to the firstname, it will not be deduped.
The text was updated successfully, but these errors were encountered: