-
Great Product and Excellent work so far! Curious about your thoughts of using this tool to perform a 3-way match (invoice price vs purchase order price - quantity received into inventory vs invoice quantity)? OCR is a great tool to help with data entry in accounts payable, but it is mainly used to get Invoice Header (Invoice No, Inv Date, Inv Amount, Supplier) but has challenges doing a three way match with line level detail. Imagine having 300 line invoice and manually combing though the lines to find the pricing errors and shortage on these two lines. Purchase Receipt Data
OCR Invoice Line Level Detail
Even thought only one column of data matches (12 quantity) on line 1 and zero columns match on line 2, if the formatting was better a human would be able to readily identify $24 pricing problem on the first line and the $18 pricing difference on the second line with a short shipment of 1. However, these could be line 37 and 92 on the receiver and line 188 and 78 on the invoice and becomes a nightmare to find. The goal would be to find the best match between the 300 line items on the receiver and the 300 lines on the invoice. Is it beyond the scope of this tool to help solve this problem? Thank you in advance for responding! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
If I understand you correctly (link a table of 300 records with another table of 300 records, where each record in A will match exactly one record in B), then this exactly what splink is designed for. You will have to write a bit of custom SQL to implement the comparisons that you are talking about, but it would work. For the scale that you are talking about, there are only 300x300=90_000 possible comparisons, so honestly I think it might be easier to just write this logic in pure python and skip splink entirely. |
Beta Was this translation helpful? Give feedback.
If I understand you correctly (link a table of 300 records with another table of 300 records, where each record in A will match exactly one record in B), then this exactly what splink is designed for. You will have to write a bit of custom SQL to implement the comparisons that you are talking about, but it would work.
For the scale that you are talking about, there are only 300x300=90_000 possible comparisons, so honestly I think it might be easier to just write this logic in pure python and skip splink entirely.