-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add flag for multiple matches to /geocode/multiple for detecting low quality addresses. #32
Comments
Thanks, I will consider this! It was a directed decision to not show candidates when doing multiple geocodes for payload size etc. What decisions would you make from knowing that there were multiple candidates? For the time being you could switch to single geocodes. We won't tell the neighbors. I ask because the example you give is an invalid address - It's missing a direction. Because of that you are getting addresses on both sides of the north temple. Based on your input, there is no clear way to determine which address is correct. I'm not sure why there was a tenth of a score point given to North over South but it really should be a tie. |
Yeah. I figured the output was intentionally simplified. The biggest thing I would like to do with the enhancement is use it to detect invalid addresses. It is nearly impossible to detect invalid addresses using sql (too many exceptions to the rules), so if we ran into a record with multiple possible geocodes, we can mark them for manual processing. I don't want to use the single geocode for the first pass because we have a lot of data, many with records that are going to be pretty solid matches. But you will notice both of these were > 90, so score alone doesn't tell you if the address is vague. I would also me marking anything with a low score, or without any successful matches. We have a team who connects records from our various sources, and it would be quite easy to give them an interface which pulls up each marked record, which uses the single match to give them a drop down of options. Then a human could decide which is the right option, or mark the record to be corrected. |
I think every geocode will always have I'm not sure that a |
True. I had been thinking along the lines of the candidates also matching the If I was doing processing with I was already planning on the algorithm making multiple passes, so I think I can implement my own version of the delta from the single match. I do worry that I'm going to have too few hits on the first pass with |
|
When running
/geocode
low quality addresses will return multiple matches with similar scores. However, when running with/geocode/multiple
it becomes impossible to tell that an address is low quality. If we had some way to know that there were multiple 'good' matches, we could then flag those for later processing using the non-batched/geocode
.For example:
201 main street
andSLC
(withacceptScore = 90
) in the Single Geocode returns:The multi geocode returns:
The suggestion is to add some sort of indicator that multiple matches were found.
The text was updated successfully, but these errors were encountered: