Skip to content

Commit

Permalink
[expand] adding a method that allows hash/equality comparisons of add…
Browse files Browse the repository at this point in the history
…resses like "100 Main" with "100 S Main St." or units like "Apt 101" vs. "#101".  Instead of expanding the phrase abbreviations, this version tries its best to delete all but the root words in a string for a specific component. It's probably not perfect, but does handle a number of edge cases related to pre/post directionals in English e.g. "E St" will have a root word of simply "E", "Avenue E" => "E", etc. Also handles a variety of cases where the phrase could be a thoroughfare type but is really a root word such as "Park Pl" or the famous "Avenue Rd". This can be used for near dupe hashing to catch possible dupes for later analysis. Note that it will normalize "St Marks Pl" and "St Marks Ave" to the same thing, which is sometimes warranted (if the user typed the wrong thoroughfare), but can also be reconciled at deduping time.
  • Loading branch information
albarrentine committed Dec 17, 2017
1 parent d0364ab commit 3f7abd5
Show file tree
Hide file tree
Showing 2 changed files with 516 additions and 76 deletions.
Loading

0 comments on commit 3f7abd5

Please sign in to comment.