-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
In GitLab by @loualiche on Oct 27, 2023, 22:53
I thought naively that it would be easy to use the great StringDistances.jl package to do a fuzzy merge based on strings (say addresses or imperfect country names).
There must be something that prevent the composition.
The exercise I was thinking of something of the sort:
innerjoin(
(df1, df2),
by_distance(:country_name, Partial(Levenshtein()), <=(5)),
multi=(M=closest,)
)Think of the country names being either "United States" in one table and "United States of America" in another.
There are other examples (where there is no strict inclusion).
PS: I love this package.
Metadata
Metadata
Assignees
Labels
No labels