-
Notifications
You must be signed in to change notification settings - Fork 5
Matchup yelp IDs with TA IDs not in Crosswalk #25
Comments
Steps:
As James mentions, it'd be great to give back our findings to Factual. |
…(not just url).
I discovered we can put and into TA's location_mapper API and it'll pass back IDs – no dealing with messy search! In my first experiment, centered on the Ferry Building, 15/50 (30%) places were missing TA data:
The remaining 35 places were correct. The full analysis is in a gist. Next TODO:
|
My current implementation puts the info directly into TA's location_mapper, taking the first result, and, if there are no results, strips any accents and tries again. A possible improvement is to remove the name query and do our own place matching. With the current implementation, using the 50 best match Yelp results with our top level categories from an 800m radius around the following locations, I got the following results:
Notes:
Updated TODO:
|
…add improvements notes. See code comments for details and improvement notes.
…(not just url).
…add improvements notes. See code comments for details and improvement notes.
…(not just url).
…add improvements notes. See code comments for details and improvement notes.
This will allow devs to check out yelp places in different areas to see how well TA matches.
This will allow us to figure out which places don't have TA data so we can run crosswalk on it.
This will allow us to figure out which places don't have TA data so we can run crosswalk on it.
Update for non-dense area and analysis of Factual crosswalk: Data: 0.5km crawl around Nashville (36.162963, -86.780758) = 41 places. Adjusting for places where Yelp serves an area rather than a specific location (6), there are 35 places.
Raw notes added to the gist. Overall, it seems we're getting about 75% correct from this method. This one test of factual shows 46% correct for TA. |
…add improvements notes. See code comments for details and improvement notes.
This will allow devs to check out yelp places in different areas to see how well TA matches.
This will allow us to figure out which places don't have TA data so we can run crosswalk on it.
…ts notes. See code comments for details and improvement notes.
This will allow devs to check out yelp places in different areas to see how well TA matches.
This will allow us to figure out which places don't have TA data so we can run crosswalk on it.
We can do yelp -> TA: we just need to integrate (#91). |
Because Factual Crosswalk isn't reliable.
This would be a regular crawl, disconnected with a user hitting the the prox-server API.
This would also include add new Crosswalk records to the Factual database.
The text was updated successfully, but these errors were encountered: