Skip to content

adrianbautista/foursquare-restaurant-grades

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Foursquare Restaurant Grades

The data and work in this repoistory attempts to identify any relationship between the tips Foursquare users leave for a NYC restaurant and the restaurant's sanitation grade from the NYC Department of Health and Mental Hygiene (NYCDOHMH). It is also my final project for the Fall 2014 session of General Assembly's Data Science course.

Also in this repoistory is old work for my first project idea about Hollywood films and China box office results.

About

What if the NYCDOHMH could prioritize restaurants in need of health inspections based on text reviews left by patron? The this project's regression models attempt to answer this question by analyzing over 10,000 tips left by Foursquare users at restaurants and predicting, based on adjectives used in the tips, whether the restaurant was a grade "A" or grade "C" restaurant.

Scores and Area Under the Curve (AUC) values produced from the data indicate that randomness has a strong impact on any modeled relationship.

Moving Forward

Idea and suggestions on how work in this project could be improved:

  • The number of tips per restaurant is fairly low, so using more tips or refining the model to restaurants with a minimum number of tips might help.
  • Using Bigrams and Trigrams to measure the impact of word combos since sentiment analysis is weak when only used against individual words.
  • Doing a deeper dive into certain areas of NYC instead of all restaurants in the city.

Data sources

Languages and packages used

About

GA Data Science final project on Foursquare tips and NYC restaurant grades

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages