The intent of the paper is to identify subcategories of businesses and services by studying user reviews on Yelp. Further, the study was extended to identify categories that are non-intuitive. These subcategories can help businesses boost their revenue, while users can quickly locate businesses and services based on popular subcategories.
Identifying non-intuitive sub categories will enhance the profile of a particular neighborhood as these are some of the most unique businesses that are listed on Yelp. I used the Yelp academic open dataset that has 4.1 Million user reviews.
To find the latent subcategories by studying the reviews, I employed the Natural Language Took Kit, to compute n-gram and their frequency of occurrences. For identifying the non-intuitive categories 1.1 Million business attributes that are tagged by the users were combined with the business categories for three target cities.
Overall a lot of interesting insights were discovered, this could definitely help the users and businesses alike.