Skip to content

fediazgon/yelp-business-viz

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

project-logo
yelp-business-viz

Visualizing yelp challeng's data with Shiny!

Live DemoGetting startedIdiomsAuthors

About This Project

The aim of this project is to develop an interactive visualization tool going trough all the design abstraction levels needed to properly structure a complex visualization project.

Live Demo

The easyest way to get a grasp on what the project does is to look at our live demo.

Getting Started

We suggest you to install RStudio.

Once installed you can easily install shiny issuing the following command in RStudio console:

> install.packages("shiny")

The Dataset

The dataset used is the yelp visualization challenge dataset which collects various informations about businesses around the world and check-in events generated by users.

Questions

Good visualization are aimed to answer defined questions. For this reason we selected some interesting questions concerning the dataset:

  1. How are the business geographically distributed?
  2. What is the distribution of review score on a determined geography? Are there areas that are more "picky"?
  3. Is there a relationship between the number of check-in and the number of reviews?
  4. Which is the relationship between the most common business cate- gories?
  5. In which time of the day the customer check-in? Does it coincides with the opening hours?
  6. Is it true that different categories have different opening hours? Like bars opens until late and restaurants close sooner.

Idioms

The project implements various idioms (plots) in different tabs.

Map

The map tab aims to answer the first three questions. For that reason we used facetig of multiple idioms. A choroplet map for the geographical distribution of the businesses, an histogram of the review score and a scatter plot between the logarithm of the number of reviews and the number of check-in.

Businesses can be filtered by review score and the size and color of the bubbles can encode different values.

All the idioms are coordinated, as can be seen in the following demo: map-demo

Adiacency Matrix

We represented the relationship between the businesses categories as a network, having as nodes the names of the categories and a link every time two categories appears in the same business. The weight associated to link will increase each time the link appears in the dataset. Being inspired by the "les miserables" co-occurrence adiacency matrix, we decided to use the same idiom. The user will be able to manipulate the view by selecting the sorting criteria of the matrix by name of category, frequency and cluster.

adjacency-matrix-demo

Heatmap

For investigating the similarities between check-in hours and opening hours we decided to apply some aggregation building two matrix with day of the week and hour as rows and columns index. We calculated a scalar field where each cell rapresent the number of check-in in that hour/day and the number of business open in that hour/day. The user have the capability to filter by selecting the category to analyze. We choose a purely sequential colormap without a central reference. The user is also be able to apply smoothing to the heat map to ease the detection of particular patterns, like you can see in the demo below.

heatmap-demo

Learn More

In case you want to learn more about the design decisions taken and the selection of the idioms used we seggest you to read the report we redacted.

Authors 🇪🇸 💙 🇮🇹

  • Fernando Díaz
  • Giorgio Ruffa

About

Visualization of local business information included in the Yelp dataset (2017)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published