Data Mining on Yelp and Census Data
This is a data mining project for understanding how businesses and neighborhoods interact.
We employ different clustering techniques to investigate the relationship between business dynamics and neighborhood characteristics. The former was revealed by clustering patronage patterns in Yelp data, the latter was shown by clustering neighborhoods by census data.
- code - the Jupyter notebooks containing code for data processing and analysis.
- data - auxiliary datasets and intermediate tidy data files generated in the process.
- tex - the source code for the accompanying paper written in LaTex.
It you want to reproduce the data cleaning steps, you would also need to place the
Yelp Dataset in folder
a folder called
yelp-data two levels up from the root folder of this repository.