Skip to content

Searching for Boba: Analyzing Bubble Tea Shops in NYC Using the Yelp Fusion API

License

Notifications You must be signed in to change notification settings

datalifenyc/boba-nyc

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

85 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Obsessed with Boba? Analyzing Bubble Tea Shops in NYC Using the Yelp Fusion API

Presenters

  • Mark Bauer
  • Chidi Ezeolu
  • Ho Hsieh
  • Nathan Williamson

Event

NYC Open Data Week 2022

Event RSVP

Binder

cover-photo

Table of Contents

Introduction

In this workshop, we explore and develop insights about NYC's Bubble Tea Shops using the Yelp Fusion API. Sections include:

  • How to use the Yelp Fusion API
  • Data Cleaning, Wrangling and Visualizations in Python
  • A demo of our web app created in Jupyter Book and Streamlit.

Additionally, questions we’ll explore include bubble tea locations, Yelp ratings, review counts and price.

After an initial introduction of each section, participants will join break-out groups depending on which topic they would like to learn more about. These break-out sessions will be hands-on and interactive. Participants will then reconvene for a Q&A and final thoughts. Attendees will gain a better understanding of the data analysis workflow and will leave with skills and a template to uncover insights with any dataset.

This workshop recommends beginner-level proficiency with Python and is focused on applying Python to data analysis; however, those new to Python are gladly welcome!

Prerequisites

  • Basics of Python or other programming languages (R, SQL, etc.)
  • Basic knowledge of Data Analysis
  • Basics of Jupyter Notebooks

This project recommends beginner-level proficiency with Python and is focused on applying Python to data analysis.

Install

  1. Install Anaconda

  2. Install Git

  3. Clone boba-nyc repo

    git clone https://github.com/mebauer/boba-nyc.git
  4. Enter directory of local repo

    cd boba-nyc
  5. Install requirements

    conda env create -f environment_detail.yml

Other Commands

Conda

Managing environments

conda issues # 4339: Exporting clean environment to environment.yml

conda env export --from-history | grep -v "prefix" > environment.yml

Git

Git - git-push Documentation

git push origin

Configuring a remote for a fork

git remote -v
git remote add upstream https://github.com/mebauer/boba-nyc.git
git remote -v

Syncing a fork from the command line

main: name of local default branch
upstream/master: name of remote parent (orginal) repo branch

git fetch upstream
git checkout main
git merge upstream/master

Jupyter Book

Build your book

jupyter-book build --all teabook/

Streamlit

Create an app

streamlit run <app.py>

Data

Yelp Fusion API

Note: the Yelp Fusion API is a free API on Yelp's Developer Site. Details from the Yelp Fusion page:

Create an app on Yelp's Developers site In order to set up your access to Yelp Fusion API, you need to create an app with Yelp. This app represents the application you'll build using our API and includes the credentials you'll need to gain access. Here are the steps for creating an app:

  1. Go to Create App
  2. In the create new app form, enter information about your app, then agree to Yelp API Terms of Use and Display Requirements. Then click the Submit button.
  3. You will now have an API Key.

Please keep the API Key 🔑 to yourself since it is the credential for your call to Yelp's API.

Source: Get started with the Yelp Fusion API

Datasets

Dataset Description
Yelp Fusion API - Business Search This endpoint returns up to 1000 businesses based on the provided search criteria.
NYC Borough Boundaries GIS data of NYC boroughs.

Output Data

The output data retrieved from the Yelp Fusion API query is titled boba-nyc.csv and is saved as a CSV file.

Analysis

You can view these notebooks through your browser by clicking View under the Static Webpage column.

File Name Description Static Webpage
socrata-api-demo.ipynb Intro to the Socrata API with the NYC Dog Licensing Dataset & Python Demo
boba-analysis-nyc.ipynb Analyzing Bubble Tea shops in NYC. Demo
data-wrangling.ipynb Query and data cleaning workflow from the Yelp Fusion API's Business Search endpoint. Demo

Streamlit App Demo

streamlit-app-demo

Open Source Applications Used in Project

  • Anaconda: A distribution of the Python and R programming languages for scientific computing (data science, machine learning applications, large-scale data processing, predictive analytics, etc.), that aims to simplify package management and deployment.
  • Project Jupyter: Project Jupyter is a non-profit, open-source project, born out of the IPython Project in 2014 as it evolved to support interactive data science and scientific computing across all programming languages.
  • Jupyter Notebook: The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.
  • Jupyter Book: Jupyter Book is an open source project for building beautiful, publication-quality books and documents from computational material
  • nbviewer: A web application that lets you enter the URL of a Jupyter Notebook file, renders that notebook as a static HTML web page, and gives you a stable link to that page which you can share with others.
  • Binder: The Binder Project is an open community that makes it possible to create sharable, interactive, reproducible environments.
  • Socrata: The Socrata Open Data API allows you to programmatically access a wealth of open data resources from governments, non-profits, and NGOs around the world.
  • Plotly: The front end for ML and data science models.

Other Applications and Services Used in Project

NYC Open Data Week 2022

  • About Open Data Week: Open Data Week is organized and produced by the NYC Open Data Program and BetaNYC. This annual festival takes place during the first week of March to celebrate New York City’s Open Data Law, which was signed into law on March 7, 2012, and International Open Data Day which is typically the first Saturday in March.
  • NYC Open Data: Open Data is free public data published by New York City agencies and other partners.

Yelp Fusion API - references

CC-licensed materials

Images

Bubble Tea Logo: Photo at ViVi Bubble Tea - 49 Bayard St, New York, NY 10013

Cheatsheets

Image editors

Social media badges

Further Reading

Licensing

Say Hello 👋

We can be reached at:

Presenter LinkedIn GitHub Twitter
Mark Bauer LinkedIn GitHub followers Twitter Follow
Chidi Ezeolu LinkedIn GitHub followers
Ho Hsieh LinkedIn GitHub followers
Nathan Williamson GitHub followers

About

Searching for Boba: Analyzing Bubble Tea Shops in NYC Using the Yelp Fusion API

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 78.8%
  • HTML 21.0%
  • Other 0.2%