# Setup

Import the following packages. To get all the packages run `pip install -r requirements.txt`

In [1]:
import pandas as pd
from geopy.distance import distance

# Theory

My friend had an interesting theory that at every Jersey Mike's store there is also a Starbucks store in vincinity, whether within the same street mall or a couple of blocks near. This project is intended to preform data analysis on how accurate this theory is for specifically US Jersey Mike's store locations.

This project walks through loading and examining datasets pertaining to geolocation data on all Jersey Mike's and Starbucks stores in the US. We will treat all of our Jersey Mike's store locations as "points". Thus, for every Jersey Mike's point, we will check whether a Starbucks store is in proximity.

# Load Datasets

Jersey Mike's location dataset was scraped on another notebook, `starbucks_jersey_mikes_theory.ipynb`. You can obtain the dataset for Starbucks locations through the Kaggle site, https://www.kaggle.com/datasets/kukuroo3/starbucks-locations-worldwide-2021-version. However, this dataset will entail a little more cleaning.

In [2]:
# Load Jersey Mike's dataset
jersey_mikes_df = pd.read_csv('../data/jersey_mikes_locations.csv')
jersey_mikes_df.head()

Unnamed: 0,Address,Latitude,Longitude
0,"Imperial Promenade,5675 E La Palma Avenue,Suit...",33.86097,-117.791142
1,"4509 Phoenix Avenue,Fort Smith, AR 72903-6005,US",35.338555,-94.383186
2,"3821 Lakewood Boulevard ,Ste. 101,Long Beach, ...",33.828598,-118.143035
3,"Station Park West,1060 West Park Lane,Suite 11...",40.983426,-111.907794
4,"6095 Carlson Way,Suite B,Marion, IA 52302-6651,US",42.036505,-91.547915


In [3]:
# Load Starbucks dataset
starbucks_df = pd.read_csv('../data/starbucks_locations.csv')
starbucks_df.head()

Unnamed: 0.1,Unnamed: 0,storeNumber,countryCode,ownershipTypeCode,schedule,slug,latitude,longitude,streetAddressLine1,streetAddressLine2,streetAddressLine3,city,countrySubdivisionCode,postalCode,currentTimeOffset,windowsTimeZoneId,olsonTimeZoneId
0,0,34638-85784,HK,LS,"[{'dayName': 'Today', 'hours': '8:30 AM to 10:...",荷李活廣場-level-2-plaza-hollywood-diamond-hill-hon...,22.3407,114.20169,"Level 2, Plaza Hollywood, Diamond Hill,",Kowloon,,Hong Kong,91,,480,China Standard Time,GMT+08:00 Asia/Hong_Kong
1,1,32141-267986,HK,LS,"[{'dayName': 'Today', 'hours': '7:30 AM to 10:...",黃大仙中心南館-shop-no-g-3-b-ground-floor-wong-tai-si...,22.341694,114.194208,"Shop No. G3B, Ground Floor, Wong Tai Sin","Shopping Centre, Wong Tai Sin, Kowloon",,Hong Kong,91,,480,China Standard Time,GMT+08:00 Asia/Hong_Kong
2,2,15035-155445,HK,LS,"[{'dayName': 'Today', 'hours': '8:00 AM to 10:...",mikiki-shop-no-g-01-ground-floor-mikiki-638-ko...,22.33352,114.19678,"Shop No. G01, Ground Floor, Mikiki 638","Prince Edward Road East,",,Kowloon,91,,480,China Standard Time,GMT+08:00 Asia/Hong_Kong
3,3,49646-268445,HK,LS,"[{'dayName': 'Today', 'hours': '8:00 AM to 10:...",九龍城廣場-九龍九龍城廣場-地下低層-lg-10-舖-香港-91-hk,22.331223,114.188143,九龍九龍城廣場 地下低層LG10舖,,,香港,91,,480,China Standard Time,GMT+08:00 Asia/Hong_Kong
4,4,31944-224544,HK,LS,"[{'dayName': 'Today', 'hours': '8:00 AM to 8:3...",國際展貿中心-shop-48-g-f-hk-intl-trade-exhibit-hong-...,22.323871,114.203796,"Shop 48, G/F,HK Int'l Trade & Exhibit","1 Trademart Drive, Kowloon Bay",,Hong Kong,91,,480,China Standard Time,GMT+08:00 Asia/Hong_Kong


In [4]:
# Filter for US stores for Starbucks locations
# We are also only interested in latitude and longitude data
starbucks_df = starbucks_df.loc[starbucks_df['countryCode'] == 'US']
starbucks_df = starbucks_df[['latitude', 'longitude']]

starbucks_df.head()

Unnamed: 0,latitude,longitude
861,48.116035,-123.434818
862,48.105707,-123.379985
866,48.077861,-123.129862
867,48.080661,-123.117224
869,48.286608,-122.661504


# Testing Theory

For a generous test, we will check whether there are any starbucks within a 0.5 mile distance of a Jersey Mike's point.

In [5]:
# Total count for all Jersey Mike's in vincinity of a Starbucks store
metric = 0

for _, jersey_mikes_store in jersey_mikes_df.iterrows():
    # Get geolocation point for Jersey Mike's store
    jersey_mikes_location = (jersey_mikes_store['Latitude'], jersey_mikes_store['Longitude'])
    
    for _, starbucks_store in starbucks_df.iterrows():
        # Get geolocation point for Starbucks store
        starbucks_location = (starbucks_store['latitude'], starbucks_store['longitude'])
        
        # Calculate distance and update metric count
        distance_between = distance(jersey_mikes_location, starbucks_location).miles
        
        # We can assume that one Starbucks is near a Jersey Mike's store
        if distance_between <= 0.5:
            metric += 1
            continue

In [6]:
metric

497