# Dallas vs. Fort Worth

### Texas is thriving and people from all over the world move here for work. Typically people who move to North Texas for work live in or near either Fort Worth or Dallas. Despite these two cities proximity to one another they have very different vibes. I will be leveraging FourSquare location data to compare the two cities similarities/dissimilarities to create suggestions for people moving to North Texas, based on their interests.

# Data Described

### I will be using Four Square location data to gather information about these two cities. There is a lot of data that can be gathered about Four Square venues including: comments, checkins, likes, venue category, venue name, trending details and more. I will be looking more specifically at the most common type of venues in each city and seeing how the two cities compare. This data will help people moving to North Texas choose a city to move to based on their lifestyle and the sorts of venues they would frequent.

In [14]:
conda install -c conda-forge folium


Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.


Note: you may need to restart the kernel to use updated packages.


In [1]:
import folium as folium


In [2]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

usage: conda-script.py [-h] [-V] command ...
conda-script.py: error: unrecognized arguments: # uncomment this line if you haven't completed the Foursquare API lab


Libraries imported.


In [3]:
CLIENT_ID = 'Hidden' # your Foursquare ID
CLIENT_SECRET = 'Hidden' # your Foursquare Secret
VERSION = '20200422' # Foursquare API version



In [4]:
dallas_latitude = '32.7767'
dallas_longitude = '-96.7970'


city_name = 'Dallas, TX'

print('Latitude and longitude values of {} are {}, {}.'.format(city_name, 
                                                               dallas_latitude, 
                                                               dallas_longitude))

Latitude and longitude values of Dallas, TX are 32.7767, -96.7970.


In [5]:
ftw_latitude = '32.7555'
ftw_longitude = '-97.3308'


city_name = 'Fort Worth,TX'

print('Latitude and longitude values of {} are {}, {}.'.format(city_name, 
                                                               ftw_latitude, 
                                                               ftw_longitude))

Latitude and longitude values of Fort Worth,TX are 32.7555, -97.3308.


### API request for Dallas Venue Data

In [6]:

CLIENT_ID = 'Hidden'
CLIENT_SECRET = 'Hidden'
VERSION = '20200422'
LIMIT = 100
radius = 500
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID,
    CLIENT_SECRET,
    VERSION,
    dallas_latitude,
    dallas_longitude,
    radius,
    LIMIT)


### Get the results and store them

In [8]:

results = requests.get(url).json()


In [9]:
def get_category_type(row):
    try: 
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list)==0:
        return None
    else:
        return categories_list[0]['name']

# Dallas Venues

In [10]:
venues =results['response']['groups'][0]['items']
nearby_venues = json_normalize(venues)

#filtering columns
filtered_columns = ['venue.name','venue.categories','venue.location.lat','venue.location.lng','venue.id']
nearby_venues = nearby_venues.loc[:,filtered_columns]

#filter category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type,axis=1)

#clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()


  


Unnamed: 0,name,categories,lat,lng,id
0,AT&T,Mobile Phone Shop,32.778866,-96.798911,4b60537cf964a520bfdf29e3
1,"DataBank, Ltd.",IT Services,32.778459,-96.798155,4bab84c9f964a520c7af3ae3
2,The Joule,Hotel,32.780558,-96.798247,4bc3321adce4eee1287c719d
3,Weekend,Coffee Shop,32.780309,-96.798191,5193bdfc454a90460c1c6129
4,Spice in the City,Indian Restaurant,32.780014,-96.797829,54cfc18a498e4f433e49e3d5


In [11]:
print('{} venues were returned by Foursquare'.format(nearby_venues.shape[0]))
dallas_venues = nearby_venues
dallas_venues['City'] = 'Dallas'

26 venues were returned by Foursquare


In [12]:
dallas_venues.head()

Unnamed: 0,name,categories,lat,lng,id,City
0,AT&T,Mobile Phone Shop,32.778866,-96.798911,4b60537cf964a520bfdf29e3,Dallas
1,"DataBank, Ltd.",IT Services,32.778459,-96.798155,4bab84c9f964a520c7af3ae3,Dallas
2,The Joule,Hotel,32.780558,-96.798247,4bc3321adce4eee1287c719d,Dallas
3,Weekend,Coffee Shop,32.780309,-96.798191,5193bdfc454a90460c1c6129,Dallas
4,Spice in the City,Indian Restaurant,32.780014,-96.797829,54cfc18a498e4f433e49e3d5,Dallas


### API request for Fort Worth Data

In [13]:

CLIENT_ID = 'Hidden'
CLIENT_SECRET = 'Hidden'
VERSION = '20200422'
LIMIT = 100
radius = 500
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID,
    CLIENT_SECRET,
    VERSION,
    ftw_latitude,
    ftw_longitude,
    radius,
    LIMIT)

### Get and store Fort Worth venue data

In [14]:
resultsftw = requests.get(url).json()


# Fort Worth Venues

In [15]:
venues =resultsftw['response']['groups'][0]['items']
nearby_venues = json_normalize(venues)

#filtering columns
filtered_columns = ['venue.name','venue.categories','venue.location.lat','venue.location.lng','venue.id']
nearby_venues = nearby_venues.loc[:,filtered_columns]

#filter category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type,axis=1)

#clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

  


Unnamed: 0,name,categories,lat,lng,id
0,Bass Performance Hall,Performing Arts Venue,32.755024,-97.329903,4b244a4cf964a520596524e3
1,Sundance Square,Plaza,32.754764,-97.3314,4b1b2452f964a520c7f823e3
2,Silver Leaf Cigar Lounge,Lounge,32.755096,-97.330601,531603dd498eaf59c9c8b755
3,Flying Saucer Draught Emporium,Beer Bar,32.755639,-97.331248,40e0b100f964a520a7091fe3
4,Bird Cafe,New American Restaurant,32.754835,-97.330547,524c4ff3bce69aecefce62e3


In [16]:
print('{} venues were returned by Foursquare'.format(nearby_venues.shape[0]))
ftw_venues = nearby_venues

67 venues were returned by Foursquare


In [17]:
ftw_venues['City'] = 'Fort Worth'
ftw_venues.head()

Unnamed: 0,name,categories,lat,lng,id,City
0,Bass Performance Hall,Performing Arts Venue,32.755024,-97.329903,4b244a4cf964a520596524e3,Fort Worth
1,Sundance Square,Plaza,32.754764,-97.3314,4b1b2452f964a520c7f823e3,Fort Worth
2,Silver Leaf Cigar Lounge,Lounge,32.755096,-97.330601,531603dd498eaf59c9c8b755,Fort Worth
3,Flying Saucer Draught Emporium,Beer Bar,32.755639,-97.331248,40e0b100f964a520a7091fe3,Fort Worth
4,Bird Cafe,New American Restaurant,32.754835,-97.330547,524c4ff3bce69aecefce62e3,Fort Worth


## Joining Dallas & Fort Worth Data

In [18]:
dfw_venues = pd.concat([dallas_venues, ftw_venues])
dfw_venues.head()

Unnamed: 0,name,categories,lat,lng,id,City
0,AT&T,Mobile Phone Shop,32.778866,-96.798911,4b60537cf964a520bfdf29e3,Dallas
1,"DataBank, Ltd.",IT Services,32.778459,-96.798155,4bab84c9f964a520c7af3ae3,Dallas
2,The Joule,Hotel,32.780558,-96.798247,4bc3321adce4eee1287c719d,Dallas
3,Weekend,Coffee Shop,32.780309,-96.798191,5193bdfc454a90460c1c6129,Dallas
4,Spice in the City,Indian Restaurant,32.780014,-96.797829,54cfc18a498e4f433e49e3d5,Dallas


# Analyze Neighborhoods


In [19]:
dfw_onehot = pd.get_dummies(dfw_venues[['categories']],prefix = "",prefix_sep="")

dfw_onehot['City'] = dfw_venues['City']

fixed_columns = [dfw_onehot.columns[-1]] + list(dfw_onehot.columns[:-1])
dfw_onehot = dfw_onehot[fixed_columns]

dfw_onehot.head()

Unnamed: 0,City,American Restaurant,Bar,Beer Bar,Bistro,Boutique,Brazilian Restaurant,Breakfast Spot,Burger Joint,Café,Cajun / Creole Restaurant,Chinese Restaurant,Cocktail Bar,Coffee Shop,Comedy Club,Department Store,Dessert Shop,Diner,Fondue Restaurant,Food Truck,French Restaurant,Grocery Store,Gym,Gym / Fitness Center,Hotel,IT Services,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Jazz Club,Korean Restaurant,Lounge,Mexican Restaurant,Mobile Phone Shop,Movie Theater,New American Restaurant,Nightclub,Park,Performing Arts Venue,Piano Bar,Pizza Place,Plaza,Pub,Public Art,Rental Car Location,Rock Club,Sandwich Place,Seafood Restaurant,Shipping Store,Steakhouse,Sushi Restaurant,Thai Restaurant,Theater,Turkish Restaurant
0,Dallas,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Dallas,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Dallas,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Dallas,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Dallas,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


### Group rows by city, taking the mean of the frequency of occurrence of each category


In [20]:
dfw_grouped = dfw_onehot.groupby('City').mean().reset_index()
dfw_grouped

Unnamed: 0,City,American Restaurant,Bar,Beer Bar,Bistro,Boutique,Brazilian Restaurant,Breakfast Spot,Burger Joint,Café,Cajun / Creole Restaurant,Chinese Restaurant,Cocktail Bar,Coffee Shop,Comedy Club,Department Store,Dessert Shop,Diner,Fondue Restaurant,Food Truck,French Restaurant,Grocery Store,Gym,Gym / Fitness Center,Hotel,IT Services,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Jazz Club,Korean Restaurant,Lounge,Mexican Restaurant,Mobile Phone Shop,Movie Theater,New American Restaurant,Nightclub,Park,Performing Arts Venue,Piano Bar,Pizza Place,Plaza,Pub,Public Art,Rental Car Location,Rock Club,Sandwich Place,Seafood Restaurant,Shipping Store,Steakhouse,Sushi Restaurant,Thai Restaurant,Theater,Turkish Restaurant
0,Dallas,0.038462,0.0,0.0,0.038462,0.038462,0.0,0.0,0.038462,0.076923,0.0,0.0,0.0,0.076923,0.0,0.038462,0.0,0.0,0.0,0.0,0.038462,0.038462,0.0,0.0,0.230769,0.038462,0.0,0.038462,0.0,0.0,0.0,0.0,0.038462,0.038462,0.0,0.038462,0.0,0.0,0.038462,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0
1,Fort Worth,0.104478,0.029851,0.029851,0.0,0.0,0.014925,0.014925,0.014925,0.014925,0.014925,0.029851,0.014925,0.044776,0.029851,0.0,0.014925,0.014925,0.014925,0.014925,0.0,0.0,0.014925,0.014925,0.074627,0.0,0.014925,0.0,0.014925,0.014925,0.014925,0.029851,0.029851,0.0,0.014925,0.029851,0.014925,0.014925,0.014925,0.014925,0.014925,0.014925,0.014925,0.014925,0.014925,0.014925,0.014925,0.029851,0.014925,0.044776,0.014925,0.014925,0.014925,0.014925


### Top 10 most common venues in each city

In [21]:
num_top_venues = 10
for city in dfw_grouped['City']:
    print("-----"+city+"-----")
    temp = dfw_grouped[dfw_grouped['City']==city].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp= temp.round({'freq':2})
    print(temp.sort_values('freq',ascending =False).reset_index(drop=True).head(num_top_venues))
    print('\n')

-----Dallas-----
                     venue  freq
0                    Hotel  0.23
1                    Plaza  0.08
2                     Café  0.08
3              Coffee Shop  0.08
4      American Restaurant  0.04
5         Department Store  0.04
6       Seafood Restaurant  0.04
7    Performing Arts Venue  0.04
8  New American Restaurant  0.04
9        Mobile Phone Shop  0.04


-----Fort Worth-----
                     venue  freq
0      American Restaurant  0.10
1                    Hotel  0.07
2               Steakhouse  0.04
3              Coffee Shop  0.04
4       Chinese Restaurant  0.03
5       Mexican Restaurant  0.03
6                      Bar  0.03
7  New American Restaurant  0.03
8              Comedy Club  0.03
9                   Lounge  0.03




### Function to sort venues in descending order

In [22]:
def return_most_common_venues(row,num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [23]:
#Display top 15 venues in each city
num_top_venues = 15

indicators = ['st','nd','rd']
#columns according to number of top venues
columns = ['City']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venues'.format(ind+1))
# Create a new df
city_venues_sorted = pd.DataFrame(columns = columns)
city_venues_sorted['City']=dfw_grouped['City']

for ind in np.arange(dfw_grouped.shape[0]):
    city_venues_sorted.iloc[ind, 1:] = return_most_common_venues(dfw_grouped.iloc[ind, :], num_top_venues)

city_venues_sorted.head()

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venues,5th Most Common Venues,6th Most Common Venues,7th Most Common Venues,8th Most Common Venues,9th Most Common Venues,10th Most Common Venues,11th Most Common Venues,12th Most Common Venues,13th Most Common Venues,14th Most Common Venues,15th Most Common Venues
0,Dallas,Hotel,Café,Coffee Shop,Plaza,Indian Restaurant,New American Restaurant,Bistro,Boutique,Burger Joint,Department Store,French Restaurant,Grocery Store,IT Services,Mexican Restaurant,Mobile Phone Shop
1,Fort Worth,American Restaurant,Hotel,Steakhouse,Coffee Shop,Beer Bar,Seafood Restaurant,Bar,Lounge,Mexican Restaurant,Chinese Restaurant,Comedy Club,New American Restaurant,Fondue Restaurant,Gym / Fitness Center,Gym


# Results and Insights
Overall Dallas and Fort Worth are very similar. They both have a wide variety of restaurant options. For night life it would appear that Fort Worth has more options given that the 5th most common venues are bars. Fort Worth also boasts Comedy Club as a top 15 venue. Regardless of the City chosen, there is plenty of fun and food to be had.

