# Dog-Friendly Neighbourhoods of Stockholm

<img src="https://miro.medium.com/max/1800/1*Ajbb76yGEqKPRdbp-rrn4g.jpeg" alt="dogswelcome" align="left" width=600>

<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Hello, My name is Liuba and I live in Gothenburg with my dog and sidekick called Watson. We live in a neighbourhood called Olskroken. When I was buying my appartment I chose this neighbourhood for it's dog-friendliness (before I even met Watson!).</p>
<p><strong>What does a dog-friendly neighbourhood mean?</strong> From my point of view a neighbourhood can be called dog-friendly if it has the following attributes:</p>
<ul>
<li>A forest or a park</li>
<li>An Animal Hospital</li>
<li>A Dog Park (Called "hundrasgarden" in Swedish)</li>
<li>Doggy Daycare (Called "hunddagis" in Swedish)</li>
<li>A Pet Store</li>
<li>A Pet Salon</li>
<li>Dog-friendly cafes and restaurants</li>
</ul>
<p>It is not necessary for a neighbourhood to have all the attributes mentioned above to be called dog-friendly but the more checkboxes it ticks the higher it would be on my list.</p>
<p><strong>Now to the problem and goal of this project</strong>: I am looking into moving to Stockholm and I would like to find a dog-friendly area to live in. The goal of this project would be to determine and compare dog-friendly neighbourhoods in Stockholm.</p>
<p>I believe the end results of this analysis would be beneficial to any dog owner living in Stockholm or someone who wants to move with their furry buddy to this city.</p>
<p>I will be using Foursquare API to retrieve venue information and any data that I can scrape on the&nbsp;Wikipedia or the internet about the neighbourhoods of Stockholm.</p>

# Step 1. Downloading the libraries that will be required

In [2]:
import pandas as pd # library for data analsysis
import numpy as np  # library to handle data in a vectorized manner
import json # library to handle JSON files
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
import matplotlib.cm as cm # Matplotlib and associated plotting modules
import matplotlib.colors as colors
from sklearn.cluster import KMeans # import k-means from clustering stage
import folium # map rendering library

print('Libraries imported.')

# Step 2. Processing the data

After a bit of search for map data about the districts of Stockholm, I ended up manually scrapping the Wikipedia page https://en.wikipedia.org/wiki/Districts_of_Sweden for information about the districts. I have created the file called stockholm_districts.csv containing the Boroughs and Districts of Stockholm.

In [8]:
data = pd.read_csv("/Users/liuba/Desktop/GitHub/Coursera_Capstone/stockholm_districts.csv")

Next I added the columns for lattitude and longitude of each district and I retrieved this information using Nominatim that we learned about during the labs.

In [9]:
for i in range(115):
    address = data.iloc[i,2] + ", "+ data.iloc[i,1] + ", "+ data.iloc[i,0]
    geolocator = Nominatim(user_agent="explorer")
    location = geolocator.geocode(address)
    if location is not None:
        data.at[i,'Lat'] = float(location.latitude)
        data.at[i,'Long'] = float(location.longitude)
        #print(i,float(location.latitude)," ",float(location.longitude))
    else:
        address = data.iloc[i,2] + ", "+ data.iloc[i,0]
        geolocator = Nominatim(user_agent="explorer")
        location = geolocator.geocode(address)
        if location is not None:
            data.at[i,'Lat'] = float(location.latitude)
            data.at[i,'Long'] = float(location.longitude)
            #print(i,float(location.latitude)," ",float(location.longitude))
        else:
            data.at[i,'Lat'] = float(0)
            data.at[i,'Long'] = float(0)
            #print(i,float(0)," ",float(0))

In [10]:
#Checking if there is any address that was not located.
data.loc[(data['Lat'] == float(0))]

Next I proceed to create the map of Stockholm and I added markers for the districts as well.

In [163]:
address = 'Stockholm,Sweden'

geolocator = Nominatim(user_agent="explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Stockholm are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Stockholm are 59.3251172, 18.0710935.


In [12]:
# Creating a map of Stockholm using latitude and longitude values
map_stkhlm = folium.Map(location=[latitude, longitude], zoom_start=11)

# Adding markers of the districts to the map
for lat, lng, borough, district in zip(data['Lat'], data['Long'], data['Borough'], data['District']):
    label = '{}, {}'.format(district, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=10,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_stkhlm)  
    
map_stkhlm

I noticed that a couple of districts where mapped wrong so I manually corrected their coordinates.

In [13]:
data.at[15,'Lat'] = float(59.390159)
data.at[15,'Long'] = float(17.872202)
data.at[95,'Lat'] = float(59.251924)
data.at[95,'Long'] = float(18.174457)

In [7]:
# Creating a map of Stockholm using latitude and longitude values
map_stkhlm = folium.Map(location=[latitude, longitude], zoom_start=11)

# Adding markers of the districts to the map
for lat, lng, borough, district in zip(data['Lat'], data['Long'], data['Borough'], data['District']):
    label = '{}, {}'.format(district, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=10,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_stkhlm)  
    
map_stkhlm

Much better! I am not sure if I retrieved ALL Stockholm districts and whether all the markers are placed correctly but I believe at this point we have enough data to start exploring the districts with the help of Foursquare.

Before I did that, I saved the processed data into a cvs format to be able to retrieve it without the need to process it again.

In [15]:
data.to_csv(r'/Users/liuba/Desktop/GitHub/Coursera_Capstone/stockholm_districts_coords.csv', index=False)

In [3]:
data = pd.read_csv("/Users/liuba/Desktop/GitHub/Coursera_Capstone/stockholm_districts_coords.csv")
data.head()

# Step 3. Utilizing the Foursquare API to explore the districts of Stockholm

<p>First I examined the Foursquare venue categories (<a href="https://developer.foursquare.com/docs/resources/categories">https://developer.foursquare.com/docs/resources/categories</a>) to determine which ones will be relevant for the goal of this analysis.&nbsp;</p>
<p>I think the following categories will by the key features for dog-friendly districts.&nbsp;</p>
<ul>
<li><strong>Pet Caf&eacute;</strong> 56aa371be4b08b9a8d573508</li>
<li><strong>Dog Run</strong> 4bf58dd8d48988d1e5941735</li>
<li><strong>Park</strong> 4bf58dd8d48988d163941735</li>
<li><strong>Trail</strong> 4bf58dd8d48988d159941735</li>
<li><strong>Veterinarian</strong> 4d954af4a243a5684765b473</li>
<li><strong>Pet Service</strong> 5032897c91d4c4b30a586d69</li>
<li><strong>Pet Store</strong> 4bf58dd8d48988d100951735</li>
</ul>

In [161]:
config = json.load(open('/Users/liuba/Desktop/GitHub/Coursera_Capstone/config.json'))
CLIENT_ID = config['CLIENT_ID']
CLIENT_SECRET = config['CLIENT_SECRET']
VERSION = config['VERSION']

In [218]:
category_ids = ['4bf58dd8d48988d163941735', '4bf58dd8d48988d159941735','56aa371be4b08b9a8d573508', '4bf58dd8d48988d1e5941735',  '4d954af4a243a5684765b473', '5032897c91d4c4b30a586d69', '4bf58dd8d48988d100951735']
category_ids = ",".join(category_ids)
radius = 20000
LIMIT = 100

In [219]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&categoryId={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, category_ids, radius, LIMIT)
url
results = requests.get(url).json()

# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()

Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,location.lat,location.lng,location.labeledLatLngs,location.distance,location.postalCode,location.cc,location.city,location.state,location.country,location.formattedAddress,location.crossStreet,location.neighborhood
0,4f2808fae4b03421b8718ae3,Monteliusvägen,"[{'id': '4bf58dd8d48988d159941735', 'name': 'T...",v-1570917586,False,Monteliusvägen,59.320863,18.062692,"[{'label': 'display', 'lat': 59.32086345840404...",672,118 24,SE,Stockholm,Storstockholm,Sverige,"[Monteliusvägen, 118 24 Stockholm, Sverige]",,
1,4d859ad302eb5481361e48f5,In My Bikilas,"[{'id': '4bf58dd8d48988d159941735', 'name': 'T...",v-1570917586,False,,59.350849,17.998626,"[{'label': 'display', 'lat': 59.35084928888889...",5012,,SE,,,Sverige,[Sverige],,
2,5a28ff2fbed48327f00df610,Folk & Friends,"[{'id': '56aa371ce4b08b9a8d57356c', 'name': 'B...",v-1570917586,False,Hornsgatan 180,59.315251,18.032266,"[{'label': 'display', 'lat': 59.31525124366064...",2463,117 34,SE,Stockholm,Storstockholm,Sverige,"[Hornsgatan 180, 117 34 Stockholm, Sverige]",,
3,4adcdaeef964a520c05a21e3,Tantolunden,"[{'id': '4bf58dd8d48988d163941735', 'name': 'P...",v-1570917587,False,,59.313769,18.037651,"[{'label': 'display', 'lat': 59.31376859108054...",2281,118 42,SE,Stockholm,Storstockholm,Sverige,"[118 42 Stockholm, Sverige]",,
4,523188c37e4862c06a6f78c1,Konradsbergsparken,"[{'id': '4bf58dd8d48988d163941735', 'name': 'P...",v-1570917587,False,,59.330191,18.017128,"[{'label': 'display', 'lat': 59.33019053604235...",3116,,SE,Stockholm,Storstockholm,Sverige,"[Stockholm, Sverige]",,


In [220]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered.head()

Unnamed: 0,name,categories,address,lat,lng,labeledLatLngs,distance,postalCode,cc,city,state,country,formattedAddress,crossStreet,neighborhood,id
0,Monteliusvägen,Trail,Monteliusvägen,59.320863,18.062692,"[{'label': 'display', 'lat': 59.32086345840404...",672,118 24,SE,Stockholm,Storstockholm,Sverige,"[Monteliusvägen, 118 24 Stockholm, Sverige]",,,4f2808fae4b03421b8718ae3
1,In My Bikilas,Trail,,59.350849,17.998626,"[{'label': 'display', 'lat': 59.35084928888889...",5012,,SE,,,Sverige,[Sverige],,,4d859ad302eb5481361e48f5
2,Folk & Friends,Beer Bar,Hornsgatan 180,59.315251,18.032266,"[{'label': 'display', 'lat': 59.31525124366064...",2463,117 34,SE,Stockholm,Storstockholm,Sverige,"[Hornsgatan 180, 117 34 Stockholm, Sverige]",,,5a28ff2fbed48327f00df610
3,Tantolunden,Park,,59.313769,18.037651,"[{'label': 'display', 'lat': 59.31376859108054...",2281,118 42,SE,Stockholm,Storstockholm,Sverige,"[118 42 Stockholm, Sverige]",,,4adcdaeef964a520c05a21e3
4,Konradsbergsparken,Park,,59.330191,18.017128,"[{'label': 'display', 'lat': 59.33019053604235...",3116,,SE,Stockholm,Storstockholm,Sverige,"[Stockholm, Sverige]",,,523188c37e4862c06a6f78c1


In [221]:
venues_map = folium.Map(location=[latitude, longitude], zoom_start=13) # generate map centred around Stockholm

# add the dog-friendly place as red circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='red',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map)

# display map
venues_map

In [167]:
radius = 500
LIMIT=100

def getNearbyVenues(names, latitudes, longitudes, radius=500):
   
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        #print(name)
            
        url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&categoryId={}&radius={}&limit={}'.format(
                CLIENT_ID, 
                CLIENT_SECRET, 
                lat, 
                lng, 
                VERSION, 
                category_ids, 
                radius, 
                LIMIT)

        # make the GET request
        results = requests.get(url).json()['response']['venues']
                
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['name'], 
            v['location']['lat'], 
            v['location']['lng'],  
            v['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['District', 
                  'District Latitude', 
                  'District Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [171]:
stckhlm_venues = getNearbyVenues(names=data['Borough'] + "," + data['District'],
                                   latitudes=data['Lat'],
                                   longitudes=data['Long']
                                  )

In [172]:
print(stckhlm_venues.shape)
stckhlm_venues.head()

(330, 7)


Unnamed: 0,District,District Latitude,District Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Älvsjö,Långsjö",59.267506,17.978904,Långsjöparken,59.262426,17.981336,Playground
1,"Älvsjö,Långbro",59.282433,17.982491,Långbro Park,59.282241,17.972748,Park
2,"Älvsjö,Långbro",59.282433,17.982491,Långbrogårdsparken,59.280328,17.991275,Park
3,"Älvsjö,Örby Slott",59.28094,18.029227,Örby Slott,59.28095,18.031669,Park
4,"Älvsjö,Örby Slott",59.28094,18.029227,Walking Molly,59.281191,18.032336,Dog Run


In [173]:
stckhlm_venues.groupby('District').count()

Unnamed: 0_level_0,District Latitude,District Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
District,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Bromma,Abrahamsberg",1,1,1,1,1,1
"Bromma,Alvik",6,6,6,6,6,6
"Bromma,Beckomberga",1,1,1,1,1,1
"Bromma,Blackeberg",1,1,1,1,1,1
"Bromma,Bromma Kyrka",1,1,1,1,1,1
...,...,...,...,...,...,...
"Älvsjö,Örby Slott",3,3,3,3,3,3
"Östermalm,Djurgården",9,9,9,9,9,9
"Östermalm,Hjorthagen",8,8,8,8,8,8
"Östermalm,Ladugårdsgärdet",9,9,9,9,9,9


In [176]:
stckhlm_venues['Venue Category'].unique()

array(['Playground', 'Park', 'Dog Run', 'Veterinarian', 'Trail',
       'Building', 'Pet Store', 'Pet Service', 'Cemetery', 'Forest',
       'Pet Café', 'Restaurant', 'Beach', 'Garden', 'Historic Site',
       'Other Great Outdoors', 'Bridge', 'Field', 'Spa', 'Plaza',
       'Amphitheater', 'Bathing Area', 'Harbor / Marina'], dtype=object)

In [174]:
# Using the one hot encoding
stckhlm_onehot = pd.get_dummies(stckhlm_venues[['Venue Category']], prefix="", prefix_sep="")
stckhlm_onehot['District'] = stckhlm_venues['District'] 
fixed_columns = [stckhlm_onehot.columns[-1]] + list(stckhlm_onehot.columns[:-1])
stckhlm_onehot = stckhlm_onehot[fixed_columns]
stckhlm_onehot.head()

Unnamed: 0,District,Amphitheater,Bathing Area,Beach,Bridge,Building,Cemetery,Dog Run,Field,Forest,...,Park,Pet Café,Pet Service,Pet Store,Playground,Plaza,Restaurant,Spa,Trail,Veterinarian
0,"Älvsjö,Långsjö",0,0,0,0,0,0,0,0,0,...,0,0,0,0,1,0,0,0,0,0
1,"Älvsjö,Långbro",0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,0,0,0
2,"Älvsjö,Långbro",0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,0,0,0
3,"Älvsjö,Örby Slott",0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,0,0,0
4,"Älvsjö,Örby Slott",0,0,0,0,0,0,1,0,0,...,0,0,0,0,0,0,0,0,0,0


In [194]:
stckhlm_grouped = stckhlm_onehot.groupby('District').sum().reset_index()
stckhlm_grouped.head()

Unnamed: 0,District,Amphitheater,Bathing Area,Beach,Bridge,Building,Cemetery,Dog Run,Field,Forest,...,Park,Pet Café,Pet Service,Pet Store,Playground,Plaza,Restaurant,Spa,Trail,Veterinarian
0,"Bromma,Abrahamsberg",0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,0,0,0
1,"Bromma,Alvik",0,0,0,0,0,0,0,0,0,...,4,0,0,0,0,0,0,0,1,1
2,"Bromma,Beckomberga",0,0,0,0,1,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,"Bromma,Blackeberg",0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,0,0,0
4,"Bromma,Bromma Kyrka",0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,0,0,0


In [195]:
stckhlm_grouped.shape

(81, 24)

In [196]:
stckhlm_grouped['Total'] = stckhlm_grouped['Amphitheater']
for row in range(len(stckhlm_grouped)):
    for column in stckhlm_grouped.columns:
        if column != 'District':
            stckhlm_grouped.at[row,'Total'] = stckhlm_grouped.at[row,'Total'] + stckhlm_grouped.at[row,column]

In [197]:
stckhlm_grouped['Total'].describe()

count    81.000000
mean      8.172840
std       7.163781
min       2.000000
25%       2.000000
50%       6.000000
75%      12.000000
max      34.000000
Name: Total, dtype: float64

In [198]:
data['Total'] = 0
for row in range(len(data)):
    for row2 in range(len(stckhlm_grouped)):
        if data.at[row,'Borough'] + "," + data.at[row,'District'] == stckhlm_grouped.at[row2,'District']:
            data.at[row,'Total'] = stckhlm_grouped.at[row2,'Total']
data['Total'].head(20)

0      0
1      2
2      4
3      0
4      0
5      6
6      2
7      2
8     12
9      2
10     2
11     2
12     0
13     2
14     0
15     0
16    10
17     2
18     2
19     2
Name: Total, dtype: int64

In [208]:
data.groupby(['Total']).agg({'District':['count']})

Unnamed: 0_level_0,District
Unnamed: 0_level_1,count
Total,Unnamed: 1_level_2
0,34
2,24
4,13
6,7
8,10
10,5
12,7
14,3
16,2
18,4


In [212]:
data['marker_color'] = 'grey'
for row in range(len(data)):
        if data.at[row,'Total'] == 0:
            data.at[row,'marker_color'] = 'grey'
        elif data.at[row,'Total'] < 6:
            data.at[row,'marker_color'] = 'yellow'
        else:
            data.at[row,'marker_color'] = 'green'

In [213]:
data['marker_color']

0        grey
1      yellow
2      yellow
3        grey
4        grey
        ...  
110     green
111      grey
112      grey
113      grey
114      grey
Name: marker_color, Length: 115, dtype: object

In [217]:
stckhlm_map = folium.Map(location=[latitude, longitude], zoom_start=13) # generate map centred around Stockholm

# add the dog-friendly place as red circle markers
for index, row in data.iterrows():
    folium.CircleMarker([row['Lat'], row['Long']],
                    radius=10, color=row['marker_color'], fill = True,
                    fill_color=row['marker_color'], fill_opacity=0.6).add_to(stckhlm_map)

# display map
stckhlm_map