# Battle of the Neighborhoods  
Author Eleonora Balbi

## Description of the Problem and Discussion of the Background

The Ardèche Department in France is well known for it's summer activities such as hiking, rock climbing and canoeing. Campgrounds are a flourishing business there and a lot of people go to campgrounds for a cheap and family friendly way to sleep and stay in the region for holidays. As a result, there are many campgrounds in the Ardèche region and more open every year. In any case, the location of the campgrounds is one of the most important decisions that will determine whether the campground will be a success or a failure.  
The objective of this capstone project is to analyse and select the best locations in the Ardèche department in France to open a new campground. Using data science methodology and machine learning techniques like clustering, this project aims to provide solutions to answer the following business question: Where is the best place to open a new campground in the Ardèche region in France?

To solve the problem, we will need the following data:  
1) List of communities in the Ardèche department, scrapped from a wikipedia page. To scrap this list, the libraries Beautifoulsoup4 and pandas are used to create a dataframe.  
2) We will need latitude and longitude coordinates of these communities. This is required in order to plot the map and to get the venue data.  We will receive this data with the help of Geopy library.  
3) Venue data, particularly data related to campinggrounds, that were within a 2km radius of the community, will be gathered from Foursquare. We will use this data to perform clustering on the communities.   

## Exploratory Data Analysis

Overview of exploratory data analysis:  

1) Build a dataframe of communities in the Ardèche department in France by web scraping the data from a Wikipedia page  
2) Get the geographical coordinates of the communities  
3) Obtain the venue data for the communities from Foursquare API  
4) Explore and cluster the communities  
5) Select the best cluster to open a new campground  


### Import and install the libraries

In [2]:
# import the necessary Libraries 
import sys
!{sys.executable} -m pip install geocoder

import requests
import json # library to handle JSON files

# Matplotlib and associated plotting modules
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as colors

#!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import geocoder # import geocoder

import io
from bs4 import BeautifulSoup
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
from sklearn.cluster import KMeans # import k-means from clustering stage

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Packages installed and libraries imported.')

Collecting geocoder
[?25l  Downloading https://files.pythonhosted.org/packages/4f/6b/13166c909ad2f2d76b929a4227c952630ebaf0d729f6317eb09cbceccbab/geocoder-1.38.1-py2.py3-none-any.whl (98kB)
[K     |████████████████████████████████| 102kB 13.2MB/s ta 0:00:01
Collecting ratelim (from geocoder)
  Downloading https://files.pythonhosted.org/packages/f2/98/7e6d147fd16a10a5f821db6e25f192265d6ecca3d82957a4fdd592cad49c/ratelim-0.1.6-py2.py3-none-any.whl
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6
Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    openssl-1.1.1f             |       h516909a_0         2.1 MB  conda-forge
    certifi-2020.4.5.1         |   py36h9f0ad1d_0         151 KB  conda-forge
   

### 1) Build a dataframe of communities in the Ardèche department in France by web scraping the data from a Wikipedia page  

In [4]:
# send the GET request
data = requests.get("https://en.wikipedia.org/wiki/Category:Communes_of_Ardèche").text

# parse data from the html into a beautifulsoup object
soup = BeautifulSoup(data, 'html.parser')


In [5]:
# create a list to store neighborhood data
neighborhoodList = []


# append the data into the list
for row in soup.find_all("div", class_="mw-category")[0].findAll("li"):
    neighborhoodList.append(row.text)

In [6]:
# create a new DataFrame from the list
ard_df = pd.DataFrame({"Neighborhood": neighborhoodList})
ard_df.head()


Unnamed: 0,Neighborhood
0,Communes of the Ardèche department
1,Accons
2,Ailhon
3,Aizac
4,Ajoux


In [7]:
# print the number of rows of the dataframe
ard_df.shape

(200, 1)

### 2) Get the geographical coordinates of the communities  


In [8]:
# define a function to get coordinates
def get_latlng(neighborhood):
    # initialize your variable to None
    lat_lng_coords = None
    # loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Ardèche, France'.format(neighborhood))
        lat_lng_coords = g.latlng
    return lat_lng_coords

In [9]:
# call the function to get the coordinates, store in a new list using list comprehension
coords = [ get_latlng(neighborhood) for neighborhood in ard_df["Neighborhood"].tolist() ]

In [10]:
coords

[[44.66667000000007, 4.333330000000046],
 [44.886360000000025, 4.386800000000051],
 [44.59782000000007, 4.341910000000041],
 [44.71402000000006, 4.330110000000047],
 [44.765310000000056, 4.500430000000051],
 [44.555030000000045, 4.597460000000069],
 [44.821420000000046, 4.429120000000069],
 [44.94473000000005, 4.7296700000000556],
 [44.712500000000034, 4.630380000000059],
 [45.23978000000005, 4.798380000000066],
 [45.239810000000034, 4.666850000000068],
 [44.89945000000006, 4.323260000000062],
 [45.18665000000004, 4.736460000000022],
 [45.037150000000054, 4.653100000000052],
 [45.13936000000007, 4.806980000000067],
 [44.752717490000066, 4.425498094000034],
 [44.42106000000007, 4.173440000000028],
 [44.68587000000008, 4.059310000000039],
 [44.61975000000007, 4.388220000000047],
 [44.588550000000055, 4.633560000000045],
 [44.715640000000064, 4.760250000000042],
 [44.50806000000006, 4.372720000000072],
 [44.36794000000003, 4.155200000000036],
 [44.66825000000006, 4.168630000000064],
 [44.

In [11]:
# create temporary dataframe to populate the coordinates into Latitude and Longitude
df_coords = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])

In [12]:
# merge the coordinates into the original dataframe
ard_df['Latitude'] = df_coords['Latitude']
ard_df['Longitude'] = df_coords['Longitude']

In [13]:
# check the neighborhoods and the coordinates
print(ard_df.shape)
ard_df

(200, 3)


Unnamed: 0,Neighborhood,Latitude,Longitude
0,Communes of the Ardèche department,44.666670,4.333330
1,Accons,44.886360,4.386800
2,Ailhon,44.597820,4.341910
3,Aizac,44.714020,4.330110
4,Ajoux,44.765310,4.500430
5,Alba-la-Romaine,44.555030,4.597460
6,Albon-d'Ardèche,44.821420,4.429120
7,Alboussière,44.944730,4.729670
8,Alissas,44.712500,4.630380
9,Andance,45.239780,4.798380


In [14]:
#drop the first row, as it is not part of the list
ard_df = ard_df.drop([0])
ard_df.shape

(199, 3)

In [15]:
ard_df

Unnamed: 0,Neighborhood,Latitude,Longitude
1,Accons,44.886360,4.386800
2,Ailhon,44.597820,4.341910
3,Aizac,44.714020,4.330110
4,Ajoux,44.765310,4.500430
5,Alba-la-Romaine,44.555030,4.597460
6,Albon-d'Ardèche,44.821420,4.429120
7,Alboussière,44.944730,4.729670
8,Alissas,44.712500,4.630380
9,Andance,45.239780,4.798380
10,Annonay,45.239810,4.666850


In [16]:
# save the DataFrame as CSV file
ard_df.to_csv("ard_df.csv", index=False)

In [17]:
### create a map

In [18]:
# get the coordinates of Kuala Lumpur
address = 'Ardèche, France'

geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Ardèche, France is {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Ardèche, France is 44.815194000000005, 4.3986524702343965.


In [19]:
# create map of Ardèche using latitude and longitude values
map_ard = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, neighborhood in zip(ard_df['Latitude'], ard_df['Longitude'], ard_df['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_ard)  
    
map_ard

In [20]:
# save the map as HTML file
map_ard.save('map_ard.html')

### 3) Obtain the venue data for the communities from Foursquare API  

In [21]:
# define Foursquare Credentials and Version
CLIENT_ID = 'XATPOLHEX31VTOOEYXJBBOCRNY4UV2UW1DZANIEJSEHYCO5K' # your Foursquare ID
CLIENT_SECRET = 'O2P5UVPY1BMXMFBSYBUTRZTMAV2CVXPR4AQ32Y1HOOO5AOQP' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: XATPOLHEX31VTOOEYXJBBOCRNY4UV2UW1DZANIEJSEHYCO5K
CLIENT_SECRET:O2P5UVPY1BMXMFBSYBUTRZTMAV2CVXPR4AQ32Y1HOOO5AOQP


In [66]:
# Top 100 venues in a radius of 8km
radius = 8000
LIMIT = 100

venues = []

for lat, long, neighborhood in zip(ard_df['Latitude'], ard_df['Longitude'], ard_df['Neighborhood']):
    
    # create the API request URL
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        radius, 
        LIMIT)
    
    # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    # return only relevant information for each nearby venue
    for venue in results:
        venues.append((
            neighborhood,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))

KeyError: 'groups'

In [45]:
# convert the venues list into a new DataFrame
venues_df = pd.DataFrame(venues)

# define the column names
venues_df.columns = ['Neighborhood', 'Latitude', 'Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

print(venues_df.shape)
venues_df.head()


(1723, 7)


Unnamed: 0,Neighborhood,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,Accons,44.88636,4.3868,Intermarché SUPER Le Cheylard et Drive,44.912917,4.442322,Supermarket
1,Accons,44.88636,4.3868,Le Grand Café,44.905767,4.422568,Café
2,Accons,44.88636,4.3868,Gamm Vert,44.915796,4.433417,Garden
3,Accons,44.88636,4.3868,Café La Palisse,44.912319,4.441642,Lounge
4,Ailhon,44.59782,4.34191,Au Bureau,44.620028,4.388062,Steakhouse


In [46]:
# How many venues by neighborhood
venues_df.groupby(["Neighborhood"]).count()

Unnamed: 0_level_0,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Accons,4,4,4,4,4,4
Ailhon,19,19,19,19,19,19
Aizac,4,4,4,4,4,4
Ajoux,5,5,5,5,5,5
Alba-la-Romaine,10,10,10,10,10,10
Albon-d'Ardèche,4,4,4,4,4,4
Alboussière,2,2,2,2,2,2
Alissas,7,7,7,7,7,7
Andance,18,18,18,18,18,18
Annonay,8,8,8,8,8,8


In [47]:
print('There are {} uniques categories.'.format(len(venues_df['VenueCategory'].unique())))

There are 141 uniques categories.


In [48]:
# print out the list of categories
venues_df['VenueCategory'].unique()[:140]

array(['Supermarket', 'Café', 'Garden', 'Lounge', 'Steakhouse',
       'Ice Cream Shop', 'French Restaurant', 'Campground', 'Castle',
       'Department Store', 'Fast Food Restaurant', 'Home Service',
       'Restaurant', 'Outdoors & Recreation', 'Food & Drink Shop',
       'Pizza Place', 'Bistro', 'Trail', 'Museum', 'Hotel',
       'Electronics Store', 'Health & Beauty Service', 'Bar', 'Pool',
       'Business Service', 'Deli / Bodega', 'Hobby Shop', 'Zoo', 'Bakery',
       'Sandwich Place', 'Train Station', 'Rest Area', 'Gastropub',
       'Grocery Store', 'Discount Store', 'Canal Lock',
       'Construction & Landscaping', 'Office', 'Pharmacy', 'Lake',
       'Boarding House', 'Residential Building (Apartment / Condo)',
       'Plaza', 'Theme Park Ride / Attraction', 'Village',
       'Molecular Gastronomy Restaurant', 'Diner', 'Boat Rental',
       'Italian Restaurant', 'Forest', 'Farmers Market',
       'Rock Climbing Spot', 'IT Services', 'Shopping Mall',
       'Miscellaneous Sh

#### Analyze each Neighborhood

In [49]:
# one hot encoding
ard_onehot = pd.get_dummies(venues_df[['VenueCategory']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
ard_onehot['Neighborhoods'] = venues_df['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [ard_onehot.columns[-1]] + list(ard_onehot.columns[:-1])
ard_onehot = ard_onehot[fixed_columns]

print(ard_onehot.shape)
ard_onehot.head()

(1723, 142)


Unnamed: 0,Neighborhoods,ATM,Asian Restaurant,Athletics & Sports,Auto Dealership,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Bagel Shop,...,Travel & Transport,Travel Agency,Vacation Rental,Video Store,Village,Vineyard,Waterfall,Wine Shop,Yoga Studio,Zoo
0,Accons,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Accons,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Accons,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Accons,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Ailhon,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [50]:
ard_grouped = ard_onehot.groupby(["Neighborhoods"]).mean().reset_index()

print(ard_grouped.shape)
ard_grouped

(199, 142)


Unnamed: 0,Neighborhoods,ATM,Asian Restaurant,Athletics & Sports,Auto Dealership,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Bagel Shop,...,Travel & Transport,Travel Agency,Vacation Rental,Video Store,Village,Vineyard,Waterfall,Wine Shop,Yoga Studio,Zoo
0,Accons,0.0,0.000000,0.00,0.000000,0.0,0.0,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.000,0.000,0.000000
1,Ailhon,0.0,0.000000,0.00,0.000000,0.0,0.0,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.000,0.000,0.000000
2,Aizac,0.0,0.000000,0.00,0.000000,0.0,0.0,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.000,0.000,0.000000
3,Ajoux,0.0,0.000000,0.00,0.000000,0.0,0.0,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.000,0.000,0.000000
4,Alba-la-Romaine,0.0,0.000000,0.00,0.000000,0.0,0.0,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.000,0.000,0.000000
5,Albon-d'Ardèche,0.0,0.000000,0.00,0.000000,0.0,0.0,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.000,0.000,0.000000
6,Alboussière,0.0,0.000000,0.00,0.000000,0.0,0.0,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.000,0.000,0.000000
7,Alissas,0.0,0.000000,0.00,0.000000,0.0,0.0,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.000,0.000,0.000000
8,Andance,0.0,0.000000,0.00,0.000000,0.0,0.0,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.000,0.000,0.055556
9,Annonay,0.0,0.000000,0.00,0.000000,0.0,0.0,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.000,0.000,0.125000


In [51]:
# print each neighborhood with 3 most common venues
num_top_venues = 3

for hood in ard_grouped['Neighborhoods']:
    print("----"+hood+"----")
    temp = ard_grouped[ard_grouped['Neighborhoods'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Accons----
         venue  freq
0       Garden  0.25
1  Supermarket  0.25
2       Lounge  0.25


----Ailhon----
            venue  freq
0     Supermarket  0.21
1  Ice Cream Shop  0.11
2      Restaurant  0.11


----Aizac----
               venue  freq
0     Ice Cream Shop  0.25
1  Food & Drink Shop  0.25
2        Supermarket  0.25


----Ajoux----
         venue  freq
0       Bistro   0.2
1  Pizza Place   0.2
2       Museum   0.2


----Alba-la-Romaine----
                     venue  freq
0                    Hotel   0.2
1  Health & Beauty Service   0.1
2                     Pool   0.1


----Albon-d'Ardèche----
              venue  freq
0            Bistro  0.25
1  Business Service  0.25
2            Museum  0.25


----Alboussière----
           venue  freq
0          Hotel   0.5
1  Deli / Bodega   0.5
2            ATM   0.0


----Alissas----
               venue  freq
0             Garden  0.14
1  French Restaurant  0.14
2        Supermarket  0.14


----Andance----
           venue  

In [52]:
len(ard_grouped[ard_grouped["Campground"] > 0])

53

In [53]:
# Create new df with data for Campground venue only
ard_mall = ard_grouped[["Neighborhoods","Campground"]]

In [54]:
ard_mall.head()

Unnamed: 0,Neighborhoods,Campground
0,Accons,0.0
1,Ailhon,0.105263
2,Aizac,0.0
3,Ajoux,0.0
4,Alba-la-Romaine,0.1


### 4) Explore and cluster the communities  

In [67]:
# set number of clusters
kclusters = 3

ard_clustering = ard_mall.drop(["Neighborhoods"], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(ard_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([2, 0, 2, 2, 0, 2, 2, 2, 2, 2], dtype=int32)

In [68]:
# create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.
ard_merged = ard_mall.copy()

# add clustering labels
ard_merged["Cluster Labels"] = kmeans.labels_

In [69]:
ard_merged.rename(columns={"Neighborhoods": "Neighborhood"}, inplace=True)
ard_merged.head()

Unnamed: 0,Neighborhood,Campground,Cluster Labels
0,Accons,0.0,2
1,Ailhon,0.105263,0
2,Aizac,0.0,2
3,Ajoux,0.0,2
4,Alba-la-Romaine,0.1,0


In [70]:
# merge ard_grouped with ard_data to add latitude/longitude for each neighborhood
ard_merged = ard_merged.join(ard_df.set_index("Neighborhood"), on="Neighborhood")

print(ard_merged.shape)
ard_merged.head() # check the last columns!


(199, 5)


Unnamed: 0,Neighborhood,Campground,Cluster Labels,Latitude,Longitude
0,Accons,0.0,2,44.88636,4.3868
1,Ailhon,0.105263,0,44.59782,4.34191
2,Aizac,0.0,2,44.71402,4.33011
3,Ajoux,0.0,2,44.76531,4.50043
4,Alba-la-Romaine,0.1,0,44.55503,4.59746


In [71]:
# sort the results by Cluster Labels
print(ard_merged.shape)
ard_merged.sort_values(["Cluster Labels"], inplace=True)
ard_merged

(199, 5)


Unnamed: 0,Neighborhood,Campground,Cluster Labels,Latitude,Longitude
172,Pradons,0.210526,0,44.474430,4.358780
170,Pourchères,0.166667,0,44.746450,4.506680
163,Pailharès,0.166667,0,45.078580,4.565570
66,Cruas,0.076923,0,44.657850,4.762810
43,Chambonas,0.153846,0,44.417540,4.129410
90,Grospierres,0.083333,0,44.401350,4.288910
124,Lavilledieu,0.142857,0,44.577700,4.450710
161,Orgnac-l'Aven,0.117647,0,44.305840,4.432880
27,Berrias-et-Casteljau,0.117647,0,44.374300,4.201530
25,Beauvène,0.200000,0,44.877130,4.509690


In [73]:
# create map to visualize resulting clusters
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(ard_merged['Latitude'], ard_merged['Longitude'], ard_merged['Neighborhood'], ard_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [61]:
# save the map as HTML file
map_clusters.save('map_clusters.html')

### 5) Select the best cluster to open a new campground  

In [74]:
# Cluster 0
ard_merged.loc[ard_merged['Cluster Labels'] == 0]

Unnamed: 0,Neighborhood,Campground,Cluster Labels,Latitude,Longitude
172,Pradons,0.210526,0,44.47443,4.35878
170,Pourchères,0.166667,0,44.74645,4.50668
163,Pailharès,0.166667,0,45.07858,4.56557
66,Cruas,0.076923,0,44.65785,4.76281
43,Chambonas,0.153846,0,44.41754,4.12941
90,Grospierres,0.083333,0,44.40135,4.28891
124,Lavilledieu,0.142857,0,44.5777,4.45071
161,Orgnac-l'Aven,0.117647,0,44.30584,4.43288
27,Berrias-et-Casteljau,0.117647,0,44.3743,4.20153
25,Beauvène,0.2,0,44.87713,4.50969


In [63]:
# Cluster 1
ard_merged.loc[ard_merged['Cluster Labels'] == 1]

Unnamed: 0,Neighborhood,Campground,Cluster Labels,Latitude,Longitude
50,Chassiers,0.25,1,44.55137,4.29665
46,Chandolas,0.3,1,44.40339,4.25351
72,Dunière-sur-Eyrieux,0.25,1,44.82256,4.65837
51,Chauzon,0.307692,1,44.48442,4.35898
157,"Montréal, Ardèche",0.25,1,44.52778,4.29242
60,Colombier-le-Vieux,0.5,1,45.06564,4.69732
150,Mazan-l'Abbaye,0.25,1,44.72708,4.08841
67,Darbres,0.285714,1,44.64674,4.50519
105,Lablachère,0.272727,1,44.46545,4.21608
28,Berzème,0.4,1,44.652,4.56506


In [64]:
# Cluster 2
ard_merged.loc[ard_merged['Cluster Labels'] == 2]

Unnamed: 0,Neighborhood,Campground,Cluster Labels,Latitude,Longitude
127,"Le Chambon, Ardèche",0.000000,2,44.836650,4.305750
144,Malbosc,0.000000,2,44.346190,4.073160
143,Malarce-sur-la-Thines,0.000000,2,44.446090,4.073310
142,Lyas,0.000000,2,44.757640,4.597440
114,Lalevade-d'Ardèche,0.000000,2,44.650150,4.322520
117,Lanarce,0.000000,2,44.727510,4.003980
118,Lanas,0.000000,2,44.531000,4.399490
140,"Loubaresse, Ardèche",0.000000,2,44.600150,4.049520
139,Limony,0.000000,2,45.350620,4.756970
138,Lespéron,0.000000,2,44.730610,3.897810


Cluster 0: Communities with moderate number of campgrounds  
Cluster 1: Communities with high number of campgrounds  
Cluster 2: Communities with low number of campgrounds  
