# Data Section

## Data Sources
To address the business problem discussed, the required data sets will be generated using the following sources:
1.	Toronto neighborhoods: https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M
2.	Demographics of Toronto neighborhoods:
https://en.wikipedia.org/wiki/Demographics_of_Toronto_neighbourhoods
3.	Geocoder/Google geolocation API
4.	Foursquare APIs


## Data Sources at a glance:
#### 1.	Toronto neighborhoods: 

This Wikipedia page provides a list of postal codes of Canada beginning with the letter M. All these postal codes correspond to Boroughs and neighborhoods located within Toronto. The required data set is extracted using ‘Beautiful soup’ and this is one of the data sets that will be used to solve the problem at hand.


#### Example: Few rows in the extracted dataframe: 

|Postcode|	Borough |Neighbourhood|
|--------|----------|-------------|
|	M1A	|Not assigned	|Not assigned
|	M2A	|Not assigned	|Not assigned
|	M3A	|North York	|Parkwoods
|	M4A	|North York	|Victoria Village
|	M5A	|Downtown Toronto	|Harbourfront


#### 2.	Demographics of Toronto neighborhoods: 

Apart from the data source listed above, demographic information is also quite crucial for addressing the given problem as the business venture is primarily focussed on an ethnic group. This Wikipedia page provides a list of demographic information for Toronto neighborhoods. ‘Beautiful soup’ will be used to extract the data and then generate the required data set.

#### Example:Few rows in the extracted dataframe: 



In [3]:
Demographics.head()

Unnamed: 0,Name,FM,Census Tracts,Population,Land area (km2),Density (people/km2),% Change in Population since 2001,Average Income,Transit Commuting %,% Renters,Second most common language (after English) by name,Second most common language (after English) by percentage,Map
0,Toronto CMA Average,,All,5113149,5903.63,866,9.0,40704,10.6,11.4,,,
1,Agincourt,S,"0377.01, 0377.02, 0377.03, 0377.04, 0378.02, 0...",44577,12.45,3580,4.6,25750,11.1,5.9,Cantonese (19.3%),19.3% Cantonese,
2,Alderwood,E,"0211.00, 0212.00",11656,4.94,2360,-4.0,35239,8.8,8.5,Polish (6.2%),06.2% Polish,
3,Alexandra Park,OCoT,0039.00,4355,0.32,13609,0.0,19687,13.8,28.0,Cantonese (17.9%),17.9% Cantonese,
4,Allenby,OCoT,0140.00,2513,0.58,4333,-1.0,245592,5.2,3.4,Russian (1.4%),01.4% Russian,


#### 3.	Geocoder/ Google geolocation API:  

Address geocoding refers to the process of finding an associated latitude and longitude for a given address. The geocoordinates (viz. latitude and longitude) for the neighborhoods will be obtained using the geocoder or by using Google geolocation API. 

#### Example: The following information is returned for the coordinates of downtown Toronto
The geograpical coordinates of Downtown Toronto are 43.6541737, -79.3808116451341.


#### 4.	Foursquare API: 

Foursquare is one of the most popular Location Based Social Network (LBSN) in recent times. Foursquare provides personalized recommendations of places to go to near a user's current location based on users' previous browsing history, purchases, or check-in history. It allows users to explore the world around them and provides geo tagged information. The Foursquare API allows application developers to interact with the Foursquare platform. The API provides location-based experiences with diverse information about venues, users, photos, and check-ins. The API supports real time access to places, Snap-to-Place that assigns users to specific locations, and Geo-tag. API calls will be made to obtain the required information for different venues of interest located in the neighborhoods. This information is crucial in meeting the objective of this project.

#### Example: Get the top 100 venues that are in Rosedale within a radius of 500 meters.

##### GET request url
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
    
    
##### send the GET request and examine the results
results = requests.get(url).json()
results['response']['groups'][0]['items']




In [None]:
[{'reasons': {'count': 0,
   'items': [{'summary': 'This spot is popular',
     'type': 'general',
     'reasonName': 'globalInteractionReason'}]},
  'venue': {'id': '4aff2d47f964a520743522e3',
   'name': 'Rosedale Park',
   'location': {'address': '38 Scholfield Ave.',
    'crossStreet': 'at Edgar Ave.',
    'lat': 43.68232820227814,
    'lng': -79.37893434347683,
    'labeledLatLngs': [{'label': 'display',
      'lat': 43.68232820227814,
      'lng': -79.37893434347683}],
    'distance': 327,
    'cc': 'CA',
    'city': 'Toronto',
    'state': 'ON',
    'country': 'Canada',
    'formattedAddress': ['38 Scholfield Ave. (at Edgar Ave.)',
     'Toronto ON',
     'Canada']},
   'categories': [{'id': '4bf58dd8d48988d1e7941735',
     'name': 'Playground',
     'pluralName': 'Playgrounds',
     'shortName': 'Playground',
     'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/parks_outdoors/playground_',
      'suffix': '.png'},
     'primary': True}],
   'photos': {'count': 0, 'groups': []}},
  'referralId': 'e-0-4aff2d47f964a520743522e3-0'},
 {'reasons': {'count': 0,
   'items': [{'summary': 'This spot is popular',
     'type': 'general',
     'reasonName': 'globalInteractionReason'}]},
  'venue': {'id': '4bd777aa5cf276b054639b00',
   'name': 'Whitney Park',
   'location': {'lat': 43.68203573063681,
    'lng': -79.37378835021306,
    'labeledLatLngs': [{'label': 'display',
      'lat': 43.68203573063681,
      'lng': -79.37378835021306}],
    'distance': 408,
    'cc': 'CA',
    'country': 'Canada',
    'formattedAddress': ['Canada']},
   'categories': [{'id': '4bf58dd8d48988d163941735',
     'name': 'Park',
     'pluralName': 'Parks',
     'shortName': 'Park',
     'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/parks_outdoors/park_',
      'suffix': '.png'},
     'primary': True}],
   'photos': {'count': 0, 'groups': []}},
  'referralId': 'e-0-4bd777aa5cf276b054639b00-1'},
 {'reasons': {'count': 0,
   'items': [{'summary': 'This spot is popular',
     'type': 'general',
     'reasonName': 'globalInteractionReason'}]},
  'venue': {'id': '4d0e77df76cc37045715767c',
   'name': 'Alex Murray Parkette',
   'location': {'address': '107 Crescent Road',
    'crossStreet': 'South Drive',
    'lat': 43.678300240478954,
    'lng': -79.38277328698108,
    'labeledLatLngs': [{'label': 'display',
      'lat': 43.678300240478954,
      'lng': -79.38277328698108}],
    'distance': 444,
    'cc': 'CA',
    'city': 'Toronto',
    'state': 'ON',
    'country': 'Canada',
    'formattedAddress': ['107 Crescent Road (South Drive)',
     'Toronto ON',
     'Canada']},
   'categories': [{'id': '4bf58dd8d48988d163941735',
     'name': 'Park',
     'pluralName': 'Parks',
     'shortName': 'Park',
     'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/parks_outdoors/park_',
      'suffix': '.png'},
     'primary': True}],
   'photos': {'count': 0, 'groups': []}},
  'referralId': 'e-0-4d0e77df76cc37045715767c-2'},
 {'reasons': {'count': 0,
   'items': [{'summary': 'This spot is popular',
     'type': 'general',
     'reasonName': 'globalInteractionReason'}]},
  'venue': {'id': '4ef8f2a3775b54cdb5bdec7c',
   'name': "Milkman's Lane",
   'location': {'address': 'South Dr',
    'crossStreet': 'at Glen Rd',
    'lat': 43.676352068015554,
    'lng': -79.37384239440172,
    'labeledLatLngs': [{'label': 'display',
      'lat': 43.676352068015554,
      'lng': -79.37384239440172}],
    'distance': 464,
    'cc': 'CA',
    'city': 'Toronto',
    'state': 'ON',
    'country': 'Canada',
    'formattedAddress': ['South Dr (at Glen Rd)', 'Toronto ON', 'Canada']},
   'categories': [{'id': '4bf58dd8d48988d159941735',
     'name': 'Trail',
     'pluralName': 'Trails',
     'shortName': 'Trail',
     'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/parks_outdoors/hikingtrail_',
      'suffix': '.png'},
     'primary': True}],
   'photos': {'count': 0, 'groups': []}},
  'referralId': 'e-0-4ef8f2a3775b54cdb5bdec7c-3'}]