# Capstone Project - The Battle of Neighbourhood

This Notebook has the project for the IBM Data Science Professional Certificate or Applied Data Science Capstone course on Coursera

## Business Problem -

In this project we will try to find **how similar the two cities are** on the basis of the venues category or we can say **city's basic infrastructure** like parks, hotels, cafe's, etc. The two cities i have taken for this project are New Delhi and Mumbai.

Since there will be a lot of venues in both cities we will restrict our venues to a limit of 100 and within 15km radius of the city.

Specifically, this report will be helpful for a person or family who is **shifting for one cities to the other**. This report will help them determine how similar the city they are moving to is from the city they are currently living in.   


## A Little Brief on Cities - 

<h2 align = "center">New Delhi</h2>

New Delhi is the capital of India and an administrative district of NCT Delhi. New Delhi is also the seat of all three branches of the Government of India, that is Executive (Rashtrapati Bhavan), Legislature (Parliament House) and Judiciary (Supreme Court of India).

The foundation stone of New Delhi was laid by Emperor George V during the Delhi Durbar of 1911. It was designed by British architects Sir Edwin Lutyens and Sir Herbert Baker. The new capital was inaugurated on 13 February 1931, by Viceroy and Governor-General of India Lord Irwin.

Although colloquially Delhi and New Delhi are used interchangeably to refer to the National Capital Territory of Delhi (NCT), these are two distinct entities, with New Delhi forming a small part of the city of Delhi. The National Capital Region is a much larger entity comprising the entire NCT along with adjoining districts in neighbouring states.

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/4/49/Swaminarayan_Akshardham.jpg/1280px-Swaminarayan_Akshardham.jpg" width="200" align="right">
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/8/87/All_India_War_Memorial_%28INDIA_GATE%29.jpg/1280px-All_India_War_Memorial_%28INDIA_GATE%29.jpg" width="200" align="left">
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Lovely_Lotus_temple.jpg/1280px-Lovely_Lotus_temple.jpg" width="200" align="center">


<h2 align="center">Mumbai</h2>

Mumbai is the capital city of the Indian state of Maharashtra. According to the United Nations, as of 2018, Mumbai is the most populous city in the country and the seventh-most populous city in the world with a population of roughly 20 million.The city is also home to Bollywood and Marathi cinema industries.

Mumbai lies on the Konkan coast on the west coast of India and has a deep natural harbour. In 2008, Mumbai was named an alpha world city. It has the highest number of millionaires and billionaires among all cities in India. Mumbai is home to three UNESCO World Heritage Sites: the Elephanta Caves, Chhatrapati Shivaji Maharaj Terminus, and the city's distinctive ensemble of Victorian and Art Deco buildings.

Mumbai is the financial, commercial, and the entertainment capital of India. It is also one of the world's top ten centres of commerce in terms of global financial flow, generating 6.16% of India's GDP, and accounting for 25% of industrial output, 70% of maritime trade in India (Mumbai Port Trust and JNPT), and 70% of capital transactions to India's economy. Mumbai has the eighth-highest number of billionaires of any city in the world, and Mumbai's billionaires had the highest average wealth of any city in the world in 2008. 

<img src="https://upload.wikimedia.org/wikipedia/commons/2/2a/Chhatrapati_Shivaji_Maharaj_Terminal.jpg" width="200" align="right">
<img src="https://upload.wikimedia.org/wikipedia/commons/d/d3/Gateway_of_India_-Mumbai.jpg" width="200" align="left">
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/1/14/Mumbai_Skyline_at_Night.jpg/1920px-Mumbai_Skyline_at_Night.jpg" width="200" align="center">

## Data -

Based on defination of our Business Problem, factors that will influence our analysis are:
* Number of Venues
* Number of Unique Venue Categories
* Frequency of Each Unique Venue Category

Following data sources will be needed to extract/generate the required information:
* Number of Venues and their category and location for each city will be obtained using **Foursquare API**
* Coordinate of City center will be obtained using **Geopy Library**.

Let's Import some libraries we will be using for our analysis.

In [1]:
import pandas as pd
import numpy as np
from geopy.geocoders import Nominatim

Let's first get the coordinated of both cities using the geopy library.

### New Delhi's Coordinates

In [2]:
address_del = 'New Delhi, India'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address_del)
latitude_del = location.latitude
longitude_del = location.longitude
print('The geograpical coordinate of New Delhi are {}, {}.'.format(latitude_del, longitude_del))

The geograpical coordinate of New Delhi are 28.6138954, 77.2090057.


### Mumbai's Coordinates

In [3]:
address_mum = 'Mumbai, Maharashtra'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address_mum)
latitude_mum = location.latitude
longitude_mum = location.longitude
print('The geograpical coordinate of Mumbai are {}, {}.'.format(latitude_mum, longitude_mum))

The geograpical coordinate of Mumbai are 18.9387711, 72.8353355.


Now we will get the rest of our data using Foursquare API.

## FourSquare 

First let's upload our credential for using the foursquare api.

In [4]:
# The code was removed by Watson Studio for sharing.

Your credentails uploaded


We again import some libraries to help us with the process.

In [6]:
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

Now let's define a function that returns us the dataframe for the nearby popular venues, it's latitude, it's longitude and it's category for the given city.  

In [7]:
def getNearbyVenues(lat,long,radius, Limit,city):# Function that returns the Dataframe for nearby popular venues for the given city
    
    venues_list = []
    url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            long, 
            radius, 
            Limit)
    
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    venues_list.append([(city, 
        v['venue']['name'], 
        v['venue']['location']['lat'], 
        v['venue']['location']['lng'],  
        v['venue']['categories'][0]['name']) for v in results])
        
    nearby_venues = 0
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = [ 'City',
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

## Delhi's Data

In [8]:
delhi_venues = getNearbyVenues(latitude_del, longitude_del,radius = 15000 , Limit = 100, city = 'New Delhi')
delhi_venues

Unnamed: 0,City,Venue,Venue Latitude,Venue Longitude,Venue Category
0,New Delhi,Tamra,28.620543,77.218174,Restaurant
1,New Delhi,The Imperial,28.625548,77.218664,Hotel
2,New Delhi,Pandey Paan,28.622249,77.201075,Smoke Shop
3,New Delhi,Varq | वर्क,28.604547,77.223781,Indian Restaurant
4,New Delhi,The Big Chill Cafe,28.600686,77.227636,Italian Restaurant
5,New Delhi,Gulati Restaurant,28.608010,77.229989,Indian Restaurant
6,New Delhi,Gurudwara Sri Rakabganj Sahibji,28.618296,77.205269,Spiritual Center
7,New Delhi,Lodhi Gardens (लोधी बाग़) (Lodhi Gardens),28.591424,77.220899,Park
8,New Delhi,SODABOTTLEOPENERWALA,28.600141,77.226273,Irani Cafe
9,New Delhi,Naturals Ice Cream,28.634455,77.222139,Ice Cream Shop


## Mumbai's Data

In [9]:
mumbai_venues = getNearbyVenues(latitude_mum, longitude_mum,radius = 15000 , Limit = 100, city = 'Mumbai')
mumbai_venues

Unnamed: 0,City,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Mumbai,Starbucks,18.932190,72.833959,Coffee Shop
1,Mumbai,Wankhede Stadium,18.938792,72.825944,Cricket Ground
2,Mumbai,Food for Thought,18.932031,72.831667,Café
3,Mumbai,Natural's Ice Cream Parlour,18.934892,72.824222,Ice Cream Shop
4,Mumbai,Taj Mahal Palace & Tower,18.922306,72.833578,Hotel
5,Mumbai,Marine Drive,18.941221,72.823261,Scenic Lookout
6,Mumbai,Nariman Point,18.929183,72.822232,Scenic Lookout
7,Mumbai,K Rustoms. Ice Cream,18.933478,72.824995,Ice Cream Shop
8,Mumbai,Trishna,18.928619,72.832356,Seafood Restaurant
9,Mumbai,The Sassy Spoon,18.928426,72.822512,Diner


Looking good. So now we have all the popular venues in area within 15km from both the cities.

This concludes the data gathering phase - we're now ready to use this data for analysis to produce the report on similarity of both the cities!