## The Battle of the Neighborhoods - Week 2

### 1- Download and Explore New York city geographical coordinates data

NewYork neighborhood has a total of 5 boroughs and 306 neighborhoods. In order to segement the neighborhoods and explore them, we will essentially need a dataset that contains the 5 boroughs and the neighborhoods that exist in each borough as well as the the latitude and logitude coordinates of each neighborhood. 

This dataset is available on the web. Link to the dataset: https://geo.nyu.edu/catalog/nyu_2451_34572

So, we will download all the libraries that we will need.

In [78]:
import numpy as np 
import pandas as pd
from PIL import Image
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
import json
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
import matplotlib as mpl
import matplotlib.pyplot as plt
plt.style.use('ggplot')
import matplotlib.cm as cm
import matplotlib.colors as colors
import matplotlib.pyplot as plt
from bs4 import BeautifulSoup
import folium
from wordcloud import WordCloud, STOPWORDS
import csv

print('Libraries imported.')

Libraries imported.


In [79]:
!wget -q -O 'newyork_data.json' https://ibm.box.com/shared/static/fbpwbovar7lf8p5sgddm06cgipa2rxpe.json
print('Data downloaded!')

Data downloaded!


The json file is downloaded and it is placed on the server. So run a `wget` command and access the data.

In [80]:
with open('newyork_data.json') as json_data:
    newyork_data = json.load(json_data)

#### Load and explore the data

In [81]:
neighborhoods_data = newyork_data['features']

In [82]:
neighborhoods_data[0]

{'type': 'Feature',
 'id': 'nyu_2451_34572.1',
 'geometry': {'type': 'Point',
  'coordinates': [-73.84720052054902, 40.89470517661]},
 'geometry_name': 'geom',
 'properties': {'name': 'Wakefield',
  'stacked': 1,
  'annoline1': 'Wakefield',
  'annoline2': None,
  'annoline3': None,
  'annoangle': 0.0,
  'borough': 'Bronx',
  'bbox': [-73.84720052054902,
   40.89470517661,
   -73.84720052054902,
   40.89470517661]}}

All the relevant data is in the *features* key, which is basically a list of the neighborhoods. So, define a new variable that includes this data.

Take a look at the first item in this list.

#### Tranform the data into a *pandas* dataframe
The next task is essentially transforming this data of nested Python dictionaries into a *pandas* dataframe. Start by creating an empty dataframe.

In [83]:
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

In [84]:
neighborhoods

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude


Then loop through the data and fill the dataframe one row at a time.

In [85]:
for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

In [86]:
neighborhoods.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


Let's make sure that the dataset has all 5 boroughs and 306 neighborhoods.

In [87]:
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(neighborhoods['Borough'].unique()),
        neighborhoods.shape[0]
    )
)

The dataframe has 5 boroughs and 306 neighborhoods.


In [88]:
neighborhoods.to_csv('BON1_NYC_GEO.csv',index=False)

#### Use geopy library to get the latitude and longitude values of New York City.

In [89]:
address = 'New York City, NY'

geolocator = Nominatim(user_agent="Jupyter")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of New York City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of New York City are 40.7127281, -74.0060152.


#### Create a map of New York with neighborhoods superimposed on top.

**Folium** is a great visualization library. We can zoom into the below map, and click on each circle mark to reveal the name of the neighborhood and its respective borough.

In [90]:
# create map of Toronto using latitude and longitude values
map_NewYork = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_NewYork)  
    
map_NewYork

### A : The Population Data
Web scrapping of Population data from wikipedia page - https://en.wikipedia.org/wiki/New_York_City

#### Web scrapping of Population data from wikipedia page using BeautifulSoup.
Beautiful Soup is a Python package for parsing HTML and XML documents (including having malformed markup, i.e. non-closed tags, so named after tag soup). It creates a parse tree for parsed pages that can be used to extract data from HTML, which is useful for web scraping.

In [91]:
website_url = requests.get('https://en.wikipedia.org/wiki/Demographics_of_New_York_City').text
soup = BeautifulSoup(website_url,'lxml')
table = soup.find('table',{'class':'wikitable sortable'})
headers = [header.text for header in table.find_all('th')]
table_rows = table.find_all('tr')        
rows = []
for row in table_rows:
   td = row.find_all('td')
   row = [row.text for row in td]
   rows.append(row)
with open('BON2_POPULATION1.csv', 'w') as f:
   writer = csv.writer(f)
   writer.writerow(headers)
   writer.writerows(row for row in rows if row)

In [92]:
# Load data from CSV

In [93]:
Pop_data=pd.read_csv('BON2_POPULATION1.csv')
Pop_data.drop(Pop_data.columns[[7,8,9,10,11]], axis=1, inplace=True)
print('Data loaded!')
Pop_data.head()

Data loaded!


Unnamed: 0,New York City's five boroughsvte,Jurisdiction,Population,Gross Domestic Product,Land area,Density,Borough,squarekm,persons / sq. mi,persons /km2
0,The Bronx\n,\n Bronx\n,"1,471,160\n",42.695\n,"29,200\n",42.10\n,109.04\n,,,
1,Brooklyn\n,\n Kings\n,"2,648,771\n",91.559\n,"34,600\n",70.82\n,183.42\n,,,
2,Manhattan\n,\n New York\n,"1,664,727\n",600.244\n,"360,600\n",22.83\n,59.13\n,,,
3,Queens\n,\n Queens\n,"2,358,582\n",93.310\n,"39,600\n",108.53\n,281.09\n,,,
4,Staten Island\n,\n Richmond\n,"479,458\n",14.514\n,"30,300\n",58.37\n,151.18\n,,,


In [94]:
Pop_data.columns = Pop_data.columns.str.replace(' ', '')
Pop_data.columns = Pop_data.columns.str.replace('\'','')
Pop_data.rename(columns={'Borough':'persons_sq_mi','County':'persons_sq_km'}, inplace=True)
Pop_data

Unnamed: 0,NewYorkCitysfiveboroughsvte,Jurisdiction,Population,GrossDomesticProduct,Landarea,Density,persons_sq_mi,squarekm,persons/sq.mi,persons/km2
0,The Bronx\n,\n Bronx\n,"1,471,160\n",42.695\n,"29,200\n",42.10\n,109.04\n,,,
1,Brooklyn\n,\n Kings\n,"2,648,771\n",91.559\n,"34,600\n",70.82\n,183.42\n,,,
2,Manhattan\n,\n New York\n,"1,664,727\n",600.244\n,"360,600\n",22.83\n,59.13\n,,,
3,Queens\n,\n Queens\n,"2,358,582\n",93.310\n,"39,600\n",108.53\n,281.09\n,,,
4,Staten Island\n,\n Richmond\n,"479,458\n",14.514\n,"30,300\n",58.37\n,151.18\n,,,
5,City of New York,8622698,842.343,97700,302.64,783.83,28188,,,
6,State of New York,19849399,1701.399,85700,47214,122284,416.4,,,
7,Sources:[14] and see individual borough articl...,,,,,,,,,


In [95]:
#rename columns
Pop_data.rename(columns = {'NewYorkCitysfiveboroughsvte\n' : 'Borough',
                   'Jurisdiction\n':'County',
                   'Population\n':'Estimate_2017', 
                   'Landarea\n':'square_miles',
                    'Density\n':'square_km','GrossDomesticProduct\n':'GDP'}, inplace=True)
Pop_data

Unnamed: 0,Borough,County,Estimate_2017,GDP,square_miles,square_km,persons_sq_mi,squarekm,persons/sq.mi,persons/km2
0,The Bronx\n,\n Bronx\n,"1,471,160\n",42.695\n,"29,200\n",42.10\n,109.04\n,,,
1,Brooklyn\n,\n Kings\n,"2,648,771\n",91.559\n,"34,600\n",70.82\n,183.42\n,,,
2,Manhattan\n,\n New York\n,"1,664,727\n",600.244\n,"360,600\n",22.83\n,59.13\n,,,
3,Queens\n,\n Queens\n,"2,358,582\n",93.310\n,"39,600\n",108.53\n,281.09\n,,,
4,Staten Island\n,\n Richmond\n,"479,458\n",14.514\n,"30,300\n",58.37\n,151.18\n,,,
5,City of New York,8622698,842.343,97700,302.64,783.83,28188,,,
6,State of New York,19849399,1701.399,85700,47214,122284,416.4,,,
7,Sources:[14] and see individual borough articl...,,,,,,,,,


In [96]:
#shift data
Pop_data['BoroughName']=Pop_data['Borough'].replace(to_replace='\n', value='', regex=True)
Pop_data['County']=Pop_data['County'].replace(to_replace='\n', value='', regex=True)
Pop_data['Estimate_2017']=Pop_data['Estimate_2017'].replace(to_replace='\n', value='', regex=True)
Pop_data['square_miles']=Pop_data['square_miles'].replace(to_replace='\n', value='', regex=True)
Pop_data['square_km']=Pop_data['square_km'].replace(to_replace='\n', value='', regex=True)
Pop_data['persons_sq_mi']=Pop_data['persons/sq.mi'].replace(to_replace='\n', value='', regex=True)
Pop_data['persons_sq_km']=Pop_data['squarekm'].replace(to_replace='\n', value='', regex=True)
Pop_data

Unnamed: 0,Borough,County,Estimate_2017,GDP,square_miles,square_km,persons_sq_mi,squarekm,persons/sq.mi,persons/km2,BoroughName,persons_sq_km
0,The Bronx\n,Bronx,1471160.0,42.695\n,29200.0,42.1,,,,,The Bronx,
1,Brooklyn\n,Kings,2648771.0,91.559\n,34600.0,70.82,,,,,Brooklyn,
2,Manhattan\n,New York,1664727.0,600.244\n,360600.0,22.83,,,,,Manhattan,
3,Queens\n,Queens,2358582.0,93.310\n,39600.0,108.53,,,,,Queens,
4,Staten Island\n,Richmond,479458.0,14.514\n,30300.0,58.37,,,,,Staten Island,
5,City of New York,8622698,842.343,97700,302.64,783.83,,,,,City of New York,
6,State of New York,19849399,1701.399,85700,47214.0,122284.0,,,,,State of New York,
7,Sources:[14] and see individual borough articl...,,,,,,,,,,Sources:[14] and see individual borough articles,


In [97]:
#remove NaN
Pop_data = Pop_data.fillna('')
Pop_data

Unnamed: 0,Borough,County,Estimate_2017,GDP,square_miles,square_km,persons_sq_mi,squarekm,persons/sq.mi,persons/km2,BoroughName,persons_sq_km
0,The Bronx\n,Bronx,1471160.0,42.695\n,29200.0,42.1,,,,,The Bronx,
1,Brooklyn\n,Kings,2648771.0,91.559\n,34600.0,70.82,,,,,Brooklyn,
2,Manhattan\n,New York,1664727.0,600.244\n,360600.0,22.83,,,,,Manhattan,
3,Queens\n,Queens,2358582.0,93.310\n,39600.0,108.53,,,,,Queens,
4,Staten Island\n,Richmond,479458.0,14.514\n,30300.0,58.37,,,,,Staten Island,
5,City of New York,8622698,842.343,97700,302.64,783.83,,,,,City of New York,
6,State of New York,19849399,1701.399,85700,47214.0,122284.0,,,,,State of New York,
7,Sources:[14] and see individual borough articl...,,,,,,,,,,Sources:[14] and see individual borough articles,


In [98]:
Pop_data.columns = Pop_data.columns.str.replace(' ', '')
Pop_data.columns = Pop_data.columns.str.replace('\'','')
Pop_data

Unnamed: 0,Borough,County,Estimate_2017,GDP,square_miles,square_km,persons_sq_mi,squarekm,persons/sq.mi,persons/km2,BoroughName,persons_sq_km
0,The Bronx\n,Bronx,1471160.0,42.695\n,29200.0,42.1,,,,,The Bronx,
1,Brooklyn\n,Kings,2648771.0,91.559\n,34600.0,70.82,,,,,Brooklyn,
2,Manhattan\n,New York,1664727.0,600.244\n,360600.0,22.83,,,,,Manhattan,
3,Queens\n,Queens,2358582.0,93.310\n,39600.0,108.53,,,,,Queens,
4,Staten Island\n,Richmond,479458.0,14.514\n,30300.0,58.37,,,,,Staten Island,
5,City of New York,8622698,842.343,97700,302.64,783.83,,,,,City of New York,
6,State of New York,19849399,1701.399,85700,47214.0,122284.0,,,,,State of New York,
7,Sources:[14] and see individual borough articl...,,,,,,,,,,Sources:[14] and see individual borough articles,


In [99]:
Pop_data['BoroughName']=Pop_data['Borough'].replace(to_replace='\n', value='', regex=True)
Pop_data['County']=Pop_data['County'].replace(to_replace='\n', value='', regex=True)
Pop_data['Estimate_2017']=Pop_data['Estimate_2017'].replace(to_replace='\n', value='', regex=True)
Pop_data['GDP']=Pop_data['GDP'].replace(to_replace='\n', value='', regex=True)
Pop_data['square_miles']=Pop_data['square_miles'].replace(to_replace='\n', value='', regex=True)
Pop_data['square_km']=Pop_data['square_km'].replace(to_replace='\n', value='', regex=True)
Pop_data['Borough']=Pop_data['Borough'].replace(to_replace='\n', value='', regex=True)
Pop_data['squarekm']=Pop_data['squarekm'].replace(to_replace='\n', value='', regex=True)

Pop_data

Unnamed: 0,Borough,County,Estimate_2017,GDP,square_miles,square_km,persons_sq_mi,squarekm,persons/sq.mi,persons/km2,BoroughName,persons_sq_km
0,The Bronx,Bronx,1471160.0,42.695,29200.0,42.1,,,,,The Bronx,
1,Brooklyn,Kings,2648771.0,91.559,34600.0,70.82,,,,,Brooklyn,
2,Manhattan,New York,1664727.0,600.244,360600.0,22.83,,,,,Manhattan,
3,Queens,Queens,2358582.0,93.31,39600.0,108.53,,,,,Queens,
4,Staten Island,Richmond,479458.0,14.514,30300.0,58.37,,,,,Staten Island,
5,City of New York,8622698,842.343,97700.0,302.64,783.83,,,,,City of New York,
6,State of New York,19849399,1701.399,85700.0,47214.0,122284.0,,,,,State of New York,
7,Sources:[14] and see individual borough articles,,,,,,,,,,Sources:[14] and see individual borough articles,


In [100]:
Pop_data = Pop_data.fillna('')
Pop_data

Unnamed: 0,Borough,County,Estimate_2017,GDP,square_miles,square_km,persons_sq_mi,squarekm,persons/sq.mi,persons/km2,BoroughName,persons_sq_km
0,The Bronx,Bronx,1471160.0,42.695,29200.0,42.1,,,,,The Bronx,
1,Brooklyn,Kings,2648771.0,91.559,34600.0,70.82,,,,,Brooklyn,
2,Manhattan,New York,1664727.0,600.244,360600.0,22.83,,,,,Manhattan,
3,Queens,Queens,2358582.0,93.31,39600.0,108.53,,,,,Queens,
4,Staten Island,Richmond,479458.0,14.514,30300.0,58.37,,,,,Staten Island,
5,City of New York,8622698,842.343,97700.0,302.64,783.83,,,,,City of New York,
6,State of New York,19849399,1701.399,85700.0,47214.0,122284.0,,,,,State of New York,
7,Sources:[14] and see individual borough articles,,,,,,,,,,Sources:[14] and see individual borough articles,


In [101]:
i = Pop_data[((Pop_data.County == 'Sources: [2] and see individual borough articles'))].index
Pop_data.drop(i)

Unnamed: 0,Borough,County,Estimate_2017,GDP,square_miles,square_km,persons_sq_mi,squarekm,persons/sq.mi,persons/km2,BoroughName,persons_sq_km
0,The Bronx,Bronx,1471160.0,42.695,29200.0,42.1,,,,,The Bronx,
1,Brooklyn,Kings,2648771.0,91.559,34600.0,70.82,,,,,Brooklyn,
2,Manhattan,New York,1664727.0,600.244,360600.0,22.83,,,,,Manhattan,
3,Queens,Queens,2358582.0,93.31,39600.0,108.53,,,,,Queens,
4,Staten Island,Richmond,479458.0,14.514,30300.0,58.37,,,,,Staten Island,
5,City of New York,8622698,842.343,97700.0,302.64,783.83,,,,,City of New York,
6,State of New York,19849399,1701.399,85700.0,47214.0,122284.0,,,,,State of New York,
7,Sources:[14] and see individual borough articles,,,,,,,,,,Sources:[14] and see individual borough articles,


In [102]:
Pop_data.to_csv('BON2_POPULATION.csv',index=False)

### B : DEMOGRAPHICS DATA

We will web scrap Demographics data from wikipedia page - https://en.wikipedia.org/wiki/New_York_City

#### Web scrapping of Demographics data from wikipedia page using BeautifulSoup.
Beautiful Soup is a Python package for parsing HTML and XML documents (including having malformed markup, i.e. non-closed tags, so named after tag soup). It creates a parse tree for parsed pages that can be used to extract data from HTML, which is useful for web scraping.

In [103]:
website_url = requests.get('https://en.wikipedia.org/wiki/Demographic_history_of_New_York_City').text
#website_url = requests.get('https://en.wikipedia.org/wiki/New_York_City').text
soup = BeautifulSoup(website_url,'lxml')
table = soup.find('table',{'class':'wikitable sortable'})
#print(soup.prettify())
links=table.findAll('tr')
#links
#headers = [header.text for header in table.findAll('tr')]

table_rows = table.find_all('tr')        
rows = []
for row in table_rows:
   td = row.find_all('td')
   row = [row.text for row in td]
   rows.append(row)

with open('NYC_DEMO.csv', 'w') as f:
   writer = csv.writer(f)
   writer.writerow(headers)
   writer.writerows(row for row in rows if row)

In [104]:
# Load data from CSV
Demo_data=pd.read_csv('NYC_DEMO.csv')
print('Data downloaded!')

Data downloaded!


In [105]:
Demo_data

Unnamed: 0,New York City's five boroughsvte,Jurisdiction,Population,Gross Domestic Product,Land area,Density,Borough,County,Estimate (2017)[12],billions(US$)[13],per capita(US$),square miles,squarekm,persons / sq. mi,persons /km2
1900,3437202,3369898,98.04,,,60666,1.76,6607,0.19,31,0.0,,,1270080,36.95
1910,4766883,4669162,97.95,,,91709,1.92,5669,0.12,343,0.01,,,1944357,40.79
1920,5620048,5459463,97.14,,,152467,2.71,7969,0.14,149,0.0,,,2028160,36.09
1930,6930446,6589377,95.08,,,327706,4.73,12972,0.19,391,0.01,,,2358686,34.03
1940,7454995,6977501,93.59,6856586.0,91.97,458444,6.15,17986,0.24,1064,0.01,120915.0,1.62,2138657,28.69
1950,7891957,7116441,90.17,,,747608,9.47,21441,0.27,6467,0.08,,,1784206,22.61
1960,7781984,6640662,85.33,,,1087931,13.98,43103,0.55,10288,0.13,,,1558690,20.03
1970,7894862,6048841,76.62,4969749.0,62.95,1668115,21.13,94499,1.2,83407,1.06,1278630.0,16.2,1437058,18.2
1980,7071639,4294075,60.72,3668945.0,51.88,1784337,25.23,231501,3.27,761762,10.77,1406024.0,19.88,1670199,23.62
1990,7322564,3827088,52.26,3163125.0,43.2,2102512,28.71,512719,7.0,880245,12.02,1783511.0,24.36,2082931,28.45


In [106]:
#Remove whitespaces and rename columns
Demo_data.columns

Index(['New York City's five boroughsvte\n', 'Jurisdiction\n', 'Population\n',
       'Gross Domestic Product\n', 'Land area\n', 'Density\n', 'Borough',
       'County', 'Estimate (2017)[12]', 'billions(US$)[13]', 'per capita(US$)',
       'square miles', 'squarekm', 'persons / sq. mi', 'persons /km2\n'],
      dtype='object')

In [107]:
Demo_data.rename(columns = {'2010[237]' : '2010',
                   '1990[239]':'1990',
                   '1970[239]':'1970', 
                   '1940[239]\n':'1940',
                    }, inplace=True)
Demo_data

Unnamed: 0,New York City's five boroughsvte,Jurisdiction,Population,Gross Domestic Product,Land area,Density,Borough,County,Estimate (2017)[12],billions(US$)[13],per capita(US$),square miles,squarekm,persons / sq. mi,persons /km2
1900,3437202,3369898,98.04,,,60666,1.76,6607,0.19,31,0.0,,,1270080,36.95
1910,4766883,4669162,97.95,,,91709,1.92,5669,0.12,343,0.01,,,1944357,40.79
1920,5620048,5459463,97.14,,,152467,2.71,7969,0.14,149,0.0,,,2028160,36.09
1930,6930446,6589377,95.08,,,327706,4.73,12972,0.19,391,0.01,,,2358686,34.03
1940,7454995,6977501,93.59,6856586.0,91.97,458444,6.15,17986,0.24,1064,0.01,120915.0,1.62,2138657,28.69
1950,7891957,7116441,90.17,,,747608,9.47,21441,0.27,6467,0.08,,,1784206,22.61
1960,7781984,6640662,85.33,,,1087931,13.98,43103,0.55,10288,0.13,,,1558690,20.03
1970,7894862,6048841,76.62,4969749.0,62.95,1668115,21.13,94499,1.2,83407,1.06,1278630.0,16.2,1437058,18.2
1980,7071639,4294075,60.72,3668945.0,51.88,1784337,25.23,231501,3.27,761762,10.77,1406024.0,19.88,1670199,23.62
1990,7322564,3827088,52.26,3163125.0,43.2,2102512,28.71,512719,7.0,880245,12.02,1783511.0,24.36,2082931,28.45


In [108]:
Demo_data.columns

Index(['New York City's five boroughsvte\n', 'Jurisdiction\n', 'Population\n',
       'Gross Domestic Product\n', 'Land area\n', 'Density\n', 'Borough',
       'County', 'Estimate (2017)[12]', 'billions(US$)[13]', 'per capita(US$)',
       'square miles', 'squarekm', 'persons / sq. mi', 'persons /km2\n'],
      dtype='object')

In [109]:
Demo_data.columns = Demo_data.columns.str.replace(' ', '')

In [110]:
#Replace newline('\n') from each string from left and right sides
Demo_data= Demo_data.replace('\n',' ', regex=True)
Demo_data

Unnamed: 0,NewYorkCity'sfiveboroughsvte,Jurisdiction,Population,GrossDomesticProduct,Landarea,Density,Borough,County,Estimate(2017)[12],billions(US$)[13],percapita(US$),squaremiles,squarekm,persons/sq.mi,persons/km2
1900,3437202,3369898,98.04,,,60666,1.76,6607,0.19,31,0.0,,,1270080,36.95
1910,4766883,4669162,97.95,,,91709,1.92,5669,0.12,343,0.01,,,1944357,40.79
1920,5620048,5459463,97.14,,,152467,2.71,7969,0.14,149,0.0,,,2028160,36.09
1930,6930446,6589377,95.08,,,327706,4.73,12972,0.19,391,0.01,,,2358686,34.03
1940,7454995,6977501,93.59,6856586.0,91.97,458444,6.15,17986,0.24,1064,0.01,120915.0,1.62,2138657,28.69
1950,7891957,7116441,90.17,,,747608,9.47,21441,0.27,6467,0.08,,,1784206,22.61
1960,7781984,6640662,85.33,,,1087931,13.98,43103,0.55,10288,0.13,,,1558690,20.03
1970,7894862,6048841,76.62,4969749.0,62.95,1668115,21.13,94499,1.2,83407,1.06,1278630.0,16.2,1437058,18.2
1980,7071639,4294075,60.72,3668945.0,51.88,1784337,25.23,231501,3.27,761762,10.77,1406024.0,19.88,1670199,23.62
1990,7322564,3827088,52.26,3163125.0,43.2,2102512,28.71,512719,7.0,880245,12.02,1783511.0,24.36,2082931,28.45


In [121]:
Demo_data.to_csv('BON2_DEMOGRAPHICS.csv',index=False)

In [124]:
#This data is extracted from the wikipedia Page - https://en.wikipedia.org/wiki/Cuisine_of_New_York_City
cuisine_df=pd.read_csv('/Users/rbiswa03/Downloads/BON3_NYC_CUISINE.csv')
cuisine_df.head()

Unnamed: 0,Borough,Neighborhood,Cuisine
0,The Bronx,Bedford Park,"Mexican, Puerto Rican, Dominican, Korean (on ..."
1,The Bronx,Belmont,"Italian, Albanian (also known as ""Arthur Aven..."
2,The Bronx,City Island,"Italian, Seafood"
3,The Bronx,Morris Park,"Italian, Albanian"
4,The Bronx,Norwood,"Filipino (formerly Irish, less so today)"
