# Capstone Project - The Battle of Neighborhoods

### Introduction

When arcade game manufacturers produced cabinets in the '80s, they made them to be placed in all sorts of arcades, malls and other areas of young-skewing entertainment. Fast forward to current time, and while arcades aren't as prevalent — or as popular — as they once were, they're still hanging around.

And within these locations, new business models are developing. Many traditional arcades are changing their ways, moving away from the coin-based business model that has long been part of the arcade ecosystem.
Meanwhile, combination arcade bars are springing up across the country, bringing their own methods of monetizing games with them, along with other changes to pull the machines in line with more adult — and modern — usage.

### So our Problem Statement is, Which locality of all the cities in United States would be the best place to start a Gaming Arcade?

Since a long time a friend of mine is interested in starting a gaming arcade in the best locality of all the cities in United states. He defines a best locatlity based on the following constraints,

* Population density of a locality
* Per Capital income
* Population of each location
* Venues in each locality

__The category of the venues that he's interested in are,__

* Arts and Entertainment
* Shops & Service
* College and University
* Event
* Food
* Nightife Spot
* Outdoors & Recreation
* Professional & Other places
* Residence
* Travel & Transport

## Data we need

__To help my friend to set up a gaming arcade, we will get the data from the below sources__

* List of all the cities in United States with population density and coordinates: 
   *Data Source* https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population

 *Description* : This data set contains the required information. And we will use this data set to explore various cities of United States.

* List of all the cities in United States with Per Capita Income : 
   *Data Source* https://en.wikipedia.org/wiki/List_of_United_States_counties_by_per_capita_income

__We will use Four Square API to get the following__

Description : 

List of all venues in each city
List of all venues in each locality in the selected city
Using the above data we will first select best city to proceed with based on the values like Population density, per capita income of the state, number of venues (as we are giving weights to each venue based on its category).

Once we select a city, we then go hunting for Localities. Again, we do it using the same approach i.e. based on the scores of venues in each locality.

In [1]:
# Importing all the necessary libraries we will be needing to do the Ananlysis


import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

# for webscraping import Beautiful Soup 
from bs4 import BeautifulSoup

import xml

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    geopy-1.20.0               |             py_0          57 KB  conda-forge
    openssl-1.1.1c             |       h516909a_0         2.1 MB  conda-forge
    geographiclib-1.49         |             py_0          32 KB  conda-forge
    certifi-2019.9.11          |           py36_0         147 KB  conda-forge
    ca-certificates-2019.9.11  |       hecc5488_0         144 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.5 MB

The following NEW packages will be INSTALLED:

    geographiclib:   1.49-py_0         conda-forge
    geopy:           1.20.0-py_0       conda-forge

The following packages will be UPDATED:

    ca-

### Exracting the content in a wiki page that has 'List of US Cities by population' in to a text file

In [2]:
link = 'https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population'
page = requests.get(link) 
soup = BeautifulSoup(page.text)

### Finding the table that has the data that we need i.e. list of all cities with their population, Square Area, Location (coordinates) 

In [3]:
table = soup.find_all('table')[4]

### Extracting the table from the webpage into a data frame by specifying the column names

In [4]:
table_rows = table.find_all('tr')
res = []
for tr in table_rows:
    td = tr.find_all('td')
    row = [tr.text.strip() for tr in td if tr.text.strip()]
    if row:
        res.append(row)
df = pd.DataFrame(res, columns=["Rank", "City", "State", "del1", "del2", "del3", "Sq.Area", "del5", "population density in Sq Mi", "Population density in Km2", "Location"])
df.head()

Unnamed: 0,Rank,City,State,del1,del2,del3,Sq.Area,del5,population density in Sq Mi,Population density in Km2,Location
0,1,New York[d],New York,8398748,8175133,+2.74%,301.5 sq mi,780.9 km2,"28,317/sq mi","10,933/km2",40°39′49″N 73°56′19″W﻿ / ﻿40.6635°N 73.9387°W﻿...
1,2,Los Angeles,California,3990456,3792621,+5.22%,468.7 sq mi,"1,213.9 km2","8,484/sq mi","3,276/km2",34°01′10″N 118°24′39″W﻿ / ﻿34.0194°N 118.4108°...
2,3,Chicago,Illinois,2705994,2695598,+0.39%,227.3 sq mi,588.7 km2,"11,900/sq mi","4,600/km2",41°50′15″N 87°40′54″W﻿ / ﻿41.8376°N 87.6818°W﻿...
3,4,Houston[3],Texas,2325502,2100263,+10.72%,637.5 sq mi,"1,651.1 km2","3,613/sq mi","1,395/km2",29°47′12″N 95°23′27″W﻿ / ﻿29.7866°N 95.3909°W﻿...
4,5,Phoenix,Arizona,1660272,1445632,+14.85%,517.6 sq mi,"1,340.6 km2","3,120/sq mi","1,200/km2",33°34′20″N 112°05′24″W﻿ / ﻿33.5722°N 112.0901°...


### Getting the per capita income state wise for USA

In [5]:
link1 = 'https://en.wikipedia.org/wiki/List_of_United_States_counties_by_per_capita_income'
page1 = requests.get(link1) 
soup1 = BeautifulSoup(page1.text)
table = soup1.find_all('table')[2]
table_rows = table.find_all('tr')
res = []
for tr in table_rows:
    td = tr.find_all('td')
    row = [tr.text.strip() for tr in td if tr.text.strip()]
    if row:
        res.append(row)
df_state = pd.DataFrame(res, columns=["Rank", "Country-equivalent", "State", "Per capita income", "del2", "del3", "Population", "del5"])
df_state.head()

Unnamed: 0,Rank,Country-equivalent,State,Per capita income,del2,del3,Population,del5
0,1,New York County,New York,"$62,498","$69,659","$84,627",1605272,736192
1,2,Arlington,Virginia,"$62,018","$103,208","$139,244",214861,94454
2,3,Falls Church City,Virginia,"$59,088","$120,000","$152,857",12731,5020
3,4,Marin,California,"$56,791","$90,839","$117,357",254643,102912
4,5,Alexandria City,Virginia,"$54,608","$85,706","$107,511",143684,65369


### Approach :
    
* Collect the city population density and coordinates from https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population
* Using FourSquare API we will find all venues in each locality in these selected city.
* Filter out all venues based on the values like Population density, per capita income of the state
* Find scores/category ,using FourSquare API.
* Using scores for each city, we will sort that data.
* Visualize the Ranking of neighborhoods using folium library(python)

In [None]:
### Code

In [None]:
### Further code and maps
#Provided in the code section in week 5

#### Conclusion

Following the above approach,we finally got a better place in the Jersey city
This place is between the Groove Street and the Grand Street, hence it would also have the best footfall and potential customers as well.