# Problem Statement:

### “Where is a good place for a U.S. expansion of a French style restaurant?” ###

# Initial Conversation/background into Problem Statement: 

 While living in France a friend there approached regarding an expansion of their French restaurant into the U.S. market.  Knowing that I am from the U.S. they requested my assistance in determining a suitable location, within the continental U.S., for this expansion.  I gladly accepted.
	After our initial interview it was determined that we would focus on the State of Washington in the U.S. due to its proximity to the Pacific Ocean (fresh seafood daily), strong economy (Microsoft, Amazon, Google etc) and a large international community (According to the American Immigration Council 1 in 7 residents in Washington State is an immigrant and of those 38% have a college degree or higher (1) ).  Now the question is where within the State of Washington should my friend startup a new French Restaurant?  This is where the Data Science comes in.
	To choose a location we first must decide which county to focus on then we will narrow the focus down to 2-3 cities within the county.  After identifying the potential cities we will make an API call to foursquare enabling us to explore each location and, along with one-hot encoding, we will examine the venues for each location giving us a better idea of where a new French style restaurant may flourish.  



# DATA INCLUDED/STEPS FOR PROBLEM RESOLUTION: ###

###     •	Folium map of Washington State
###     •	Analyze a table of criteria to decide on which County in Washington State to focus on
###     •	Create a Folium map of deciding county within Wash. St.
###     •	Analyze a table of criteria to identify 2-3 cities of interest to explore on foursquare
###     •	Create a graph, or chart, to help visually identify qualifying cities
###     •	Generate Foursquare API calls for each city identified and examine local venues
###     •	Rank the cities based on One-hot encoding and foursquare
###     •	Discuss the results with the friend/customer and examine next steps





# Data Section

In [11]:
from bs4 import BeautifulSoup
import numpy as np
import pandas as pd
import json
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim

import requests

from pandas.io.json import json_normalize

import matplotlib.cm as cm

import matplotlib.colors as colors

from sklearn.cluster import KMeans
!conda install -c conda-forge folium=0.5.0 --yes
import folium
print('Libraries imported')

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.

Libraries imported


In [13]:
url2 = 'https://data.cityofnewyork.us/resource/833y-fsy8.json'

In [14]:
NY_Crime_df = pd.read_json(url2)
#results = requests.get(url2).json()
#results

In [15]:
NY_Crime_df.head()

Unnamed: 0,incident_key,occur_date,occur_time,boro,precinct,jurisdiction_code,statistical_murder_flag,vic_age_group,vic_sex,vic_race,x_coord_cd,y_coord_cd,latitude,longitude,geocoded_column,:@computed_region_efsh_h5xi,:@computed_region_f5dn_yrer,:@computed_region_yeji_bk3q,:@computed_region_92fq_4b7q,:@computed_region_sbqj_enih,perp_age_group,perp_sex,perp_race,location_desc
0,201575314,2019-08-23T00:00:00.000,2021-02-07 22:10:00,QUEENS,103,0.0,False,25-44,M,BLACK,1037451,193561,40.697805,-73.808141,"{'type': 'Point', 'coordinates': [-73.80814071...",24670.0,41,3,6,61,,,,
1,205748546,2019-11-27T00:00:00.000,2021-02-07 15:54:00,BRONX,40,0.0,False,25-44,F,BLACK,1006789,237559,40.8187,-73.918571,"{'type': 'Point', 'coordinates': [-73.91857061...",10929.0,49,5,43,23,<18,M,BLACK,
2,193118596,2019-02-02T00:00:00.000,2021-02-07 19:40:00,MANHATTAN,23,0.0,False,18-24,M,BLACK HISPANIC,999347,227795,40.791916,-73.94548,"{'type': 'Point', 'coordinates': [-73.94547965...",12426.0,7,4,35,14,18-24,M,WHITE HISPANIC,
3,204192600,2019-10-24T00:00:00.000,2021-02-07 00:52:00,STATEN ISLAND,121,0.0,True,25-44,F,BLACK,938149,171781,40.638064,-74.166108,"{'type': 'Point', 'coordinates': [-74.16610830...",10371.0,4,1,13,75,25-44,M,BLACK,PVT HOUSE
4,201483468,2019-08-22T00:00:00.000,2021-02-07 18:03:00,BRONX,46,0.0,False,18-24,M,BLACK,1008224,250621,40.854547,-73.913339,"{'type': 'Point', 'coordinates': [-73.91333944...",10931.0,6,5,29,29,25-44,M,BLACK HISPANIC,


In [16]:
url = 'https://population.un.org/wpp/Download/Files/1_Indicators%20(Standard)/CSV_FILES/WPP2019_TotalPopulationBySex.csv'

Toronto_df = pd.read_csv (url, header=0)

In [17]:
Toronto_df.shape

(280932, 10)

In [19]:
SeaCrime_df= pd.read_csv(r'C:\Users\keg\Downloads\seattle-crime-stats-by-1990-census-tract-1996-2007.csv', sep=',')
SeaCrime_df.head()


Unnamed: 0,Report_Year,Census_Tract_1990,Crime_Type,Report_Year_Total
0,1996,1.0,Aggravated Assault,11
1,1996,1.0,Homicide,0
2,1996,1.0,NonResidential Burglary,41
3,1996,1.0,Property Crimes Total,430
4,1996,1.0,Rape,2


In [17]:
NYCrime_df= pd.read_csv(r'C:\Users\keg\Downloads\NYPD_Complaint_Data_Current__Year_To_Date_.csv', sep=',')

NYCrime_df.head()

Unnamed: 0,CMPLNT_NUM,ADDR_PCT_CD,BORO_NM,CMPLNT_FR_DT,CMPLNT_FR_TM,CMPLNT_TO_DT,CMPLNT_TO_TM,CRM_ATPT_CPTD_CD,HADEVELOPT,HOUSING_PSA,JURISDICTION_CODE,JURIS_DESC,KY_CD,LAW_CAT_CD,LOC_OF_OCCUR_DESC,OFNS_DESC,PARKS_NM,PATROL_BORO,PD_CD,PD_DESC,PREM_TYP_DESC,RPT_DT,STATION_NAME,SUSP_AGE_GROUP,SUSP_RACE,SUSP_SEX,TRANSIT_DISTRICT,VIC_AGE_GROUP,VIC_RACE,VIC_SEX,X_COORD_CD,Y_COORD_CD,Latitude,Longitude,Lat_Lon,New Georeferenced Column
0,885776788,66,,12/23/2020,19:50:00,,,COMPLETED,,,,N.Y. POLICE DEPT,101,FELONY,OUTSIDE,MURDER & NON-NEGL. MANSLAUGHTER,,,,,,12/23/2020,,,,,,18-24,BLACK,M,986633,167258,40.625769,-73.991417,"(40.62576896100006, -73.99141682199996)",POINT (-73.99141682199996 40.62576896100006)
1,350637195,77,,12/21/2020,01:10:00,,,COMPLETED,,,,N.Y. POLICE DEPT,101,FELONY,INSIDE,MURDER & NON-NEGL. MANSLAUGHTER,,,,,,12/21/2020,,,,,,25-44,BLACK,M,1003606,185050,40.674583,-73.930222,"(40.67458330800008, -73.93022154099998)",POINT (-73.93022154099998 40.67458330800008)
2,347843168,43,BRONX,11/22/2020,22:00:00,,,COMPLETED,,,0.0,N.Y. POLICE DEPT,104,FELONY,,RAPE,,PATROL BORO BRONX,157.0,RAPE 1,STREET,11/23/2020,,UNKNOWN,UNKNOWN,U,,25-44,BLACK,F,1020316,239179,40.823101,-73.86969,"(40.82310129900002, -73.86969046099993)",POINT (-73.86969046099993 40.82310129900002)
3,197941396,47,,11/22/2020,09:50:00,,,COMPLETED,,,,N.Y. POLICE DEPT,101,FELONY,INSIDE,MURDER & NON-NEGL. MANSLAUGHTER,,,,,,11/22/2020,,25-44,BLACK,M,,25-44,BLACK,F,1026387,262634,40.887451,-73.847608,"(40.88745131300004, -73.84760778699997)",POINT (-73.84760778699997 40.88745131300004)
4,298404927,25,,11/21/2020,15:38:00,,,COMPLETED,,,,N.Y. HOUSING POLICE,101,FELONY,OUTSIDE,MURDER & NON-NEGL. MANSLAUGHTER,,,,,,11/21/2020,,,,,,18-24,BLACK HISPANIC,M,1003396,230824,40.800222,-73.930848,"(40.80022202900005, -73.93084834199995)",POINT (-73.93084834199995 40.80022202900005)


In [6]:
!conda install -c conda-forge folium=0.5.0 --yes
import folium

print('Folium installed and imported!')

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.

Folium installed and imported!


In [20]:
    US_latitude = 36.52 
    US_longitude = -96.63

    # define the world map centered around Canada with a higher zoom level 
    US_map = folium.Map(location=[US_latitude, US_longitude], zoom_start=4)

    # display world map
    US_map

In [10]:
    Wash_latitude = 47.7511 
    Wash_longitude = -120.7401

    # define the world map centered around Canada with a higher zoom level 
    Wash_map = folium.Map(location=[Wash_latitude, Wash_longitude], zoom_start=6)

    # display world map
    Wash_map

In [18]:
NYCrime_df.columns

Index(['CMPLNT_NUM', 'ADDR_PCT_CD', 'BORO_NM', 'CMPLNT_FR_DT', 'CMPLNT_FR_TM',
       'CMPLNT_TO_DT', 'CMPLNT_TO_TM', 'CRM_ATPT_CPTD_CD', 'HADEVELOPT',
       'HOUSING_PSA', 'JURISDICTION_CODE', 'JURIS_DESC', 'KY_CD', 'LAW_CAT_CD',
       'LOC_OF_OCCUR_DESC', 'OFNS_DESC', 'PARKS_NM', 'PATROL_BORO', 'PD_CD',
       'PD_DESC', 'PREM_TYP_DESC', 'RPT_DT', 'STATION_NAME', 'SUSP_AGE_GROUP',
       'SUSP_RACE', 'SUSP_SEX', 'TRANSIT_DISTRICT', 'VIC_AGE_GROUP',
       'VIC_RACE', 'VIC_SEX', 'X_COORD_CD', 'Y_COORD_CD', 'Latitude',
       'Longitude', 'Lat_Lon', 'New Georeferenced Column'],
      dtype='object')

In [None]:
C:\Users\keg\Desktop\Data Visualization with Python Final\police data for SF.csv

In [None]:
C:\Users\keg\Downloads\NYPD_Complaint_Data_Current__Year_To_Date_.csv

In [None]:
https://population.un.org/wpp/Download/Files/1_Indicators%20(Standard)/CSV_FILES/WPP2019_TotalPopulationBySex.csv

In [32]:
weather = 'http://api.openweathermap.org/data/2.5/weather?q=Seattle,us&APPID=df2a929bc331ae09bbc8ba86b7300ff3'
weather

'http://api.openweathermap.org/data/2.5/weather?q=Seattle,us&APPID=df2a929bc331ae09bbc8ba86b7300ff3'

In [36]:
Chi_weather ='https://openweathermap.org/weathermap?basemap=map&cities=true&layer=temperature&lat=41.85&lon=-87.65&zoom=12' 

#'http://api.openweathermap.org/data/2.5/weather?q=Chicago,us&APPID=df2a929bc331ae09bbc8ba86b7300ff3'
Chi_weather

'https://openweathermap.org/weathermap?basemap=map&cities=true&layer=temperature&lat=41.85&lon=-87.65&zoom=12'

In [38]:
results = requests.get(Chi_weather).json()
results

<Response [200]>

In [33]:
results = requests.get(weather).json()
results



{'coord': {'lon': -122.3321, 'lat': 47.6062},
 'weather': [{'id': 804,
   'main': 'Clouds',
   'description': 'overcast clouds',
   'icon': '04n'}],
 'base': 'stations',
 'main': {'temp': 280.55,
  'feels_like': 277.34,
  'temp_min': 279.82,
  'temp_max': 281.48,
  'pressure': 1011,
  'humidity': 87},
 'visibility': 10000,
 'wind': {'speed': 3.09, 'deg': 170},
 'clouds': {'all': 90},
 'dt': 1612102339,
 'sys': {'type': 1,
  'id': 3417,
  'country': 'US',
  'sunrise': 1612107404,
  'sunset': 1612141729},
 'timezone': -28800,
 'id': 5809844,
 'name': 'Seattle',
 'cod': 200}

In [32]:
KChp_df = pd.read_excel(r'C:\Users\keg\AppData\Local\Temp\Unemployment Statistics by County -December 2020.xlsx', sep=',')

KChp_df.head() 

Unnamed: 0,Washington state,Unnamed: 1,Unnamed: 2,Benchmark: March 2020,Unnamed: 4
0,Employment Security Department,,,,
1,Labor Market and Economic Analysis,,,,
2,"Date: January 26, 2021",,,,
3,2020-12-01 00:00:00,,,,
4,Washington state resident civilian labor force...,,,,


In [None]:
http://maps.openweathermap.org/maps/2.0/weather/PA0/10/{x}/{y}?date=1596198600&appid=a6041ed2fa4cecacc2b0e658aab92588

In [None]:
https://openweathermap.org/weathermap?basemap=map&cities=true&layer=temperature&lat=41.85&lon=-87.65&zoom=12