<h1 align='center'><span style='color:green ; font-size:35px ; font-weight:bold'>Restaurant Location
Recommend System for DFW Neighborhoods</span></h1>

<a  href="https://www.google.com/maps/place/Dallas-Fort+Worth+Metropolitan+Area,+TX/@32.7634165,-97.1659589,11.25z"><img src = "https://images.squarespace-cdn.com/content/v1/52e2fba3e4b0b88a2745baef/1480618532927-MF605OK6IKH8P3QCYBJ0/ke17ZwdGBToddI8pDm48kOvFTVfKxpIXwE_q_BwqyfNZw-zPPgdn4jUwVcJE1ZvWEtT5uBSRWt4vQZAgTJucoTqqXjS3CfNDSuuf31e0tVGctfxHY7goLxYWk6cM_FjVqqhxGsDjAJeWa6MlEDD7tRur-lC0WofN0YB1wFg-ZW0/Dallas+Fort+Worth+Area+Landscape+North+Texas+Services+Map?format=500w" width = 300 align="middle"> </a>

## Introduction

This model will help to identify a location where one can start a restaurant in DFW Metroplex area,
Currently we are building the model to identify a location where the Asian community exists
and can face little or no competition to start a Indo Chinese fusion restaurant with more Indian flavors.

Currently the people who are planning to start a Indian/Asian restaurant can be benefited by using this model.
but with some little changes, we can use the same model to identify location to start a restaurant with different cuisine.

## Table of Contents

### [Section-1 : Creating Dataframe with DFW Area Cities / Locations with Zip Codes](#item1)

1. <a href="#item1">Importing the Libraries</a>
2. <a href="#item2">DFW Area Data set with Cities and Zip Codes</a> 
3. <a href="#item3">Getting Latitude, Longitude and Community Information</a> 
4. <a href="#item4">Saving the output to a file For Backup</a> 
<p></p>
<p></p>
<p></p>


### [Section-2 : Displaying the Location Data on the Maps](#item5)

5. <a href="#item5">Display DFW Maps with Location Data</a>
6. <a href="#item6">Creating and Displaying Indian Restaurant Venues in DFW area</a> 
7. <a href="#item7">Creating and Displaying Asian Restaurant Venues in DFW area</a> 
8. <a href="#item8">Creating and Displaying Shopping Venues in DFW area</a> 
9. <a href="#item9">Creating and Displaying Residential Complex Venues in DFW area</a> 
10. <a href="#item10">Creating and Displaying Office / Hotel / University Venues in DFW area</a> 
<p></p>
<p></p>
<p></p>


### [Section-3 : Evaluating the Score and Identifying the Location and displaying](#item11)
11. <a href="#item11">Measure the Venue count for each location</a> 
12. <a href="#item12">Measure the weight score for each location</a> 
13. <a href="#item13">Display Location with highest weight score</a> 
14. <a href="#item14">Conclusion</a> 
<p></p>
<p></p>
<p></p>
<p></p>

## Section-1 : Creating Dataframe with DFW Area Cities / Locations with Zip Codes

<a id="item1"></a>

### 1. Importing the Libraries

In [1]:
import pandas as pd
import requests
from bs4 import BeautifulSoup

import numpy as np # library to handle data in a vectorized manner
import json

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

#!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes
#=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


<a id="item2"></a>

### 2. DFW Area Data set with Cities and Zip Codes

In [2]:
# Web Link Variable intialization
web_link_tx = 'https://www.zipdatamaps.com/list-of-zip-codes-in-texas.php'
web_link_zip = 'https://www.zipdatamaps.com/'

In [3]:
data_tx_zipcodes = pd.read_html(web_link_tx)

# Create DataFrame with all Zip Codes available in State of Texas
df_tx_zip = data_tx_zipcodes[0]

In [4]:
df_tx_zip.head()

Unnamed: 0,Zip Code,Zip Code Type,Zip Code Name,County
0,73301,Unique,Austin,Travis
1,73344,Unique,Austin,Travis
2,73960,,Texhoma,Sherman
3,75001,Non-Unique,Addison,Dallas
4,75002,Non-Unique,Allen,Collin


In [5]:
df_tx_zip.columns = ['ZipCode','ZipCodeType','City','County']

In [7]:
df_tx_zip.head()

Unnamed: 0,ZipCode,ZipCodeType,City,County
0,73301,Unique,Austin,Travis
1,73344,Unique,Austin,Travis
2,73960,,Texhoma,Sherman
3,75001,Non-Unique,Addison,Dallas
4,75002,Non-Unique,Allen,Collin


In [8]:
#Create a list with all the counties available in DFW area.
dfw_counties = ['Collin', 'Dallas', 'Denton', 'Ellis', 'Hood', 'Hunt', 'Johnson', 'Kaufman', 'Rockwall', 'Somervell', 'Parker', 'Tarrant', 'Wise']

In [9]:
# Create a DataFrame with all the Cities and ZipCodes available in DFW Counties
df_dfw_zip_all = df_tx_zip[df_tx_zip['County'].isin(dfw_counties) ]

In [10]:
df_dfw_zip_all.count()

ZipCode        406
ZipCodeType    406
City           406
County         406
dtype: int64

In [11]:
df_dfw_zip_all['ZipCodeType'].unique()

array(['Non-Unique', 'PO Box', 'Unique'], dtype=object)

In [12]:
# Filter Data to exclude zip codes related to 'PO Box'
df_dfw_zip = df_dfw_zip_all[df_dfw_zip_all['ZipCodeType'].isin(['Non-Unique', 'Unique']) ]
df_dfw_zip.count()

ZipCode        309
ZipCodeType    309
City           309
County         309
dtype: int64

In [13]:
df_dfw_zip = df_dfw_zip[['ZipCode','City','County']]
df_dfw_zip.reset_index(drop=True,inplace=True)
df_dfw_zip.head()

Unnamed: 0,ZipCode,City,County
0,75001,Addison,Dallas
1,75002,Allen,Collin
2,75006,Carrollton,Dallas
3,75007,Carrollton,Denton
4,75009,Celina,Collin


<a id="item3"></a>

### 3. Getting Latitude, Longitude and Community Information

In [14]:
web_link_zip = 'https://www.zipdatamaps.com/'
def get_lat_lon_population(zip):
    wl = web_link_zip+zip
    #print(wl)
    wl_results = pd.read_html(wl)
    z_TP = list(wl_results[0][wl_results[0].iloc[:,0]=='Current Population:'].values[0])
    z_ll = list(wl_results[0][wl_results[0].iloc[:,0]=='Coordinates(Y,X)'].values[0])
    z_asian = list(wl_results[1][wl_results[1].iloc[:,0]=='Asian'].values[0])
    z_asian_ps = list(wl_results[2][wl_results[2].iloc[:,0]=='Asian'].values[0])
    lat = float(z_ll[1].split(',')[0])
    lon = float(z_ll[1].split(',')[1])
    asian_pop = float(z_asian[1])
    asian_ps = z_asian_ps[1].strip('%')
    if asian_ps.replace('.','').replace(',','').isdigit():
        asian_pct_ps = float(asian_ps)
    else:
        asian_pct_ps = 0.01
    total_pop = float(z_TP[1])
    if np.isnan(asian_pop) or asian_pop==0:
        asian_pop = (total_pop*asian_pct_ps)//100
        asian_pct = asian_pct_ps
    else:
        asian_pct = (asian_pop/total_pop)*100
        
    r_data = {'ZipCode':zip,
              'Total_Population': total_pop,
              'Asian_Population': asian_pop,
              'AsianPop_Percent': asian_pct,
              'Latitude' : lat,
              'Longitude': lon   }
    return r_data

In [15]:
col_names = ['ZipCode', 'Total_Population', 'Asian_Population', 'AsianPop_Percent', 'Latitude', 'Longitude']
df_dfw_ll_pop = pd.DataFrame(columns=col_names)
for zcode in df_dfw_zip['ZipCode']:
    ll_data = get_lat_lon_population(zcode)
    df_dfw_ll_pop = df_dfw_ll_pop.append(ll_data , ignore_index=True)
print('Getting Data Is Completed')

Getting Data Is Completed


In [16]:
df_dfw_ll_pop.head(3)

Unnamed: 0,ZipCode,Total_Population,Asian_Population,AsianPop_Percent,Latitude,Longitude
0,75001,12414.0,1284.0,10.343161,32.960049,-96.838417
1,75002,63140.0,5616.0,8.89452,33.091141,-96.606972
2,75006,46364.0,3315.0,7.149944,32.953411,-96.901871


In [17]:
df_dfw_all_data = pd.merge(df_dfw_zip, df_dfw_ll_pop , on='ZipCode' , how='left' )

In [18]:
# Create a DataFrame with Asian Community is more than 1000 and >3.5% of Total Population.
df_dfw_asian = df_dfw_all_data[(df_dfw_all_data['Asian_Population']>1000) & (df_dfw_all_data['AsianPop_Percent']>3.5)]

In [19]:
df_dfw_asian.head()

Unnamed: 0,ZipCode,City,County,Total_Population,Asian_Population,AsianPop_Percent,Latitude,Longitude
0,75001,Addison,Dallas,12414.0,1284.0,10.343161,32.960049,-96.838417
1,75002,Allen,Collin,63140.0,5616.0,8.89452,33.091141,-96.606972
2,75006,Carrollton,Dallas,46364.0,3315.0,7.149944,32.953411,-96.901871
3,75007,Carrollton,Denton,51624.0,7822.0,15.151867,33.00996,-96.896088
5,75010,Carrollton,Denton,21607.0,6081.0,28.143657,33.054829,-96.871742


<a id="item4"></a>

### 4. Saving the output to a file For Backup

In [20]:
# Saving the Data to a File For BackUp.
df_dfw_asian.to_csv('dfw_asian_data.csv', index=False)

In [21]:
# Reading Data from Backup file
df_dfw_asian = pd.read_csv('dfw_asian_data.csv')

In [22]:
df_dfw_asian.count()

ZipCode             88
City                88
County              88
Total_Population    88
Asian_Population    88
AsianPop_Percent    88
Latitude            88
Longitude           88
dtype: int64

## Section-2 : Displaying the Location Data on the Maps

<a id="item5"></a>

### 5. Display DFW Map with Location Data

In [23]:
# Intializing the Center Co-Ordinates for DFW Area
address = 'dfw , TX, USA'

geolocator = Nominatim(user_agent="capstoneProject")
location = geolocator.geocode(address, timeout=60, exactly_one=True)
latitude = location.latitude
longitude = location.longitude
print('The decimal coordinates of DFW area are {}, {}.'.format(latitude, longitude))

The decimal coordinates of DFW area are 32.89651945, -97.0465220537124.


In [24]:
# create DFW Map with all the cities AND zip code locations with "City:ZipCode" as Label
map_dfw_cities = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, local, count in zip(df_dfw_asian['Latitude'], df_dfw_asian['Longitude'], df_dfw_asian['City'], df_dfw_asian['Asian_Population']):
    label = '{}:{}'.format(local, int(count))
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_dfw_cities)  
    
map_dfw_cities

In [25]:
# function to Search and Get Venuw information based on the Latitude and Longitude associated with ZipCode.
import urllib
def getNearbyVenues(names, zip_codes ,latitudes, longitudes, radius=10000,categoryType = '', categoryIds='', LIMIT=500,dLimit=1001):
    column_names = ['City',
                    'ZipCode',
                  'Latitude', 
                  'Longitude',
                  'Category',
                  'Venue',
                  'VenueId',
                  'VenueLatitude', 
                  'VenueLongitude',
                  'VenueZipCode',
                  'VenueDistance',
                  'VenueCategory',
                  'CategoryId']
    try:
        venues_list=[]
        for name,zcode, lat, lng in zip(names,zip_codes, latitudes, longitudes):
            #print(name, zcode, categoryIds)

            # create the API request URL
            url = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, VERSION, lat, lng, radius, LIMIT)
            if (categoryIds != ''):
                url = url + '&categoryId={}'
                url = url.format(categoryIds)
            
            #print(url)
            cType = categoryType
            #print(cat)

            # make the GET request
            response = requests.get(url).json()
            results = response["response"]['venues']
            #print("result_count :", len(results))

            # return only relevant information for each nearby venue
            for v in results:
                success = False
                try:
                    category = v['categories'][0]['name']
                    success = True
                except:
                    pass

                if success:
                    venues_list.append([(
                        name,
                        zcode,
                        lat, 
                        lng,
                        cType,
                        v['name'], 
                        v['id'],
                        v['location']['lat'], 
                        v['location']['lng'],
                        v['location'].get('postalCode',0),
                        v['location'].get('distance',0),
                        v['categories'][0]['name'],
                        v['categories'][0]['id']
                    )])
        nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
        nearby_venues.columns = column_names
        final_venues = nearby_venues[(nearby_venues['VenueZipCode']==nearby_venues['ZipCode'].apply(str)) | (nearby_venues['VenueDistance']<dLimit)]
    
    except:
        final_venues = pd.DataFrame( columns=column_names )
        #print("url")
        #print(response)
        #print(results)
        #print(nearby_venues)

    return(final_venues)

In [26]:
# Initializing the Token Variable with Key Information
limit = 500 # limit of number of venues returned by Foursquare API
radius = 10000 # define radius
CLIENT_ID = 'YWOI1OAAD2SGEXVYQI0SONQQHQYQHXHEOQW4HLTSXZZ5G5IX'
CLIENT_SECRET = 'TP1CBK44IG0FB4XERW1B2222GZKUGSTKYSFAALKU1JSNBHHX'
VERSION = '20200420'

In [27]:
# Function To call getNearbyVenues() for each sub category listed in a Venuw Categories.
def getVenueCategory(City, zip_code ,latitude, longitude, radius=2000,categoryType='', clist=[], dLimit=1001):
    #print(City,zip_code, latitude, longitude)
    if len(clist)>0:
        venue_data_list = []
        for cid in clist:
            get_venue_data = getNearbyVenues(names=City,zip_codes=zip_code, latitudes=latitude, longitudes=longitude,categoryType=categoryType, categoryIds=cid ,radius=radius, dLimit=dLimit)
            venue_data_list.append(get_venue_data)
        venue_data = pd.concat(venue_data_list)
        
    else:
        venue_data = getNearbyVenues(names=City,zip_codes=zip_code, latitudes=latitude, longitudes=longitude,categoryType=categoryType, radius=radius, dLimit=dLimit)
        
    venue_data.reset_index(drop=True, inplace=True)
    return(venue_data)

In [28]:
# Defining the Required Categories
# 1. Indian Restaurants : Indian / South Indian / Indian Chinese Restaurants
Indian_Restaurants = ['4bf58dd8d48988d10f941735', '54135bf5e4b08f3d2429dfde', '54135bf5e4b08f3d2429dfdf']

# 2. Asian Restaurants : Korean / Chinese / Thai / Asian / Japanese Restaurants
Asian_Restaurants = ['4bf58dd8d48988d113941735','4bf58dd8d48988d145941735','4bf58dd8d48988d149941735','4bf58dd8d48988d142941735','4bf58dd8d48988d111941735' ]

# 3. Shopping Places : Grocery / Market / Department Store / Supermarket / Warehouse / Shopping Mall for Shopping Places
shopping = ['4bf58dd8d48988d118951735', '50be8ee891d4fa8dcc7199a7', '4bf58dd8d48988d1f6941735', '52f2ab2ebcbc57f1066b8b46', '52e816a6bcbc57f1066b7a54', '4bf58dd8d48988d1fd941735']

# 4. Residential Complex : Residential Building (Apartment / Condo)
Residential = ['4d954b06a243a5684965b473']

# 5. Office_Hotel Venues : Office / Hotel / University
office_hotel = ['4bf58dd8d48988d124941735', '4bf58dd8d48988d1fa931735', '4bf58dd8d48988d1ae941735']

In [29]:
# function to add markers for given venues to existing map
def displayOnMap(dataset, color, existingMap):
    for lat, lng, city, zcode, venue, venueCat in zip(dataset['VenueLatitude'], dataset['VenueLongitude'], dataset['City'], dataset['ZipCode'], dataset['Venue'], dataset['Category']):
        label = '{} ({}) - {}:{}'.format(venue, venueCat, city, zcode)
        label = folium.Popup(label, parse_html=True)
        folium.CircleMarker(
            [lat, lng],
            radius=5,
            popup=label,
            color=color,
            fill=True,
            fill_color=color,
            fill_opacity=0.7).add_to(existingMap)

<a id="item6"></a>

### 6. Creating and Displaying Indian Restaurant Venues in DFW area

In [30]:
# Create Data Frame with Indian Restaurants Locations
df_Indian_Restaurants = getVenueCategory(City=df_dfw_asian['City'], zip_code=df_dfw_asian['ZipCode']  ,latitude=df_dfw_asian['Latitude'], longitude=df_dfw_asian['Longitude'],categoryType='Indian_Restaurants',clist=Indian_Restaurants, radius=10000, dLimit=3001)
df_Indian_Restaurants.count()

City              291
ZipCode           291
Latitude          291
Longitude         291
Category          291
Venue             291
VenueId           291
VenueLatitude     291
VenueLongitude    291
VenueZipCode      291
VenueDistance     291
VenueCategory     291
CategoryId        291
dtype: int64

In [31]:
map_dfw_indian_restaurants = folium.Map(location=[latitude, longitude], zoom_start=10)
displayOnMap(df_Indian_Restaurants,'red', map_dfw_indian_restaurants)
map_dfw_indian_restaurants

<a id="item7"></a>

### 7. Creating and Displaying Asian Restaurant Venues in DFW area

In [32]:
df_Asian_Restaurants = getVenueCategory(City=df_dfw_asian['City'], zip_code=df_dfw_asian['ZipCode']  ,latitude=df_dfw_asian['Latitude'], longitude=df_dfw_asian['Longitude'],categoryType='Asian_Restaurants',clist=Asian_Restaurants, radius=10000, dLimit=3001)
df_Asian_Restaurants.count()

City              1637
ZipCode           1637
Latitude          1637
Longitude         1637
Category          1637
Venue             1637
VenueId           1637
VenueLatitude     1637
VenueLongitude    1637
VenueZipCode      1637
VenueDistance     1637
VenueCategory     1637
CategoryId        1637
dtype: int64

In [33]:
map_dfw_asian_restaurants = folium.Map(location=[latitude, longitude], zoom_start=10)
displayOnMap(df_Asian_Restaurants,'darkgreen', map_dfw_asian_restaurants)
map_dfw_asian_restaurants

<a id="item8"></a>

### 8. Creating and Displaying Shopping Venues in DFW area

In [34]:
df_shopping_places = getVenueCategory(City=df_dfw_asian['City'], zip_code=df_dfw_asian['ZipCode']  ,latitude=df_dfw_asian['Latitude'], longitude=df_dfw_asian['Longitude'],categoryType='Shopping_Places',clist=shopping, radius=10000, dLimit=1001)
df_shopping_places.count()

City              926
ZipCode           926
Latitude          926
Longitude         926
Category          926
Venue             926
VenueId           926
VenueLatitude     926
VenueLongitude    926
VenueZipCode      926
VenueDistance     926
VenueCategory     926
CategoryId        926
dtype: int64

In [35]:
map_dfw_shopping_places = folium.Map(location=[latitude, longitude], zoom_start=10)
displayOnMap(df_shopping_places,'fuchsia', map_dfw_shopping_places)
map_dfw_shopping_places

<a id="item9"></a>

### 9. Creating and Displaying Residential Complex Venues in DFW area

In [56]:
df_residential_complex = getVenueCategory(City=df_dfw_asian['City'], zip_code=df_dfw_asian['ZipCode']  ,latitude=df_dfw_asian['Latitude'], longitude=df_dfw_asian['Longitude'],categoryType='Residential_Complex',clist=Residential, radius=10000, dLimit=2001)
df_residential_complex.count()

City              388
ZipCode           388
Latitude          388
Longitude         388
Category          388
Venue             388
VenueId           388
VenueLatitude     388
VenueLongitude    388
VenueZipCode      388
VenueDistance     388
VenueCategory     388
CategoryId        388
dtype: int64

In [57]:
map_dfw_residential_complex = folium.Map(location=[latitude, longitude], zoom_start=10)
displayOnMap(df_residential_complex,'purple', map_dfw_residential_complex)
map_dfw_residential_complex

<a id="item10"></a>

### 10. Creating and Displaying Office / Hotel / University Venues in DFW area

In [58]:
df_office_hotel_places = getVenueCategory(City=df_dfw_asian['City'], zip_code=df_dfw_asian['ZipCode']  ,latitude=df_dfw_asian['Latitude'], longitude=df_dfw_asian['Longitude'],categoryType='Office_Hotel_Places',clist=office_hotel, radius=10000, dLimit=1001)
df_office_hotel_places.count()

City              678
ZipCode           678
Latitude          678
Longitude         678
Category          678
Venue             678
VenueId           678
VenueLatitude     678
VenueLongitude    678
VenueZipCode      678
VenueDistance     678
VenueCategory     678
CategoryId        678
dtype: int64

In [59]:
map_dfw_office_hotel_places = folium.Map(location=[latitude, longitude], zoom_start=10)
displayOnMap(df_office_hotel_places,'orange', map_dfw_office_hotel_places)
map_dfw_office_hotel_places

<a id="item11"></a>

## Section-3 : Evaluating the Score and Identifying the Location and displaying

<a id="item11"></a>

### 11. Measure the Venue count for each location

In [60]:
#Function to measure number of venues and add new category column to data frame for a given Category
def addCategoryColumn(dataDf, columnTitle, venueDf):
    grouped = venueDf.groupby('ZipCode').count()
    
    for n in dataDf['ZipCode']:
        try:
            dataDf.loc[dataDf['ZipCode'] == n,columnTitle] = grouped.loc[n, 'Venue']
        except:
            dataDf.loc[dataDf['ZipCode'] == n,columnTitle] = 0

In [61]:
df_dfw_venue_stats = df_dfw_asian.copy()
addCategoryColumn(df_dfw_venue_stats, 'Indian_Restaurants', df_Indian_Restaurants)
addCategoryColumn(df_dfw_venue_stats, 'Asian_Restaurants', df_Asian_Restaurants)
addCategoryColumn(df_dfw_venue_stats, 'Shopping_Places', df_shopping_places)
addCategoryColumn(df_dfw_venue_stats, 'Residential_Complex', df_residential_complex)
addCategoryColumn(df_dfw_venue_stats, 'Office_Hotels_Places', df_office_hotel_places)

In [62]:
df_dfw_venue_stats.head()

Unnamed: 0,ZipCode,City,County,Total_Population,Asian_Population,AsianPop_Percent,Latitude,Longitude,Indian_Restaurants,Asian_Restaurants,Shopping_Places,Residential_Complex,Office_Hotels_Places
0,75001,Addison,Dallas,12414.0,1284.0,10.343161,32.960049,-96.838417,2.0,9.0,6.0,5.0,9.0
1,75002,Allen,Collin,63140.0,5616.0,8.89452,33.091141,-96.606972,2.0,17.0,19.0,3.0,3.0
2,75006,Carrollton,Dallas,46364.0,3315.0,7.149944,32.953411,-96.901871,3.0,22.0,8.0,2.0,2.0
3,75007,Carrollton,Denton,51624.0,7822.0,15.151867,33.00996,-96.896088,3.0,43.0,14.0,1.0,4.0
4,75010,Carrollton,Denton,21607.0,6081.0,28.143657,33.054829,-96.871742,3.0,14.0,2.0,4.0,0.0


<a id="item12"></a>

### 12. Measure the weight score for each location

In [63]:
# negative weight, if already Indian restaurants located in that area, business may impact the business in negative way
weight_IndianRestaurants = -1.5

# negative weight, if already Asian restaurants located in that area, business may impact the business in negative way
weight_AsianRestaurants = -1

# positive weight, because most of the indian restaurant customers will visit before or after their shopping..
weight_ShoppingPlaces = 0.5

# positive weight, if the appartment complexes are more, then it will create positive impact
weight_ResidentialComplex=1.5

# positive weight If office spaces , hotels and universities will create positive impact on the business.
weight_OfficeHotel = 1.5

# positive weight If the asian population is more in that area. for every 1000 asian we will give 1 as weight score.
weight_AsianPopulation = 1000

In [64]:
df_dfw_ir_weight_score = df_dfw_venue_stats[['City','ZipCode','County','Asian_Population','Indian_Restaurants','Asian_Restaurants', 'Shopping_Places','Residential_Complex','Office_Hotels_Places' ]]

In [65]:
df_dfw_ir_weight_score.head(10)

Unnamed: 0,City,ZipCode,County,Asian_Population,Indian_Restaurants,Asian_Restaurants,Shopping_Places,Residential_Complex,Office_Hotels_Places
0,Addison,75001,Dallas,1284.0,2.0,9.0,6.0,5.0,9.0
1,Allen,75002,Collin,5616.0,2.0,17.0,19.0,3.0,3.0
2,Carrollton,75006,Dallas,3315.0,3.0,22.0,8.0,2.0,2.0
3,Carrollton,75007,Denton,7822.0,3.0,43.0,14.0,1.0,4.0
4,Carrollton,75010,Denton,6081.0,3.0,14.0,2.0,4.0,0.0
5,Allen,75013,Collin,6752.0,2.0,20.0,11.0,4.0,7.0
6,Coppell,75019,Dallas,7747.0,3.0,6.0,11.0,3.0,7.0
7,Flower Mound,75022,Denton,2836.0,4.0,9.0,4.0,2.0,4.0
8,Plano,75023,Collin,5449.0,10.0,32.0,11.0,1.0,3.0
9,Plano,75024,Collin,12997.0,21.0,44.0,7.0,3.0,26.0


In [66]:
df_dfw_ir_weight_score['WtScore'] = df_dfw_ir_weight_score['Indian_Restaurants'] * weight_IndianRestaurants + df_dfw_ir_weight_score['Asian_Restaurants'] * weight_AsianRestaurants + df_dfw_ir_weight_score['Shopping_Places'] * weight_ShoppingPlaces + df_dfw_ir_weight_score['Residential_Complex'] * weight_ResidentialComplex + df_dfw_ir_weight_score['Office_Hotels_Places'] * weight_OfficeHotel + df_dfw_ir_weight_score['Asian_Population'] // weight_AsianPopulation 

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


In [67]:
df_dfw_ir_weight_score = df_dfw_ir_weight_score[['City','ZipCode','County','Asian_Population','WtScore','Indian_Restaurants','Asian_Restaurants', 'Shopping_Places','Residential_Complex','Office_Hotels_Places']]

In [68]:
df_dfw_ir_weight_score.head(10)

Unnamed: 0,City,ZipCode,County,Asian_Population,WtScore,Indian_Restaurants,Asian_Restaurants,Shopping_Places,Residential_Complex,Office_Hotels_Places
0,Addison,75001,Dallas,1284.0,13.0,2.0,9.0,6.0,5.0,9.0
1,Allen,75002,Collin,5616.0,3.5,2.0,17.0,19.0,3.0,3.0
2,Carrollton,75006,Dallas,3315.0,-13.5,3.0,22.0,8.0,2.0,2.0
3,Carrollton,75007,Denton,7822.0,-26.0,3.0,43.0,14.0,1.0,4.0
4,Carrollton,75010,Denton,6081.0,-5.5,3.0,14.0,2.0,4.0,0.0
5,Allen,75013,Collin,6752.0,5.0,2.0,20.0,11.0,4.0,7.0
6,Coppell,75019,Dallas,7747.0,17.0,3.0,6.0,11.0,3.0,7.0
7,Flower Mound,75022,Denton,2836.0,-2.0,4.0,9.0,4.0,2.0,4.0
8,Plano,75023,Collin,5449.0,-30.5,10.0,32.0,11.0,1.0,3.0
9,Plano,75024,Collin,12997.0,-16.5,21.0,44.0,7.0,3.0,26.0


In [69]:
df_dfw_ir_weight_score = df_dfw_ir_weight_score.sort_values(by=['WtScore'], ascending=False)

In [70]:
df_dfw_ir_weight_score.head()

Unnamed: 0,City,ZipCode,County,Asian_Population,WtScore,Indian_Restaurants,Asian_Restaurants,Shopping_Places,Residential_Complex,Office_Hotels_Places
87,Roanoke,76262,Denton,1093.0,66.5,0.0,19.0,7.0,11.0,43.0
83,Denton,76201,Denton,2177.0,41.0,4.0,57.0,21.0,28.0,33.0
23,Grand Prairie,75052,Dallas,8610.0,39.5,0.0,10.0,29.0,8.0,10.0
36,Richardson,75082,Collin,5260.0,37.5,2.0,8.0,3.0,9.0,19.0
73,Mansfield,76063,Tarrant,2875.0,36.5,0.0,34.0,26.0,10.0,27.0


In [71]:
def displayIRLoc(zipCode):
    zCode = zipCode
    loc_data=df_dfw_venue_stats[df_dfw_venue_stats['ZipCode'] == zCode]
    loc_lat=loc_data['Latitude'].values[0]
    loc_lon=loc_data['Longitude'].values[0]
    map_dfw_ir_loc = folium.Map(location=[loc_lat, loc_lon], zoom_start=12)

    for lat, lng, local in zip(loc_data['Latitude'], loc_data['Longitude'], loc_data['City']):
        label = '{}'.format(local)
        label = folium.Popup(label, parse_html=True)
        folium.CircleMarker(
            [lat, lng],
            radius=5,
            popup=label,
            color='blue',
            fill=True,
            fill_color='blue',
            fill_opacity=0.7).add_to(map_dfw_ir_loc) 

    displayOnMap(df_Indian_Restaurants[df_Indian_Restaurants['ZipCode'] == zCode], 'red', map_dfw_ir_loc)
    displayOnMap(df_Asian_Restaurants[df_Asian_Restaurants['ZipCode'] == zCode], 'darkgreen', map_dfw_ir_loc)
    displayOnMap(df_shopping_places[df_shopping_places['ZipCode'] == zCode], 'fuchsia', map_dfw_ir_loc)
    displayOnMap(df_residential_complex[df_residential_complex['ZipCode'] == zCode], 'purple', map_dfw_ir_loc)
    displayOnMap(df_office_hotel_places[df_office_hotel_places['ZipCode'] == zCode], 'orange', map_dfw_ir_loc)

    return(map_dfw_ir_loc)

<a id="item13"></a>

### 13. Display Location with highest weight score

In [72]:
#Display Roanoke:76262 with all vanues used for evaluation
displayIRLoc(76262)

In [73]:
#Display Denton:76201 with all vanues used for evaluation
displayIRLoc(76201)

In [74]:
#Display Grand Prairie:75052 with all vanues used for evaluation
displayIRLoc(75052)

In [75]:
df_dfw_ir_weight_score.head(3)

Unnamed: 0,City,ZipCode,County,Asian_Population,WtScore,Indian_Restaurants,Asian_Restaurants,Shopping_Places,Residential_Complex,Office_Hotels_Places
87,Roanoke,76262,Denton,1093.0,66.5,0.0,19.0,7.0,11.0,43.0
83,Denton,76201,Denton,2177.0,41.0,4.0,57.0,21.0,28.0,33.0
23,Grand Prairie,75052,Dallas,8610.0,39.5,0.0,10.0,29.0,8.0,10.0


<a id="item14"></a>

### 14. Conclusion

After reviewing first 3 location, we can consider Roanoke might be the best possible location to start a Indian Restaurant.
For Roanoke:76262
Pros:
    1. No Indian / Indo Chinese Restaurants
    2. Big Office complex (i.e. Charles Shwab, Fedelity, Sabre etc) and shopping places are near by this area
    3. Considerable number of Residential complexes are there in the area
    4. As the Property Taxes are less in this area compare to other places in DFW area and Property prices are high in other 
        places, now Asian community is growing considerably in this area.
    
 Cons:
     1. Asian Community is less compared to other locations.