## 1. Business Problem

#### Orlando, a city in central Florida, is home to more than a dozen theme parks. Chief among its claims to fame is Walt Disney  World, comprised of parks like the Magic Kingdom and Epcot, as well as water parks. Another major destination, Universal Orlando, offers Universal Studios and Islands of Adventure, with the Wizarding World of Harry Potter straddling both.

#### A client of mine wants to start a Travel Business in Orlando, FL with the concept of taking tourists around all the attractions according to their demand. The goal of this project is that my client wants to find the best neighborhood that is near to places like Resorts, Vacation Rentals and Motels so that my client can kick-start their business and estimate the turn out profits in the next two to three years by concentrating on the top three neighborhoods.

#### This business problem mainly concentrates on finding the top three neighborhoods to start the food truck business.

## 2. Data

### Source
##### List of the neighborhoods in Orlando, FL: https://data.cityoforlando.net/Government-General/Neighborhoods/dpx3-qjrc
##### FourSquare data to find the popular venues
##### How will the data be used to answer the business needs?
##### The data mentioned above will be used to explore and target locations across different venues present in the neighborhoods.
##### 1. Use Foursquare and geopy data to map top venues for the super neighborhoods of Orlando and cluster them in groups
##### 2. City-Data to get the neighborhoods information
##### 3. Additional data will be added from open data sources if available in the future if the data is insufficient
###### By extracting the venues of the neighborhoods we can determine the most visited venues which would determine that the tourist count is high in that area. By using Foursquare data and the Orlando's neighborhood data, we can recommend the top three neighborhoods by performing machine learning techniques and can visualize them through a graph or a map.


### Installing Required Packages to load the data

In [13]:
import pandas as pd
import numpy as np
from geopy.geocoders import Nominatim

In [14]:
data = pd.read_csv('C:/Users/jahna/Downloads/mco.csv')
data.head()

Unnamed: 0,the_geom,NBHDID,NBHDNAME,COLOR
0,MULTIPOLYGON (((-81.40938965865213 28.49422746...,110,Park Central,10551295
1,MULTIPOLYGON (((-81.38217375194158 28.51274701...,112,Lake Holden,16752895
2,MULTIPOLYGON (((-81.29578617870794 28.49571779...,113,Pershing,16752895
3,MULTIPOLYGON (((-81.31051469166141 28.54720993...,4,Azalea Park,10551295
4,MULTIPOLYGON (((-81.31456291107887 28.51490542...,7,Bryn Mawr,16752895


#### Renaming the NBHDID to Neighborhood

In [15]:
df_mco = data.rename(columns={data.columns[2]: 'Neighborhood'})
df_mco.head()

Unnamed: 0,the_geom,NBHDID,Neighborhood,COLOR
0,MULTIPOLYGON (((-81.40938965865213 28.49422746...,110,Park Central,10551295
1,MULTIPOLYGON (((-81.38217375194158 28.51274701...,112,Lake Holden,16752895
2,MULTIPOLYGON (((-81.29578617870794 28.49571779...,113,Pershing,16752895
3,MULTIPOLYGON (((-81.31051469166141 28.54720993...,4,Azalea Park,10551295
4,MULTIPOLYGON (((-81.31456291107887 28.51490542...,7,Bryn Mawr,16752895


#### Getting Longitude and Latitude values of Orlando using geolocator

In [16]:
address = 'Orlando, FL'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print("Geographical co-ordinates of Orlando, FL are (lat):{} and (long): {}".format(latitude, longitude))

Geographical co-ordinates of Orlando, FL are (lat):28.5421109 and (long): -81.3790304


#### Create lists for lat and long

In [17]:
lat = []
lng = []

#Loop through all neigborhoods in Helsinki
for adr in df_mco['Neighborhood']:
    #Use geolocator to get coordinates of neigborhoods within Houston
    loc = geolocator.geocode(adr+','+ 'Orlando'+','+'FL')
    if loc == None:
        lat.append('NAN')
        lng.append('NAN')
    #Append coordinates to lists
    else:
        lat.append(loc.latitude)
        lng.append(loc.longitude)

#Map coordinate lists to data frame 

df_mco['lat'] = lat
df_mco['lng'] = lng

In [18]:
df_mco.head()

Unnamed: 0,the_geom,NBHDID,Neighborhood,COLOR,lat,lng
0,MULTIPOLYGON (((-81.40938965865213 28.49422746...,110,Park Central,10551295,28.4904,-81.4135
1,MULTIPOLYGON (((-81.38217375194158 28.51274701...,112,Lake Holden,16752895,28.5049,-81.3853
2,MULTIPOLYGON (((-81.29578617870794 28.49571779...,113,Pershing,16752895,28.4973,-81.2963
3,MULTIPOLYGON (((-81.31051469166141 28.54720993...,4,Azalea Park,10551295,28.5474,-81.301
4,MULTIPOLYGON (((-81.31456291107887 28.51490542...,7,Bryn Mawr,16752895,28.5102,-81.3237


#### Checking for null values

In [19]:
df_mco.isnull().sum()

the_geom        0
NBHDID          0
Neighborhood    0
COLOR           0
lat             0
lng             0
dtype: int64

#### Renamed lat to Latitude and lng to Longitude

In [20]:
df_mco = df_mco.rename(columns = {"lat": "Latitude", "lng": "Longitude"})
df_mco.head()

Unnamed: 0,the_geom,NBHDID,Neighborhood,COLOR,Latitude,Longitude
0,MULTIPOLYGON (((-81.40938965865213 28.49422746...,110,Park Central,10551295,28.4904,-81.4135
1,MULTIPOLYGON (((-81.38217375194158 28.51274701...,112,Lake Holden,16752895,28.5049,-81.3853
2,MULTIPOLYGON (((-81.29578617870794 28.49571779...,113,Pershing,16752895,28.4973,-81.2963
3,MULTIPOLYGON (((-81.31051469166141 28.54720993...,4,Azalea Park,10551295,28.5474,-81.301
4,MULTIPOLYGON (((-81.31456291107887 28.51490542...,7,Bryn Mawr,16752895,28.5102,-81.3237


#### Created a df that consists of the required columns

In [21]:
df_neigh = df_mco[["Neighborhood", "Longitude", "Latitude"]]
df_neigh.head()

Unnamed: 0,Neighborhood,Longitude,Latitude
0,Park Central,-81.4135,28.4904
1,Lake Holden,-81.3853,28.5049
2,Pershing,-81.2963,28.4973
3,Azalea Park,-81.301,28.5474
4,Bryn Mawr,-81.3237,28.5102
