# Capstone Project - The Battle of Neighborhoods

## Section 1: Introduction/Business Problem

Introduction where you discuss the business problem and who would be interested in this project.

#### A friend of mine wants to open a cinema in Paris. 
He thinks that the first important thing about opening a cinema is its location and the surrounding neighborhood. He asked me to find some possible locations to open a cinema in Paris and wants me to concentrate on the selection of cinema location instead of cinema facility and rental price. 

He explained that in customer point of view, watching movie is a part of whole afternoon or night activities. Cinema should has **many restaurants and shopping places nearby**. Transportation is also an important factor. Customer can walk to cinema within **5 minutes** from **public transport facilities** such as bus stop and metro station.  
  
I first need to select **5 possible locations** to build the cinema. Then I will find out which location should be suggested to my friend.

## Section 2: Data

Data where you describe the data that will be used to solve the problem and the source of the data.

According to the question, I need to find following data to resolve the problem.

### 1. Geographic coordinate of Paris cinemas

I need to **compare 5 possible locations with current cinemas** in Paris. Therefore, I need to find a list of Paris cinema and cinemas' geographic coordinates. I will scrape the data from this website and come up with a list of cinemas in Paris http://www.allocine.fr/salle/cinemas-pres-de-115755/.

In [1]:
#Import neccesary libraries
import requests
from bs4 import BeautifulSoup as bs
import pandas as pd
import folium

In [1]:
#define a function to extract the name and address of the cinemas on the website
#this function will scrape the web for meta data and extract the name and address of all the cinemas include in an url
def get_cinemas(soup):
    cinemas = soup.find_all('div', class_='theaterblock j_entity_container')
    list_cinemas = []
    for i,cinema in enumerate(cinemas):
        name = cinemas[i].h2.text.strip()
        address = cinemas[i].p.text.strip()
        list_cinemas.append({
            'name':name,
            'address':address
        })
    return list_cinemas

In [74]:
#run the function to scraping the data from the website
endpoint = 'http://www.allocine.fr/salle/cinemas-pres-de-115755/?page={}'
list_cinemas = []
for page in range(1,33): #the information about paris cinemas is included in 32 page the website
    url = endpoint.format(str(page))
    print(url)
    source = requests.get(url)
    soup = bs(source.text, 'lxml')
    list_cinemas += get_cinemas(soup)
    print('Page', page)

http://www.allocine.fr/salle/cinemas-pres-de-115755/?page=1
Page 1
http://www.allocine.fr/salle/cinemas-pres-de-115755/?page=2
Page 2
http://www.allocine.fr/salle/cinemas-pres-de-115755/?page=3
Page 3
http://www.allocine.fr/salle/cinemas-pres-de-115755/?page=4
Page 4
http://www.allocine.fr/salle/cinemas-pres-de-115755/?page=5
Page 5
http://www.allocine.fr/salle/cinemas-pres-de-115755/?page=6
Page 6
http://www.allocine.fr/salle/cinemas-pres-de-115755/?page=7
Page 7
http://www.allocine.fr/salle/cinemas-pres-de-115755/?page=8
Page 8
http://www.allocine.fr/salle/cinemas-pres-de-115755/?page=9
Page 9
http://www.allocine.fr/salle/cinemas-pres-de-115755/?page=10
Page 10
http://www.allocine.fr/salle/cinemas-pres-de-115755/?page=11
Page 11
http://www.allocine.fr/salle/cinemas-pres-de-115755/?page=12
Page 12
http://www.allocine.fr/salle/cinemas-pres-de-115755/?page=13
Page 13
http://www.allocine.fr/salle/cinemas-pres-de-115755/?page=14
Page 14
http://www.allocine.fr/salle/cinemas-pres-de-115755/

In [75]:
# take a look at the result
list_cinemas

[{'name': 'Luminor Hôtel de Ville\nRéservation sur Allociné',
  'address': '20, rue du Temple 75004 Paris 4e arrondissement'},
 {'name': 'Espace Saint-Michel',
  'address': '7, place Saint-Michel 75005 Paris 5e arrondissement'},
 {'name': 'Studio Galande',
  'address': '42, rue Galande 75005 Paris 5e arrondissement'},
 {'name': 'Centre Georges-Pompidou',
  'address': 'pl. Georges-Pompidou 75004 Paris 4e arrondissement\nMétro Hotel-de-Ville, Rambuteau.'},
 {'name': 'Saint-André des Arts',
  'address': '30 rue Saint-André-des-Arts: caisse, salles 1 & 2 - 12 rue Gît-le-Cœur, Salle 3 75006 Paris 6e arrondissement'},
 {'name': 'MK2 Odéon (Côté St Michel)',
  'address': '7, rue Hautefeuille 75006 Paris 6e arrondissement'},
 {'name': 'MK2 Beaubourg',
  'address': '50, rue Rambuteau 75003 Paris 3e arrondissement'},
 {'name': 'Christine 21',
  'address': '4, rue Christine 75006 Paris 6e arrondissement'},
 {'name': 'Le Champo - Espace Jacques Tati',
  'address': '51, rue des Ecoles 75005 Paris 5

It seems like our data need some cleaning to remove parts like "Réservation sur Allociné" or "\n"

In [4]:
#transform our data into a pandas DataFrame
df_cinemas = pd.DataFrame(list_cinemas, columns=['name','address'])
df.shape

(251, 2)

In [6]:
#drop duplicates
df = df.drop_duplicates('address', keep='first')
df.shape

(245, 2)

In [7]:
#define a functon to clean the name of the cinema
def clean_name(string):
    while string.find('\r') != -1:
        string = string.split('\r\n')[0]
    while string.find('\n') != -1:
        string = string.split('\n')[0]
    return string

#define a function to clean the address of the cinema
def clean_address(string):
    string = ', '.join(string.split('\r\n'))
    return string

In [2]:
#apply our function to clean data and reset index and take a look at out data
df.name = df.name.apply(clean_name)
df.address = df.address.apply(clean_address)
df = df.reset_index(drop=True)
df

Unnamed: 0,name,address
0,Luminor Hôtel de Ville,"20, rue du Temple 75004 Paris 4e arrondissement"
1,Espace Saint-Michel,"7, place Saint-Michel 75005 Paris 5e arrondiss..."
2,Studio Galande,"42, rue Galande 75005 Paris 5e arrondissement"
3,Centre Georges-Pompidou,pl. Georges-Pompidou 75004 Paris 4e arrondisse...
4,Saint-André des Arts,"30 rue Saint-André-des-Arts: caisse, salles 1 ..."
5,MK2 Odéon (Côté St Michel),"7, rue Hautefeuille 75006 Paris 6e arrondissement"
6,MK2 Beaubourg,"50, rue Rambuteau 75003 Paris 3e arrondissement"
7,Christine 21,"4, rue Christine 75006 Paris 6e arrondissement"
8,Le Champo - Espace Jacques Tati,"51, rue des Ecoles 75005 Paris 5e arrondissement"
9,Nouvel Odéon,"6, rue de l'Ecole-de-Medecine 75006 Paris 6e a..."


#### Next step we need to find the geographical coordinates using geopy library and the API of Bing

In [3]:
#set up dependencies to get geolocation
import geopy
API_KEY = '<your API key>'
geolocator = geopy.geocoders.Bing(API_KEY)

In [4]:
#define a function to get a latitude and longitude of the cinema
def get_lat_lon(address):
    g = None
    while g == None:
        g = geolocator.geocode(address)
    return g.latitude, g.longitude

In [5]:
#get geolocation of the cinemas
#if can not get a geolocation of an address, skip and give the coordinate None value
cinemas_full = []
for i in range(len(df)):
    row = df.loc[i,:]
    lat = None
    lon = None
    print("ROW", i, end=' ')
    try:
        lat,lon = get_lat_lon(row[1])
    except:
        coords = {
            'name':row[0],
            'address':row[1],
            'latitude':lat,
            'longitude': lon
        }
        cinemas_full.append(coords)
        print('Failed')
        continue
    coords = {
        'name':row[0],
        'address':row[1],
        'latitude':lat,
        'longitude': lon
    }
    cinemas_full.append(coords)
    print('Success', end=' ')

ROW 0 Success ROW 1 Success ROW 2 Success ROW 3 Success ROW 4 Success ROW 5 Success ROW 6 Success ROW 7 Success ROW 8 Success ROW 9 Success ROW 10 Success ROW 11 Success ROW 12 Success ROW 13 Success ROW 14 Success ROW 15 Success ROW 16 Success ROW 17 Success ROW 18 Success ROW 19 Success ROW 20 Success ROW 21 Success ROW 22 Success ROW 23 Success ROW 24 Success ROW 25 Success ROW 26 Success ROW 27 Success ROW 28 Success ROW 29 Success ROW 30 Success ROW 31 Success ROW 32 Success ROW 33 Success ROW 34 Success ROW 35 Success ROW 36 Success ROW 37 Success ROW 38 Success ROW 39 Success ROW 40 Success ROW 41 Success ROW 42 Success ROW 43 Success ROW 44 Success ROW 45 Success ROW 46 Success ROW 47 Success ROW 48 Success ROW 49 Success ROW 50 Success ROW 51 Failed
ROW 52 Success ROW 53 Success ROW 54 Success ROW 55 Success ROW 56 Success ROW 57 Success ROW 58 Success ROW 59 Success ROW 60 Success ROW 61 Success ROW 62 Success ROW 63 Success ROW 64 Success ROW 65 Success ROW 66 Success ROW 67

In [4]:
#take a look at our cinema's location
cinemas_final = pd.DataFrame(cinemas_full, columns =['name','address','latitude','longitude'])
#save our data
cinemas_final.to_csv('cinemas_final.csv')

cinemas_final

Unnamed: 0,name,address,latitude,longitude
0,Luminor Hôtel de Ville,"20, rue du Temple 75004 Paris 4e arrondissement",48.858690,2.353590
1,Espace Saint-Michel,"7, place Saint-Michel 75005 Paris 5e arrondiss...",48.853090,2.344230
2,Studio Galande,"42, rue Galande 75005 Paris 5e arrondissement",48.851620,2.347080
3,Centre Georges-Pompidou,pl. Georges-Pompidou 75004 Paris 4e arrondisse...,48.860680,2.351980
4,Saint-André des Arts,"30 rue Saint-André-des-Arts: caisse, salles 1 ...",48.853300,2.342270
5,MK2 Odéon (Côté St Michel),"7, rue Hautefeuille 75006 Paris 6e arrondissement",48.852190,2.342790
6,MK2 Beaubourg,"50, rue Rambuteau 75003 Paris 3e arrondissement",48.861580,2.352290
7,Christine 21,"4, rue Christine 75006 Paris 6e arrondissement",48.854460,2.340130
8,Le Champo - Espace Jacques Tati,"51, rue des Ecoles 75005 Paris 5e arrondissement",48.849980,2.343210
9,Nouvel Odéon,"6, rue de l'Ecole-de-Medecine 75006 Paris 6e a...",48.850760,2.341750


After taking a look at our data, we have found 245 cinemas in Paris. This doesn't seem right because after some research, there are only about 80 cinemas within Paris.
If we pay attention to the zip code in cluded in the address, there are many zip codes that are not from Paris.

#### We want to find only cinemas within Paris but this list contains cinemas nearby Paris. The next step is to extract only cinemas that is within Paris

In [5]:
#take a look a the Null data
nan = cinemas_final[cinemas_final.latitude.isnull()]
nan

Unnamed: 0,name,address,latitude,longitude
51,MK2 Grand Palais,A l'intérieur du Grand Palais: à l'angle du Co...,,
105,Les 3 Cinés - Robespierre,"19, avenue Maximilien Robespierre 94400 Vitry-...",,
147,Le Colombier,Place Charles de Gaulle 92410 Ville-d'Avray,,
173,UGC Ciné Cité O'Parinor,Centre commercial O’Parinor Le Haut de Galy 93...,,


In [6]:
# drop the null data since they do not contain zipcode within paris
cinemas_final = cinemas_final.dropna()

#### Next step
To check wether a cinema is in Paris, we will check if the postal code in the address is included in the list of 20 postal codes of districts (or arrondissements) in Paris. <br> After a quick search, we know that the postal code in Paris has the form of "750xx" with xx is from 01 to 20, which equals to 20 districts in Paris.

In [7]:
# Create a list of postal code in Paris
district_list = []
for i in range(1,21):
    district = '750{}'.format(str(i).zfill(2))
    district_list.append(district)
print(district_list)

['75001', '75002', '75003', '75004', '75005', '75006', '75007', '75008', '75009', '75010', '75011', '75012', '75013', '75014', '75015', '75016', '75017', '75018', '75019', '75020']


In [8]:
# Define a function to check if the address is within Paris
def in_paris(address):
    district_list = ['75001', '75002', '75003', '75004', '75005', '75006', '75007', '75008',
                     '75009', '75010', '75011', '75012', '75013', '75014', '75015', '75016',
                     '75017', '75018', '75019', '75020']
    for district in district_list:
        if district in address:
            return True
    return False

In [9]:
# Apply our function and take a look at our data
cinemas_final = cinemas_final[cinemas_final.address.apply(in_paris)]
cinemas_final = cinemas_final.reset_index(drop= True)
cinemas_final

Unnamed: 0,name,address,latitude,longitude
0,Luminor Hôtel de Ville,"20, rue du Temple 75004 Paris 4e arrondissement",48.858690,2.353590
1,Espace Saint-Michel,"7, place Saint-Michel 75005 Paris 5e arrondiss...",48.853090,2.344230
2,Studio Galande,"42, rue Galande 75005 Paris 5e arrondissement",48.851620,2.347080
3,Centre Georges-Pompidou,pl. Georges-Pompidou 75004 Paris 4e arrondisse...,48.860680,2.351980
4,Saint-André des Arts,"30 rue Saint-André-des-Arts: caisse, salles 1 ...",48.853300,2.342270
5,MK2 Odéon (Côté St Michel),"7, rue Hautefeuille 75006 Paris 6e arrondissement",48.852190,2.342790
6,MK2 Beaubourg,"50, rue Rambuteau 75003 Paris 3e arrondissement",48.861580,2.352290
7,Christine 21,"4, rue Christine 75006 Paris 6e arrondissement",48.854460,2.340130
8,Le Champo - Espace Jacques Tati,"51, rue des Ecoles 75005 Paris 5e arrondissement",48.849980,2.343210
9,Nouvel Odéon,"6, rue de l'Ecole-de-Medecine 75006 Paris 6e a...",48.850760,2.341750


Now our data only contains 80 cinemas within Paris, which is about right. Next step is to draw a map to check if our cinemas is in fact all in Paris

In [10]:
# This a the geolocation of Paris resulted from another search I did previously.
# I didn't include the code because it was similar to what we did in week 2 and week 3.
latitude = 48.8572082519531
longitude = 2.34143996238708

In [12]:
#create map of Paris using latitude and longitude
map_paris = folium.Map(location = [latitude, longitude], zoom_start = 12)

#add markers to the map
for name, add, lat, lng in zip(cinemas_final['name'],cinemas_final['address'], 
                               cinemas_final['latitude'],cinemas_final['longitude']):
    label = 'Cinema {} - {}'.format(name,add)
    label = folium.Popup(label, parse_html = True)
    folium.CircleMarker(
        [lat,lng],
        radius = 5,
        popup = label,
        color = 'blue',
        fill = True,
        fill_color = '#3186cc',
        fill_opacity = 0.7
    ).add_to(map_paris)
    
map_paris

### 2.  Location of top 5 popular cinemas in Paris

We can find the list of popular cinemas in this website http://www.allocine.fr/salle/
#### Next step
We will try to scrap the name and address of these popular cinemas 

In [94]:
# get the part of the website that contains data of the 10 most popular cinemas in France
url = 'http://www.allocine.fr/salle/'
source = requests.get(url)
soup = bs(source.text, 'lxml')
table = soup.find_all('table', class_="noborder")[0]
rows = table.find_all('tr')

In [116]:
# Extract from the data name, address
top_cinemas = []
for row in rows:
    row = row.text.strip().split('\n')
    name = row[-2]
    address = row[-1]
    
    # Only get the geolocation from popular cinemas in Paris
    if in_paris(address):
        lat,lon = get_lat_lon(address)
        popular_cine = {
            'name':name,
            'address':address,
            'latitude':lat,
            'longitude':lon
        }
        top_cinemas.append(popular_cine)
    # print out cinemas that are note in paris to double check
    else: 
        print(address, "NOT IN PARIS")
        continue


Le Dôme - Centre commercial Les 4 Temps 92092 Paris - La Défense NOT IN PARIS
chemin des Pennes aux Pins 13170 Les Pennes-Mirabeau NOT IN PARIS
1 avenue de la Source de la Bievre 78180 Montigny-le-Bretonneux NOT IN PARIS
Centre commercial Belle-Epine 94320 Thiais NOT IN PARIS
3 allée du préambule, Centre Commercial Carré Sénart 77127 Lieusaint NOT IN PARIS


In [14]:
#take a look at our data
df_top_cinemas = pd.DataFrame(top_cinemas, columns=['name','address','latitude','longitude'])
df_top_cinemas.to_csv('top_cinemas_in_paris.csv')
df_top_cinemas

Unnamed: 0,name,address,latitude,longitude
0,UGC Ciné Cité Les Halles,"7, place de la Rotonde 75001 Paris 1er arrondi...",48.89616,2.38566
1,UGC Ciné Cité Bercy,"2, cour Saint-Emilion 75012 Paris 12e arrondis...",48.831835,2.385233
2,MK2 Bibliothèque,128-162 avenue de France 75013 Paris 13e arron...,48.83167,2.376
3,Pathé Wepler,"140, bd de Clichy et 8, av de Clichy 75018 Par...",48.88394,2.32803
4,Pathé Beaugrenelle,7 rue Linois 75015 Paris 15e arrondissement,48.8489,2.28258


#### Mapping the popular with cinemas with all other cinemas

In [15]:
#create map of Toronto using latitude and longitude
map_paris2 = folium.Map(location = [latitude, longitude], zoom_start = 12)

#add markers of all cinemas in Paris to the map
for name, add, lat, lng in zip(cinemas_final['name'],cinemas_final['address'], 
                               cinemas_final['latitude'],cinemas_final['longitude']):
    label = 'Cinema {} - {}'.format(name,add)
    label = folium.Popup(label, parse_html = True)
    folium.CircleMarker(
        [lat,lng],
        radius = 5,
        popup = label,
        color = 'blue',
        fill = True,
        fill_color = '#3186cc',
        fill_opacity = 0.7
    ).add_to(map_paris2)
#add markers of 5 top cinemas in Paris to the map
for name, add, lat, lng in zip(df_top_cinemas['name'],df_top_cinemas['address'], 
                               df_top_cinemas['latitude'],df_top_cinemas['longitude']):
    label = 'Cinema {} - {}'.format(name,add)
    label = folium.Popup(label, parse_html = True)
    folium.CircleMarker(
        [lat,lng],
        radius = 5,
        popup = label,
        color = 'red',
        fill = True,
        fill_color = '#3186cc',
        fill_opacity = 0.7
    ).add_to(map_paris2)
    
    
map_paris2

### 3. Geographic coordinates of 5 possible cinema addresses
I will choose randomly 5 addresses out of all cinemas in Paris (except the top 5 popular ones) and pretend that those 5 address don't have cinema.
Those 5 randomly choosen addresses will be possible locations to evaluate.

### 4. Eating, Shopping and Public transportation facility around cinema
The recommended cinema location needs to have many eating and shopping venues nearby. Convenient public transport is also required.  
I can use FourSquare API to find these venues around the location. 

5 minutes walking distance is about 500m. I think it is the suitable distance to search nearby venues.

Following categories will be used for finding the target venues. Full list of categories: https://developer.foursquare.com/docs/resources/categories

In [30]:
print('Use the first cinema "{}" in the list as example to explore venues nearyby'.format(cinemas_final['name'][0]))

Use the first cinema "Luminor Hôtel de Ville" in the list as example to explore venues nearyby


In [31]:
fs_categories = {
    'Food': '4d4b7105d754a06374d81259',
    'Shop & Service': '4d4b7105d754a06378d81259',
    'Bus Stop': '52f2ab2ebcbc57f1066b8b4f',
    'Metro Station': '4bf58dd8d48988d1fd931735',
    'Nightlife Spot': '4d4b7105d754a06376d81259',
    'Arts & Entertainment': '4d4b7104d754a06370d81259'
}

In [32]:
', '.join([ cat for cat in fs_categories])

'Food, Shop & Service, Bus Stop, Metro Station, Nightlife Spot, Arts & Entertainment'

In [33]:
cinema = cinemas_final.loc[0]

In [34]:
from six.moves.urllib import parse
from pandas.io.json import json_normalize

In [36]:
CLIENT_ID = '<Your_foursquare_client_id>'
CLIENT_SECRET = '<Your_foursquare_secret_key>'
VERSION = '20181104'
LATITUDE,LONGITUDE = cinema['latitude'],cinema['longitude']
RADIUS = 500

In [37]:
http = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&categoryId={}'

In [None]:
url = http.format(
    CLIENT_ID,
    CLIENT_SECRET,
    VERSION,
    LATITUDE,
    LONGITUDE,
    RADIUS,
    CATID
)

In [41]:
resp = requests.get(url).json()
resp['response']['venues']

[{'id': '4b2e4a89f964a520fcdd24e3',
  'name': 'Les Marronniers',
  'location': {'address': '18 rue des Archives',
   'lat': 48.85783014865521,
   'lng': 2.354572839448655,
   'labeledLatLngs': [{'label': 'display',
     'lat': 48.85783014865521,
     'lng': 2.354572839448655}],
   'distance': 119,
   'postalCode': '75004',
   'cc': 'FR',
   'city': 'Paris',
   'state': 'Île-de-France',
   'country': 'France',
   'formattedAddress': ['18 rue des Archives', '75004 Paris', 'France']},
  'categories': [{'id': '4bf58dd8d48988d10c941735',
    'name': 'French Restaurant',
    'pluralName': 'French Restaurants',
    'shortName': 'French',
    'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/french_',
     'suffix': '.png'},
    'primary': True}],
  'referralId': 'v-1542542513',
  'hasPerk': False},
 {'id': '5293ae7d11d2fba382d9f652',
  'name': 'Benedict',
  'location': {'address': '19 rue Sainte-Croix-de-la-Bretonnerie',
   'lat': 48.85820815365001,
   'lng': 2.3560811411196494,

In [42]:
df__ = json_normalize(resp['response']['venues'])

In [43]:
df__.columns

Index(['categories', 'hasPerk', 'id', 'location.address', 'location.cc',
       'location.city', 'location.country', 'location.crossStreet',
       'location.distance', 'location.formattedAddress',
       'location.labeledLatLngs', 'location.lat', 'location.lng',
       'location.neighborhood', 'location.postalCode', 'location.state',
       'name', 'referralId', 'venuePage.id'],
      dtype='object')

In [44]:
# Define a function to search nearby information and convert the result as dataframe
def venues_nearby(latitude, longitude, category, verbose=True):    
    url = http.format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        latitude,
        longitude,
        RADIUS,
        fs_categories[category]
    )
    results = requests.get(url).json()['response']
    df = json_normalize(results['venues'])
    cols = ['name','latitude','longitude']    
    if( len(df) == 0 ):        
        df = pd.DataFrame(columns=cols)
    else:        
        df = df[['name','location.lat','location.lng']]
        df.columns = cols
    if( verbose ):
        print('{} "{}" venues are found within {}m of location'.format(len(df), category, RADIUS))
    return df
    

#### Find Metro Station around the cinema

In [45]:
venues_nearby(cinema['latitude'], cinema['longitude'], 'Metro Station').head()

4 "Metro Station" venues are found within 500m of location


Unnamed: 0,name,latitude,longitude
0,"Métro Hôtel de Ville [1,11]",48.857353,2.351761
1,"Métro Châtelet [1,4,7,11,14]",48.859099,2.346813
2,Métro Pont Marie [7],48.853872,2.356778
3,Métro Rambuteau [11],48.861298,2.353106


#### Find Bus Stop around the cinema

In [46]:
venues_nearby(cinema['latitude'], cinema['longitude'], 'Bus Stop').head()

15 "Bus Stop" venues are found within 500m of location


Unnamed: 0,name,latitude,longitude
0,"Arrêt Hôtel de Ville [72,74]",48.85688,2.350751
1,Arrêt Archives-Haudriettes (29),48.86118,2.358411
2,Arrêt La Verrerie [75],48.857859,2.354724
3,"Arrêt Hôtel de Ville [67,69,76,96]",48.857162,2.35292
4,"Arrêt Centre Georges Pompidou [38,47,75,N12,N1...",48.860359,2.352556


#### Find eating places around the cinema

In [47]:
venues_nearby(cinema['latitude'], cinema['longitude'], 'Food').head()

30 "Food" venues are found within 500m of location


Unnamed: 0,name,latitude,longitude
0,Les Marronniers,48.85783,2.354573
1,Benedict,48.858208,2.356081
2,La Caféothèque de Paris,48.854197,2.355714
3,KFC,48.86074,2.349063
4,McDonald's,48.857928,2.351477


#### Find arts and entertainment places around the cinema

In [48]:
venues_nearby(cinema['latitude'], cinema['longitude'], 'Arts & Entertainment').head()

30 "Arts & Entertainment" venues are found within 500m of location


Unnamed: 0,name,latitude,longitude
0,Luminor Hôtel de Ville,48.858654,2.353498
1,Centre Pompidou – Musée National d'Art Moderne,48.860445,2.35249
2,Tour Saint-Jacques,48.858031,2.348875
3,Le Renard,48.858204,2.351839
4,Lafayette Anticipations,48.859167,2.354866


With above data, I can build a **content-based recommender systems** to resolve the problem.  

Combine with FourSquare API on counting how many different venues (Food, Transport, Night Life) and Paris cinema list, a **cinema nearby venues matrix** can be built. The 5 popular cinema in Paris is the **profile** to combine with cinema nearby venues matrix to become a **weighted matrix of favorite cinema**.

The weighted matrix can be applied on **5 possible locations with venues information** to generate a ranking result. The **the top one** on the ranking list can be recommended to my friend.


## Section 3. Methodology 

Methodology section which represents the main component of the report where you discuss and describe any exploratory data analysis that you did, any inferential statistical testing that you performed, and what machine learnings were used and why.

TBD

## Section 4. Results 

Results section where you discuss the results.

TBD

## Section 5. Discussion 

Discussion section where you discuss any observations you noted and any recommendations you can make based on the results.TBC

TBD

## Section 6.Conclusion 

Conclusion section where you conclude the report.

TBD