# Capstone Project - The Battle of the Neighborhoods (Week 2)
### Applied Data Science Capstone by IBM/Coursera

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)



## Introduction: Business Problem <a name="introduction"></a>

This paper provides a comparative overview of cinemas in the cities of Belgorod and Stary Oskol.
- ***Belgorod*** is the administrative center of the region with a population of 400,000 people, covering an area of 151.3 sq. km
- the industrial city of ***Stary Oskol*** with a population of 224,000 people, total area - 200.8 sq. km
A preliminary assessment requires a general understanding of the number, location and identification of possible previously unaccounted for indicators of entertainment infrastructure in the named cities.
Based on the data obtained, an understanding will be obtained about the development of this type of entertainment services in these cities and about satisfaction in needs and about possible development. This study will help in the intention of large movie theater chains in determining the strategy of their presence in the region.


## Data <a name="data"></a>

Based on definition of our problem, factors that will influence our decission are:
* number of existing movie theaters in the cities (any type of cinema)

Following data sources will be needed to extract/generate the required information:
* centers of candidate areas will be generated algorithmically and approximate addresses of centers of those areas will be obtained using **Foursquare API**


### Neighborhood Candidates

Let's create latitude & longitude coordinates for centroids of our candidate neighborhoods. We will create a grid of cells covering our area of interest which is aprox. 10 killometers centered around ***Belgorod*** and ***Stary Oskol*** cities center.

Let's first find the latitude & longitude of ***Belgorod*** and ***Stary Oskol*** city centers, using specific, well known address and **Foursquare API**

In [1]:
# The code was removed by Watson Studio for sharing.

In [1]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation


#!pip install geopy
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize


#! pip install folium==0.5.0
import folium # plotting library


Define Foursquare Credentials and Version
Make sure that you have created a Foursquare developer account and have your credentials handy
To obtain access token follow these steps.

Go to your "App Settings" page on the developer console of Foursquare.com
Set the "Redirect URL" under "Web Addresses" to https://www.google.com
Paste and enter the following url in your web browser (replace YOUR_CLIENT_ID with your actual client id): https://foursquare.com/oauth2/authenticate?client_id=YOUR_CLIENT_ID&response_type=code&redirect_uri=https://www.google.com

This should redirect you to a google page requesting permission to make the connection.

Accept and then look at the url of your web browser (take note at the CODE part of the url to use in step 5)
It should look like https://www.google.com/?code=CODE

Copy the code value from the previous step.
Paste and enter the following into your web browser (replace placeholders with actual values): https://foursquare.com/oauth2/access_token?client_id=YOUR_CLIENT_ID&client_secret=YOUR_CLIENT_SECRET&grant_type=authorization_code&redirect_uri=https://www.google.com&code=CODE.

When you paste the link , This should lead you to a page that gives you your access token.

In [2]:
CLIENT_ID = 'HRIC1M0HSDGXEIL1YV2HL1X5VVOHNWZAA424FATBDUMRIHQS' # your Foursquare ID
CLIENT_SECRET = '3JP0IN5JVNBRP01WII34ZZ31MQAKOKBOJS5OKQXBNJKOGF3G' # your Foursquare Secret
ACCESS_TOKEN = '' # your FourSquare Access Token
VERSION = '20180604'
LIMIT = 30


Let's again assume that you are staying at the center of Belgorod. So let's start by converting the Belgorod center's address to its latitude and longitude coordinates.
In order to define an instance of the geocoder, we need to define a user_agent. We will name our agent foursquare_agent, as shown below.

In [6]:
address_B = 'Belgorod'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address_B)
latitude_B = location.latitude
longitude_B = location.longitude
print(latitude_B, longitude_B)

50.5955595 36.5873394


In [8]:
address_SO = 'Stary Oskol'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address_SO)
latitude_SO = location.latitude
longitude_SO = location.longitude
print(latitude_SO, longitude_SO)

51.298038 37.833202


So, let's define a query to search for movies theater's that is within 10000 metres from the cities center .

Let's visualize the data we have so far: city center location and candidate neighborhood centers:

In [9]:
search_query = 'кинотеатр'
radius = 10000
print(search_query + ' .... OK!')

кинотеатр .... OK!


In [12]:
url_B = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&oauth_token={}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude_B, longitude_B,ACCESS_TOKEN, VERSION, search_query, radius, LIMIT)
url_B

'https://api.foursquare.com/v2/venues/search?client_id=HRIC1M0HSDGXEIL1YV2HL1X5VVOHNWZAA424FATBDUMRIHQS&client_secret=3JP0IN5JVNBRP01WII34ZZ31MQAKOKBOJS5OKQXBNJKOGF3G&ll=50.5955595,36.5873394&oauth_token=&v=20180604&query=кинотеатр&radius=10000&limit=30'

In [13]:
url_SO = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&oauth_token={}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude_SO, longitude_SO,ACCESS_TOKEN, VERSION, search_query, radius, LIMIT)
url_SO

'https://api.foursquare.com/v2/venues/search?client_id=HRIC1M0HSDGXEIL1YV2HL1X5VVOHNWZAA424FATBDUMRIHQS&client_secret=3JP0IN5JVNBRP01WII34ZZ31MQAKOKBOJS5OKQXBNJKOGF3G&ll=51.298038,37.833202&oauth_token=&v=20180604&query=кинотеатр&radius=10000&limit=30'

In [14]:
results_B = requests.get(url_B).json()
results_B

{'meta': {'code': 200, 'requestId': '60eee068f180a40a3d92ace2'},
 'response': {'venues': [{'id': '51e38132498ed57c5bd225a3',
    'name': 'ост. Кинотеатр Победа',
    'location': {'address': 'ул. Преображенская',
     'lat': 50.59922680114017,
     'lng': 36.58287512651475,
     'labeledLatLngs': [{'label': 'display',
       'lat': 50.59922680114017,
       'lng': 36.58287512651475}],
     'distance': 515,
     'cc': 'RU',
     'city': 'Белгород',
     'state': 'Белгородскя обл.',
     'country': 'Россия',
     'formattedAddress': ['ул. Преображенская', 'Белгород', 'Россия']},
    'categories': [{'id': '52f2ab2ebcbc57f1066b8b4f',
      'name': 'Bus Stop',
      'pluralName': 'Bus Stops',
      'shortName': 'Bus Stop',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/travel/busstation_',
       'suffix': '.png'},
      'primary': True}],
    'referralId': 'v-1626267752',
    'hasPerk': False},
   {'id': '4fa7e362e4b0e4baa4665267',
    'name': '5D кинотеатр',
    'location

In [15]:
results_SO = requests.get(url_SO).json()
results_SO

{'meta': {'code': 200, 'requestId': '60eee0908f64026453af2c82'},
 'response': {'venues': [{'id': '4ccd86c37c2ff04d87b69f7e',
    'name': 'Быль',
    'location': {'address': 'мк-н Жукова, 38',
     'lat': 51.30884952097854,
     'lng': 37.890751017389924,
     'labeledLatLngs': [{'label': 'display',
       'lat': 51.30884952097854,
       'lng': 37.890751017389924}],
     'distance': 4182,
     'cc': 'RU',
     'city': 'Старый Оскол',
     'state': 'Белгородская обл.',
     'country': 'Россия',
     'formattedAddress': ['мк-н Жукова, 38', 'Старый Оскол', 'Россия']},
    'categories': [{'id': '4bf58dd8d48988d17f941735',
      'name': 'Movie Theater',
      'pluralName': 'Movie Theaters',
      'shortName': 'Movie Theater',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/arts_entertainment/movietheater_',
       'suffix': '.png'},
      'primary': True}],
    'referralId': 'v-1626267792',
    'hasPerk': False}]}}

In [17]:
# assign relevant part of JSON to venues
venues_B = results_B['response']['venues']

# tranform venues into a dataframe
dataframe_B = json_normalize(venues_B)
dataframe_B.head(100)

  dataframe_B = json_normalize(venues_B)


Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,location.lat,location.lng,location.labeledLatLngs,location.distance,location.cc,location.city,location.state,location.country,location.formattedAddress,location.crossStreet,location.postalCode
0,51e38132498ed57c5bd225a3,ост. Кинотеатр Победа,"[{'id': '52f2ab2ebcbc57f1066b8b4f', 'name': 'B...",v-1626267752,False,ул. Преображенская,50.599227,36.582875,"[{'label': 'display', 'lat': 50.59922680114017...",515,RU,Белгород,Белгородскя обл.,Россия,"[ул. Преображенская, Белгород, Россия]",,
1,4fa7e362e4b0e4baa4665267,5D кинотеатр,"[{'id': '4bf58dd8d48988d17e941735', 'name': 'I...",v-1626267752,False,"ТРК ""Рио"", 2 этаж",50.641786,36.572006,"[{'label': 'display', 'lat': 50.64178634280017...",5258,RU,Белгород,Белгородская обл.,Россия,"[ТРК ""Рио"", 2 этаж (просп. Богдана Хмельницког...","просп. Богдана Хмельницкого, 164",
2,5dfba4312f4b3400084b4b73,Кинотеатр Спутник,"[{'id': '4bf58dd8d48988d17f941735', 'name': 'M...",v-1626267752,False,"Магистральная, 2 в",50.57297,36.538273,"[{'label': 'display', 'lat': 50.57297, 'lng': ...",4283,RU,Белгород,Белгородская обл.,Россия,"[Магистральная, 2 в, 308019, Белгород, Россия]",,308019.0
3,4e42c933fa76a4284c1c8671,Победа,"[{'id': '4bf58dd8d48988d17f941735', 'name': 'M...",v-1626267752,False,"ул. 50 лет Белгородской области, 8Б",50.598902,36.585835,"[{'label': 'display', 'lat': 50.59890173997732...",386,RU,Белгород,Белгородская обл.,Россия,"[ул. 50 лет Белгородской области, 8Б (ул. Прео...",ул. Преображенская,
4,4bfd60fdbf6576b01830adb8,Сити Молл Белгородский,"[{'id': '4bf58dd8d48988d1fd941735', 'name': 'S...",v-1626267752,False,мкрн. Пригородный,50.55229,36.571185,"[{'label': 'display', 'lat': 50.55228950457049...",4950,RU,п. Дубовое,Белгородская обл.,Россия,"[мкрн. Пригородный (ул. Щорса, 64), 308501, п....","ул. Щорса, 64",308501.0


In [19]:
dataframe_B = dataframe_B.drop(index=[0])

In [20]:
dataframe_B.head(100)

Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,location.lat,location.lng,location.labeledLatLngs,location.distance,location.cc,location.city,location.state,location.country,location.formattedAddress,location.crossStreet,location.postalCode
1,4fa7e362e4b0e4baa4665267,5D кинотеатр,"[{'id': '4bf58dd8d48988d17e941735', 'name': 'I...",v-1626267752,False,"ТРК ""Рио"", 2 этаж",50.641786,36.572006,"[{'label': 'display', 'lat': 50.64178634280017...",5258,RU,Белгород,Белгородская обл.,Россия,"[ТРК ""Рио"", 2 этаж (просп. Богдана Хмельницког...","просп. Богдана Хмельницкого, 164",
2,5dfba4312f4b3400084b4b73,Кинотеатр Спутник,"[{'id': '4bf58dd8d48988d17f941735', 'name': 'M...",v-1626267752,False,"Магистральная, 2 в",50.57297,36.538273,"[{'label': 'display', 'lat': 50.57297, 'lng': ...",4283,RU,Белгород,Белгородская обл.,Россия,"[Магистральная, 2 в, 308019, Белгород, Россия]",,308019.0
3,4e42c933fa76a4284c1c8671,Победа,"[{'id': '4bf58dd8d48988d17f941735', 'name': 'M...",v-1626267752,False,"ул. 50 лет Белгородской области, 8Б",50.598902,36.585835,"[{'label': 'display', 'lat': 50.59890173997732...",386,RU,Белгород,Белгородская обл.,Россия,"[ул. 50 лет Белгородской области, 8Б (ул. Прео...",ул. Преображенская,
4,4bfd60fdbf6576b01830adb8,Сити Молл Белгородский,"[{'id': '4bf58dd8d48988d1fd941735', 'name': 'S...",v-1626267752,False,мкрн. Пригородный,50.55229,36.571185,"[{'label': 'display', 'lat': 50.55228950457049...",4950,RU,п. Дубовое,Белгородская обл.,Россия,"[мкрн. Пригородный (ул. Щорса, 64), 308501, п....","ул. Щорса, 64",308501.0


In [18]:
# assign relevant part of JSON to venues
venues_SO = results_SO['response']['venues']

# tranform venues into a dataframe
dataframe_SO = json_normalize(venues_SO)
dataframe_SO.head(100)

  dataframe_SO = json_normalize(venues_SO)


Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,location.lat,location.lng,location.labeledLatLngs,location.distance,location.cc,location.city,location.state,location.country,location.formattedAddress
0,4ccd86c37c2ff04d87b69f7e,Быль,"[{'id': '4bf58dd8d48988d17f941735', 'name': 'M...",v-1626267792,False,"мк-н Жукова, 38",51.30885,37.890751,"[{'label': 'display', 'lat': 51.30884952097854...",4182,RU,Старый Оскол,Белгородская обл.,Россия,"[мк-н Жукова, 38, Старый Оскол, Россия]"


In [25]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns_B = ['name', 'categories'] + [col for col in dataframe_B.columns if col.startswith('location.')] + ['id']
dataframe_filtered_B = dataframe_B.loc[:, filtered_columns_B]

filtered_columns_SO = ['name', 'categories'] + [col for col in dataframe_SO.columns if col.startswith('location.')] + ['id']
dataframe_filtered_SO = dataframe_SO.loc[:, filtered_columns_SO]
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered_B['categories'] = dataframe_filtered_B.apply(get_category_type, axis=1)
dataframe_filtered_SO['categories'] = dataframe_filtered_SO.apply(get_category_type, axis=1)
# clean column names by keeping only last term
dataframe_filtered_B.columns = [column.split('.')[-1] for column in dataframe_filtered_B.columns]
dataframe_filtered_SO.columns = [column.split('.')[-1] for column in dataframe_filtered_SO.columns]

dataframe_filtered_B


Unnamed: 0,name,categories,address,lat,lng,labeledLatLngs,distance,cc,city,state,country,formattedAddress,crossStreet,postalCode,id
1,5D кинотеатр,Indie Movie Theater,"ТРК ""Рио"", 2 этаж",50.641786,36.572006,"[{'label': 'display', 'lat': 50.64178634280017...",5258,RU,Белгород,Белгородская обл.,Россия,"[ТРК ""Рио"", 2 этаж (просп. Богдана Хмельницког...","просп. Богдана Хмельницкого, 164",,4fa7e362e4b0e4baa4665267
2,Кинотеатр Спутник,Movie Theater,"Магистральная, 2 в",50.57297,36.538273,"[{'label': 'display', 'lat': 50.57297, 'lng': ...",4283,RU,Белгород,Белгородская обл.,Россия,"[Магистральная, 2 в, 308019, Белгород, Россия]",,308019.0,5dfba4312f4b3400084b4b73
3,Победа,Movie Theater,"ул. 50 лет Белгородской области, 8Б",50.598902,36.585835,"[{'label': 'display', 'lat': 50.59890173997732...",386,RU,Белгород,Белгородская обл.,Россия,"[ул. 50 лет Белгородской области, 8Б (ул. Прео...",ул. Преображенская,,4e42c933fa76a4284c1c8671
4,Сити Молл Белгородский,Shopping Mall,мкрн. Пригородный,50.55229,36.571185,"[{'label': 'display', 'lat': 50.55228950457049...",4950,RU,п. Дубовое,Белгородская обл.,Россия,"[мкрн. Пригородный (ул. Щорса, 64), 308501, п....","ул. Щорса, 64",308501.0,4bfd60fdbf6576b01830adb8


In [26]:
dataframe_filtered_SO

Unnamed: 0,name,categories,address,lat,lng,labeledLatLngs,distance,cc,city,state,country,formattedAddress,id
0,Быль,Movie Theater,"мк-н Жукова, 38",51.30885,37.890751,"[{'label': 'display', 'lat': 51.30884952097854...",4182,RU,Старый Оскол,Белгородская обл.,Россия,"[мк-н Жукова, 38, Старый Оскол, Россия]",4ccd86c37c2ff04d87b69f7e


In [27]:
dataframe_filtered_B.name

1              5D кинотеатр
2         Кинотеатр Спутник
3                    Победа
4    Сити Молл Белгородский
Name: name, dtype: object

In [28]:
dataframe_filtered_SO.name

0    Быль
Name: name, dtype: object

In [33]:
venues_map_B = folium.Map(location=[latitude_B, longitude_B], zoom_start=13) # generate map centred around of cities center

# add a red circle marker to represent the center of city
folium.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='Belgorod',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map_B)

# add the movies theter's as blue circle markers
for lat, lng, label in zip(dataframe_filtered_B.lat, dataframe_filtered_B.lng, dataframe_filtered_B.categories):
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map_B)

venues_map_SO = folium.Map(location=[latitude_SO, longitude_SO], zoom_start=13) # generate map centred around of cities center

folium.CircleMarker(
    [latitude_SO, longitude_SO],
    radius=10,
    color='red',
    popup='Stary Oskol',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map_SO)

# add the movies theter's as blue circle markers
for lat, lng, label in zip(dataframe_filtered_SO.lat, dataframe_filtered_SO.lng, dataframe_filtered_SO.categories):
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map_SO)
# display map
venues_map_B

In [34]:
venues_map_SO

## Methodology <a name="methodology"></a>

In this project, the selection was made for the selection of data on cinemas of the two largest cities in the Belgorod region.
The selection of data was carried out within a radius of 10 km from the conditional city center. Based on the data obtained with the help of Foursquare, a visual display of the results was performed for further analysis.

## Analysis <a name="analysis"></a>

The data obtained clearly demonstrate the level of presence of the cinema industry in the cities of Belgorod and Stary Oskol. In Belgorod, there is one cinema per 100,000 people in Stary Oskol per 200,000. In Belgorod, the uneven distribution of cinemas within the city is clearly expressed.

## Results and Discussion <a name="results"></a>

Based on the analysis obtained, it can be concluded that the network of cinemas in Belgorod and Stary Oskol is underdeveloped and conveniently located. Taking into account the average statistical requirement of 1 cinema for 20,000 people in these cities, a significant expansion of the cinema network is necessary. For Belgorod, satisfaction in the needs of this type of entertainment is 25% for Stary Oskol 10%. Stary Oskol is more promising in comparison with Belgorod, since this industry is much less developed.

## Conclusion <a name="conclusion"></a>

The purpose of this project was to determine the stage of development of the cinema network in order to help stakeholders get a visual idea of the prospects for opening new cinemas. After calculating the distribution of cinemas based on Foursquare data, we first identified the locations of cinemas that warrant further analysis (Belgorod and Stary Oskol) and then generated a collection of locations.

The final decision on the optimal location of the new cinema will be made by stakeholders based on the specific characteristics of the areas and locations in each recommended area, taking into account additional factors such as the attractiveness of each location (proximity to a park or water), noise level / proximity to main roads. , property availability, prices, social and economic dynamics of each area, etc.