<a href="https://colab.research.google.com/github/Dehyzz/IBM_Capstone/blob/master/The_Battle_of_Neighborhoods.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# The Capstone Project - The Battle of Neighborhoods

## Week 1: Report beginning

### 1. Description of the problem 
It is always a big challenge to choose right location for new property, especially a hotel. As a D. Trump being a luxury property magnate once said "When considering to build new property there is three major factors that you always have to consider: Location, Location and of course Location!"

This data science project is aiming to solve the challenge of finding the best location for new hotel in $$$ price category in Dublin, Ireland. To do so we will apply machine learning algorithms on Foursquare location data.

As a measure of location choice quality we will be using Venue Rating on Foursquare. Factors known for influencing location quality: proximity to restaurants, museums, parks, monuments, shops and other points of interest (POI).

---

### 2. Description of the data and how it will be used to solve the problem

Provider of Location Data will be Foursquare. Below is high-level process for preparing data.

1. Will be used Foursquare API call "search/ + keyword=hotel" to find all existing hotels in Dublin. Than filtered only hotels with price category $$$. Saving to Pandas Dataframe.

2. For every hotel in Dataframe the rating will be found, by passing to Foursquare query with by "venue/ + id" API call) (for every hotel in Dataframe).

3. Having all the hotels we add (by "explore/" API call) nearby POI in selected categories:

    *   Restaurants
    *   Museums
    *   Parks
    *   Monuments
    *   Shops
    *   List item

4. Then we filter out some veues that have less impact on rating of high price category hotels, keeping the following:
    *   Restaurants (only with rating>4 and price_category>$$)
    *   Shops (only with price_category>$$)

5. Then we will merge all the data into Dataframe (one hotel = one line) and make a prediction for every Dublin postcode prediction of rating for hypothetical hotel build in this location.

---

## Week 2:

### 1. Report
* Introduction with the business problem and interested parties

* Data and its source

* Methodology section which represents the main component of the report where you discuss and describe exploratory data analysis, inferential statistical testing, if any, and what machine learnings were used and why.

* Results description

* Discussion section where you discuss any observations you noted and any recommendations you can make based on the results

* Conclusion section where you conclude the report

### 2. A link to your Notebook

### 3. Presentation or blogpost


### Example

* Report: https://cocl.us/coursera_capstone_report
* Notebook: https://cocl.us/coursera_capstone_notebook
* Presentation: https://cocl.us/coursera_capstone_presentation
* Blogpost: https://cocl.us/coursera_capstone_blogpost

# Preparing Data

1. Getting List of hotels in Dublin (NAME, LATITUDE, LONGITUDE, RATING)

In [0]:
### Importing Libraries

import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
# from IPython.display import Image 
# from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize
import folium # plotting library

In [13]:
# Mounting Gdrive in Notebook
from google.colab import drive
drive.mount('/content/drive/')


Mounted at /content/drive/


In [0]:
### Write file 
with open('/content/drive/My Drive/Colab_disk/foo.json', 'w') as f:
  f.write('Ho!')

### Read file
# !cat /content/drive/My\ Drive/Colab_disk/foo2.txt


In [31]:
### Foursquare Auth data
CLIENT_ID = '1Z1J1FGK4R5WD2RSEUC3P011PLPLUP02NJGM2W1OF3UYXJ2M' # your Foursquare ID
CLIENT_SECRET = 'R2VWWYKS1HOSJMD0X2NVS3XTJBIKCJIZ2DEL2VWANBHBX3PN' # your Foursquare Secret
VERSION = '20200531'


### Dublin city center location
address = 'Dublin, Ireland'
geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

### Getting hotels list (by 'Hotel/Inn' in category)
# https://developer.foursquare.com/docs/build-with-foursquare/categories/
inn = '5bae9231bedf3950379f89cb'
hotel = '4bf58dd8d48988d1fa931735'
resort = '4bf58dd8d48988d12f951735'
category = resort
radius = 50*1000 # in meters
LIMIT = 30
intent='browse'
search_query = ''
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&{}&v={}&query={}&radius={}&limit={}&categoryId={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, intent, VERSION, search_query, radius, LIMIT, category)
results = requests.get(url).json()

results

# search_query = 'Italian'
# print(search_query + ' .... OK!')
# url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
# results = requests.get(url).json()

53.3497645 -6.2602732


{'meta': {'code': 200, 'requestId': '5edb76c29da7ee001bdefe9b'},
 'response': {'confident': False,
  'venues': [{'categories': [{'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/shops/spa_',
       'suffix': '.png'},
      'id': '4bf58dd8d48988d1ed941735',
      'name': 'Spa',
      'pluralName': 'Spas',
      'primary': True,
      'shortName': 'Spa'}],
    'hasPerk': False,
    'id': '4e524214a809cef5beb46355',
    'location': {'address': 'Knightsbrook Hotel',
     'cc': 'IE',
     'city': 'Trim',
     'country': 'Ireland',
     'crossStreet': 'Dublin Rd',
     'distance': 40130,
     'formattedAddress': ['Knightsbrook Hotel (Dublin Rd)',
      'Trim',
      'Co Meath',
      'Ireland'],
     'labeledLatLngs': [{'label': 'display',
       'lat': 53.548641769933575,
       'lng': -6.765167616877216}],
     'lat': 53.548641769933575,
     'lng': -6.765167616877216,
     'state': 'Co Meath'},
    'name': 'The River Spa',
    'referralId': 'v-1591441111',
    'venuePage': {'id'