# Applied data science capstone: Data-based decision support to inform relocations

## A description of the problem and a discussion of the background



Relocations, moving to a new place and establishing one's home there due to e.g. change of job, are periods of great changes where several important decisions need to be taken. Among these important decisions, where to live is probably one of the most important ones. In effect, one's home location determines not only how much time will be used for commuting to work/to study or how big one's home will be given an available budget, but also what kind of services (grocery shops, restaurants, schools, cinemas, etc.) will be easily accessible.

In many cases, the decision of where to relocate is taken either quickly or based on limited information, especially when one is relocating far, e.g. to another country. 

This capstone will aim at developing a data-based decision support to help those in the process of relocating. 

To simplify the decision-making process of where to relocate, it is assumed that it depends on the following parameters:

* **Composition of neighbourhood**, this is a subjective criteria that depends on the individual preferences of the person relocating. 
* **Size of the new apartment**, this is a function of available budget and the chosen location (neighbourhood) to relocate. 
* **Commuting time**, this can be modeled as a function of the distance between the chosen location to relocate and the location of the commute (work/study). 

For the purpose of this capstone, the user (i.e. the one relocating) will define its preferences and constraints in terms of:
* **Location (target neighbourhood) he would like the new apartment's location to be similar to**, this can be the current apartment's location if the user finds it is a comfortable neighbourhood. 
* **Available budget**, this will be used to estimate the size of the apartment given a recommended location
* **Location of work/study**, this will be used to estimate commuting time by computing distance between work/study location and the new apartment's location.

The main idea is that users inform i) a neighbourhood location they like, ii) the city where they are rellocating, iii) an available budget and iv) the location of work/study. 

Given the above parameters, the user will be presented with suggested neighbourhoods to relocate. For each suggested neighbourhood, an estimated apartment size and daily commuting time will be calculated. This will provide decision-support to the user, that will then be able to target their apartment search on the recommended neighbourhoods. 

This project could be extended so that not only neighbourhoods, but actually apartments, are proposed to the user relocating. 

In order to suggest neighbourhoods for relocation, the Foursquare location data will be used to characterize the 'target neighbourhood' as well as the different neighbourhoods in the city where to relocate. Then, a clustering algorithm will be used to cluster the set of the neighbourhoods in the new city plus the target neighbourhood. Once similar neighbourhoods to the target one are identified, commuting times and apartment's size will be estimated based on user-provided information. 


## A description of the data and how it will be used to solve the problem

### Neigbourhoods in Barcelona
We will assume that the user wants to relocate to Barcelona.

An overview of Barcelona's districts (each district contains several neighbourhoods) can be seen below: 
<img src="450px-Barcelona_districtes.svg.png" />

The coordinates of the different neighbourhoods in Barcelona will be extracted from <a href="https://en.wikipedia.org/wiki/Districts_of_Barcelona">this Wikipedia page</a>.

These coordinates will be used to explore the different neighbourhoods (plus the target one) in FOURSQUARE.



### Sqm price of Barcelona neighbourhoods

The price per square meter of an apprtment in Barcelona will be extracted from <a href="https://www.bcn.cat/estadistica/castella/dades/timm/ipreus/hab2mave/evo/t2mab.htm">https://www.bcn.cat/estadistica/castella/dades/timm/ipreus/hab2mave/evo/t2mab.htm</a>, which is provided by the local council of Barcelona. 
UARE.

## Additional user information 
For the purpose of illustrating this capstone project, the following parameters will be assumed:
* **Location (target neighbourhood) he would like the new apartment's location to be similar to**: a location in Madrid (similar city) will be chosen.
* **New work location**: the user will be working close to 'Sants Station', the main train station in Barcelona. 
* **User's available budget:** The user has an available budget of 300.000 EUR to buy the apartment where to relocte. 

# Sections to be filled in week 05

## Introduction
Where you discuss the business problem and who would be interested in this project.


## Data 
Where you describe the data that will be used to solve the problem and the source of the data.


## Methodology section 
It represents the main component of the report where you discuss and describe any exploratory data analysis that you did, any inferential statistical testing that you performed, if any, and what machine learnings were used and why.


## Results 
Section where you discuss the results.


## Discussion 
Section where you discuss any observations you noted and any recommendations you can make based on the results.


## Conclusion 
Section where you conclude the report.

In [3]:
import pandas as pd
import numpy as np