# Relocation Analysis
In this project, I am looking to develop a data analysis package that can assess similarity between the current neighborhood that I am living in against neighborhoods within Bangkok, Thailand.

## Aims of the Project
The idea came around as I am in the process of relocating from United Kingdom to Thailand as an expat, and would like to evaluate neighborhoods within Bangkok and find similar neighborhood that shares some similarity to my current neighborhood - Edgbaston, Birmingham, UK. 

## The Problems
I have not been to Thailand for a long time and have very limited knowledge about Bangkok and where is a good place to live. This is a very challenging issue for me as a person who is soon to be living in a foreign city. Any additional knowledge that could help me pick a neighborhood to live in would be fantastic! Nobody wants to relocate half the world away only to live in an unsuitable area far away from everything.

## The Solutions
In this project, I will look to obtain venues data within the radius of my current neighborhood - Edgbaston, Birmingham, UK and compare it against similar venues data of all neighborhoods in Bangkok, Thailand. These venues data will be obtained from Foursquare, and the neighborhoods coordinates for Bangkok will be scraped from Wikipedia via the following url.
https://en.wikipedia.org/wiki/List_of_districts_of_Bangkok

-----------------------------------------------------------------

# Section 1: Obtaining Data
In this section, I am looking to scrape Bangkok's neighborhoods coordinates from Wikipedia (https://en.wikipedia.org/wiki/List_of_districts_of_Bangkok) and find Point of Interests (venues) data from Foursquare. Similar process will also be carried out against my current neighborhood as well.

In [1]:
#import relevant packages
import pandas as pd

In [2]:
#scrape coordinates of Bangkok's neighborhood from Wikipedia
bangkok_url = 'https://en.wikipedia.org/wiki/List_of_districts_of_Bangkok'

bangkok_data = pd.read_html(bangkok_url, match='District')

In [10]:
#cleaned up data by removing unneeded columns and rename appropriately
bangkok_data = bangkok_data[0].drop(columns=['MapNr','Thai','Popu-lation','No. ofSubdis-trictsKhwaeng'], axis=1)
bangkok_data.columns = ['District','Post Code','Latitude','Longitude']

print("Shape of Table: {} rows and {} columns.".format(bangkok_data.shape[0] ,bangkok_data.shape[1]))
bangkok_data.head()

Shape of Table: 50 rows and 4 columns.


Unnamed: 0,District,Post Code,Latitude,Longitude
0,Bang Bon,10150,13.6592,100.3991
1,Bang Kapi,10240,13.765833,100.647778
2,Bang Khae,10160,13.696111,100.409444
3,Bang Khen,10220,13.873889,100.596389
4,Bang Kho Laem,10120,13.693333,100.5025


In [12]:
#create DataFrame to store similar data of my current neighborhood
bham_data = pd.DataFrame({'District':['Edgbaston'],'Post Code':['B5 7SU'],'Latitude': ['52.454198'],'Longitude': ['-1.905032']})
bham_data

Unnamed: 0,District,Post Code,Latitude,Longitude
0,Edgbaston,B5 7SU,52.454198,-1.905032
