# Real Estate Investments in Birmingham

*(Week 5 Project Report)*

## 1. Introduction/ Business Problem

A financial company is interested in investing in property which is widespread throughout Birmingham as part of a new low-risk fund.
The mission statement of this new fund is to invest in properties in Birmingham which are the most common to their areas, to ensure a low risk strategy on investments.
This preliminary study is to identify which types of business are most common in certain areas of Birmingham

Birmingham is becoming a more ethnically diverse city in a short period of time, in accordance to census data (see https://en.wikipedia.org/wiki/Demography_of_Birmingham). Investments in ethnic restaurants in the right areas of the city may hold potential as safer investments, so identifying areas where these restaurants are most prolific would be an excellent start for this new fund.

While this report may be of use to this company, sorting areas of Birmingham into different areas may also assist in other lines of inquiry, such as more desirable places to live based on surrounding shops and amenities. This information may double up as being useful for companies looking to build new homes and investors who will finance this, as the current political climate in the UK requires the building of many homes in the coming years.

### Target Audience:

Investors who are looking to know more about the culture and diversity of Birmingham, and how this information can be used to affect an investment into real estate.

### Stake-Holders:

* Investors
* Home buyers/sellers
* Real estate owners

## 2. Data

To achieve the above, geographical data in and around the city of Birmingham needs to be created. To do this, the latitude and longitude for postcodes beginning with 'B' are taken from: https://www.freemaptools.com/download-uk-postcode-lat-lng.htm. In this data set, the outcode (i.e. the first half) of postcodes along with latitude and longitude are available in .csv format.

With the postcode outcodes, the latitude and longitude will be used to make a call to the Foursquare API, to record what services are central to each region. 

The most common venue will be displayed, up to the 10th most common venue, to get a sense for which venues are most prevalent.

## 3. Methodology

To assess whether separating Birmingham by postcode outcodes was appropriate, the locations were plotted using Folium maps to determine the distance between each region. The distances between each region were appropriate to use and so exploratory analysis of the venues in Birmingham was undertaken. the locations were found to be suitably spaced for determining separate areas of Birmingham.

Initial analysis from a call to Foursquare to find the number of locations was undertaken with a limit of 100 calls per locations and a radius of 500m. This initial search did not return enough results and so the radius was increased to 1000m, to ensure a greater number of calls to the API, and providing more data to determine the differences in each area.

After collecting the venue data types, a table is made to show which venue is most common, from the 1st most common to the 10th.

After this was achieved, a clustering algorithm was used to determine similar areas of Birmingham, with the number of clusters being set. After some experimentation, seven clusters were used to split Birmingham, with three large clusters dominating the results. Clustering was used as it is a quick and easy way to tackle the business problem, as we are interested in finding out which areas of Birmingham 

Due to the small number of calls of some locations (~10 in the rural areas) the three most common venues were taken to provide greater insight to determine why each cluster was separated in this way. A frequency table for the three most common venues was created, to get a better understanding of the cluster sorting.

## 4. Results

The clustering in Birmingham of the areas sorted by venue are as follows: **(note that these images can be seen in the powerpoint presentation if they are not rendering in github)**

![Birmingham%207%20clusters.PNG](attachment:Birmingham%207%20clusters.PNG)

From the above, it is clear to see that there are three main clusters of purple, red and yellow. The four other clusters (blue, light blue, orange and cyan) are too small to consider in this study and are cast aside as outliers to the clustering.

The three most common venues of the _purple segment_ are shown as a frequency plot below:

![cluster%201%20%28purple%29.PNG](attachment:cluster%201%20%28purple%29.PNG)

From this data, there are some clear trends in the most common venues in these areas. Coffee shops and Pubs are highly prevalent in built up urban areas, with hotels as another very common venue. Throughout the data set, a great variation in stores is seen especially in the table for the third most common venue where no frequency is higher than 3. 

The three most common venues of the _red segment_ are shown as a frequency plot below:

![cluster%200%20%28red%29.PNG](attachment:cluster%200%20%28red%29.PNG)

There are some significant differences between this and the purple cluster, as pubs are no longer the dominant venue and grocery stores are now the most common venue, followed by Indian Restaurants.

The three most common venues of the _yellow segment_ are shown as a frequency plot below:

![cluster%205%20%28yellow%29.PNG](attachment:cluster%205%20%28yellow%29.PNG)

Compared to the previous segments, this yellow cluster seems to be a mix of the two previous. Pubs are prevalent, alongside Indian Restaurants and Grocery Stores. 

Overall, these three clusters represent three very different areas of Birmingham, relating to built up urban areas, suburban areas and more rural areas.

## 5. Discussion

Overall, the three clusters show great variation between themselves, with the following general trends:

1. Purple clusters are mostly concentrated to urban areas with a high number of Pubs, Coffee shops and Hotels
2. Red clusters are mostly concentrated to suburban areas, with a high prevalence Grocery Stores, Indian Restaurants and Pubs
3. Yellow clusters are mostly concentrated to rural areas, with a prevalence of Pubs and Grocery Stores.

## 6. Conclusion

To invest in low risk real estate (i.e. the most prevalent in the city), the city has been segmented by postcode outcodes and separaed into three major clusters: urban, suburban and rural. These different clusters show that investing in either Pubs, Coffee shops, Hotels, Grocery Stores and Indian Restaurants may be low risk, as they are the most prevalent shops throughout Birmingham. 
Secondary conclusions infer that the yellow cluster may be more suitable for housing developments in the future, as there are many things of interest (lakes, parks, Plazas) for people to visit.
Links between the demographics and resultant venues seem to correlate well, with indian and pakistani restaurants being more common in the urban areas, which may be of interest for investors.