# Family-friendly Neighbourhoods in Melbourne, AU

## Introduction

The local government of Melbourne in Australia wants to compile information about the venues and facilities suitable for families with children under 15 years old. The final objective is to improve local communities' public areas for families with children. Thus the goal of this study is to find which suburbs need more resources for this purpose.

The analysis will use data from the Australian Bureau of Statistics (<a href="https://www.abs.gov.au/">ABS</a>) to compile data about families per suburb and the API of <a href="https://foursquare.com/">Foursquare</a> to gather the venues of each locality.

### Data Description

The first dataset downloaded from ABS has aggregations of family composition and other statistics grouped by the suburbs of the State of Victoria in Australia. It comes with several columns, but we will use the three we need to extract information about families. These columns gives us the number of couple families with children under 15 years `CF_ChU15_a_Total_F`, one-parent families with children under 15 years `OPF_ChU15_a_Total_F` and the total of families `Total_F`.

Each row of this dataset is the aggregation data per suburb with an ID of 5 digits named SSC (State Suburb Code) and assigned by the ABS.

An example of the data looks like this:

| SSC_CODE_2016 | ...  | CF_ChU15_a_Total_F | ...  | OPF_ChU15_a_Total_F | ... | Total_F | ...  |
|---------------|------|--------------------|------|---------------------|-----|---------|------|
| ...           | ...  | ...                | ...  | ...                 | ... | ...     | ...  |
| SSC20002      | 1164 | 353                | 1274 | 58                  | 135 | 1886    | 4539 |
| SSC20003      | 287  | 356                | 1496 | 39                  | 114 | 1034    | 3282 |
| ...           | ...  | ...                | ...  | ...                 | ... | ...     | ...  |

We can get a ratio of families with children per suburb with the equation:

> (`CF_ChU15_a_Total_F` + `OPF_ChU15_a_Total_F`) / `Total_F`

The Victorian government also provided a dataset `CG_SSC_2016_SA4_2016.csv` with the correspondence between SSC and SA4, which are large statistic areas defined by ABS. Those that have 'Melbourne' as part of its name `SA4_NAME_2016` represent the areas of the city with their corresponding suburbs. With this dataset, we can filter out from the previous one any suburb out of the Melbourne area. This dataset looks like this:

| SSC_CODE_2016 | SSC_NAME_2016       | SA4_CODE_2016 | SA4_NAME_2016          | RATIO | PERCENTAGE |
|---------------|---------------------|---------------|------------------------|-------|------------|
| 20172         | Bayswater (Vic.)    | 211           | Melbourne - Outer East | 1     | 100        |
| 20173         | Bayswater North     | 211           | Melbourne - Outer East | 1     | 100        |
| 20174         | Beaconsfield (Vic.) | 212           | Melbourne - South East | 1     | 100        |
| 20175         | Beaconsfield Upper  | 212           | Melbourne - South East | 1     | 100        |
| 20176         | Bealiba             | 201           | Ballarat               | 1     | 100        |
| 20177         | Bearii              | 216           | Shepparton             | 1     | 100        |
| 20178         | Bears Lagoon        | 202           | Bendigo                | 1     | 100        |

Now we will be able to go through each suburb and find the venues that are suitable for families with children. To only get those venues we are interested in, we will pass 50 categories chosen from the full list of Foursquare categories <a href="https://developer.foursquare.com/docs/resources/categories">here</a>.

At this point, we will know how many venues per category every suburb of Melbourne has. Adding this information to the ratio of families, we can run k-means clustering with 10 clusters to have that number of defined areas. The government will have then 10 areas to analyse and support depending on the outcome.

#### Final recommendations on how to interpret the final result

- Suburbs with a low ratio of families with children should be of low priority.
- Suburbs with a high ratio of families with children and a high number of suitable venues should be of low priority to create new facilities. The focus here should be on maintenance of the current infrastructure and businesses.
- Suburbs with a high ratio of families with children and a low number of suitable venues should be the focus of the government plan and give top priority to these areas.

### Thank you for reviewing this work!

This notebook was created by [Israel Tiomno](https://www.linkedin.com/in/tiomno/).

<hr>

Copyright &copy; 2020 [Israel Tiomno](https://github.com/tiomno). This notebook and its source code are released under the terms of the MIT License.