20-minute neighbourhoods: understanding housing density in Bradford

Disclaimer

This analysis was part of a larger piece of work on 20-minute neighbourhoods as part of the GEOG5403 Creative Coding for Urban Problems module on the MSc Urban Data Science & Analytics programme at the University of Leeds.

The content of this notebook is intended for educational and general use purposes only.

Any use of the materials should adhere to the guidelines and policies of your educational institution.

The author does not take any responsibility for how the materials in this repository are utilised.

Summary

In a 20-minute neighbourhood, residents can walk, wheel or cycle to key services and amenities within a return 20-minute journey. The concept is not new, but interest grew as COVID-19 lockdowns highlighted the liveability of local areas. One way to identify a 20-minute neighbourhood is to draw a suitable walk-time around an area, analyse what services exist, and then “fill in the gaps”. However, this approach does not consider wants and needs of local communities and how this can differ. The Town and Country Planning Association has built a comprehensive guide arguing most 20-minute neighbourhoods will include many similar features, like schools, greenspaces, and health facilities.

This analysis looks at potential 20-minute neighbourhoods in Bradford, with the aim of understanding housing density to determine where people live and therefore which areas might have the most households benefit from investment in the 20-minute neighbourhood concept.

Using population-weighted centroids (PWCs) for each output area (OA), an 800m Euclidian buffer is drawn around this point to reflect 20-minute neighbourhoods, as shown in Figure 1. OAs are the smallest census geography, with PWCs reflecting where people live compared to the centre-point of the polygon. However, OA polygons are used to visualise results for clarity, avoiding buffer overlap.

Figure 1: Defining neighbourhoods

For housing, OA populations could have been used but would include everyone in any OA with a PWC inside the neighbourhood buffer, rather than the households within the neighbourhood itself. Instead, domestic EPC ratings since 2008 are used to find residential properties, spatially matching these to the buffers. Over 157,000 homes were identified, roughly 75% coverage given 210,000 households in Bradford.

Clear spatial patterns emerge of where housing is most dense, with these areas potentially being good candidates to investigate further in terms how well they are or are not setup for 20-minute living. There are limitations to this methodology, including:

800m Euclidian buffers not reflecting actual path networks - a network analysis may be more useful in reflecting true 20-minute neighbourhoods
Population is not included, and so the actual number of people living in these neighbourhoods is not calculated

Figure 2: Housing density choropleth

Import and setup geographies

Links to the raw data can be found in each code cell. OA data needs to be reduced to just Bradford, and then dataframes tidied ready for use.

# Import required libraries
import pandas as pd
import geopandas as gpd

# OA polygons (BFC: Full resolution - clipped to the coastline (Mean High Water mark))
# https://geoportal.statistics.gov.uk/datasets/ons::output-areas-2021-boundaries-ew-bfc/about
OA_polygons = gpd.read_file('Data/Output_Areas_2021_Polygon/OA_2021_EW_BFC_V8.shp')

# OA Population-weighted centroids
# https://geoportal.statistics.gov.uk/datasets/ons::output-areas-december-2021-pwc-v3/about
OA_PWC = gpd.read_file('Data/Output_Areas_2021_Centroid/PopCentroids_EW_2021_V3.shp')

# OA to LSOA to MSOA to LAD lookup
# https://geoportal.statistics.gov.uk/datasets/output-area-to-lower-layer-super-output-area-to-middle-layer-super-output-area-to-local-authority-district-december-2021-lookup-in-england-and-wales-v2-1/about
OA_lookup = pd.read_csv('Data/Output_Area_Lookup_in_England_and_Wales_v3.csv')

# Remove unnecessary columns from OA lookup, keeping only required columns in new dataframe
OA_lookup_trim = OA_lookup[['OA21CD','LAD22CD','LAD22NM']]

# Filter OA lookup to just Bradford
Bradford_OA_lookup = OA_lookup_trim[OA_lookup_trim['LAD22NM'] == 'Bradford']

# Check how many OAs are left - this number is an important reference point for further data wrangling tasks
len(Bradford_OA_lookup)

1575

# Merge Bradford_OA_lookup with OA polygons to reduce the geodataframe containing the polygons to only Bradford OAs
Bradford_OA_polygons = OA_polygons.merge(Bradford_OA_lookup, how='right', on='OA21CD')

# Check number of OAs is same as in the reduced Bradford OA lookup dataframe
print(len(Bradford_OA_polygons))
print(len(Bradford_OA_polygons) == len(Bradford_OA_lookup))

1575

True

# Merge Bradford_OA_lookup with OA PWCs to reduce the geodataframe containing the PWCs to only Bradford OAs
Bradford_OA_PWC = OA_PWC.merge(Bradford_OA_lookup, how='right', on='OA21CD')

# Check number of OAs is same as in the reduced Bradford OA lookup dataframe
print(len(Bradford_OA_PWC))
print(len(Bradford_OA_PWC) == len(Bradford_OA_lookup))

1575

True

# Drop unrequired columns
Bradford_OA_polygons_tidy = Bradford_OA_polygons.drop([
    'LSOA21CD','LSOA21NM','LSOA21NMW','BNG_E','BNG_N','LAT','LONG','GlobalID','LAD22CD','LAD22NM'], axis=1)
Bradford_OA_PWC_tidy = Bradford_OA_PWC.drop(['GlobalID','LAD22CD','LAD22NM'], axis=1)

# Rename geometry columns and set geometry
Bradford_OA_polygons = Bradford_OA_polygons_tidy.set_geometry('geometry').rename_geometry('Polygon')
Bradford_OA_PWC = Bradford_OA_PWC_tidy.set_geometry('geometry').rename_geometry('PWC')

Build 20-minute neighbourhoods using 800m Euclidian buffer

As highlighted in the summary, an 800m Euclidian buffer around each PWC will be used to act as the 20-minute neighbourhoods.

# Import required libraries
from shapely.geometry import Point

# Function to create a buffer around a point with a given radius
def create_buffer(point, radius):
    return point.buffer(radius)

# Define the radius of the circle (in meters)
radius_meters = 800

# Create a new column in the GeoDataFrame to store the buffer polygons
Bradford_OA_PWC['buffer'] = Bradford_OA_PWC['PWC'].apply(lambda point: create_buffer(point, radius_meters))

# Now, 'buffer' column contains the buffer polygons around each point with a radius of 800 meters

# Set the 'buffer' to be the geometry
Bradford_OA_PWC.set_geometry('buffer', inplace=True)

Get locations of residential properties

To obtain the location of the houses, EPC ratings since 2008 is combined with UPRN point data. Properties might have multiple EPC ratings, and so the data must be ordered newest to oldest, and duplicates removed keeping only the latest rating. The EPC ratings can then be merged with the UPRN data to obtain the point of each property, setting this as the geometry for further analysis.

EPC ratings

# EPC Ratings
# https://epc.opendatacommunities.org/
EPC_Ratings = pd.read_csv('Data/Housing/domestic-E08000032-Bradford/certificates.csv')

# Step 1: Sort by date column in descending order
EPC_sorted = EPC_Ratings.sort_values(by='LODGEMENT_DATETIME', ascending=False)

# Step 2: Remove duplicates keeping the first instance based on 'UPRN' column
EPC_unique = EPC_sorted.drop_duplicates(subset='UPRN', keep='first')

# Check how properties are in the dataframe
len(EPC_unique)

157255

UPRNs

# UPRNs
# https://osdatahub.os.uk/downloads/open/OpenUPRN
UPRNs = pd.read_csv('Data/Housing/osopenuprn_202404_csv/osopenuprn_202404.csv')

UPRNs.head()

	UPRN	X_COORDINATE	Y_COORDINATE	LATITUDE	LONGITUDE
0	1	358260.66	172796.5	51.452601	-2.602075
1	26	352967.00	181077.0	51.526633	-2.679361
2	27	352967.00	181077.0	51.526633	-2.679361
3	30	354800.00	180469.0	51.521317	-2.652862
4	31	354796.00	180460.0	51.521236	-2.652918

Merge the two dataframes

# Merge EPC Ratings and UPRNs
Matched = EPC_unique.merge(UPRNs, how='inner', on='UPRN')

# Keep only columns required for analysis
Matched_trim = Matched[['UPRN','LATITUDE','LONGITUDE']].copy()

# Convert the Latitude and Longitude columns into Point objects
geometry = [Point(xy) for xy in zip(Matched_trim['LONGITUDE'], Matched_trim['LATITUDE'])]

# Convert to a GeoDataFrame
Matched_final = gpd.GeoDataFrame(Matched_trim, geometry=geometry).set_crs('epsg:4326')

# Change crs to match other data
Matched_final = Matched_final.to_crs(27700)

Matched_final.head()

	UPRN	LATITUDE	LONGITUDE	geometry
0	1.000512e+11	53.763624	-1.768966	POINT (415327.798 429721.658)
1	1.000512e+11	53.801380	-1.826545	POINT (411521.815 433911.593)
2	1.000512e+11	53.802175	-1.829245	POINT (411343.819 433999.592)
3	1.000512e+11	53.781098	-1.816461	POINT (412191.822 431656.624)
4	1.009167e+10	53.823379	-1.860954	POINT (409250.808 436354.194)

len(Matched_final)

157252

# Check length of the final dataframe is the same as the unique properties that were found
len(Matched_final) == len(EPC_unique)

False

# Check the difference to see how many properties in the EPC ratings did NOT match a UPRN (this could be because they were missing or entered incorrectly)
len(Matched_final) - len(EPC_unique)

-3

Mapping properties

A quick map illustrates where the properties are located.

# Import required libraries
import contextily as cx
import matplotlib.pyplot as plt

# Plotting the GeoDataFrame with basemap
ax = Matched_final.plot(figsize=(10, 6), markersize=10, color='red')

# Add basemap using contextily
cx.add_basemap(ax, zoom=12, crs=Matched_final.crs.to_string())

# Set title
plt.title('Properties identified by EPC data')

# Turn off the axes
ax.axis('off')

plt.show()

Spatially match properties to neighbourhood buffers

Next the properties can be spatially matched to the neighbourhood buffers to determine how many properties are in each neighbourhood. It is important to reiterate that this EPC rating methodology has roughly 75% coverage of all households, and does not include actual population counts which could vary depending on size of property.

# Make a copy of the dataframe containing the buffers with only the required columns
Bradford_PWC_buffers = Bradford_OA_PWC[['OA21CD','buffer']].copy()

# Spatially match the properties to the buffers
joined = Bradford_PWC_buffers.sjoin(Matched_final, how="left", predicate="intersects")

# Visual check of the dataframe - there are duplicated rows due to the overlap of some neighbourhood buffer zones
joined

	OA21CD	buffer	index_right	UPRN	LATITUDE	LONGITUDE
0	E00053353	POLYGON ((414438.052 439495.615, 414434.200 43...	35797	1.000519e+11	53.847941	-1.784022
0	E00053353	POLYGON ((414438.052 439495.615, 414434.200 43...	63910	2.000047e+11	53.848048	-1.783626
0	E00053353	POLYGON ((414438.052 439495.615, 414434.200 43...	58772	1.000513e+11	53.845011	-1.789341
0	E00053353	POLYGON ((414438.052 439495.615, 414434.200 43...	11651	1.000513e+11	53.845155	-1.789355
0	E00053353	POLYGON ((414438.052 439495.615, 414434.200 43...	157213	1.000513e+11	53.845362	-1.789309
...	...	...	...	...	...	...
1574	E00177810	POLYGON ((410759.264 435503.188, 410755.412 43...	119131	1.009098e+10	53.821992	-1.855611
1574	E00177810	POLYGON ((410759.264 435503.188, 410755.412 43...	50374	1.000519e+11	53.816365	-1.848081
1574	E00177810	POLYGON ((410759.264 435503.188, 410755.412 43...	31264	1.000519e+11	53.816429	-1.848704
1574	E00177810	POLYGON ((410759.264 435503.188, 410755.412 43...	100903	1.000519e+11	53.816465	-1.848628
1574	E00177810	POLYGON ((410759.264 435503.188, 410755.412 43...	154452	1.000233e+10	53.816501	-1.848430

3703801 rows × 6 columns

# Group by OA21CD and count the unique UPRNs
agg_df = joined.groupby('OA21CD')['UPRN'].nunique().reset_index()

# Rename the column to indicate the count of UPRNs
agg_df.rename(columns={'UPRN': 'UPRN_Count'}, inplace=True)

# Visual check
agg_df

	OA21CD	UPRN_Count
0	E00053353	214
1	E00053354	1148
2	E00053355	1405
3	E00053356	965
4	E00053357	1441
...	...	...
1570	E00177867	3780
1571	E00177868	2910
1572	E00177869	5265
1573	E00177870	2096
1574	E00177871	5169

1575 rows × 2 columns

Add OA polygons back in and map choropleth

The results are easier to visualise using the OA polygons to remove any buffer overlap. Results show clear areas of higher density neighbourhoods, particuarly around large town in the district and the city centre.

# Merge dataframes to add polygons back in
for_choropleth = pd.merge(Bradford_OA_polygons, agg_df, on='OA21CD', how='inner')

# Visual check
for_choropleth

	OA21CD	Polygon	UPRN_Count
0	E00053353	POLYGON ((415817.093 440872.597, 415821.094 44...	214
1	E00053354	POLYGON ((415078.000 439967.001, 415058.323 43...	1148
2	E00053355	POLYGON ((416252.367 439816.041, 416253.270 43...	1405
3	E00053356	POLYGON ((416668.000 439392.028, 416667.653 43...	965
4	E00053357	POLYGON ((415143.909 439176.235, 415143.000 43...	1441
...	...	...	...
1570	E00177806	POLYGON ((404057.041 446035.505, 404056.753 44...	2069
1571	E00177807	POLYGON ((418884.032 438045.945, 418866.019 43...	1790
1572	E00177808	POLYGON ((414782.332 437464.729, 414782.519 43...	3114
1573	E00177809	POLYGON ((402252.845 444586.747, 402253.000 44...	434
1574	E00177810	POLYGON ((410031.688 435427.501, 410026.053 43...	489

1575 rows × 3 columns

fig, ax = plt.subplots(figsize=(10, 6))

# Plot the choropleth map
for_choropleth.plot(column='UPRN_Count', cmap='inferno', legend=True, ax=ax, alpha=0.8)

# Add basemap using contextily
cx.add_basemap(ax, crs=for_choropleth.crs.to_string(), zoom=10)  # Set the coordinate reference system (CRS)

# Set title
plt.title('Housing density choropleth')

# Turn off the axes
ax.axis('off')

# Show the plot
plt.show()

plt.figure(figsize=(8, 6))

# Create histogram
plt.hist(for_choropleth['UPRN_Count'], bins=15, color='skyblue', edgecolor='black')  # Adjust the number of bins as needed

# Add labels and title
plt.xlabel('UPRN Count')
plt.ylabel('Frequency')
plt.title('Histogram of Housing Density')

# Show the plot
plt.show()

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Images		Images
.gitattributes		.gitattributes
20min_neighbourhoods_Bradford_housing.ipynb		20min_neighbourhoods_Bradford_housing.ipynb
20min_neighbourhoods_Bradford_housing.md		20min_neighbourhoods_Bradford_housing.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

20-minute neighbourhoods: understanding housing density in Bradford

Disclaimer

Summary

Import and setup geographies

Build 20-minute neighbourhoods using 800m Euclidian buffer

Get locations of residential properties

EPC ratings

UPRNs

Merge the two dataframes

Mapping properties

Spatially match properties to neighbourhood buffers

Add OA polygons back in and map choropleth

About

Releases

Packages

Languages

JessArkesden/20min_neighbourhoods_Bradford_housing

Folders and files

Latest commit

History

Repository files navigation

20-minute neighbourhoods: understanding housing density in Bradford

Disclaimer

Summary

Import and setup geographies

Build 20-minute neighbourhoods using 800m Euclidian buffer

Get locations of residential properties

EPC ratings

UPRNs

Merge the two dataframes

Mapping properties

Spatially match properties to neighbourhood buffers

Add OA polygons back in and map choropleth

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages