# Equal Access to Cash in the context of the 20 Minute Neighbourhood

### Chuang Wang    Jianan Wei    Iain Paton

## Urban Analytics Group Project 2022-2023

## Introduction

Equal access to cash is an issue identified within the context of the 20 minute neighbourhood, alongside other factors (Olsen, Thornton, Tregonning, Mitchell 2022) with analysis that indicates 4.6% of the UK population do not have access to a free-to-use cash machine (ATM) and are required to pay a transaction fee (for example, £1.99) for withdrawing cash (Financial Conduct Authority 2021). The  assessment of spatial equity is well-established in Scotland, notably in Glasgow (Beairstoe Tian, Zheng, Zhao, Hong 2022) including application of the the two-step floating catchment analysis.

Access to cash has also been considered in Austria (Stir, 2020), Australia (Delaney et al, 2019; Caddy and Zhang, 2021) and Spain (Gonzalo and Sala, 2018; Restrepo, 2021) and Switzerland (Trütsch, 2022), also looking at bank closures, although often in the context of access by car. It is generally concluded there are more ATMs and bank branches in wealthier areas compared with poorer areas.

This issue may have a disproportionate impact upon poorer people, who lack access to motor vehicles and may have a greater reliance upon cash transactions, and this analysis investigates the relationship between deprivation and walkable access to cash. This is acknowledged by the banking sector, with industry subsidy schemes for ATM access (the LINK network) and further interventions to support community access to cash, which may mitigate such impacts.

## Data sources

There are two data sources: 

**1 the x/y coordinates of ATMs by type** (free or surcharging) as the "supply" or "service" element 

<div style="display:inline-block;">
    
| ATMs | Free | Surcharging | Total |
|-----------------|-----------------|-----------------|-----------------|
| Scotland | 4049 | 1026 |5075|
| Glasgow | 576 | 187 |763|
    
</div>

**2 the Scottish Index of Multiple Deprivation** as the "demand" element, as a polygon dataset

<div style="display:inline-block;">
    
| Datazones| Total|
|-----------------|-----------------|
| Scotland | 6976 | 
| Glasgow | 824 (726 attributed to Glasgow City | 
    
</div>


## Method

The 2 step Floating Catchment Analysis (2SFCA) is used to compare relative accessibility between populations, using datazones and SIMD decile, and cash machine/ATM service provider locations of the types 'free' and 'surcharging', using catchments that calculate or assume a 800m Manhattan (grid/block) walking distance equivalent as a 566m radius, to investigate if there is a disparity in equal access to free-to-use ATMs. This is conducted on an exploratory basis for Glasgow and on a fuller basis for Scotland. The 566m radius is a convenient calculation and reasonable representation of the block distance of an 800m 20 minute neighbourhood or the 600m of a 15 minute neighbourhood in a denser city environment.

2SFCA is well-established method originating in the measuring health facility accessibility (Luo and Wang, 2003), applied in England to measure the accessibility of GP practices (Bauer et al 2018) and employed recently in Glasgow to idenitfy new locations for bike sharing stations. (Beairsto et al, 2022). The first catchment analysis is applied to demand points with catchments containing residential points and supply population values, calculating a supply to demand ratio, which is then aggregated in a second catchment analyis centred on residential catchments that sums the supply to demand ratios for each catchment and then applies a spatial accessibility index from 0 to 1. 

Ordinary least squares (OLS) ression analysis can be applied to dependent (access index) and independent variables (income deprivation index) to establish if there is a statistical relationship.

The analysis is applied separately to "free" and "surcharging" ATM locations as supply catchments versus the SIMD-based demand catchment, for Glasgow in some detail and then for Scotland.

A more detailed look is taken with regard to the clusters of Free ATMs in Glasgow and also the extent of catchments for Surcharging ATMs for Scotland that have zero accessibility.

In addition, nearest-distance tables between demand locations and supply locations, for each type - free and surcharging - are calculated and mapped. 



In [None]:
import numpy as np
import pandas as pd
import geopandas as gpd
from IPython.display import Markdown, display
import shapefile
import matplotlib as mpl
import matplotlib.pyplot as plt
import requests
import urllib3
import seaborn as sns
import contextily as ctx
from pandas import Series, DataFrame
from shapely.geometry import Point
from shapely.geometry import shape  
from zipfile import ZipFile
from io import StringIO
import scipy
from scipy import stats
from scipy.stats import norm
from sklearn import preprocessing
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import train_test_split
from sklearn.cluster import KMeans
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.linear_model import LinearRegression
from sklearn import metrics
from IPython.core.display import HTML
from owslib.wfs import WebFeatureService
from owslib.util import Authentication
import requests
import math

from IPython.core.display import HTML
table_css = 'table {align:left;display:block} '
HTML('<style>{}</style>'.format(table_css))

%matplotlib inline

## Data: Cash Machines/ATMs

This dataset is published by the LINK network and categorises ATMs as "free" or "surcharging".

The columns and types are outlined below. There are a total of **4049** and **1026** free and surcharging ATMs in Scotland. 

In [None]:
atm_data = pd.read_csv('cashpoint_xy.csv')
atm_data.columns



The data is plotted spatially below in **Figure 1.** and this provides an indication of the spatial distribution, prior to joining with socio-economic data.

In [None]:
fig, ax = plt.subplots(figsize=(32,16))
geometry = [Point(xy) for xy in zip(atm_data['x'], atm_data['y'])]
crs = {'init': 'epsg:27700'}
atm_geodata = gpd.GeoDataFrame(atm_data, crs=crs, geometry=geometry)
plt.suptitle('Figure 2.1. ATMs by Type', fontsize=18)
plt.ylabel('y',fontsize=14)
plt.xlabel('x',fontsize=14)
atm_geodata.plot(ax=ax, column = 'Charge Type',cmap = 'RdYlGn_r',label = 'Charge Type', legend=True)
ctx.add_basemap(ax, source=ctx.providers.CartoDB.Voyager,crs='EPSG:27700')
plt.show()

## Data: SIMD

The Scottish Index of Multiple Deprivation (SIMD) is a relative and area-based measure of deprivation across 6,976 small area data zones. SIMD ranks data zones from most deprived (ranked 1) to least deprived (ranked 6,976) across a number of categories - income, employment, education, health, access to services, crime and housing - which also includes an aggregate ranking and is commonly arranged in deciles or quintiles.

In [None]:
url = "https://maps.gov.scot/ATOM/shapefiles/SG_SIMD_2020.zip"
simdmap_df = gpd.read_file(url)
simdmap_df.columns

In [None]:
simdmap_df.describe()

SIMD is visualised spatially below in **Figure 2**.

In [None]:
fig, ax = plt.subplots(figsize=(32,16))
plt.suptitle('Figure 2. Datazones by Multiple Deprivation Decile', fontsize=18)
plt.ylabel('northing',fontsize=14)
plt.xlabel('easting',fontsize=14)
simdmap_df.plot(ax=ax, column='Decilev2', linewidth = 0.1, legend=True)
ctx.add_basemap(ax, source=ctx.providers.CartoDB.Voyager,crs='EPSG:27700')


# Initial Catchment and Exploratory Data Analysis - Glasgow

The 2SFCA process requires two catchments: service suppliers and service users. The former can be derived from ATM locations (**Figure 1**) and the latter from datazones (**Figure 2**), which already has population estimates (Small Area Population Estimates or SAPE) and deprivation indices when joined with the Scottish Index of Multiple Deprivation as above.

Catchments can be created from the concept of a walkable "20 minute neighbourhood". The generally assumed distance is 800 metres, as recommended by Sustrans. As this is in an urban context, Manhattan block rather than Euclidean straight-line distance seems more appropriate and this is a straightforward calculation. 

## $d_E(p, q) = \sqrt{(p_x - q_x)^2 + (p_y - q_y)^2}$

## $d_M(p, q) = |p_x - q_x| + |p_y - q_y|$

The outcome of this calculation for 800 metres is 566m which also reflects the 600 metres that can be assumed for a 15 minute city.


For datazones, which are polygons of various shapes and sizes depending upon population density (500-1000 persons), a notional catchment can be derived from the area attribute, as an approximate measure of accessibility.

## $r = \sqrt{\frac{A}{\pi}}$

This can then be subject to a similar Euclidean vs Manhattan radial distance calculation as above.

In **Figure 3** below, the majority of the datazones have a catchment radius of less than **300m**. The mean distance is **566m** which provides an interesting complementarity with the Manhattan distance derived from the 20 minute neighbourhood distance of **800m**.  


In [None]:
#defining original demand and supply catchments - Scotland-wide, all attributes, polygons and points
demand_catchment_original = simdmap_df
supply_catchment_original = atm_geodata

In [None]:
for row in simdmap_df:
    demand_catchment_original["equiv_euclidean_radius"] = np.sqrt(demand_catchment_original["Shape_Area"]/3.14)
    demand_catchment_original["euclid_radius_to_manhattan"] = (demand_catchment_original["equiv_euclidean_radius"]/2)**2+(simdmap_df["equiv_euclidean_radius"]/2)**2
    demand_catchment_original["equiv_manhattan_radius"] = np.sqrt(demand_catchment_original["euclid_radius_to_manhattan"])
    
fig= plt.subplots(figsize=(15,8))
plt.suptitle('Figure 3 Derived Catchment Distance - Datazones', fontsize=18)
plt.ylabel('y',fontsize=14)
plt.xlabel('Distance',fontsize=14)
demand_catchment_original['equiv_manhattan_radius'].plot(kind = 'hist', bins=20)




In [None]:
demand_catchment_original["equiv_manhattan_radius"].describe()

A more detailed investigation of the problem focusses upon the local authority area of Glasgow City (**Figure 4**). This allows exploratory data analysis at a more granular level.

ATMs are selected for the Glasgow City area(**Figure 5**) with totals of **576 free** and **187 surcharging** ATMs.

A 800m Euclidean distance buffer representing a "20 minute neightbourhood" catchment has been selected for each ATM - free and surcharging (**Figure 6**)

This completes the creation of the first step catchment for Glasgow.

In [None]:
url = "https://geo.spatialhub.scot/geoserver/sh_las/wfs?service=WFS&request=GetFeature&typeName=pub_las&format_options=filename:Local%20Authority%20Boundaries%20-%20Scotland&outputFormat=application/json&authkey=b85aa063-d598-4582-8e45-e7e6048718fc"

response = requests.get(url, verify=False)
la_boundaries_df = gpd.read_file(response.url)
la_boundaries_df.columns
la_boundaries_glas = la_boundaries_df[la_boundaries_df['local_authority'] == 'Glasgow City']


In [None]:
fig, ax = plt.subplots(figsize=(32,16))
plt.suptitle('Figure 4. Glasgow City', fontsize=18)
plt.ylabel('northing',fontsize=14)
plt.xlabel('easting',fontsize=14)
la_boundaries_glas.plot(ax=ax, linewidth = 1, facecolor="none", edgecolor = "blue", legend=True)
ctx.add_basemap(ax, source=ctx.providers.CartoDB.Voyager,crs='EPSG:27700')


In [None]:
supply_catchment_free = supply_catchment_original[supply_catchment_original['Charge Type'] == 'Free']
supply_catchment_charge = supply_catchment_original[supply_catchment_original['Charge Type'] == 'Surcharging']

In [None]:
#3 Apr 23 Need to aggregate by totals in case of duplication
# Create a new column with the X and Y coordinates as a tuple
supply_catchment_free['xy'] = supply_catchment_free['geometry'].apply(lambda geom: (geom.x, geom.y))
supply_catchment_charge['xy'] = supply_catchment_charge['geometry'].apply(lambda geom: (geom.x, geom.y))
# Group by the xy column and sum up the count of points in each group
grouped_free = supply_catchment_free.groupby('xy').size().reset_index(name='number')
grouped_charge = supply_catchment_charge.groupby('xy').size().reset_index(name='number')
# Create a new geometry column with Point objects for each X and Y coordinate
grouped_free['geometry'] = grouped_free['xy'].apply(lambda xy: Point(xy))
grouped_charge['geometry'] = grouped_charge['xy'].apply(lambda xy: Point(xy))
# Convert back to a GeoDataFrame
grouped_free = gpd.GeoDataFrame(grouped_free, geometry='geometry', crs='EPSG:27700')
grouped_charge = gpd.GeoDataFrame(grouped_charge, geometry='geometry', crs='EPSG:27700')
supply_catchment_free = grouped_free
supply_catchment_free['Charge Type'] = 'Free'
supply_catchment_charge = grouped_charge
supply_catchment_charge['Charge Type'] = 'Surcharging'
supply_catchment_all = pd.concat([supply_catchment_free, supply_catchment_charge])

In [None]:
supply_catchment_charge#['number'].mean()

In [None]:
supply_catchment_free#['number'].mean()

## Clustering of Free ATMs

One feature observed during this analysis is the clustering of free ATMs in Glasgow. This is summarised below and shown in **Figure 5**.

**Scotland**
- Total Free: 4049
- Total Charge :1026
- Clusters Free: 2896
- Clusters Charge: 983
- Mean Free: 1.40
- Mean Charge: 1.04

**Glasgow**
- Total Free: 576
- Total Charge: 187
- Clusters Free: 384
- Clusters Charge: 181
- Mean Free: 1.50
- Mean Charge: 1.03

There is a noticeable phenomenon of the clustering of free ATMs which should be considered alongside their greater numbers and prevalence.

In [None]:
#plot moved around 3 April now shows size of atms
fig, ax = plt.subplots(figsize=(32,16))
#intersection
supply_catchment_glasgow_free = gpd.overlay(supply_catchment_free, la_boundaries_glas, how='intersection')
supply_catchment_glasgow_charge = gpd.overlay(supply_catchment_charge, la_boundaries_glas, how='intersection')
supply_catchment_glasgow_all = gpd.overlay(supply_catchment_all, la_boundaries_glas, how='intersection')
#plot
supply_catchment_glasgow_all.plot(ax=ax,column = 'Charge Type',cmap = 'RdYlGn_r', label = 'Charge Type', s = supply_catchment_glasgow_all['number'].astype(float)*50, legend=True,figsize=(32,16))
la_boundaries_glas.plot(ax=ax, linewidth = 1, facecolor="none", edgecolor = "blue", legend=True)
plt.suptitle('Figure 5. Glasgow City ATMs by type', fontsize=18)
ctx.add_basemap(ax,source=ctx.providers.CartoDB.Voyager,crs='EPSG:27700')


In [None]:
supply_catchment_glasgow_charge['number'].sum()


In [None]:
supply_catchment_glasgow_free['number'].sum()

In [None]:
supply_catchment_glasgow_free['centroid'] = supply_catchment_glasgow_free['geometry']
supply_catchment_glasgow_charge['centroid'] = supply_catchment_glasgow_charge['geometry']
supply_catchment_glasgow_free['geometry'] = supply_catchment_glasgow_free.geometry.buffer(566)
supply_catchment_glasgow_charge['geometry'] = supply_catchment_glasgow_charge.geometry.buffer(566)


In [None]:
fig, ax = plt.subplots(figsize=(32,16))
supply_catchment_glasgow_free.plot(ax=ax,facecolor='none', edgecolor = "green")
supply_catchment_glasgow_charge.plot(ax=ax,facecolor = "none", edgecolor='red')
plt.suptitle('Figure 6. Glasgow City ATMs - Free - 566m buffer', fontsize=18)
ctx.add_basemap(ax,source=ctx.providers.CartoDB.Voyager,crs='EPSG:27700')

## Datazones - A Walkable Subset for Glasgow

The demand catchments apply an assumed Manhattan walkable grid distance of 566m (800m Euclidean calculated as Manhattan). Not all datazones are walkable, so a subset is selected based upon walkability, which eliminates large park areas. The difference is shown in **Figures 7 and 8** below, clipped for Glasgow. This includes some datazones not attributed to Glasgow City Council.

In [None]:
#intersection
demand_catchment_glasgow = gpd.overlay(demand_catchment_original, la_boundaries_glas, how='intersection')
#plot
fig, ax = plt.subplots(figsize=(32,16))
plt.suptitle('Figure 7. Glasgow SIMD', fontsize=18)
plt.ylabel('northing',fontsize=14)
plt.xlabel('easting',fontsize=14)
demand_catchment_glasgow.plot(ax=ax, column='Decilev2', linewidth = 0.1, legend=True)
ctx.add_basemap(ax, source=ctx.providers.CartoDB.Voyager,crs='EPSG:27700')


In [None]:
#applies manhattan distance to datazones to eliminate those not walkable
demand_catchment_manhattan_glasgow = demand_catchment_glasgow[demand_catchment_glasgow["equiv_manhattan_radius"]<=567 ]

In [None]:
fig, ax = plt.subplots(figsize=(32,16))
plt.suptitle('Figure 8. Glasgow SIMD Manhattan Walkable Datazones', fontsize=18)
plt.ylabel('northing',fontsize=14)
plt.xlabel('easting',fontsize=14)
demand_catchment_manhattan_glasgow.plot(ax=ax, column='Decilev2', linewidth = 0.1, legend=True)
ctx.add_basemap(ax, source=ctx.providers.CartoDB.Voyager,crs='EPSG:27700')


In [None]:
demand_catchment_manhattan_glasgow.count()

## Glasgow - Two Step Floating Catchment Analysis

**Step 1** is to apply the 566m catchment to the *service points* ie. the Free and Surcharging cash machines, as per **Figures 5* and **6* above, selecting the intersecting *demand locations*.

*i* is the set of service point supply locations *n* ie. the Free and Surcharging cash machines by each type at each discrete location.

*j* is the set of demand locations *m* ie. the centroids of the datazones.

## $s_{ij} = \frac{a_i}{\sum_{j \in B_i} d_j}$

**Step 2** is to apply the 566m catchment to the *demand locations* ie the datazones, as per **Figure 8** above. For each *demand catchment*, with the sum of the earlier ratios for each location used to calculate an index of accessibility.

## $A_j = \sum_{i \in S_j} s_{ij}$



In [None]:
#step 1 - obtain centroids, apply as geometry, then buffer by 566m
demand_catchment_manhattan_glasgow["centroid"] = demand_catchment_manhattan_glasgow["geometry"].centroid
demand_catchment_manhattan_glasgow_centroid = demand_catchment_manhattan_glasgow
demand_catchment_manhattan_glasgow_centroid["geometry"] = demand_catchment_manhattan_glasgow["centroid"]
demand_catchment_manhattan_glasgow = demand_catchment_manhattan_glasgow_centroid
demand_catchment_manhattan_glasgow['geometry'] = demand_catchment_manhattan_glasgow.geometry.buffer(566)
step1_demand_catchment = demand_catchment_manhattan_glasgow

#plot
fig, ax = plt.subplots(figsize=(32,16))
plt.suptitle('Figure 9. Glasgow SIMD Manhattan Catchments', fontsize=18)
plt.ylabel('northing',fontsize=14)
plt.xlabel('easting',fontsize=14)
step1_demand_catchment.plot(ax=ax, facecolor="none", linewidth = 1, edgecolor = "blue")
ctx.add_basemap(ax, source=ctx.providers.CartoDB.Voyager,crs='EPSG:27700')

In [None]:
#supply catchments - create index
supply_catchment_glasgow_charge['index_column'] = supply_catchment_glasgow_charge.index
supply_catchment_glasgow_free['index_column'] = supply_catchment_glasgow_free.index

In [None]:
#flipping geometry to centroid for demand catchment - can flip back later
step1_demand_catchment['polygon'] = step1_demand_catchment['geometry']
step1_demand_catchment['geometry'] = step1_demand_catchment['centroid']

## Glasgow - *Step 1*

Figure 10 below illustrates **Step 1** - the supply catchments applied to demand centroids, for both Free and Surcharging ATMs, separately. As part of this process, the ratio of service supply to population demand is calculated.

In [None]:
#adding in population representative points for Demand
demand_catchment_manhattan_glasgow_centroidsonly = demand_catchment_manhattan_glasgow
demand_catchment_manhattan_glasgow_centroidsonly["geometry"] = demand_catchment_manhattan_glasgow_centroidsonly["centroid"] 
demand_catchment_manhattan_glasgow_centroidsonly['pop_index'] = (demand_catchment_manhattan_glasgow_centroidsonly['SAPE2017'] - demand_catchment_manhattan_glasgow_centroidsonly['SAPE2017'].min()) / (demand_catchment_manhattan_glasgow_centroidsonly['SAPE2017'].max() - demand_catchment_manhattan_glasgow_centroidsonly['SAPE2017'].min())
demand_catchment_manhattan_glasgow_centroidsonly['pop_index'] = demand_catchment_manhattan_glasgow_centroidsonly['pop_index']*20
demand_catchment_manhattan_glasgow_centroidsonly['pop_index'] = round(demand_catchment_manhattan_glasgow_centroidsonly['pop_index'],0)

#plot
fig, ax = plt.subplots(figsize=(32,16))
plt.suptitle('Figure 10. Step 1: ATM catchment vs demand catchment centroids (size from population)', fontsize=18)
plt.ylabel('northing',fontsize=14)
plt.xlabel('easting',fontsize=14)
demand_catchment_manhattan_glasgow_centroidsonly.plot(ax=ax, color = "blue", markersize=demand_catchment_manhattan_glasgow_centroidsonly['pop_index']*20)
supply_catchment_glasgow_charge.plot(ax=ax,edgecolor='red',facecolor="none")
supply_catchment_glasgow_free.plot(ax=ax,edgecolor='green',facecolor="none")
ctx.add_basemap(ax, source=ctx.providers.CartoDB.Voyager,crs='EPSG:27700')

In [None]:
#first step part 1 joining demand layer to supply layers
step1_charge_supply = gpd.sjoin(supply_catchment_glasgow_charge, step1_demand_catchment, how='left')
step1_free_supply = gpd.sjoin(supply_catchment_glasgow_free, step1_demand_catchment, how='left')

In [None]:
#first step part 2 dissolve and sum charge
step1_charge_supply = step1_charge_supply.dissolve(by = 'index_column', aggfunc='sum')
step1_free_supply = step1_free_supply.dissolve(by = 'index_column', aggfunc='sum')
step1_charge_supply['total_pop_s1'] = step1_charge_supply['SAPE2017']
step1_free_supply['total_pop_s1'] = step1_free_supply['SAPE2017']
#3 April 2023 - change applied here to sum by number
for i in step1_charge_supply:
    step1_charge_supply['s_d_ratio_charge'] = step1_charge_supply['number']/step1_charge_supply['total_pop_s1']
for i in step1_free_supply:
    step1_free_supply['s_d_ratio_free'] = step1_free_supply['number']/step1_free_supply['total_pop_s1']

In [None]:
#removing population less than zero to avoid error
step1_charge_supply = step1_charge_supply[step1_charge_supply["total_pop_s1"]>0]

In [None]:
#second step part 1 revert supply catchments to centroids
step2_charge_supply = step1_charge_supply
step2_free_supply = step1_free_supply

for i in step2_charge_supply:
    step2_charge_supply['centroid'] = step2_charge_supply.centroid
    step2_charge_supply['geometry'] = step2_charge_supply['centroid'] 

for i in step2_free_supply:
    step2_free_supply['centroid'] = step2_free_supply.centroid
    step2_free_supply['geometry'] = step2_free_supply['centroid'] 
    

In [None]:
#first step part 2b dissolve and sum free
# need to revert demand catchment to polygons
step1_demand_catchment['geometry'] = step1_demand_catchment['polygon']

## Glasgow - *Step 2*

Figure 11 below illustrates **Step 2** - the demand catchments applied to supply centroids, for both Free and Surcharging ATMs, separately, aggregating the supply vs population ratios from Step 1.

In [None]:
fig, ax = plt.subplots(figsize=(32,16))
plt.suptitle('Figure 11. Step 2: Demand catchments vs ATM centroids', fontsize=18)
plt.ylabel('northing',fontsize=14)
plt.xlabel('easting',fontsize=14)
step1_demand_catchment.plot(ax=ax, facecolor="none", edgecolor="blue", linewidth = 0.5)
supply_catchment_glasgow_all.plot(ax=ax,column = 'Charge Type',cmap = 'RdYlGn_r',label = 'Charge Type', markersize = supply_catchment_glasgow_all['number'].astype(float)*50, legend=True,figsize=(32,16))

ctx.add_basemap(ax, source=ctx.providers.CartoDB.Voyager,crs='EPSG:27700')

In [None]:
del step2_charge_supply['index_right']
del step2_free_supply['index_right'] 

In [None]:
#second step part 1 charge, join simd to joined atms
step2_charge_demand_join = gpd.sjoin(step1_demand_catchment, step2_charge_supply, how='left')
step2_free_demand_join = gpd.sjoin(step1_demand_catchment, step2_free_supply, how='left')

In [None]:
step2_charge_demand_dissolve = step2_charge_demand_join.dissolve(by = 'DataZone', aggfunc='sum')
step2_free_demand_dissolve = step2_free_demand_join.dissolve(by = 'DataZone', aggfunc='sum')

In [None]:
step2_charge_demand_sd_ratio = step2_charge_demand_dissolve[['s_d_ratio_charge']]
step2_free_demand_sd_ratio = step2_free_demand_dissolve[['s_d_ratio_free']]

In [None]:
step2_charge_demand_sd_ratio['DataZone'] = step2_charge_demand_sd_ratio.index
step2_free_demand_sd_ratio ['DataZone'] = step2_free_demand_sd_ratio.index

In [None]:
#Rectifying earlier error from presentation session on 20 March 2023 - create common index column and atm type column
step2_free_demand_sd_ratio['s_d_ratio_for indexing'] = step2_free_demand_sd_ratio['s_d_ratio_free']
step2_free_demand_sd_ratio['atm_type'] = 'free'
step2_charge_demand_sd_ratio['s_d_ratio_for indexing'] = step2_charge_demand_sd_ratio['s_d_ratio_charge']
step2_charge_demand_sd_ratio['atm_type'] = 'surcharging'

In [None]:
#Rectifying earlier error from presentation session on 20 March 2023 - concatenating 2 tables into 1
step2_all_demand_sd_index = pd.concat([step2_free_demand_sd_ratio,step2_charge_demand_sd_ratio])

In [None]:
#Rectifying earlier error from presentation session on 20 March 2023  indexing stage - arithmetic
step2_all_demand_sd_index['access_index_all'] = (step2_all_demand_sd_index['s_d_ratio_for indexing'] - step2_all_demand_sd_index['s_d_ratio_for indexing'].min()) / (step2_all_demand_sd_index['s_d_ratio_for indexing'].max() - step2_all_demand_sd_index['s_d_ratio_for indexing'].min()) 

In [None]:
#Rectifying earlier error from presentation session on 20 March 2023  - split into tables again 
step2_free_demand_sd_ratio_indexed = step2_all_demand_sd_index[step2_all_demand_sd_index['atm_type'] == 'free']
step2_charge_demand_sd_ratio_indexed = step2_all_demand_sd_index[step2_all_demand_sd_index['atm_type'] == 'surcharging']


In [None]:
#rejoin to original dz dataset
step2_free_demand_sd_ratio_indexed.reset_index(drop = True, inplace = True) 
step2_free_demand_sd_ratio_indexed_joined = demand_catchment_glasgow.merge(step2_free_demand_sd_ratio_indexed, on='DataZone')
step2_charge_demand_sd_ratio_indexed.reset_index(drop = True, inplace = True) 
step2_charge_demand_sd_ratio_indexed_joined= demand_catchment_glasgow.merge(step2_charge_demand_sd_ratio_indexed, on='DataZone')

In [None]:
fig, ax = plt.subplots(figsize=(32,16))
plt.suptitle('Figure 12. Glasgow SIMD 2SFCA - Accessibility of Surcharging ATMs by datazone', fontsize=18)
plt.ylabel('northing',fontsize=14)
plt.xlabel('easting',fontsize=14)
step2_charge_demand_sd_ratio_indexed_joined.plot(ax=ax, column='access_index_all', linewidth = 0.1, vmax=1, legend=True)
ctx.add_basemap(ax, source=ctx.providers.CartoDB.Voyager,crs='EPSG:27700')
#looks fine as most access in city centre

In [None]:
fig, ax = plt.subplots(figsize=(32,16))
plt.suptitle('Figure 13. Glasgow SIMD 2SFCA - Accessibility of Free ATMs by datazone', fontsize=18)
plt.ylabel('northing',fontsize=14)
plt.xlabel('easting',fontsize=14)
step2_free_demand_sd_ratio_indexed_joined.plot(ax=ax, column='access_index_all', linewidth = 0.1, legend=True)
ctx.add_basemap(ax, source=ctx.providers.CartoDB.Voyager,crs='EPSG:27700')
#looks fine as most access in city centre

The preceding **Figures 11 and 12** illustrate the relative accessibility of Surcharging and Free ATMs by datazone, following the conclusion of the 2SFCA process. It is clear that (a) there is greater overall accessibility for Free ATMs versus Surcharging ATMs and, (b) there are some notable hot spots, particularly the city centre.

**Figures 14 and 15** below provide an overview of the distribution of accessibility: most datazones are not particulcarly accessible, with predominantly lower indices of accessibility. This could be influenced by the clustering of facilities observed earlier, for Free ATMs.

In [None]:
fig= plt.subplots(figsize=(15,8))
plt.suptitle('Figure 14. Distribution of Accessibility Index - Surcharging ATMs', fontsize=18)
plt.ylabel('y',fontsize=14)
plt.xlabel('Index',fontsize=14)
step2_charge_demand_sd_ratio_indexed_joined['access_index_all'].plot(kind = 'hist', bins=20)

In [None]:
fig= plt.subplots(figsize=(15,8))
plt.suptitle('Figure 15. Distribution of Accessibility Index - Free ATMs', fontsize=18)
plt.ylabel('y',fontsize=14)
plt.xlabel('Index',fontsize=14)
step2_free_demand_sd_ratio_indexed_joined['access_index_all'].plot(kind = 'hist', bins=20)

A bivariate regression plot (**Figure 16** below) illustrates the relationship between datazone accessibility index and the deprivation ranking (income deprivation) for Free and Surcharging ATMs. Relationships can be observed, but it is also clear that there are some significant outliers for Free ATMs in particular, possibly influenced by the clustering observed earlier.

In [None]:
sns.set_style('ticks')
plt.figure(figsize=(15,8))
ax = sns.regplot(x='IncRankv2', y='access_index_all',data=step2_free_demand_sd_ratio_indexed_joined, color="green")
ax = sns.regplot(x='IncRankv2', y='access_index_all',data=step2_charge_demand_sd_ratio_indexed_joined, color="red")
plt.tick_params(axis='both', which='major', labelsize=14)
ax.set_title('Figure 16. Accessibility Index by SIMD Rank - All Surcharging and Free ATMs', fontsize = 24)
ax.set_ylabel('access index',fontsize=14)
ax.set_xlabel('rank',fontsize=14)
plt.legend(labels=["access index free","model fit","95% confidence","access index charge","model fit","95% confidence"])



A closer look at the most accessible Free ATM datazones (**Figure 17**) indicates a discrete number of clusters, including the city centre. Most of these are in less income-deprived areas, but with one cluster of deprivation. 

In [None]:
step2_free_demand_sd_ratio_indexed_joined_highaccessibility = step2_free_demand_sd_ratio_indexed_joined[step2_free_demand_sd_ratio_indexed_joined['access_index_all']>0.2]
supply_catchment_glasgow_free_clustercheck = supply_catchment_glasgow_all[supply_catchment_glasgow_all['Charge Type'] == 'Free']
highaccess_free_atms = gpd.overlay(supply_catchment_glasgow_free_clustercheck, step2_free_demand_sd_ratio_indexed_joined_highaccessibility , how='intersection')
fig, ax = plt.subplots(figsize=(32,16))
plt.suptitle('Figure 17. Glasgow - Most Accessible Datazones - Free ATMs', fontsize=24)
plt.ylabel('northing',fontsize=14)
plt.xlabel('easting',fontsize=14)
step2_free_demand_sd_ratio_indexed_joined_highaccessibility.plot(ax=ax, column='Decilev2', linewidth = 0.1, legend=True)
highaccess_free_atms.plot(ax=ax,markersize = highaccess_free_atms['number'].astype(float)*50, color='green')
ctx.add_basemap(ax, source=ctx.providers.CartoDB.Voyager,crs='EPSG:27700')

Removal of outliers, in this case accessibility indices above 0.4 for Free ATMs and above 0.075 for Surcharging ATMs, (**Figures 14 and 15 above**), provides an alternative case for modelling and statistical analysis. This is shown in **Figure 18** below as a variation of **Figure 16**.

In [None]:
#remove outlier
step2_free_demand_sd_ratio_indexed_joined_filter = step2_free_demand_sd_ratio_indexed_joined[(step2_free_demand_sd_ratio_indexed_joined.access_index_all < 0.4)]
step2_charge_demand_sd_ratio_indexed_joined_filter = step2_charge_demand_sd_ratio_indexed_joined[(step2_charge_demand_sd_ratio_indexed_joined.access_index_all <0.05)]

In [None]:
sns.set_style('ticks')
plt.figure(figsize=(15,8))
ax = sns.regplot(x='IncRankv2', y='access_index_all',data=step2_free_demand_sd_ratio_indexed_joined_filter, color="green")
ax = sns.regplot(x='IncRankv2', y='access_index_all',data=step2_charge_demand_sd_ratio_indexed_joined_filter, color="red")
plt.tick_params(axis='both', which='major', labelsize=14)
ax.set_title('Figure 18. Accessibility Index by SIMD Rank - Filtered Surcharging and Free ATMs', fontsize = 24)
ax.set_ylabel('access index',fontsize=14)
ax.set_xlabel('rank',fontsize=14)
plt.legend(labels=["access index free","model fit","95% confidence","access index charge","model fit","95% confidence"])

##  Glasgow - Statistical Analysis 

For the following sections, an ordinary least squares regression model is applied to datazones for Free and Surcharging ATMs, with the independent variable as the SIMD income rank and the dependent variable as the accessibility rank.

The hypotheses are that (a) free ATMs are more accessible in more affluent areas and (b) surcharging ATMs are more accessible in more deprived areas.

In [None]:
import statsmodels.api as sm

#not filtered then filtered models from above graphs

independent_free_notfiltered = step2_free_demand_sd_ratio_indexed_joined['IncRankv2']
dependent_free_notfiltered = step2_free_demand_sd_ratio_indexed_joined['access_index_all']

independent_charge_notfiltered = step2_charge_demand_sd_ratio_indexed_joined['IncRankv2']
dependent_charge_notfiltered = step2_charge_demand_sd_ratio_indexed_joined['access_index_all']

independent_free_filtered = step2_free_demand_sd_ratio_indexed_joined_filter['IncRankv2']
dependent_free_filtered = step2_free_demand_sd_ratio_indexed_joined_filter['access_index_all']

independent_charge_filtered = step2_charge_demand_sd_ratio_indexed_joined_filter['IncRankv2']
dependent_charge_filtered = step2_charge_demand_sd_ratio_indexed_joined_filter['access_index_all']



In [None]:
#free ATMs not filtered
X_train,X_test, Y_train, Y_test = train_test_split(independent_free_notfiltered, dependent_free_notfiltered, test_size = .2, random_state = 10)

x_incl_cons = sm.add_constant(X_train)
model = sm.OLS(Y_train, x_incl_cons)  
results = model.fit()  


pd.DataFrame({'coef': results.params , 'pvalue': round(results.pvalues,3)})

In [None]:
#charge ATMs not filtered
X_train,X_test, Y_train, Y_test = train_test_split(independent_charge_notfiltered, dependent_charge_notfiltered, test_size = .2, random_state = 10)

x_incl_cons = sm.add_constant(X_train)
model = sm.OLS(Y_train, x_incl_cons)  
results = model.fit()  

pd.DataFrame({'coef': results.params , 'pvalue': round(results.pvalues,3)})

In [None]:
#charge atms filtered

X_train,X_test, Y_train, Y_test = train_test_split(independent_charge_filtered, dependent_charge_filtered, test_size = .2, random_state = 10)

x_incl_cons = sm.add_constant(X_train)
model = sm.OLS(Y_train, x_incl_cons)  
results = model.fit()  

pd.DataFrame({'coef': results.params , 'pvalue': round(results.pvalues,3)})

In [None]:
#free atms filtered
X_train,X_test, Y_train, Y_test = train_test_split(independent_free_filtered, dependent_free_filtered, test_size = .2, random_state = 10)

x_incl_cons = sm.add_constant(X_train)
model = sm.OLS(Y_train, x_incl_cons)  
results = model.fit()  

pd.DataFrame({'coef': results.params , 'pvalue': round(results.pvalues,3)})

The null hypotheses can be rejected, with some caveats and observations:

1. There is a statistically-significant relationship for the entire dataset for Free ATMs, with a relationship between income ranking and accessibility (p value of 0.001). However, this does not hold true when the outliers are removed (p value of 0.184). The influence of clustered ATMs (see Figure 17) may be significant.

2. There is not a statistically significant relationship between income ranking and accessibility for Surcharging ATMs (p value of 0.926) until outliers are removed (p value of 0.042). As observed earlier, there are fewer Surcharging than Free ATMs, even with the clustering phenomenon for the latter category. 

The greater accessibility of clustered free ATMs may be a more significant observation than the mixed observations regarding Surcharging ATMs. 

With this in mind, a Nearest Points assessment can be carried out to identify the nearest ATM and type and any possible relationship between distance by type and income deprivation.


## Glasgow - Nearest Points

The nearest point will either be a Free or Surcharging ATM for each datazone, measured from the centroid of each datazone.

In [None]:
from scipy.spatial import distance_matrix
from scipy.stats import rdist
from scipy.spatial.distance import pdist, squareform
from shapely.ops import nearest_points

In [None]:
step1_demand_catchment['geometry'] = step1_demand_catchment['centroid']


In [None]:
#this will check the nearest supply ATMs by type, to residential demand centroids
from scipy.spatial import cKDTree
from shapely.geometry import Point

demand_dist = step1_demand_catchment
supply_dist = supply_catchment_glasgow_all

def ckdnearest(gdA, gdB):

    nA = np.array(list(gdA.geometry.apply(lambda x: (x.x, x.y))))
    nB = np.array(list(gdB.geometry.apply(lambda x: (x.x, x.y))))
    btree = cKDTree(nB)
    dist, idx = btree.query(nA, k=1)
    gdB_nearest = gdB.iloc[idx].drop(columns="geometry").reset_index(drop=True)
    gdf = pd.concat(
        [
            gdA.reset_index(drop=True),
            gdB_nearest,
            pd.Series(dist, name='dist')
        ], 
        axis=1)

    return gdf

nearest_dist = ckdnearest(demand_dist, supply_dist)

In [None]:
nearest_dist_charge = nearest_dist[nearest_dist['Charge Type'] == 'Surcharging']
nearest_dist_free = nearest_dist[nearest_dist['Charge Type'] == 'Free']
nearest_dist_charge = nearest_dist_charge[['DataZone','Charge Type','dist']]
nearest_dist_free = nearest_dist_free[['DataZone','Charge Type','dist']]


In [None]:
nearest_dist_charge.mean()

In [None]:
nearest_dist_free.mean()

In [None]:
demand_catchment_glasgow_nearest_charge = demand_catchment_glasgow.merge(nearest_dist_charge, on='DataZone')
demand_catchment_glasgow_nearest_free = demand_catchment_glasgow.merge(nearest_dist_free, on='DataZone')

The nearest ATMs by datazone are mapped in **Figure 19** below. In addition, the mean distances from demand to service facility are 332 metres for Surcharging ATMs and 320 metres for Free ATMs, a marginal difference. This is also plotted at **Figure 20** below.

In [None]:
fig, ax = plt.subplots(figsize=(32,16))
plt.suptitle('Figure 19. Nearest ATMs - Charge vs Free - Glasgow', fontsize=18)
plt.ylabel('northing',fontsize=14)
plt.xlabel('easting',fontsize=14)
demand_catchment_glasgow_nearest_charge.plot(ax=ax,edgecolor='red',facecolor='red',legend=True,)
demand_catchment_glasgow_nearest_free.plot(ax=ax,edgecolor='green',facecolor='green',legend=True,)
ctx.add_basemap(ax, source=ctx.providers.CartoDB.Voyager,crs='EPSG:27700')

**Figure 20** indicates that there is a relationship between income deprivation rank and distance to nearest ATM of both types, with distances increasing as wealth increases. This is more noticeable for Surcharging ATMs, possibly a feature of suburban car-using affluence versus urban deprivation.

In [None]:
sns.set_style('ticks')
plt.figure(figsize=(15,8))
ax = sns.regplot(x='IncRankv2', y='dist',data=demand_catchment_glasgow_nearest_free, color="green")
ax = sns.regplot(x='IncRankv2', y='dist',data=demand_catchment_glasgow_nearest_charge, color="red")
plt.tick_params(axis='both', which='major', labelsize=14)
ax.set_title('Figure 20. Nearest ATM distance by type vs income deprivation- Glasgow', fontsize = 24)
ax.set_ylabel('distance',fontsize=14)
ax.set_xlabel('rank',fontsize=14)
plt.legend(labels=["distance metres free","model fit","95% confidence","distance metres charge","model fit","95% confidence"])

Similar to accessibility index versus SIMD income ranking, the relationship can be modelled, with the hypothesis that the distance to the nearest ATM increases with income rank. In the case of the modelling below, this is true for both Surcharging ATMs and Free ATMs, with p values close to zeron.

In [None]:
independent_free_closest = demand_catchment_glasgow_nearest_free['IncRankv2']
dependent_free_closest = demand_catchment_glasgow_nearest_free['dist']

independent_charge_closest = demand_catchment_glasgow_nearest_charge['IncRankv2']
dependent_charge_closest= demand_catchment_glasgow_nearest_charge['dist']

In [None]:
X_train,X_test, Y_train, Y_test = train_test_split(independent_free_closest, dependent_free_closest, test_size = .2, random_state = 10)

x_incl_cons = sm.add_constant(X_train)
model = sm.OLS(Y_train, x_incl_cons)  
results = model.fit()  


pd.DataFrame({'coef': results.params , 'pvalue': round(results.pvalues,3)})

In [None]:
X_train,X_test, Y_train, Y_test = train_test_split(independent_charge_closest, dependent_charge_closest, test_size = .2, random_state = 10)

x_incl_cons = sm.add_constant(X_train)
model = sm.OLS(Y_train, x_incl_cons)  
results = model.fit()  


pd.DataFrame({'coef': results.params , 'pvalue': round(results.pvalues,3)})

## Glasgow Data: Conclusions

It is possible to apply a spatial analysis to Glasgow in a way that will not be practical for the rest of Scotland. This is not without challenges even at the Glasgow level, but it is possible to identify relationships between deprivation and ATM accessibility, including the clustering of Free ATMs in discrete locations, the general greater accessibility of Surcharging ATMs to more deprived areas and also the relationship between income ranking and distance to nearest ATM (of either type) which is possibly a consequence of suburban affluence, car use and the comparative lack of a need to access cash on a regular basis.

## Scotland Wide Analysis

The above analysis can be applied on a Scotland wide basis. The steps are generally similar but with less scope for spatial visualisation due to the scale of the analysis, other than **Figure 21** below, which shows the walkable catchments selected and the extent of the removal of rural datazones, reducing the total datazones from 6976 to 5930 datazones, a reduction of only 15% but comprising a substantial area of land. 

In [None]:
demand_catchment_scotland_manhattan = demand_catchment_original[demand_catchment_original["equiv_manhattan_radius"]<=567 ]

In [None]:
#this is basically walkable demand centres for scotland and eliminates many larger sparse datazones
fig, ax = plt.subplots(figsize=(32,16))
plt.suptitle('Figure 21. Scotland SIMD Manhattan Walkable Datazones', fontsize=18)
plt.ylabel('northing',fontsize=14)
plt.xlabel('easting',fontsize=14)
demand_catchment_scotland_manhattan.plot(ax=ax, column='Decilev2', linewidth = 0.1, legend=True)
ctx.add_basemap(ax, source=ctx.providers.CartoDB.Voyager,crs='EPSG:27700')


In [None]:
#rerunning demand catchment buffering for Scotland
demand_catchment_scotland_manhattan["centroid"] = demand_catchment_scotland_manhattan["geometry"].centroid
demand_catchment_scotland_manhattan_centroid = demand_catchment_scotland_manhattan
demand_catchment_scotland_manhattan_centroid["geometry"] = demand_catchment_scotland_manhattan["centroid"]
demand_catchment_scotland_manhattan = demand_catchment_scotland_manhattan_centroid
demand_catchment_scotland_manhattan['geometry'] = demand_catchment_scotland_manhattan.geometry.buffer(566)
step1_demand_catchment_scotland = demand_catchment_scotland_manhattan
step1_demand_catchment_scotland['polygon'] = step1_demand_catchment_scotland['geometry']
step1_demand_catchment_scotland['geometry'] = step1_demand_catchment_scotland['centroid']

In [None]:
#rerunning supply catchment selection and buffering for Scotland
supply_catchment_scotland_free = supply_catchment_free
supply_catchment_scotland_charge = supply_catchment_charge
supply_catchment_scotland_free['centroid'] = supply_catchment_scotland_free['geometry']
supply_catchment_scotland_charge['centroid'] = supply_catchment_scotland_charge['geometry']
supply_catchment_scotland_free['centroid'] = supply_catchment_scotland_free['geometry']
supply_catchment_scotland_charge['centroid'] = supply_catchment_scotland_charge['geometry']
supply_catchment_scotland_free['geometry'] = supply_catchment_scotland_free.geometry.buffer(566)
supply_catchment_scotland_charge['geometry'] = supply_catchment_scotland_charge.geometry.buffer(566)

In [None]:
#indexing supply catchment for Scotland
supply_catchment_scotland_charge['index_column'] = supply_catchment_scotland_charge.index
supply_catchment_scotland_free['index_column'] = supply_catchment_scotland_free.index



In [None]:
#first supply and demand join scotland
step1_scotland_charge_supply = gpd.sjoin(supply_catchment_scotland_charge, step1_demand_catchment_scotland, how='left')
step1_scotland_free_supply = gpd.sjoin(supply_catchment_scotland_free, step1_demand_catchment_scotland, how='left')

In [None]:
step1_scotland_charge_supply = step1_scotland_charge_supply.dissolve(by = 'index_column', aggfunc='sum')
step1_scotland_free_supply = step1_scotland_free_supply.dissolve(by = 'index_column', aggfunc='sum')
step1_scotland_charge_supply['total_pop_s1'] = step1_scotland_charge_supply['SAPE2017']
step1_scotland_free_supply['total_pop_s1'] = step1_scotland_free_supply['SAPE2017']
#3 April 2023 change applied here to use number instead of arbitrary 1 as per previous analysis for Glasgow
for i in step1_scotland_charge_supply:
    step1_scotland_charge_supply['s_d_ratio_charge'] = step1_scotland_charge_supply['number']/step1_scotland_charge_supply['total_pop_s1']
for i in step1_scotland_free_supply:
    step1_scotland_free_supply['s_d_ratio_free'] = step1_scotland_free_supply['number']/step1_scotland_free_supply['total_pop_s1']

In [None]:
step1_scotland_charge_supply = step1_scotland_charge_supply[step1_scotland_charge_supply["total_pop_s1"]>0]
step1_scotland_free_supply = step1_scotland_free_supply[step1_scotland_free_supply["total_pop_s1"]>0]

In [None]:
del step1_scotland_charge_supply['index_right']
del step1_scotland_free_supply['index_right'] 

In [None]:
step2_scotland_charge_supply = step1_scotland_charge_supply
step2_scotland_free_supply = step1_scotland_free_supply

for i in step2_scotland_charge_supply :
    step2_scotland_charge_supply['centroid'] = step2_scotland_charge_supply.centroid
    step2_scotland_charge_supply['geometry'] = step2_scotland_charge_supply ['centroid'] 

for i in step2_scotland_charge_supply:
    step2_scotland_charge_supply['centroid'] = step2_scotland_charge_supply.centroid
    step2_scotland_charge_supply['geometry'] = step2_scotland_charge_supply['centroid'] 
    

In [None]:
step1_demand_catchment_scotland['geometry'] = step1_demand_catchment_scotland['polygon']

In [None]:
step2_demand_catchment_scotland = step1_demand_catchment_scotland

In [None]:
step2_scotland_charge_demand = gpd.sjoin(step2_demand_catchment_scotland,step2_scotland_charge_supply,  how='left')
step2_scotland_free_demand = gpd.sjoin(step2_demand_catchment_scotland, step2_scotland_free_supply, how='left')

In [None]:
#dissolving and aggregating supply demand ratios
step2_scotland_charge_demand_dissolve = step2_scotland_charge_demand.dissolve(by = 'DataZone', aggfunc='sum')
step2_scotland_free_demand_dissolve = step2_scotland_free_demand.dissolve(by = 'DataZone', aggfunc='sum')
step2_scotland_charge_demand_sd_ratio = step2_scotland_charge_demand_dissolve[['s_d_ratio_charge']]
step2_scotland_free_demand_sd_ratio = step2_scotland_free_demand_dissolve[['s_d_ratio_free']]
step2_scotland_charge_demand_sd_ratio['DataZone'] = step2_scotland_charge_demand_sd_ratio.index
step2_scotland_free_demand_sd_ratio ['DataZone'] = step2_scotland_free_demand_sd_ratio.index

In [None]:
#Rectifying earlier error from presentation session on 20 March 2023 - create common index column and atm type column - for Scotland
step2_scotland_free_demand_sd_ratio['s_d_ratio_for indexing'] = step2_scotland_free_demand_sd_ratio['s_d_ratio_free']
step2_scotland_free_demand_sd_ratio['atm_type'] = 'free'
step2_scotland_charge_demand_sd_ratio['s_d_ratio_for indexing'] = step2_scotland_charge_demand_sd_ratio['s_d_ratio_charge']
step2_scotland_charge_demand_sd_ratio['atm_type'] = 'surcharging'

In [None]:
#Rectifying earlier error from presentation session on 20 March 2023 - concatenating 2 tables into 1
step2_scotland_all_demand_sd_index = pd.concat([step2_scotland_free_demand_sd_ratio,step2_scotland_charge_demand_sd_ratio])

In [None]:
#Rectifying earlier error from presentation session on 20 March 2023  indexing stage - arithmetic
step2_scotland_all_demand_sd_index['access_index_all'] = (step2_scotland_all_demand_sd_index['s_d_ratio_for indexing'] - step2_scotland_all_demand_sd_index['s_d_ratio_for indexing'].min()) / (step2_scotland_all_demand_sd_index['s_d_ratio_for indexing'].max() - step2_scotland_all_demand_sd_index['s_d_ratio_for indexing'].min()) 

In [None]:
#Rectifying earlier error from presentation session on 20 March 2023  - split into tables again 
step2_scotland_free_demand_sd_ratio_indexed = step2_scotland_all_demand_sd_index[step2_scotland_all_demand_sd_index['atm_type'] == 'free']
step2_scotland_charge_demand_sd_ratio_indexed = step2_scotland_all_demand_sd_index[step2_scotland_all_demand_sd_index['atm_type'] == 'surcharging']


In [None]:
step2_scotland_free_demand_sd_ratio_indexed.reset_index(drop = True, inplace = True) 
step2_scotland_free_demand_sd_ratio_indexed_joined = step2_demand_catchment_scotland.merge(step2_scotland_free_demand_sd_ratio_indexed, on='DataZone')
step2_scotland_charge_demand_sd_ratio_indexed.reset_index(drop = True, inplace = True) 
step2_scotland_charge_demand_sd_ratio_indexed_joined= step2_demand_catchment_scotland.merge(step2_scotland_charge_demand_sd_ratio_indexed, on='DataZone')

In [None]:
#no need to plot scotland-wide as buffers too small to see
#step2_scotland_charge_demand_sd_ratio_indexed_joined.plot()

In [None]:
fig= plt.subplots(figsize=(12,12))
plt.suptitle('Figure 22. Distribution of Accessibility Index - Surcharging', fontsize=18)
plt.ylabel('y',fontsize=14)
plt.xlabel('Index',fontsize=14)
step2_scotland_charge_demand_sd_ratio_indexed_joined['access_index_all'].plot(kind = 'hist', bins=20)

In [None]:
fig= plt.subplots(figsize=(12,12))
plt.suptitle('Figure 23. Distribution of Accessibility Index - Free', fontsize=18)
plt.ylabel('y',fontsize=14)
plt.xlabel('Index',fontsize=14)
step2_scotland_free_demand_sd_ratio_indexed_joined['access_index_all'].plot(kind = 'hist', bins=20)

As for accessibility for Glasgow (the earlier **Figures 14 and 15**) the general accessibility for Surcharging and Free ATMs is low, more so for Surcharging ATMs, shown in (**Figures 22 and 23**) above.

**Figures 24 and 25** below show access index mapped against SIMD income ranking, both for the entire dataset and also for a filtered subset (<0.4 for free and <0.05 for surcharging) for each ATM type. As for Glasgow, there are more Free ATMs than Surcharging ATMs, although in this case there is no analysis of localised clusters due to scale.

Nevertheless, the clustering phenomenon is visible, with an average of 1.4 Free ATMs for each single location compared to 1.04 Surcharging ATMs.

- Total Free: 4049
- Total Charge :1026
- Clusters Free: 2896
- Clusters Charge: 983
- Mean Free: 1.40
- Mean Charge: 1.04

In [None]:
#count - free
step2_scotland_free_demand_sd_ratio_indexed_joined.count()

In [None]:
#checking zeroes - free
step2_scotland_free_demand_sd_ratio_indexed_joined_zero = step2_scotland_free_demand_sd_ratio_indexed_joined[(step2_scotland_free_demand_sd_ratio_indexed_joined.access_index_all == 0)]
step2_scotland_free_demand_sd_ratio_indexed_joined_zero.count()

In [None]:
#count - surcharging
step2_scotland_charge_demand_sd_ratio_indexed_joined.count()

In [None]:
#checking zeroes - surcharging
step2_scotland_charge_demand_sd_ratio_indexed_joined_zero = step2_scotland_charge_demand_sd_ratio_indexed_joined[(step2_scotland_charge_demand_sd_ratio_indexed_joined.access_index_all == 0)]
step2_scotland_charge_demand_sd_ratio_indexed_joined_zero.count()

In [None]:
sns.set_style('ticks')
plt.figure(figsize=(15,8))
ax = sns.regplot(x='IncRankv2', y='access_index_all',data=step2_scotland_free_demand_sd_ratio_indexed_joined, color="green")
ax = sns.regplot(x='IncRankv2', y='access_index_all',data=step2_scotland_charge_demand_sd_ratio_indexed_joined, color="red")
plt.tick_params(axis='both', which='major', labelsize=14)
ax.set_title('Figure 24. Accessibility Index by SIMD Rank - Surcharging and Free ATMs - Scotland', fontsize = 24)
ax.set_ylabel('access index',fontsize=14)
ax.set_xlabel('rank',fontsize=14)
plt.legend(labels=["access index free","model fit","95% confidence","access index charge","model fit","95% confidence"])


In [None]:
step2_scotland_charge_demand_sd_ratio_indexed_joined.describe()

In [None]:
#removal of outliers
step2_scotland_free_demand_sd_ratio_indexed_joined_filter = step2_scotland_free_demand_sd_ratio_indexed_joined[(step2_scotland_free_demand_sd_ratio_indexed_joined.access_index_all <0.4)]
step2_scotland_charge_demand_sd_ratio_indexed_joined_filter = step2_scotland_charge_demand_sd_ratio_indexed_joined[(step2_scotland_free_demand_sd_ratio_indexed_joined.access_index_all <0.05)]

In [None]:
step2_scotland_charge_demand_sd_ratio_indexed_joined_filter.describe()

In [None]:
step2_scotland_free_demand_sd_ratio_indexed_joined_filter.describe()

In [None]:
#assessing extent of zeroes
step2_scotland_free_demand_sd_ratio_indexed_joined_zero = step2_scotland_free_demand_sd_ratio_indexed_joined[(step2_scotland_free_demand_sd_ratio_indexed_joined.access_index_all == 0)]
step2_scotland_free_demand_sd_ratio_indexed_joined_zero.count()

In [None]:
#assessing extent of zeroes - surcharging
step2_scotland_charge_demand_sd_ratio_indexed_joined_zero = step2_scotland_charge_demand_sd_ratio_indexed_joined[(step2_scotland_charge_demand_sd_ratio_indexed_joined.access_index_all == 0)]
step2_scotland_charge_demand_sd_ratio_indexed_joined_zero.count()

In [None]:
#probably same plot as earlier if outliers not removed
sns.set_style('ticks')
plt.figure(figsize=(15,8))
#ax = sns.regplot(x='IncRankv2', y='access_index_all',data=step2_scotland_free_demand_sd_ratio_indexed_joined_filter, color="green")
ax = sns.regplot(x='IncRankv2', y='access_index_all',data=step2_scotland_charge_demand_sd_ratio_indexed_joined_filter, color="red")
plt.tick_params(axis='both', which='major', labelsize=14)
ax.set_title('Figure 25. Accessibility Index by SIMD Rank - Surcharging and Free ATMs - Filtered - Scotland', fontsize = 24)
ax.set_ylabel('access index',fontsize=14)
ax.set_xlabel('rank',fontsize=14)
plt.legend(labels=["access index free","model fit","95% confidence","access index charge","model fit","95% confidence"])

An OLS regression model can be run once more, returning p values of close to zero for the entire dataset for Surcharging and Free ATMs, also for the subsets (<0.4 for free and <0.05 for surcharging). It is difficult to interpret this other than a general decline in accessibility as income rank increases, other than for the data subset for Free ATMs where higher accessibility values have been removed. 

In [None]:
#overwrite previous model variables
independent_free_notfiltered_scot = step2_scotland_free_demand_sd_ratio_indexed_joined['IncRankv2']
dependent_free_notfiltered_scot = step2_scotland_free_demand_sd_ratio_indexed_joined['access_index_all']

independent_charge_notfiltered_scot = step2_scotland_charge_demand_sd_ratio_indexed_joined['IncRankv2']
dependent_charge_notfiltered_scot = step2_scotland_charge_demand_sd_ratio_indexed_joined['access_index_all']

independent_free_filtered_scot = step2_scotland_free_demand_sd_ratio_indexed_joined_filter['IncRankv2']
dependent_free_filtered_scot = step2_scotland_free_demand_sd_ratio_indexed_joined_filter['access_index_all']

independent_charge_filtered_scot = step2_scotland_charge_demand_sd_ratio_indexed_joined_filter['IncRankv2']
dependent_charge_filtered_scot = step2_scotland_charge_demand_sd_ratio_indexed_joined_filter['access_index_all']

In [None]:
dependent_charge_notfiltered_scot.describe()

In [None]:
#free ATMs not filtered
X_train,X_test, Y_train, Y_test = train_test_split(independent_free_notfiltered_scot, dependent_free_notfiltered_scot, test_size = .2, random_state = 10)

x_incl_cons = sm.add_constant(X_train)
model = sm.OLS(Y_train, x_incl_cons)  
results = model.fit()  


pd.DataFrame({'coef': results.params , 'pvalue': round(results.pvalues,3)})

In [None]:
#charge ATMs not filtered
X_train,X_test, Y_train, Y_test = train_test_split(independent_charge_notfiltered_scot, dependent_charge_notfiltered_scot, test_size = .2, random_state = 10)

x_incl_cons = sm.add_constant(X_train)
model = sm.OLS(Y_train, x_incl_cons)  
results = model.fit()  

pd.DataFrame({'coef': results.params , 'pvalue': round(results.pvalues,3)})

In [None]:
#free ATMs filtered
X_train,X_test, Y_train, Y_test = train_test_split(independent_free_filtered_scot, dependent_free_filtered_scot, test_size = .2, random_state = 10)

x_incl_cons = sm.add_constant(X_train)
model = sm.OLS(Y_train, x_incl_cons)  
results = model.fit()  


pd.DataFrame({'coef': results.params , 'pvalue': round(results.pvalues,3)})

In [None]:
X_test

In [None]:
#charge ATMs filtered
X_train,X_test, Y_train, Y_test = train_test_split(independent_charge_filtered_scot, dependent_charge_filtered_scot, test_size = .2, random_state = 10)

x_incl_cons = sm.add_constant(X_train)
model = sm.OLS(Y_train, x_incl_cons)  
results = model.fit()  

pd.DataFrame({'coef': results.params , 'pvalue': round(results.pvalues,3)})

## Scotland - Nearest Points

As for Glasgow, an analysis based upon nearest points can be conducted.  In this case, the mean distance is generally lower for Surcharging ATMs (352 metres) than for Free ATMs (420 metres) although with many more nearest Free ATMs (4301) compared to Surcharging ATMs (1629). 

In [None]:
#nearest table next
step1_demand_catchment_scotland['geometry'] = step1_demand_catchment_scotland['centroid']

In [None]:
demand_dist = step1_demand_catchment_scotland
supply_dist = supply_catchment_original

def ckdnearest(gdA, gdB):

    nA = np.array(list(gdA.geometry.apply(lambda x: (x.x, x.y))))
    nB = np.array(list(gdB.geometry.apply(lambda x: (x.x, x.y))))
    btree = cKDTree(nB)
    dist, idx = btree.query(nA, k=1)
    gdB_nearest = gdB.iloc[idx].drop(columns="geometry").reset_index(drop=True)
    gdf = pd.concat(
        [
            gdA.reset_index(drop=True),
            gdB_nearest,
            pd.Series(dist, name='dist')
        ], 
        axis=1)

    return gdf

nearest_dist_scot = ckdnearest(demand_dist, supply_dist)

In [None]:
nearest_dist_scot_charge = nearest_dist_scot[nearest_dist_scot['Charge Type'] == 'Surcharging']
nearest_dist_scot_free = nearest_dist_scot[nearest_dist_scot['Charge Type'] == 'Free']
nearest_dist_scot_charge = nearest_dist_scot_charge[['DataZone','Charge Type','dist','IncRankv2']]
nearest_dist_scot_free = nearest_dist_scot_free[['DataZone','Charge Type','dist','IncRankv2']]

In [None]:
nearest_dist_scot_charge.mean()

In [None]:
nearest_dist_scot_free.mean() 

**Figures 26 to 28** show the relationship between ATM distance and income deprivation, by ATM type. It is difficult to discern much at the national level, likely with some considerable driving distances and influential outliers, so selecting an arbitrary threshold of 1000 metres provides a more interpretable visualisation and dataset (**Figure 27**) with a Manhattan equivalent of 576 metres for **Figure 28**. There is little difference between the nearest machine mean distance by type, 341 metres for Free and 315 metres for Surcharging within a kilometre, and 281 metres versus 266 metresfor Free and Surcharging respectively, within 577 metres . 

In [None]:
sns.set_style('ticks')
plt.figure(figsize=(15,8))
ax = sns.regplot(x='IncRankv2', y='dist',data=nearest_dist_scot_free, color="green")
ax = sns.regplot(x='IncRankv2', y='dist',data=nearest_dist_scot_charge, color="red")
plt.tick_params(axis='both', which='major', labelsize=14)
ax.set_title('Figure 26. Nearest ATM distance by type vs income deprivation- Scotland', fontsize = 24)
ax.set_ylabel('distance',fontsize=14)
ax.set_xlabel('rank',fontsize=14)
plt.legend(labels=["distance metres free","model fit","95% confidence","distance metres charge","model fit","95% confidence"])

In [None]:
#filter for 1000m and 577m
nearest_dist_scot_charge_576 = nearest_dist_scot_charge[nearest_dist_scot_charge['dist'] <= 577]
nearest_dist_scot_free_576 = nearest_dist_scot_free[nearest_dist_scot_free['dist'] <= 577]
nearest_dist_scot_charge_1000 = nearest_dist_scot_charge[nearest_dist_scot_charge['dist'] <= 1000]
nearest_dist_scot_free_1000 = nearest_dist_scot_free[nearest_dist_scot_free['dist'] <= 1000]

In [None]:
nearest_dist_scot_charge_1000.mean()

In [None]:
nearest_dist_scot_free_1000.mean()

In [None]:
nearest_dist_scot_charge_576.mean()

In [None]:
nearest_dist_scot_free_576.mean()

In [None]:
sns.set_style('ticks')
plt.figure(figsize=(15,8))
ax = sns.regplot(x='IncRankv2', y='dist',data=nearest_dist_scot_free_1000, color="green")
ax = sns.regplot(x='IncRankv2', y='dist',data=nearest_dist_scot_charge_1000, color="red")
plt.tick_params(axis='both', which='major', labelsize=14)
ax.set_title('Figure 27. Nearest ATM distance by type vs income deprivation- Scotland (within 1000m)', fontsize = 24)
ax.set_ylabel('distance',fontsize=14)
ax.set_xlabel('rank',fontsize=14)
plt.legend(labels=["distance metres free","model fit","95% confidence","distance metres charge","model fit","95% confidence"])

In [None]:
sns.set_style('ticks')
plt.figure(figsize=(15,8))
ax = sns.regplot(x='IncRankv2', y='dist',data=nearest_dist_scot_free_576, color="green")
ax = sns.regplot(x='IncRankv2', y='dist',data=nearest_dist_scot_charge_576, color="red")
plt.tick_params(axis='both', which='major', labelsize=14)
ax.set_title('Figure 28. Nearest ATM distance by type vs income deprivation- Scotland (576m)', fontsize = 24)
ax.set_ylabel('distance',fontsize=14)
ax.set_xlabel('rank',fontsize=14)
plt.legend(labels=["distance metres free","model fit","95% confidence","distance metres charge","model fit","95% confidence"])

As for the Scotland nearest analysis, the relationships can be investigated via OLS regression modelling. However, although there are statistically significant relationships between income ranking and distance to an ATM - a slight increase in the latter as the former increases - it is difficult to interpret this at a national scale.

In [None]:
independent_free_dist_scot_notfiltered = nearest_dist_scot_free['IncRankv2']
dependent_free_dist_scot_notfiltered = nearest_dist_scot_free['dist']

independent_charge_dist_scot_notfiltered = nearest_dist_scot_charge['IncRankv2']
dependent_charge_dist_scot_notfiltered = nearest_dist_scot_charge['dist']

independent_free_dist_scot_filtered = nearest_dist_scot_free_576['IncRankv2']
dependent_free_dist_scot_filtered = nearest_dist_scot_free_576['dist']

independent_charge_dist_scot_filtered = nearest_dist_scot_charge_576['IncRankv2']
dependent_charge_dist_scot_filtered = nearest_dist_scot_charge_576['dist']

In [None]:
#free ATMs not filtered
X_train,X_test, Y_train, Y_test = train_test_split(independent_free_dist_scot_notfiltered, dependent_free_dist_scot_notfiltered, test_size = .2, random_state = 10)

x_incl_cons = sm.add_constant(X_train)
model = sm.OLS(Y_train, x_incl_cons)  
results = model.fit()  


pd.DataFrame({'coef': results.params , 'pvalue': round(results.pvalues,3)})


In [None]:
#surcharging ATMs not filtered
X_train,X_test, Y_train, Y_test = train_test_split(independent_charge_dist_scot_notfiltered, dependent_charge_dist_scot_notfiltered, test_size = .2, random_state = 10)

x_incl_cons = sm.add_constant(X_train)
model = sm.OLS(Y_train, x_incl_cons)  
results = model.fit()  


pd.DataFrame({'coef': results.params , 'pvalue': round(results.pvalues,3)})

In [None]:
#free ATMs filtered
X_train,X_test, Y_train, Y_test = train_test_split(independent_free_dist_scot_filtered, dependent_free_dist_scot_filtered, test_size = .2, random_state = 10)

x_incl_cons = sm.add_constant(X_train)
model = sm.OLS(Y_train, x_incl_cons)  
results = model.fit()  


pd.DataFrame({'coef': results.params , 'pvalue': round(results.pvalues,3)})

In [None]:
#surcharging ATMs filtered
X_train,X_test, Y_train, Y_test = train_test_split(independent_charge_dist_scot_filtered, dependent_charge_dist_scot_filtered, test_size = .2, random_state = 10)

x_incl_cons = sm.add_constant(X_train)
model = sm.OLS(Y_train, x_incl_cons)  
results = model.fit()  


pd.DataFrame({'coef': results.params , 'pvalue': round(results.pvalues,3)})

## Conclusions: Scotland

A Scotland-wide analysis does not appear to be particularly useful, even for a subset of datazones assumed to be accessible, other than in identifying a general weak relationship between income ranking and accessibility/ nearest distance to ATMs of both Surcharging and Free types, with accessibility decreasing as income ranking increases and the nearest distance to ATM type decreasing as income rank increases. A focus on settlements or clusters on a comparative basis might be more useful, following the earlier Glasgow model. The extent of catchments for surcharging ATMs with zero accessibility was a notable finding and indicates a failure in the approach to defining catchments.

## Conclusion

The Glasgow model indicated relationships between deprivation and ATM accessibility, including the clustering of Free ATMs in discrete locations, the general greater accessibility of Free ATMs, of Surcharging ATMs to more deprived areas and also the relationship between income ranking and distance to nearest ATM. It was possible to focus on a number of areas where high accessiblity of Free ATMs coincided with clusters of these machines, in an interesting inversion of the original research question, raising a question of the influence of these Fee ATM clusters in terms of localised provision versus more general provision, undermining a possible assumption that the greater number of Free ATMs means they have greater accessibility in the context of social and spatial equality.

A Scotland-wide analysis did not yield as useful results. Nevertheless, applying a similar model as Glasgow to discrete urban areas of Scotland might prove more productive, possibly within the context of 20 minute neighbourhoods. This may also prove useful in directing the subsidy of free-to-use ATMs in more deprived areas.


## End of Notebook, Run All Above
