# LQ analysis

Since using absolute crime counts did not seem very insightful in terms crime type correlation, I attempted to look at crime counts using Location Quotients (LQs):

A measure of local concentration relative to a region, LQs are calculated for a certain crime category in an area. This measure is the ratio between the local share of crime in this category and the citywide share of crime in this category.

An LQ equal to 1 indicates that the local proportion of crime in a category is equal to the citywide share of crime in that category. An LQ more than 1 indicates that crime is locally overrepresented relative to its share of citywide crime, while an LQ less than 1 shows that crime is locally underrepresented.


Source: caranvr.github.io/gentrification-crime-ldn/

### Preparation

In [None]:
import pandas as pd
import geopandas as gpd
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
import json

In [None]:
crimes = pd.read_csv('data/all_crimes_2022-2025.csv')

In [None]:
crimes['Month'] = pd.to_datetime(crimes['Month'])

gdf_crimes = gpd.GeoDataFrame(
    crimes,
    geometry=gpd.points_from_xy(crimes['Longitude'], crimes['Latitude']),
    crs="EPSG:4326"
)

LSOAs = gpd.read_file('data/LSOAs.geojson').to_crs(epsg=4326)

gdf_joined_lsoa = gpd.sjoin(
    gdf_crimes,
    LSOAs[['geometry', 'LSOA11NM']],
    how='left',
    predicate='within'
)

# Count crimes per LSOA and crime type
lsoa_counts = (
    gdf_joined_lsoa
    .dropna(subset=['LSOA11NM'])
    .groupby(['LSOA11NM', 'Crime type'])
    .size()
    .reset_index(name='count')
)

# Create grid
lsoa_crime_grid = lsoa_counts.pivot(index='LSOA11NM', columns='Crime type', values='count').fillna(0).astype(int)

lsoa_crime_grid.head()

### Calculate LQs

In [None]:
# Calculate the total crime counts per LSOA
total_crimes_per_lsoa = lsoa_crime_grid.sum(axis=1)

# Calculate the total crimes per crime type (all LSOAs)
total_crimes_per_type = lsoa_crime_grid.sum(axis=0)

# Calculate the total crimes (all LSOAs) and crime types
total_crimes_overall = total_crimes_per_type.sum()

# Calculate LQs
lq_matrix = (lsoa_crime_grid.div(total_crimes_per_lsoa, axis=0)) / (total_crimes_per_type / total_crimes_overall)

# Handle missing values by filling with 0
lq_matrix = lq_matrix.fillna(0)

lq_matrix.head()

### Calculate correlation & show heatmap

In [None]:
lq_corr = lq_matrix.corr()

lq_corr.head()


In [None]:
plt.figure(figsize=(12, 8))
sns.heatmap(lq_corr, cmap='RdBu_r', vmin=-1, vmax=1, annot=True)
plt.title('Correlation of Crime Types Based on LQ (LSOA)')
plt.tight_layout()
plt.show()

### Map Burglary LQ

In [None]:
# lq_burglary = lq_matrix['Burglary']

# lq_burglary_df = lq_burglary.reset_index(name='LQ_Burglary')

# lq_burglary_gdf = LSOAs.merge(lq_burglary_df, left_on='LSOA11NM', right_on='LSOA11NM', how='left')

# lq_burglary_gdf[['LSOA11NM', 'LQ_Burglary']].head()


In [None]:
# fig = px.choropleth_map(
#     lq_burglary_gdf,
#     geojson=json.loads(LSOAs.to_json()),
#     locations='LSOA11NM',
#     featureidkey="properties.LSOA11NM",
#     color='LQ_Burglary',
#     color_continuous_scale="RdYlGn_r",
#     range_color=(0, 2),
#     map_style="open-street-map",
#     zoom=9,
#     center={"lat": 51.5072, "lon": -0.1276},
#     opacity=0.6,
#     height=600
# )

# fig.update_layout(title='Burglary LQ Across London (LSOA)')
# fig.show()

In [None]:
# Plot for each crime type
for crime_type in lq_matrix.columns:
    lq_column = lq_matrix[crime_type].rename(f'LQ_{crime_type}')
    lq_df = lq_column.reset_index()
    lq_gdf = LSOAs.merge(lq_df, on='LSOA11NM', how='left')

    fig = px.choropleth_map(
        lq_gdf,
        geojson=json.loads(LSOAs.to_json()),
        locations='LSOA11NM',
        featureidkey="properties.LSOA11NM",
        color=f'LQ_{crime_type}',
        color_continuous_scale="RdYlGn_r",
        range_color=(0, 2),
        map_style="open-street-map",
        zoom=9,
        center={"lat": 51.5072, "lon": -0.1276},
        opacity=0.6,
        height=600
    )

    fig.update_layout(title=f'{crime_type} LQ Across London (LSOA)')
    fig.show()

In [None]:
# # Plot Burglary & Vehicle crime
# for crime_type in lq_matrix.columns:
#     if crime_type == "Burglary" or crime_type == "Vehicle crime":

#         lq_column = lq_matrix[crime_type].rename(f'LQ_{crime_type}')
#         lq_df = lq_column.reset_index()
#         lq_gdf = LSOAs.merge(lq_df, on='LSOA11NM', how='left')

#         fig = px.choropleth_map(
#             lq_gdf,
#             geojson=json.loads(LSOAs.to_json()),
#             locations='LSOA11NM',
#             featureidkey="properties.LSOA11NM",
#             color=f'LQ_{crime_type}',
#             color_continuous_scale="RdYlGn_r",
#             range_color=(0, 2),
#             map_style="open-street-map",
#             zoom=9,
#             center={"lat": 51.5072, "lon": -0.1276},
#             opacity=0.6,
#             height=600
#         )

#         fig.update_layout(title=f'{crime_type} LQ Across London (LSOA)')
#         fig.show()

In [None]:
lq_burglary = lq_matrix['Burglary']
lq_vehicle = lq_matrix['Vehicle crime']

# Calculate average LQ
lq_avg = ((lq_burglary + lq_vehicle) / 2).rename('LQ_Avg_Burg_Vehicle')
lq_avg_df = lq_avg.reset_index()

lq_avg_gdf = LSOAs.merge(lq_avg_df, on='LSOA11NM', how='left')

fig = px.choropleth_map(
    lq_avg_gdf,
    geojson=json.loads(LSOAs.to_json()),
    locations='LSOA11NM',
    featureidkey="properties.LSOA11NM",
    color='LQ_Avg_Burg_Vehicle',
    color_continuous_scale="RdYlGn_r",
    range_color=(0, 2),
    map_style="open-street-map",
    zoom=9,
    center={"lat": 51.5072, "lon": -0.1276},
    opacity=0.6,
    height=600
)

fig.update_layout(title='Average LQ of Burglary & Vehicle Crime Across London (LSOA)')
fig.show()


### Conclusions

Compared to the absolute crime counts, this LQ strategy seems way more reliable and I am able to very quickly see obvious patterns for crimes (example: Bicycle theft, Shoplifting, Theft from the person). Thus this seems to provide way more insight. What is interesting, is that the correlation between Burglary LQ and Vehicle crime LQ is the third highest of all pairs. When comparing those two maps, it seems like specific broader regions in (mostly) the suburbs seem to have a high similarity on both Burglary and Vehicle crime LQ's maps (they seem to peak at comparable regions). When mapping the average of Burglary and Vehicle crime LQ's per LSOA, it still represents both relatively well. This correlation could have some economical (or social) causes, which we will have to dive into deeper.