# UK Cost of Living Analysis – Research Analyst Portfolio

**Dataset:** Kaggle “Global Cost of Living”  
**Focus:** Customer vulnerability in the UK housing market  
**Tools:** Pandas (for data loading), SQLite (for SQL queries)  

## Project Overview
This project analyses cost of living and local purchasing power for UK cities, highlighting the most vulnerable areas where customers might struggle with housing and daily expenses.

## Methodology
1. Filtered dataset for UK cities only.  
2. Renamed columns and handled missing values (NaN → 0).  
3. Loaded UK data into an in-memory SQLite database.  
4. Ran SQL queries to identify:
   - Top cost of living cities  
   - Top rent cities  
   - Top grocery cost cities  
   - Cities most vulnerable (Cost vs Purchasing Power)  
   - Combined cost pressure score

## Key Insights
- Most UK cities have low Local Purchasing Power (0–7.21), making even moderate costs impactful.  
- Vulnerability score highlights cities where cost of living is disproportionately high relative to income potential.  
- This analysis could help SNG target support to areas where customers may be at risk.

## Tools & Skills Demonstrated
- SQL queries on real data  
- Pandas for data cleaning and NaN handling  
- SQLite in-memory database  
- Quantitative research & business insight interpretation


In [42]:
import pandas as pd
import sqlite3
# Adjust this path to the one you found from os.walk
df = pd.read_csv("/kaggle/input/global-cost-of-living/cost-of-living.csv")
df.head()
# Filter UK rows
df_uk = df[df['country'] == 'United Kingdom']

# Keep relevant columns only
df_uk = df_uk[['city', 'country', 'x1', 'x2', 'x3', 'x5']]

# Rename columns for clarity
df_uk.columns = ['City', 'Country', 'Cost_of_Living_Index', 'Rent_Index', 'Groceries_Index', 'Local_Purchasing_Power_Index']

df_uk.head()

# Create in-memory SQLite database
conn = sqlite3.connect(":memory:")

# Load UK data into SQL
df_uk.to_sql("cost_of_living_uk", conn, index=False, if_exists="replace")
pd.read_sql("""
SELECT *
FROM cost_of_living_uk
LIMIT 10;
""", conn)
pd.read_sql("""
SELECT City, Cost_of_Living_Index
FROM cost_of_living_uk
ORDER BY Cost_of_Living_Index DESC
LIMIT 10;
""", conn)
pd.read_sql("""
SELECT City, Rent_Index
FROM cost_of_living_uk
ORDER BY Rent_Index DESC
LIMIT 10;
""", conn)

# Update SQLite table
conn.execute("""
UPDATE cost_of_living_uk
SET Local_Purchasing_Power_Index = 0
WHERE Local_Purchasing_Power_Index IS NULL;
""")
conn.commit()
pd.read_sql("""
SELECT City, Local_Purchasing_Power_Index
FROM cost_of_living_uk
ORDER BY Local_Purchasing_Power_Index ASC
LIMIT 10;
""", conn)
pd.read_sql("""
SELECT City, Cost_of_Living_Index, Local_Purchasing_Power_Index,
       Cost_of_Living_Index / Local_Purchasing_Power_Index AS vulnerability_score
FROM cost_of_living_uk
ORDER BY vulnerability_score DESC
LIMIT 10;
""", conn)
pd.read_sql("""
SELECT City, Cost_of_Living_Index, Rent_Index, Groceries_Index,
       (Cost_of_Living_Index*0.5 + Rent_Index*0.3 + Groceries_Index*0.2) AS combined_score
FROM cost_of_living_uk
ORDER BY combined_score DESC
LIMIT 10;
""", conn)
pd.read_sql("""
SELECT City, 
       Cost_of_Living_Index, 
       Local_Purchasing_Power_Index,
       (Cost_of_Living_Index / (Local_Purchasing_Power_Index + 0.01)) AS Vulnerability_Score
FROM cost_of_living_uk
ORDER BY Vulnerability_Score DESC
LIMIT 10;
""", conn)
pd.read_sql("""
SELECT City, Cost_of_Living_Index / (Local_Purchasing_Power_Index + 0.01) AS vulnerability_score
FROM cost_of_living_uk
ORDER BY vulnerability_score DESC
LIMIT 10;
""", conn)






Unnamed: 0,City,vulnerability_score
0,Seasalter,1802.0
1,Rutherglen,9.648199
2,Eastbourne,9.217391
3,Hartlepool,7.819945
4,Burton upon Trent,7.002915
5,Godalming,6.243243
6,Stevenage,5.822715
7,Nuneaton,5.619543
8,Hereford,5.547344
9,East Barnet,5.420428
