# Geocoding Greggs Locations Using the National Statistics Postcode Lookup (NSPL) Dataset

In this project, we will load a dataset of Greggs locations and geocode them to coordinates using the National Statistics Postcode Lookup (NSPL) dataset. We will follow these steps:
1. Load and preprocess the Greggs dataset.
2. Standardise and format postcodes.
3. Load and preprocess the NSPL dataset.
4. Merge the Greggs dataset with the NSPL to obtain coordinates.
5. Visualise the result on an interactive map.

---

### Step 1: Import Libraries
We will begin by importing the necessary libraries for data processing and visualization.


In [2]:
# Import necessary libraries
import pandas as pd
import geopandas as gpd
import folium

---

### Step 2: Load the Greggs Dataset
Let's load the `greggs_uk.csv` file, which contains the list of Greggs locations with postcodes. We'll inspect the first few rows to understand its structure.


In [3]:
# Load the Greggs dataset
greggs_df = pd.read_csv('greggs_uk.csv')

# Display the first few rows of the dataset
greggs_df.head()

Unnamed: 0,FHRSID,BusinessName,AddressLine1,AddressLine2,AddressLine3,AddressLine4,PostCode,BusinessType,RatingValue
0,597898,Greggs,6 Watford Road,Birmingham,,,B30 1JA,Retailers - other,5
1,854057,Greggs,39 One Stop Shopping Centre,Walsall Road,Perry Barr,Birmingham,B42 1AA,Takeaway/sandwich shop,5
2,373168,Greggs,16 Western Road,Romford,,,RM1 3LD,Retailers - other,5
3,1057556,Greggs,,,52-54 Botanic Avenue,Belfast,BT7 1JR,Restaurant/Cafe/Canteen,5
4,1291323,Greggs,,49 High Street,Stone,Staffordshire,ST15 8AD,Manufacturers/packers,5


---

### Step 3: Standardise and Format the Postcode Column
To ensure that our postcodes match the format in the NSPL dataset, we will:
- Make all postcodes uppercase.
- Remove any existing spaces.
- Insert a space before the last three characters to standardize the postcode format.


In [5]:
# Standardize the postcode column
greggs_df['PostCode'] = greggs_df['PostCode'].str.upper().str.replace(" ", "")
greggs_df['PostCode'] = greggs_df['PostCode'].str[:-3] + \
    " " + greggs_df['PostCode'].str[-3:]

# Display the first few rows to check the formatting
greggs_df.head()

Unnamed: 0,FHRSID,BusinessName,AddressLine1,AddressLine2,AddressLine3,AddressLine4,PostCode,BusinessType,RatingValue
0,597898,Greggs,6 Watford Road,Birmingham,,,B30 1JA,Retailers - other,5
1,854057,Greggs,39 One Stop Shopping Centre,Walsall Road,Perry Barr,Birmingham,B42 1AA,Takeaway/sandwich shop,5
2,373168,Greggs,16 Western Road,Romford,,,RM1 3LD,Retailers - other,5
3,1057556,Greggs,,,52-54 Botanic Avenue,Belfast,BT7 1JR,Restaurant/Cafe/Canteen,5
4,1291323,Greggs,,49 High Street,Stone,Staffordshire,ST15 8AD,Manufacturers/packers,5


---

### Step 4: Load the NSPL Dataset
Now, we'll load the `NSPL21_AUG_2024_UK.csv` file, which contains the NSPL data. We will retain only the necessary columns: `pcds` (postcode), `lat` (latitude), and `long` (longitude).

You will first need to download this from the [ONS Open Geography Portal](https://geoportal.statistics.gov.uk/search?q=PRD_NSPL%20AUG_2024&sort=Date%20Created%7Ccreated%7Cdesc)


In [6]:
# Load the NSPL dataset
nspl_df = pd.read_csv('NSPL21_AUG_2024_UK.csv')

# Select only the columns we need
nspl_df = nspl_df[['pcds', 'lat', 'long']]

# Display the first few rows of the NSPL dataset
nspl_df.head()

  nspl_df = pd.read_csv('NSPL21_AUG_2024_UK.csv')


Unnamed: 0,pcds,lat,long
0,AB1 0AA,57.101474,-2.242851
1,AB1 0AB,57.102554,-2.246308
2,AB1 0AD,57.100556,-2.248342
3,AB1 0AE,57.084444,-2.255708
4,AB1 0AF,57.096656,-2.258102


---

### Step 5: Merge the Greggs and NSPL Datasets
We will now join the `greggs_df` and `nspl_df` datasets on the postcode columns to obtain latitude and longitude for each Greggs location.


In [8]:
# Merge the Greggs dataset with the NSPL dataset on postcode
merged_df = pd.merge(greggs_df, nspl_df, left_on='PostCode',
                     right_on='pcds', how='left')

# Remove rows with missing coordinates
merged_df = merged_df.dropna(subset=['lat', 'long'])

# Display the first few rows of the merged dataset
merged_df.head()

Unnamed: 0,FHRSID,BusinessName,AddressLine1,AddressLine2,AddressLine3,AddressLine4,PostCode,BusinessType,RatingValue,pcds,lat,long
0,597898,Greggs,6 Watford Road,Birmingham,,,B30 1JA,Retailers - other,5,B30 1JA,52.416523,-1.93014
1,854057,Greggs,39 One Stop Shopping Centre,Walsall Road,Perry Barr,Birmingham,B42 1AA,Takeaway/sandwich shop,5,B42 1AA,52.517615,-1.90294
2,373168,Greggs,16 Western Road,Romford,,,RM1 3LD,Retailers - other,5,RM1 3LD,51.576851,0.182899
3,1057556,Greggs,,,52-54 Botanic Avenue,Belfast,BT7 1JR,Restaurant/Cafe/Canteen,5,BT7 1JR,54.587208,-5.932365
4,1291323,Greggs,,49 High Street,Stone,Staffordshire,ST15 8AD,Manufacturers/packers,5,ST15 8AD,52.902867,-2.147497


---

### Step 6: Convert to GeoDataFrame for Mapping
To visualize our data on a map, we will convert the `merged_df` DataFrame to a GeoDataFrame, using `geopandas`.


In [9]:
# Convert to GeoDataFrame
gdf = gpd.GeoDataFrame(merged_df, geometry=gpd.points_from_xy(
    merged_df['long'], merged_df['lat']))

---

### Step 7: Visualise Greggs Locations on a Map
Using Folium, we will create an interactive map of Greggs locations across the UK. Each location will be marked with a popup showing the postcode.


In [21]:
# Initialize a Folium map centered around Manchester
uk_map = folium.Map(location=[53.483959, -2.244644], zoom_start=15)

# Add each Greggs location to the map with name, address, and rating in the popup
for idx, row in gdf.iterrows():
    popup_text = f"""
    <b>Name:</b> {row['BusinessName']}<br>
    <b>Address:</b> {row['AddressLine1']}<br>
    <b>Business Type:</b> {row['BusinessType']}<br>
    <b>Rating:</b> {row['RatingValue']}
    """
    folium.Marker(
        location=[row['lat'], row['long']],
        popup=popup_text
    ).add_to(uk_map)

# Display the map
uk_map

---

### Step 8: Save and View the Map
Finally, we'll save the map as an HTML file, so you can open it in a web browser and explore Greggs locations interactively.


In [12]:
# Save the map to an HTML file
uk_map.save("greggs_locations_map.html")
print("Map has been saved as 'greggs_locations_map.html'")

Map has been saved as 'greggs_locations_map.html'


---

### Conclusion
In this project, we successfully geocoded Greggs locations using the NSPL dataset and visualised them on an interactive map. This technique can be applied to other datasets with postcodes to enhance spatial analysis and visualization.
