In [5]:
pip install requests

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.2 -> 24.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


Before we start our project, we need to import some neccesary file
* pandas is a powerful data manipulation and analysis library for Python. 
* requests is a library used to make HTTP requests in Python. It is helpful for interacting with web APIs, downloading data, or sending information over the internet.
* folium is a Python library used to generate interactive maps with markers, polylines, and more. It integrates with Leaflet.js, a popular JavaScript mapping library.
* random is a library used to generate random numbers or select random items from sequences. It is often used in simulations or algorithms like Simulated Annealing to introduce random behavior.

### Algorithm we're gonna use

- Simulated Annealing (v)
- A Star
- BFS

In [30]:
import pandas as pd
import requests
import folium
import random
import math
import numpy as np
import warnings


df = pd.read_csv('D:/Code/Coding Stuff/2nd Year/AI Concpet/open_pubs.csv')

df.head()

Unnamed: 0,fas_id,name,address,postcode,easting,northing,latitude,longitude,local_authority
0,24,Anchor Inn,"Upper Street, Stratford St Mary, COLCHESTER, E...",CO7 6LW,604748,234405.0,51.97039,0.979328,Babergh
1,30,Angel Inn,"Egremont Street, Glemsford, SUDBURY, Suffolk",CO10 7SA,582888,247368.0,52.094427,0.668408,Babergh
2,63,Black Boy Hotel,"7 Market Hill, SUDBURY, Suffolk",CO10 2EA,587356,241327.0,52.038683,0.730226,Babergh
3,64,Black Horse,"Lower Street, Stratford St Mary, COLCHESTER, E...",CO7 6JS,604270,233920.0,51.966211,0.972091,Babergh
4,65,Black Lion,"Lion Road, Glemsford, SUDBURY, Suffolk",CO10 7RF,582750,248298.0,52.102815,0.666893,Babergh


In [11]:
null_rows = df[df.isnull().any(axis=1)]
print(null_rows)

       fas_id                                               name   address  \
768      7499  J D Wetherspoon \"The Star\","105 High Street,...  EN11 8TN   
43212  412676  \"Rory's Bar\","57 Market Place, Malton, North...  YO17 7LX   

      postcode  easting   northing   latitude   longitude local_authority  
768     537293   208856  51.761556  -0.012036  Broxbourne             NaN  
43212   478582   471715  54.135281  -0.798783     Ryedale             NaN  


Before we start to our algorithm we have to check if there are data that have Null values

In [None]:
df['latitude'].describe()

count     51566
unique    46655
top          \N
freq         70
Name: latitude, dtype: object

In [14]:
df = df.dropna()

Here we have to clean the dataframe that are provided, due to usually in some CSV file, there's a value that's not inserted causing a Null value in some columns

In [15]:
unique_cities = df['local_authority'].unique()
sorted_cities = sorted(unique_cities)

for city in sorted_cities:
    print(city)

Aberdeen City
Aberdeenshire
Adur
Allerdale
Amber Valley
Angus
Argyll and Bute
Arun
Ashfield
Ashford
Aylesbury Vale
Babergh
Barking and Dagenham
Barnet
Barnsley
Barrow-in-Furness
Basildon
Basingstoke and Deane
Bassetlaw
Bath and North East Somerset
Bedford
Bexley
Birmingham
Blaby
Blackburn with Darwen
Blackpool
Bolsover
Bolton
Boston
Bournemouth
Bracknell Forest
Bradford
Braintree
Breckland
Brent
Brentwood
Bridgend
Brighton and Hove
Bristol, City of
Broadland
Bromley
Bromsgrove
Broxbourne
Broxtowe
Burnley
Bury
Caerphilly
Calderdale
Cambridge
Camden
Cannock Chase
Canterbury
Cardiff
Carlisle
Carmarthenshire
Castle Point
Central Bedfordshire
Ceredigion
Charnwood
Chelmsford
Cheltenham
Cherwell
Cheshire East
Cheshire West and Chester
Chesterfield
Chichester
Chiltern
Chorley
Christchurch
City of Edinburgh
City of London
Clackmannanshire
Colchester
Conwy
Copeland
Cornwall
Cotswold
County Durham
Coventry
Craven
Crawley
Croydon
Dacorum
Darlington
Dartford
Daventry
Denbighshire
Derby
Derbyshire D

Because there's 50K data frames, we decided to narrow it down into some cities for now

In [16]:
chosen_cities = "City of London"

df_chosen_city = df[df['local_authority'] == chosen_cities]

In [17]:
#this is for cleaning some dataset that's null
df['latitude'] = pd.to_numeric(df['latitude'], errors='coerce')
df['longitude'] = pd.to_numeric(df['longitude'], errors='coerce')
df = df.dropna(subset=['latitude', 'longitude'])

#choose a city, cuz it's gonna take like 100 years if we dont reduce the dataset
chosen_city = "City of London"
df_chosen_city = df[df['local_authority'] == chosen_city].head(10) #limit it to 10 for now

In [18]:
#i kinda ask AI how to use this library, because i don't know how this library works
def get_osrm_route(start_coords, end_coords):
    url = f"http://router.project-osrm.org/route/v1/driving/{start_coords[1]},{start_coords[0]};{end_coords[1]},{end_coords[0]}?overview=full&geometries=geojson"
    response = requests.get(url)

    if response.status_code == 200:
        route_data = response.json()
        distance_meters = route_data['routes'][0]['distance']  # Distance in meters
        route_coords = route_data['routes'][0]['geometry']['coordinates']
        return route_coords, distance_meters
    else:
        return None, None

In [19]:
m = folium.Map(location=[df_chosen_city['latitude'].mean(), df_chosen_city['longitude'].mean()], zoom_start=13)

#this part is to mark the pub locations
pub_locations = list(zip(df_chosen_city['latitude'], df_chosen_city['longitude'])) #basically condence the coordinate into tuples
pub_names = df_chosen_city['name'].tolist() #make the list of all pub in the city


#same as this, i dont know yet how folium works
for index, row in df_chosen_city.iterrows(): #iterrows to iterrate over columns
    folium.Marker(location=[row['latitude'], row['longitude']],
                  popup=f"{row['name']}<br>{row['address']}<br>{row['postcode']}",).add_to(m)

In [20]:
total_distance = 0

for i in range(len(pub_locations) - 1):
  start_coords = pub_locations[i]
  end_coords = pub_locations[i + 1]

  #get route coordinates and distance from OSRM
  route, distance_meters = get_osrm_route(start_coords, end_coords)

  if route:
    #convert so the folium library can use it
    route_latlong = [(coord[1], coord[0]) for coord in route]

    #add the roads
    folium.PolyLine(locations=route_latlong, color='blue', weight=2.5, opacity=0.8).add_to(m)

    distance_km = distance_meters / 1000
    total_distance += distance_km

    print(f"Distance between '{pub_names[i]}' and '{pub_names[i + 1]}': {distance_km:.2f} km")

print(f"Total distance covered: {total_distance:.2f} km")

m

Distance between 'Abbey' and 'Agenda': 0.75 km
Distance between 'Agenda' and 'All Bar One': 1.84 km
Distance between 'All Bar One' and 'All Bar One': 2.42 km
Distance between 'All Bar One' and 'All Bar One': 1.84 km
Distance between 'All Bar One' and 'All Bar One': 1.65 km
Distance between 'All Bar One' and 'All Bar One': 1.40 km
Distance between 'All Bar One' and 'Amber Bar': 1.26 km
Distance between 'Amber Bar' and 'Astronomer': 1.40 km
Distance between 'Astronomer' and 'Babble City': 2.01 km
Total distance covered: 14.55 km
