# Location-based Feature Engineering for Listings 

This notebook utilises the coordinates to call a variety of APIs to create new features for each listing, providing an indication of the kinds of amenities, services and points of interest in the vicinity of the property. 

In [99]:
import pandas as pd
import os
import sys
from pathlib import Path
from shapely.geometry import Point
import numpy as np


# Add project root to Python path
current_dir = Path().resolve()
if current_dir.name == 'notebooks':
    project_root = current_dir.parent
elif current_dir.name == 'project2':
    project_root = current_dir
else:
    # If we're in the parent directory, look for project2
    project_root = current_dir / 'project2'

sys.path.insert(0, str(project_root))
print(f"Project root: {project_root}")

from utils.preprocess import PreprocessUtils

# Initialize the preprocessor and geo utils
preprocessor = PreprocessUtils()

pd.set_option("display.max_rows", None)  # Show all rows, default is 10
pd.set_option("display.max_columns", None)  # Show all columns, default is 20

Project root: /Users/jackshee/University/MAST30034 Applied Data Science/project2


## 1. Set up API

Note: in the interest of time, we ran the sampled version instead. 

In [100]:
# Read in cleaned_listings and cleaned_listings_sampled
df = pd.read_csv('../data/processed/domain/cleaned_listings.csv')

# Uncomment below and run only for sampled subset if time constrained
# df_sampled = pd.read_csv('../data/processed/domain/cleaned_listings_sampled.csv')

In [101]:
batch_size = 500

# Set up the directories required from scripts to read from to call APIs 
for output_dir in ["../data/raw/missing_isochrones/driving", "../data/raw/missing_isochrones/walking", "../data/raw/missing_poi", "../data/raw/missing_routes"]:
    preprocessor.split_into_batches(df[['property_id', 'coordinates']], batch_size, output_dir)

# for output_dir in ["../data/raw/missing_isochrones_sampled/driving", "../data/raw/missing_isochrones_sampled/walking", "../data/raw/missing_poi_sampled", "../data/raw/missing_routes_sampled"]:
#     preprocessor.split_into_batches(df_sampled[['property_id', 'coordinates']], batch_size, output_dir)


Saved batch_0001.csv: 500 rows
Saved batch_0002.csv: 500 rows
Saved batch_0003.csv: 500 rows
Saved batch_0004.csv: 500 rows
Saved batch_0005.csv: 500 rows
Saved batch_0006.csv: 500 rows
Saved batch_0007.csv: 500 rows
Saved batch_0008.csv: 500 rows
Saved batch_0009.csv: 500 rows
Saved batch_0010.csv: 500 rows
Saved batch_0011.csv: 500 rows
Saved batch_0012.csv: 500 rows
Saved batch_0013.csv: 500 rows
Saved batch_0014.csv: 500 rows
Saved batch_0015.csv: 500 rows
Saved batch_0016.csv: 500 rows
Saved batch_0017.csv: 500 rows
Saved batch_0018.csv: 500 rows
Saved batch_0019.csv: 500 rows
Saved batch_0020.csv: 500 rows
Saved batch_0021.csv: 500 rows
Saved batch_0022.csv: 500 rows
Saved batch_0023.csv: 500 rows
Saved batch_0024.csv: 500 rows
Saved batch_0025.csv: 500 rows
Saved batch_0026.csv: 500 rows
Saved batch_0027.csv: 500 rows
Saved batch_0028.csv: 500 rows
Saved batch_0029.csv: 500 rows
Saved batch_0030.csv: 500 rows
Saved batch_0031.csv: 500 rows
Saved batch_0032.csv: 500 rows
Saved ba

### Optional Section 

Since we already made many API calls for the sampled dataset, we have stored results, we will merge these results to `df` and just get the remaining properties that need API fetches to create features on the full dataset. In the usual case, we would just run the pipeline directly on the full dataset so this section could be skipped.

In [102]:
df_cleaned = pd.read_csv("../data/curated/rent_features/cleaned_listings_sampled.csv")

df_cleaned.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14497 entries, 0 to 14496
Columns: 225 entries, bathrooms to walking_15min_imputed
dtypes: float64(202), int64(6), object(17)
memory usage: 24.9+ MB


In [103]:
driving_isochrone_features = ['driving_5min', 'driving_10min', 'driving_15min']
current_driving = df_cleaned[['property_id'] + driving_isochrone_features]
walking_isochrone_features = ['walking_5min', 'walking_10min', 'walking_15min']
current_walking = df_cleaned[['property_id'] + walking_isochrone_features]

In [104]:
ptv_routes_features = ['closest_ptv_station_id', 'min_route_dist_m', 'min_route_dur_s']
current_routes = df_cleaned[['property_id'] + ptv_routes_features]

In [105]:
current_poi = preprocessor.merge_batches("../data/processed/poi_features")
current_poi = pd.concat([current_poi, preprocessor.merge_batches("../data/processed/poi_features_wayback/")])
current_poi.info()

Starting merge process...
Input directory: ../data/processed/poi_features
File pattern: *.csv
------------------------------------------------------------
Found 13 files to merge:
  - poi_features_0.csv
  - poi_features_0001.csv
  - poi_features_1000.csv
  - poi_features_1500.csv
  - poi_features_2000.csv
  - poi_features_2500.csv
  - poi_features_3000.csv
  - poi_features_3500.csv
  - poi_features_4000.csv
  - poi_features_4500.csv
  - poi_features_500.csv
  - poi_features_combined.csv
  - poi_features_combined_before_imputation.csv

  Loaded: poi_features_0.csv (500 rows)
  Loaded: poi_features_0001.csv (1 rows)
  Loaded: poi_features_1000.csv (500 rows)
  Loaded: poi_features_1500.csv (500 rows)
  Loaded: poi_features_2000.csv (500 rows)
  Loaded: poi_features_2500.csv (500 rows)
  Loaded: poi_features_3000.csv (500 rows)
  Loaded: poi_features_3500.csv (500 rows)
  Loaded: poi_features_4000.csv (500 rows)
  Loaded: poi_features_4500.csv (500 rows)
  Loaded: poi_features_500.csv (50

In [106]:
current_poi['PropertyID'] = current_poi['PropertyID'].astype('Int64')
# rename 'PropertyID' to "property_id"
current_poi = current_poi.rename(columns={'PropertyID': 'property_id'})

current_poi.head()


Unnamed: 0,property_id,count_atm,count_bank,count_childcare,count_clinic,count_community_centre,count_doctors,count_fast_food,count_fuel,count_kindergarten,count_parking_space,count_restaurant,min_dist_atm,min_dist_bank,min_dist_childcare,min_dist_clinic,min_dist_community_centre,min_dist_doctors,min_dist_fast_food,min_dist_fuel,min_dist_kindergarten,min_dist_parking_space,min_dist_restaurant,count_bar,count_bus_station,count_cafe,count_charging_station,count_cinema,count_college,count_fire_station,count_food_court,count_library,count_nightclub,count_nursing_home,count_parcel_locker,count_pharmacy,count_police,count_social_facility,count_taxi,count_theatre,count_university,count_veterinary,min_dist_bar,min_dist_bus_station,min_dist_cafe,min_dist_charging_station,min_dist_cinema,min_dist_college,min_dist_fire_station,min_dist_food_court,min_dist_library,min_dist_nightclub,min_dist_nursing_home,min_dist_parcel_locker,min_dist_pharmacy,min_dist_police,min_dist_social_facility,min_dist_taxi,min_dist_theatre,min_dist_university,min_dist_veterinary,count_car_rental,count_coworking_space,count_events_venue,count_hospital,count_social_centre,min_dist_car_rental,min_dist_coworking_space,min_dist_events_venue,min_dist_hospital,min_dist_social_centre,count_motorcycle_parking,count_toy_library,count_waste_disposal,min_dist_motorcycle_parking,min_dist_toy_library,min_dist_waste_disposal,count_brothel,count_casino,count_conference_centre,count_exhibition_centre,count_internet_cafe,count_juice_bar,count_prison,min_dist_brothel,min_dist_casino,min_dist_conference_centre,min_dist_exhibition_centre,min_dist_internet_cafe,min_dist_juice_bar,min_dist_prison,count_student_accommodation,min_dist_student_accommodation,count_events_centre,count_healthcare,min_dist_events_centre,min_dist_healthcare,count_bus_station;shelter,min_dist_bus_station;shelter,count_courier,min_dist_courier,count_waste_transfer_station,min_dist_waste_transfer_station,count_retail,min_dist_retail,count_restaurant; cafe,min_dist_restaurant; cafe,count_restaurant;cafe,min_dist_restaurant;cafe,count_biergarten,count_meeting_point,count_music_venue,count_tool_library,min_dist_biergarten,min_dist_meeting_point,min_dist_music_venue,min_dist_tool_library,count_private parking_space,min_dist_private parking_space,count_community_hall;kindergarten,min_dist_community_hall;kindergarten,count_car_wash;cafe,min_dist_car_wash;cafe,count_cafe;deli,min_dist_cafe;deli,count_cafe;bar,min_dist_cafe;bar,count_diused:fuel,min_dist_diused:fuel,count_former_hospital,min_dist_former_hospital
0,17758724,3.0,1.0,3.0,1.0,1.0,1.0,6.0,4.0,8.0,120.0,7.0,1407.923674,1401.316787,1331.091227,1852.033388,810.631041,1054.765606,927.24533,984.819075,1332.137686,1238.165754,949.216907,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,17758700,5.0,9.0,2.0,,1.0,3.0,13.0,5.0,,19.0,32.0,457.054722,1169.642098,1117.678213,,1305.69601,1369.166666,346.785805,882.789671,,350.626347,541.321899,2.0,2.0,32.0,3.0,2.0,1.0,1.0,1.0,2.0,1.0,3.0,1.0,7.0,1.0,2.0,1.0,2.0,2.0,1.0,849.397853,1046.963051,843.257102,970.860033,1352.238195,1274.811981,589.01674,1615.877806,1529.143225,1259.554002,704.405844,1062.945278,1017.18905,1303.820591,1366.645361,1351.254851,1388.164912,1290.353417,428.532188,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2,17758698,4.0,7.0,,6.0,8.0,20.0,36.0,10.0,21.0,3.0,65.0,282.42213,528.878947,,706.62811,381.854012,307.974297,283.401256,244.01109,293.981551,1508.748316,255.940519,11.0,,77.0,3.0,1.0,,,,5.0,1.0,2.0,,8.0,1.0,2.0,,1.0,,3.0,292.652131,,277.167877,697.94511,1313.410913,,,,518.473975,1198.82495,645.830617,,357.031894,1708.611116,946.387572,,1603.364131,,277.085211,1.0,1.0,6.0,3.0,1.0,252.389046,610.898499,333.98411,798.028442,1003.49241,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
3,17758685,1.0,,2.0,,1.0,,2.0,1.0,,,,1516.54185,,1796.44002,,634.797077,,1517.592061,1510.935281,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
4,17758681,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


In [107]:
poi_features = current_poi.columns[1:]
poi_features

Index(['count_atm', 'count_bank', 'count_childcare', 'count_clinic',
       'count_community_centre', 'count_doctors', 'count_fast_food',
       'count_fuel', 'count_kindergarten', 'count_parking_space',
       ...
       'count_car_wash;cafe', 'min_dist_car_wash;cafe', 'count_cafe;deli',
       'min_dist_cafe;deli', 'count_cafe;bar', 'min_dist_cafe;bar',
       'count_diused:fuel', 'min_dist_diused:fuel', 'count_former_hospital',
       'min_dist_former_hospital'],
      dtype='object', length=130)

In [108]:
# merge current_driving, current_walking, current_routes, current_poi to df on property_id
df = df.merge(current_driving, on='property_id', how='left')
df = df.merge(current_walking, on='property_id', how='left')
df = df.merge(current_routes, on='property_id', how='left')
df = df.merge(current_poi, on='property_id', how='left')

print(f"DataFrame shape after merging: {df.shape}")
print(f"Columns added: {len(current_driving.columns) + len(current_walking.columns) + len(current_routes.columns) + len(current_poi.columns) - 4}")  # -4 for property_id duplicates


DataFrame shape after merging: (89618, 152)
Columns added: 139


In [109]:
# sort by year, quarter and drop property_id duplicates keeping first
df = df.sort_values(['year', 'quarter'])
df = df.drop_duplicates(subset=['property_id'], keep='first')

print(f"DataFrame shape after deduplication: {df.shape}")
print(f"Records removed: {len(df_cleaned) - len(df)}")


DataFrame shape after deduplication: (24027, 152)
Records removed: -9530


In [None]:
# Check which rows have null values in any of the PTV routes features
null_routes_mask = df[ptv_routes_features].isnull().any(axis=1)

# Get property_id and coordinates for rows with null PTV routes features
missing_routes_data = df[null_routes_mask][['property_id', 'coordinates']].copy()

print(f"Total rows in df: {len(df)}")
print(f"Rows with missing PTV routes data: {len(missing_routes_data)}")
print(f"Missing routes data shape: {missing_routes_data.shape}")

# Show first few rows
print("\nFirst 5 rows with missing PTV routes data:")
print(missing_routes_data.head())

preprocessor.split_into_batches(missing_routes_data, 500, output_dir='../data/raw/missing_routes_remain')


Total rows in df: 24027
Rows with missing PTV routes data: 12575
Missing routes data shape: (12575, 2)

First 5 rows with missing PTV routes data:
       property_id                      coordinates
89617      8991170  POINT (-36.7671057 144.2591911)
89616     10621148    POINT (-37.756201 144.811242)
89615     10715072  POINT (-37.8141406 144.9874786)
89606     11423279  POINT (-36.7447064 144.2815164)
89614     11447599   POINT (-37.8265171 144.960065)


In [116]:
# Check which rows have null values in all of the POI features
null_poi_mask = df[poi_features].isnull().all(axis=1)

# Get property_id and coordinates for rows with null PTV routes features
missing_poi_data = df[null_poi_mask][['property_id', 'coordinates']].copy()

print(f"Total rows in df: {len(df)}")
print(f"Rows with missing POI data: {len(missing_poi_data)}")
print(f"Missing POI data shape: {missing_poi_data.shape}")

# Show first few rows
print("\nFirst 5 rows with missing POI data:")
print(missing_poi_data.head())

# Split into batches
preprocessor.split_into_batches(missing_poi_data, 500, output_dir='../data/raw/missing_poi_remain')

Total rows in df: 24027
Rows with missing POI data: 13583
Missing POI data shape: (13583, 2)

First 5 rows with missing POI data:
       property_id                      coordinates
89616     10621148    POINT (-37.756201 144.811242)
89615     10715072  POINT (-37.8141406 144.9874786)
89606     11423279  POINT (-36.7447064 144.2815164)
89614     11447599   POINT (-37.8265171 144.960065)
89605     12544827    POINT (-37.857627 144.678496)
Saved batch_0001.csv: 500 rows
Saved batch_0002.csv: 500 rows
Saved batch_0003.csv: 500 rows
Saved batch_0004.csv: 500 rows
Saved batch_0005.csv: 500 rows
Saved batch_0006.csv: 500 rows
Saved batch_0007.csv: 500 rows
Saved batch_0008.csv: 500 rows
Saved batch_0009.csv: 500 rows
Saved batch_0010.csv: 500 rows
Saved batch_0011.csv: 500 rows
Saved batch_0012.csv: 500 rows
Saved batch_0013.csv: 500 rows
Saved batch_0014.csv: 500 rows
Saved batch_0015.csv: 500 rows
Saved batch_0016.csv: 500 rows
Saved batch_0017.csv: 500 rows
Saved batch_0018.csv: 500 rows


['../data/raw/missing_poi_remain/batch_0001.csv',
 '../data/raw/missing_poi_remain/batch_0002.csv',
 '../data/raw/missing_poi_remain/batch_0003.csv',
 '../data/raw/missing_poi_remain/batch_0004.csv',
 '../data/raw/missing_poi_remain/batch_0005.csv',
 '../data/raw/missing_poi_remain/batch_0006.csv',
 '../data/raw/missing_poi_remain/batch_0007.csv',
 '../data/raw/missing_poi_remain/batch_0008.csv',
 '../data/raw/missing_poi_remain/batch_0009.csv',
 '../data/raw/missing_poi_remain/batch_0010.csv',
 '../data/raw/missing_poi_remain/batch_0011.csv',
 '../data/raw/missing_poi_remain/batch_0012.csv',
 '../data/raw/missing_poi_remain/batch_0013.csv',
 '../data/raw/missing_poi_remain/batch_0014.csv',
 '../data/raw/missing_poi_remain/batch_0015.csv',
 '../data/raw/missing_poi_remain/batch_0016.csv',
 '../data/raw/missing_poi_remain/batch_0017.csv',
 '../data/raw/missing_poi_remain/batch_0018.csv',
 '../data/raw/missing_poi_remain/batch_0019.csv',
 '../data/raw/missing_poi_remain/batch_0020.csv',


In [118]:
# Check which rows have null values in all of the POI features
null_driving_mask = df[driving_isochrone_features].isnull().any(axis=1)

# Get property_id and coordinates for rows with null driving isochrone features
missing_driving_data = df[null_driving_mask][['property_id', 'coordinates']].copy()

print(f"Total rows in df: {len(df)}")
print(f"Rows with missing driving isochrone data: {len(missing_driving_data)}")
print(f"Missing driving isochrone data shape: {missing_driving_data.shape}")

# Show first few rows
print("\nFirst 5 rows with missing driving isochrone data:")
print(missing_driving_data.head())

# Split into batches
preprocessor.split_into_batches(missing_driving_data, 500, output_dir='../data/raw/missing_isochrones_remain/driving')

Total rows in df: 24027
Rows with missing driving isochrone data: 12802
Missing driving isochrone data shape: (12802, 2)

First 5 rows with missing driving isochrone data:
       property_id                      coordinates
89616     10621148    POINT (-37.756201 144.811242)
89615     10715072  POINT (-37.8141406 144.9874786)
89614     11447599   POINT (-37.8265171 144.960065)
89605     12544827    POINT (-37.857627 144.678496)
89602     14282968   POINT (-37.8418634 144.992411)
Saved batch_0001.csv: 500 rows
Saved batch_0002.csv: 500 rows
Saved batch_0003.csv: 500 rows
Saved batch_0004.csv: 500 rows
Saved batch_0005.csv: 500 rows
Saved batch_0006.csv: 500 rows
Saved batch_0007.csv: 500 rows
Saved batch_0008.csv: 500 rows
Saved batch_0009.csv: 500 rows
Saved batch_0010.csv: 500 rows
Saved batch_0011.csv: 500 rows
Saved batch_0012.csv: 500 rows
Saved batch_0013.csv: 500 rows
Saved batch_0014.csv: 500 rows
Saved batch_0015.csv: 500 rows
Saved batch_0016.csv: 500 rows
Saved batch_0017.csv

['../data/raw/missing_isochrones_remain/driving/batch_0001.csv',
 '../data/raw/missing_isochrones_remain/driving/batch_0002.csv',
 '../data/raw/missing_isochrones_remain/driving/batch_0003.csv',
 '../data/raw/missing_isochrones_remain/driving/batch_0004.csv',
 '../data/raw/missing_isochrones_remain/driving/batch_0005.csv',
 '../data/raw/missing_isochrones_remain/driving/batch_0006.csv',
 '../data/raw/missing_isochrones_remain/driving/batch_0007.csv',
 '../data/raw/missing_isochrones_remain/driving/batch_0008.csv',
 '../data/raw/missing_isochrones_remain/driving/batch_0009.csv',
 '../data/raw/missing_isochrones_remain/driving/batch_0010.csv',
 '../data/raw/missing_isochrones_remain/driving/batch_0011.csv',
 '../data/raw/missing_isochrones_remain/driving/batch_0012.csv',
 '../data/raw/missing_isochrones_remain/driving/batch_0013.csv',
 '../data/raw/missing_isochrones_remain/driving/batch_0014.csv',
 '../data/raw/missing_isochrones_remain/driving/batch_0015.csv',
 '../data/raw/missing_iso

In [119]:
# Check which rows have null values in all of the POI features
null_walking_mask = df[walking_isochrone_features].isnull().any(axis=1)

# Get property_id and coordinates for rows with null walking isochrone features
missing_walking_data = df[null_walking_mask][['property_id', 'coordinates']].copy()

print(f"Total rows in df: {len(df)}")
print(f"Rows with missing walking isochrone data: {len(missing_walking_data)}")
print(f"Missing walking isochrone data shape: {missing_walking_data.shape}")

# Show first few rows
print("\nFirst 5 rows with missing walking isochrone data:")
print(missing_walking_data.head())

# Split into batches
preprocessor.split_into_batches(missing_walking_data, 500, output_dir='../data/raw/missing_isochrones_remain/walking')

Total rows in df: 24027
Rows with missing walking isochrone data: 14247
Missing walking isochrone data shape: (14247, 2)

First 5 rows with missing walking isochrone data:
       property_id                      coordinates
89617      8991170  POINT (-36.7671057 144.2591911)
89616     10621148    POINT (-37.756201 144.811242)
89615     10715072  POINT (-37.8141406 144.9874786)
89606     11423279  POINT (-36.7447064 144.2815164)
89614     11447599   POINT (-37.8265171 144.960065)
Saved batch_0001.csv: 500 rows
Saved batch_0002.csv: 500 rows
Saved batch_0003.csv: 500 rows
Saved batch_0004.csv: 500 rows
Saved batch_0005.csv: 500 rows
Saved batch_0006.csv: 500 rows
Saved batch_0007.csv: 500 rows
Saved batch_0008.csv: 500 rows
Saved batch_0009.csv: 500 rows
Saved batch_0010.csv: 500 rows
Saved batch_0011.csv: 500 rows
Saved batch_0012.csv: 500 rows
Saved batch_0013.csv: 500 rows
Saved batch_0014.csv: 500 rows
Saved batch_0015.csv: 500 rows
Saved batch_0016.csv: 500 rows
Saved batch_0017.csv

['../data/raw/missing_isochrones_remain/walking/batch_0001.csv',
 '../data/raw/missing_isochrones_remain/walking/batch_0002.csv',
 '../data/raw/missing_isochrones_remain/walking/batch_0003.csv',
 '../data/raw/missing_isochrones_remain/walking/batch_0004.csv',
 '../data/raw/missing_isochrones_remain/walking/batch_0005.csv',
 '../data/raw/missing_isochrones_remain/walking/batch_0006.csv',
 '../data/raw/missing_isochrones_remain/walking/batch_0007.csv',
 '../data/raw/missing_isochrones_remain/walking/batch_0008.csv',
 '../data/raw/missing_isochrones_remain/walking/batch_0009.csv',
 '../data/raw/missing_isochrones_remain/walking/batch_0010.csv',
 '../data/raw/missing_isochrones_remain/walking/batch_0011.csv',
 '../data/raw/missing_isochrones_remain/walking/batch_0012.csv',
 '../data/raw/missing_isochrones_remain/walking/batch_0013.csv',
 '../data/raw/missing_isochrones_remain/walking/batch_0014.csv',
 '../data/raw/missing_isochrones_remain/walking/batch_0015.csv',
 '../data/raw/missing_iso

## 2. Running the API scripts

In order to get the features, one must create API keys from https://account.heigit.org/login. For the full dataset that has 49 batches, one would need 49 keys to get all the data in 24hrs since the limiting API request is 500 and batches have been split into 500. Alternatively, 25 keys would enable the fetching of all the features over two days provided the scripts are run in separate terminals concurrently. 

To get the minimum distance and duration to the closest PTV station via roads network (driving), we run the `./run_routes.sh` shell script command from the terminal in the project root directory. It may be required enable run permissions `chmod +x run_routes.sh`. 

Likewise, to get the counts and minimum distance to various kinds of amenities (points of interest), run `./run_all_poi_batches.sh`. 

For isochrones (both driving and walking profile), run `./run_isochrones_batch_driving.sh` and run `./run_isochrones_batch_walking.sh`. This will return a POLYGON object that describes the reachable area within 5, 10 or 15mins of either driving or walking starting from an individual property. 

All fetched data are saved as `.csv` files with `property_id` and features in `data/processed` and are then further merged to `cleaned_listings.csv` on `property_id`. 