# Illegal STL Detection and Reporting

This notebook shows the process used to find illegal STLs in Galway City. Additional information is available for Galway County, Conamara, and the rest of Ireland.

Note: Airbnb blocks web scrapers from searching their site, so we rely on [Inside Airbnb](https://insideairbnb.com/get-the-data)'s data for Ireland. We can run Scrapy on the specific listing URLs but most of the information we would want is already provided in Inside Airbnb's listings.csv (which just needs to be unzipped from listings.csv.gz)

Questions
- We were focusing on listings for entire homes, but what about private/shared rooms in guest houses/ etc? Where the owner is letting all the individual rooms in an entire property?
- I haven't been able to find the 81 approved STL planning permissions, where did we get that information and can we get the list of permission reference IDs?
- Any proposals to get around "Exact location provided after booking"? (I'm wondering if we look at planning permissions for guest houses, etc that we can figure out if planning permission was obtained for development but not STL?)

In [18]:
import os
import pandas as pd
import geopandas as gp
cwd = os.getcwd()
input_dir = cwd+"/inputs"
output_dir = cwd+"/outputs"

## 1. Getting Data

### Airbnb
We can download and unzip listings.csv.gz from [Inside Airbnb](https://insideairbnb.com/get-the-data), which includes over 80 fields of information. We won't be interested in all of the fields right now, but there is a lot to explore.

In [6]:
# we load the data into a pandas data frame
list_df = pd.read_csv(input_dir+"/listings.csv")
#list_df.columns #print the list of columns if you want to see your options

Now we filter our data frame (which covers all of Ireland) to focus on Galway City listings

In [7]:
galway_city_df = list_df[list_df['region_name'].str.contains("Galway")] #1119 listings

In [9]:
set(list(galway_city_df["room_type"]))

{'Entire home/apt', 'Hotel room', 'Private room', 'Shared room'}

And here we select only listings where the room type is 'Entire home/apt' (for now at least)

In [14]:
gal_df = galway_city_df[galway_city_df["room_type"].str.contains("Entire")] #731 listings

Now maybe we want to get rid of some of the columns we aren't interested in this time around, just to make things look a bit cleaner for us.

In [15]:
desired_columns = ['id', 'listing_url', 'scrape_id', 'last_searched', 'last_scraped',
       'source', 'name', 'description', 'host_id', 'host_url', 'host_name', 'host_since', 'host_location',
       'host_neighbourhood', 'host_listings_count','host_total_listings_count', 'neighbourhood',
       'latitude', 'longitude', 'property_type', 'room_type', 'accommodates',
       'bathrooms', 'bedrooms', 'beds', 'price', 'estimated_occupancy_l365d',
       'estimated_revenue_l365d','calculated_host_listings_count',
       'calculated_host_listings_count_entire_homes',
       'calculated_host_listings_count_private_rooms',
       'calculated_host_listings_count_shared_rooms', 'region_id',
       'region_name', 'region_parent_id', 'region_parent_name',
       'region_parent_parent_id', 'region_parent_parent_name'
       ]
gal_df = gal_df.filter(desired_columns, axis=1)

Now we have our final Airbnb dataset, filtered to focus on entire home/apartment unit listings in Galway City. We can save it as a csv or excel file for easier viewing.

In [17]:
#gal_df.to_csv(output_dir+"/galway_city_270925.csv")
gal_df.to_excel(output_dir+'/galway_city_270925.xlsx')

### Booking.com
On the one hand, it's nice that we can use Scrapy to crawl booking.com for listings, but on the other hand, it means we need to do a bit more work to get the information.

In [None]:
# will include information about the strucutre of the scrapy spider and how to run it

## 2. Geospatial Data
Now we want to evaluate if a listing has corresponding planning permission. To do this, we're first going to download the PACE_Planning_Sites_With_Info shapefile from the [City Council Planning Map on ArcGIS](https://experience.arcgis.com/experience/4878ca4a845945db8b3c1af302acbebf) and put it in our "inputs" folder. Then, we're going to convert our tables of listings into a point shapefile. We may need to filter our Planning Permissions a bit, but then we can work on determining if a planning permission corresponds to a listing coordinate or not.

It may also be helpful to view the files on free GIS software like [QGIS](https://qgis.org/download/).

### Expedia?

### Listing Table to Shapefile

In [19]:
gal_gdf = gp.GeoDataFrame(
    gal_df, geometry=gp.points_from_xy(gal_df.longitude, gal_df.latitude, crs="EPSG:4326"))

Maybe we update the column names ourselves too?

In [20]:
mapper = {
    'listing_url':'list_url', 
    'last_searched':"srch_date", 
    'last_scraped':"scrpe_date",
    'description':"descrpt", 
    'host_location':"host_loc",
    'host_neighbourhood':"host_nbhd", 
    'host_listings_count':"hst_lcount",
    'host_total_listings_count':"hst_t_lcnt", 
    'neighbourhood':"nbhd",
    'property_type':"prop_type", 
    'accommodates':"max_guests",
    'estimated_occupancy_l365d':"est_occ_yr",
    'estimated_revenue_l365d': "est_rev_yr",
    'calculated_host_listings_count':"htlc",
    'calculated_host_listings_count_entire_homes':"htlc_eh",
    'calculated_host_listings_count_private_rooms':"htlc_pr",
    'calculated_host_listings_count_shared_rooms':"htlc_sr",
    'region_name':"reg_name", 
    'region_parent_id':"reg_pid", 
    'region_parent_name':"reg_pname",
    'region_parent_parent_id':"reg_ppid", 
    'region_parent_parent_name':"reg_ppname"
}
gal_gdf.rename(mapper, axis=1, inplace=True)

In [22]:
gal_gdf.to_file(output_dir+"/shapefiles/galway_city_270925.shp")

  ogr_write(


### Galway City Council Planning Permission Map

We gather our data from the [ArcGIS Experience Map Site](https://experience.arcgis.com/experience/4878ca4a845945db8b3c1af302acbebf), downloading the Shapefile of Planning Applications (Last 10 Years)

In [23]:
gcc_planmap_orig_addr = input_dir+"/PACE_Planning_Sites_With_Info_-8595002616335958008"
gp_og = gp.read_file(gcc_planmap_orig_addr)

  return ogr_read(


contains polygon(s) with rings with invalid winding order
shapefile should be corrected using ogr2ogr

In [25]:
len(gp_og)

21841

In [26]:
pp_stl = gp_og[gp_og["Developm00"].str.contains("short term let")]
#pp_s_t = gal_entire = gp_og[gp_og["Developm00"].str.contains("short-term")] #0
#pp_st2 = gp_og[gp_og["Developm00"].str.contains("Short Term")] #0
#pp_st3 = gp_og[gp_og["Developm00"].str.contains("Short-term")] #0
#pp_stl = gp_og[gp_og["Developm00"].str.contains("STL")] #0
pp_cou = gp_og[gp_og["Developm00"].str.contains("change of use")] #1461 many irrelev
len(pp_stl), len(pp_cou)

(20, 1670)

So, this is way fewer than the 81 mentioned [in this article](https://catuireland.org/airbnb/2025/04/30/how-to-report-illegal-short-term-lets/)-- was that number about Galway County? Where did we get that number? *Can we have the permission reference numbers*? I may need to update the way I search for these planning permissions?

When searching the [Galway City Planning site](https://www.eplanning.ie/GalwayCity/searchresults) for "short term let", there are only 22 applications, some refused or invalid.

In [None]:
after_cou = []
for description in pp_cou["Developm00"]:
    after_cou.append(description[description.rfind("to"):])
from collections import Counter
options_c = Counter(after_cou)
options_c

In [None]:
### VALID we tolerate for now
#'to short term let for a period not exceeding 90 days per calendar year'
### RED FLAG if nearby listings are "Exact location provided after booking"
#'bedsit'
#'granny flat'
#'apartment'
#'guesthouse'
#'guest house'
#'self-contained apartment'
### YES we're looking at 
#'residential apartment'
#'student accommodation'
#'living accommodation'
#'guest bedroom'
#'guest room'
#'bedroom'

Now first examine in qgis, both this layer and the county council layers