## 1. Import Libraries

In [26]:
from pathlib import Path
import numpy as np
import pandas as pd
import geopandas as gpd

## 2. Reading Cleaned Data 

with the read_csv_url() function wrote by Chris

the function of read_csv_url() is:

Load a CSV file from a remote URL into a pandas DataFrame
Clean column names using janitor(if loading fails, prints a failure message)

In [2]:
import file_reader as fr

In [4]:
url = "../01_Data/Cleaned/listings.csv"
df = fr.read_csv_url(url)

Successfully loaded from URL: ../01_Data/Cleaned/listings.csv
Preview of first 5 rows:
        scrape_id  host_id  host_listings_count  latitude  longitude  \
0  20250610032232  1389063                 11.0  51.44306   -0.01948   
1  20250610032232  1389063                 11.0  51.44284   -0.01997   
2  20250610032232  1389063                 11.0  51.44359   -0.02275   
3  20250610032232  1389063                 11.0  51.44355   -0.02309   
4  20250610032232  1389063                 11.0  51.44333   -0.02307   

        property_type        room_type  accommodates  bedrooms  price  \
0  Entire rental unit  Entire home/apt            10       4.0  297.0   
1  Entire rental unit  Entire home/apt             2       1.0   98.0   
2  Entire rental unit  Entire home/apt             4       2.0  148.0   
3  Entire rental unit  Entire home/apt             5       2.0  144.0   
4  Entire rental unit  Entire home/apt             4       2.0  157.0   

   estimated_occupancy_l365d  
0         

In [5]:
df.head(5)

Unnamed: 0,scrape_id,host_id,host_listings_count,latitude,longitude,property_type,room_type,accommodates,bedrooms,price,estimated_occupancy_l365d
0,20250610032232,1389063,11.0,51.44306,-0.01948,Entire rental unit,Entire home/apt,10,4.0,297.0,110
1,20250610032232,1389063,11.0,51.44284,-0.01997,Entire rental unit,Entire home/apt,2,1.0,98.0,37
2,20250610032232,1389063,11.0,51.44359,-0.02275,Entire rental unit,Entire home/apt,4,2.0,148.0,55
3,20250610032232,1389063,11.0,51.44355,-0.02309,Entire rental unit,Entire home/apt,5,2.0,144.0,64
4,20250610032232,1389063,11.0,51.44333,-0.02307,Entire rental unit,Entire home/apt,4,2.0,157.0,37


In [6]:
print(f"The data is {df.shape[0]} rows * {df.shape[1]} columns")

The data is 96651 rows * 11 columns


## 3. QUESTION 3

### How many properties would be affected by the opposition’s proposal?

Assumptions: 
- For code analysis, we are considering that the **properties** affected by the oposition's proposal are the ones from professional landlords.

- However, we are going to complement this analysis (maybe in another question that include affected **public** and not only **properties**) with two possible paths:

    + if airbnb prices (average) are already similar to hotel prices (average) in the same location (*choose unit: borought/MSOA*):
 
        - it is possible that airbnb prices will remain the same and only the landlord profit will be reduced. 
            - only landlords will be affected.
 
        - other possible scenario: landlords that were is airbnb before might change to long-term rents (avoid profit reduction)
            - more long-term rents + fixed demand = lower prices >>> local residents will be affected   
          
    + if airbnb prices (average) are smaller than hotel prices (average) in the same location (*choose unit: borought/MSOA*):
 
        - increase in taxes probably are going to result in increase in airbnb prices.
            - travelers are going to be affected    

### 3.1. Counting professional hosts in airbnb listings

professional landlord is the host that: 
- has two or more listings (or)
- have rented his property for more then X days per year.

In [11]:
# Creat a copy of the DataFrame (optional safe)
df = df.copy() 

In [12]:
# PARAMETERS
# In the UK, your tax residency status will depend on a statutory residence test. 
# You'll usually be regarded as a UK resident if: 
# 1) You spend more than 183 days in the UK within a tax year. 
# 2) Your only home was in the UK for 91 days or more, and you stayed in this home for more than 30 days.
pro_rented_days = 183 

In [13]:
# Compute which hosts qualify as professional landlords
df['is_professional'] = (
    (df['host_listings_count'] >= 2) |
    (df['estimated_occupancy_l365d'] > pro_rented_days)
)

In [22]:
# Count unique professional landlords
n_professional_hosts = df.loc[df['is_professional'], 'host_id'].nunique()

# Count total unique hosts
total_hosts = df['host_id'].nunique()

# Calculate percentage
pro_host_percentage = (n_professional_hosts / total_hosts) * 100

print(f'There are {n_professional_hosts:,.0f} professional hosts in London, they represent {pro_host_percentage:.2f}% of all hosts in the city')

There are 16,779 professional hosts in London, they represent 30.07% of all hosts in the city


### 3.2. Counting the properties connected to professional hosts in airbnb listings

In [23]:
df.head(5)

Unnamed: 0,scrape_id,host_id,host_listings_count,latitude,longitude,property_type,room_type,accommodates,bedrooms,price,estimated_occupancy_l365d,is_professional
0,20250610032232,1389063,11.0,51.44306,-0.01948,Entire rental unit,Entire home/apt,10,4.0,297.0,110,True
1,20250610032232,1389063,11.0,51.44284,-0.01997,Entire rental unit,Entire home/apt,2,1.0,98.0,37,True
2,20250610032232,1389063,11.0,51.44359,-0.02275,Entire rental unit,Entire home/apt,4,2.0,148.0,55,True
3,20250610032232,1389063,11.0,51.44355,-0.02309,Entire rental unit,Entire home/apt,5,2.0,144.0,64,True
4,20250610032232,1389063,11.0,51.44333,-0.02307,Entire rental unit,Entire home/apt,4,2.0,157.0,37,True


In [25]:
# Count properties where the host is professional
n_properties_pro_hosts = df.loc[df['is_professional'], :].shape[0]

# Count total unique properties
total_properties = df.shape[0]

# Calculate percentage
percentage_pro_properties = (n_properties_pro_hosts / total_properties) * 100

print(f'There are {n_properties_pro_hosts:,.0f} properties owned by professional hosts in London, they represent {percentage_pro_properties:.2f}% of all properties in the city.')

There are 57,619 properties owned by professional hosts in London, they represent 59.62% of all properties in the city.


### 3.3. Assumptions and conclusions 

To analyse the impact of the opposition’s proposal, it is first necessary to make some assumptions regarding the definition of professional landlords. Two main parameters were therefore considered: the number of properties owned by a host and the number of days a property was available for short-term rent.

First, all landlords with two or more properties were counted. The analysis shows that almost 29% of all hosts in London list more than one property on the Airbnb platform. This finding contrasts with one of the arguments used to justify Airbnb’s lower tax burden—namely, that short-term rentals provide a means for local residents to generate extra income by renting out their homes when they are travelling. However, just under one third of Airbnb hosts in London are listing properties that are unlikely to be their primary residence.

To validate the first parameter, a second criterion was established. If a property was available on Airbnb for 180 days or more in a given year, it was considered highly likely to be managed by a professional host. The analysis found that nearly 37% of all hosts in London have properties available for 180 days or more annually.

A professional host was therefore defined as one who meets both of these criteria. Consequently, the properties affected by the opposition’s proposal are those associated with professional landlords. The results indicate that slightly more than 15% of all Airbnb hosts in London fall into this category. Under this scenario, the analysis reveals that the opposition’s proposal would affect 25,512 properties, representing 26.40% of all properties listed on Airbnb in London.

## 4. EXTRA

1. Read London Borough shp 
2. Read Hotel data from Open Street Map (check if there are prices in the data)
3. Prepare Airbnb data (set crs, create spatial points) for plot professional properties in London map 
4. Spatial join hotels to boroughs
5. Spatial join airbnb to boroughs
6. Calculate average prices per borough
7. Create a column that indicate the difference between prices 
8. Plot this column in the London borough map 
9. Write conclusions