# Spatial Analysis of Airbnb and Housing Market Dynamics in the UK

The focus here is a GIS task - to demonstrate the spatial relationship between the Airbnb data with other factors such as housing prices, and the short to let market. Using the datasets that I have shared here and using your own research, can you answer a series of research questions using spatial data science principles. 




This codebook outlines the variables, data sources, and spatial data science methods that will be used in the analysis of Airbnb listings and their impact on housing prices and the short-let market in the UK. I require this in python. 

This dataset can be used for:
- Analyzing Airbnb pricing trends and availability.
- Examining property sales trends and rental markets.
- Exploring socioeconomic factors in relation to real estate.
- Mapping and analyzing geographical boundaries.

## Variables

### Airbnb Data
- **listing_id**: Unique identifier for Airbnb listing.
- **latitude**: Latitude coordinate of listing/property/area.
- **longitude**: Longitude coordinate of listing/property/area.
- **price_per_night**: Price per night for Airbnb listing.
- **availability_365**: Number of days listing is available in a year.
- **number_of_reviews**: Total number of reviews for Airbnb listing.
- **neighbourhood**: Name of the neighborhood of Airbnb listing.

### Property Data
- **postcode**: Postal code of property/area (Source: Housing Prices, Short-let, ONS Data).
- **price**: Sale price of property (Source: UK House Price Index - UKHPI).
- **date_of_sale**: Date when the property was sold (Source: UK House Price Index - UKHPI).
- **property_type**: Type of property (detached, semi-detached, etc.) (Source: UK House Price Index - UKHPI).
- **rental_price**: Monthly rental price of short-let property (Source: Short-let Market Data).
- **availability_status**: Current availability status of short-let property (Source: Short-let Market Data).

### Socioeconomic Data
- **median_income**: Median household income for area (Source: ONS Data).
- **population_density**: Population density of area (Source: ONS Data).
- **unemployment_rate**: Unemployment rate for area (Source: ONS Data).

### Geographical Data
- **boundary_polygon**: Polygon representing geographical boundaries (Source: Ordnance Survey Data).

## Sources

- **Airbnb Data**: Data related to Airbnb listings, including pricing and reviews.
- **UK House Price Index (UKHPI)**: Provides information on property sales prices and property types.
- **Short-let Market Data**: Contains data on short-let rental prices and availability.
- **ONS Data**: Includes socioeconomic indicators such as median income, population density, and unemployment rate.
- **Ordnance Survey Data**: Provides geographical boundaries for the areas in question.


In [2]:


# data analysis
import pandas as pd
# import scipy
# import statmodel


# spatail anlaysis
import geopandas as gpd
# import osmnx
# import geopy
# import pysal

# data visualization
import seaborn as sns
import matplotlib.pyplot as plt
# import convetex
# import folium


In [5]:
# read all data 
airbnb_path = "data/Airbnb listings.csv"
boudnary_path = "data/house_pricing"
scrapped_market = "data/shortlet_market_data.csv"
demo_data =""


Exploratory Data Analysis (EDA)

Airbnb Data

In [7]:
airbnb_list = pd.read_csv(airbnb_path)
airbnb_list.head()

Unnamed: 0,id,listing_url,scrape_id,last_scraped,source,name,description,neighborhood_overview,picture_url,host_id,...,review_scores_communication,review_scores_location,review_scores_value,license,instant_bookable,calculated_host_listings_count,calculated_host_listings_count_entire_homes,calculated_host_listings_count_private_rooms,calculated_host_listings_count_shared_rooms,reviews_per_month
0,312761,https://www.airbnb.com/rooms/312761,20240319050633,2024-03-21,city scrape,Spacious Central London Apartment by Hoxton Sq...,"Very central location, in the middle of Shored...",Everything is so convenient and the area is al...,https://a0.muscache.com/pictures/miso/Hosting-...,1608226,...,5.0,4.89,4.93,,f,1,1,0,0,0.86
1,13913,https://www.airbnb.com/rooms/13913,20240319050633,2024-03-20,city scrape,Holiday London DB Room Let-on going,My bright double bedroom with a large window h...,Finsbury Park is a friendly melting pot commun...,https://a0.muscache.com/pictures/miso/Hosting-...,54730,...,4.83,4.7,4.7,,f,3,2,1,0,0.25
2,15400,https://www.airbnb.com/rooms/15400,20240319050633,2024-03-20,city scrape,Bright Chelsea Apartment. Chelsea!,Lots of windows and light. St Luke's Gardens ...,It is Chelsea.,https://a0.muscache.com/pictures/428392/462d26...,60302,...,4.83,4.93,4.74,,f,1,1,0,0,0.54
3,159736,https://www.airbnb.com/rooms/159736,20240319050633,2024-03-20,city scrape,A double Room 5mins from King's College Hospital,Calm sunny double room with a queen size bed a...,We love that in Loughborough Junction we live ...,https://a0.muscache.com/pictures/1067303/d2300...,766056,...,4.89,4.34,4.66,,f,4,0,4,0,0.62
4,165336,https://www.airbnb.com/rooms/165336,20240319050633,2024-03-21,city scrape,Charming Flat in Notting Hill,A stylish apartment close to Portobello market...,"Notting Hill has many cafes, bars and restaura...",https://a0.muscache.com/pictures/60757460/47f8...,761400,...,4.96,4.87,4.71,,f,1,1,0,0,1.57


In [8]:
airbnb_list.columns

Index(['id', 'listing_url', 'scrape_id', 'last_scraped', 'source', 'name',
       'description', 'neighborhood_overview', 'picture_url', 'host_id',
       'host_url', 'host_name', 'host_since', 'host_location', 'host_about',
       'host_response_time', 'host_response_rate', 'host_acceptance_rate',
       'host_is_superhost', 'host_thumbnail_url', 'host_picture_url',
       'host_neighbourhood', 'host_listings_count',
       'host_total_listings_count', 'host_verifications',
       'host_has_profile_pic', 'host_identity_verified', 'neighbourhood',
       'neighbourhood_cleansed', 'neighbourhood_group_cleansed', 'latitude',
       'longitude', 'property_type', 'room_type', 'accommodates', 'bathrooms',
       'bathrooms_text', 'bedrooms', 'beds', 'amenities', 'price',
       'minimum_nights', 'maximum_nights', 'minimum_minimum_nights',
       'maximum_minimum_nights', 'minimum_maximum_nights',
       'maximum_maximum_nights', 'minimum_nights_avg_ntm',
       'maximum_nights_avg_ntm', 'ca

In [9]:
airbnb_list.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 90852 entries, 0 to 90851
Data columns (total 75 columns):
 #   Column                                        Non-Null Count  Dtype  
---  ------                                        --------------  -----  
 0   id                                            90852 non-null  int64  
 1   listing_url                                   90852 non-null  object 
 2   scrape_id                                     90852 non-null  int64  
 3   last_scraped                                  90852 non-null  object 
 4   source                                        90852 non-null  object 
 5   name                                          90852 non-null  object 
 6   description                                   87851 non-null  object 
 7   neighborhood_overview                         47521 non-null  object 
 8   picture_url                                   90842 non-null  object 
 9   host_id                                       90852 non-null 

Explore the data
Answers the following Questions from the the Airbnb Data

1. **Spatial Distribution:**
    - How does the spatial distribution of listings with high review scores compare to those with lower scores?
    - Are there specific areas where higher-priced properties tend to cluster?
   - How does the average price of listings vary by neighborhood or neighborhood group?
   - Are there significant differences in average review scores between different neighborhoods or neighborhood groups?

2. **Property Type Insights:**
   - What is the distribution of property types (e.g., entire home, private room) across different neighborhoods?
   - How does the average price differ between various property types?

3. **Room Type Comparisons:**
   - How does the average rating of listings vary between different room types (e.g., shared room, private room)?
   - Is there a correlation between room type and the number of amenities offered?

4. **Capacity and Pricing Analysis:**
   - How does the price of listings correlate with the number of accommodations (e.g., accommodates 2 vs. 6)?
   - What is the relationship between the number of bedrooms or beds and the price of listings?

5. **Amenities and Pricing:**
   - Which amenities are most commonly found in higher-priced listings?
   - How does the number of amenities offered impact the review scores or pricing?

6. **Bathroom and Bedroom Analysis:**
   - Is there a correlation between the number of bathrooms and the price of the listing?
   - How does the average number of bedrooms relate to the listing price?

7. **Review Scores Examination:**
   - How do review scores (accuracy and overall rating) vary by property type or room type?
   - Is there a relationship between review scores and the number of amenities offered?

8. **Trend Analysis:**
   - Are there any observable trends in pricing or review scores over time within different neighborhoods or property types?
   - How do the review scores change with seasonal variations in pricing?

9. **Outlier Detection:**
   - Are there any outliers in pricing or review scores based on the number of bedrooms, bathrooms, or amenities?
   - What factors contribute to high or low review scores in specific neighborhoods?

In [None]:
neighbourhood',
       'neighbourhood_cleansed', 'neighbourhood_group_cleansed', 'latitude',
       'longitude', 'property_type', 'room_type', 'accommodates', 'bathrooms',
       'bathrooms_text', 'bedrooms', 'beds', 'amenities', 'price',

        'review_scores_rating', 'review_scores_accuracy','reviews_per_month'



Housing Prices Data

Shortlet Market Data