Airbnb JerseyCity case study! Optimize the revenue by its active listings.
In this case study, I will estimate the revenue potential of Airbnb in jersey city and check if I can increase its revenue potential. In order to answer the key business questions, I will follow the steps of the data analysis process: Ask, Prepare, Process, Analyze, Share, and Act.
Data Source: Inside Airbnb [accessed on 10/05/23]
SQL Queries:
Data Visualizations: Tableau
Turn the data I have into actionable items in order for the company to optimize its revenue.
-
I needed to define the KPI(s) to use to measure revenue as I didn’t have access to Airbnb’s actual revenue information.
-
I decided to estimate the revenue potential by the total earnings of all active listings in the next month(30 days).
-
Namely, Revenue Potential = Price x (30 — availability_30).
-
I then needed to define what “active listings” mean in this analysis — it would mean that the listing’s last review was made no earlier than November 2022.
-
With that in mind, I performed the analysis looking for the main revenue contributors (by different variables), and whether Airbnb has exhausted their revenue potential.
I will use Airbnb Inside data to analyze and identify trends from Dec 2022 to Sept 2023 which can be downloaded from Inside Airbnb.
This is public data that can be used to explore detailed list of Listings data, neighbourhood data, review data and calender data. But note that data-privacy issues prohibit from using members personally identifiable information. This means that we won’t be able to connect purchases to credit card numbers.
There are 6 files and each file includes information for one month, such as the member id, host id, name, neighbourhood, latitude,longitude, room types, price, availability 30,60,90 and number of reviews. The corresponding column names are id, name, host_id, host_name, neighbourhood_group, latitude, longitude, room_type, price, minimum_nights, num_of_reviews, last_review, review_per_month,availability_365, license.
BigQuery is used to explore the data set.
Reason:
A worksheet can only have 1,048,576 rows in Microsoft Excel because of its inability to manage large amounts of data. Because the Airbnb JerseyCity dataset has more than million rows, it is essential to use a platform like BigQuery that supports huge volumes of data.
SQL Query: 1.Data Exploration
Before cleaning the data, I am familiarizing myself with the data to find the inconsistencies.
- The table below shows the all column names and their data types. The id column is our primary key.
2.The following table shows number of null values in each column.
Note that some columns have same number of missing values. This may be due to missing information in the same row.
- The following table shows number of duplicate ids in the table.
- There are 4 unique room types (room_type) in our data.
- There are 6 unique neighbourhood group types (neighbourhood_group) in our data.
- MAX and MIN price of the room types
SQL Query: Data Analysis
The data is stored appropriately and is now prepared for analysis. I queried multiple relevant tables for the analysis and visualized them in Tableau.
The analysis question is:
1.) How/where are tourists/Airbnb guests spending their money in JerseyCity?
2.) What makes a successful host in JerseyCity?
3.) Strategy suggestions for Airbnb in JerseyCity?
-
There are 579 unique Airbnb hosts in Jersey City.
-
There are 1,324 unique listings in Jersey City.
-
About (72%) of Airbnb hosts only have one listing, other hosts could have multiple listings (up to as many as 400)
- Number of listings and average price by neighbourhood : Some of the more rural/remote neighbourhoods have higher average prices, my guess is that the properties tend to be larger for these areas (room type/beds)
- Number of listing and average price by room type :
- Average price per bed by neighbourhood : This looks more like it, guests pay more $ for places downtown on average.
- Neighbourhoods with the biggest revenue potential : Measure "potential" - with availability 30, I am making the assumption here that more than 30 days (before 2022-12-01) is too far away for people to book Airbnb for a trip. -- Limitaion - availability being 0 could also mean that the host simply blocked the calendar (I am seleting only the ones who has been reviewed in the past 12 months assuming the rest are inactive)
- what about potential by neighbourhood and room type?
The top two are entire homes in Ward E and Ward F by far
- Also, it looks like no one really wants shared room, regardless of location
- Entire homes as a room type is the most in demand .
- let's analyse the reviews and ratings
- Use number_of_reviews as a metric for popularity
- From Airbnb - A Superhost is someone who goes above and beyond in their hosting duties and is a shining example of how a Host should be. You can easily identify one from the badge that appears on their listing and profile.
- I'd like to explore the "superhost" status a bit and see if it is associated with earning potential
- As well as the elements associated with being a superhost (ratings)?, do they earn more (price)? Where are they in the city?
- Number of superhosts/non-superhosts in JC
384 superhosts and 940 regular hosts
- SuperHost and RegularHost By neighbourhood
Ward E and Ward F are top 2 by number of superhosts.
- Looking into correlation between superhost status and price
155 vs 140, superhosts charge more.
- what about neighbourhoods?
In Ward E, regular hosts charge more than superhosts on average, something is not adding up
- Again, superhosts charge more but I suspect that the rankings of neighbourhoods has been affected by room type
- Pulling the same query, but price per bed this time
We can see that in some neighbourhoods superhosts aren't even charging as much as the regular ones (especially in the Ward E where properties are in demand and superhosts have high revenue potential
- Next, let's look at ratings for superhosts vs regular hosts - what are superhosts doing right?
20 records with col host_is_superhost blank, does affect our analysis so update it to regular host.
- relationship between instant book and revenue potential
this does not match Airbnb's claim that enabling instant book increases a host's earnings
- let's look at price and availability_30 separately.
average price is even higher without instant book, according to Airbnb hosts get double the reservations because of the convenience factor.
- is it true that with instant book enabled, hosts get more reservations?
- again we will use the availibility for the next 30 days to measure it
hosts with instant book enabled seems to be more available, which means they get less reservations.
Data Visualization: Tableau
-
About (72%) of Airbnb hosts only have one listing, other hosts could have multiple listings (up to as many as 400)
-
Some of the more rural/remote neighbourhoods have higher average prices, my guess is that the properties tend to be larger for these areas (room type/beds)
-
guests pay more $ for places downtown on average.
-
The top two are entire homes in Ward E and Ward F by far
- Also, it looks like no one really wants shared room, regardless of location
- Entire homes as a room type is the most in demand .
-
Ward E and Ward F are top 2 by number of superhosts and superhosts charge more.
-
Superhosts charge more but I suspect that the rankings of neighbourhoods has been affected by room type.
-
We can see that in some neighbourhoods superhosts aren't even charging as much as the regular ones (especially in the Ward E where properties are in demand and superhosts have high revenue potential.
-
Average price is even higher without instant book, according to Airbnb hosts get double the reservations because of the convenience factor.
-
hosts with instant book enabled seems to be more available, which means they get less reservations. Hosts are not charging premium for the convenience that comes with instant book.
-
Airbnb should notify the super hosts about the price in Ward E to help them price their properties right for more revenue.
-
Airbnb should notify hosts with instant book charging premium for the convenience for more reservations and more revenue.
Recommendations:
The top revenue contributors to Airbnb in Jersey City are entire homes from Ward E, Ward F, Ward C, Ward D by far, to maximize revenue, it is essential that Airbnb makes the most out of the potential of these listings. Potential areas to be optimized and explored are:
-
The prices of entire homes in top four neighbourhoods should reflect the large demand;
-
Help superhosts price their properties in a way that reflects their status;
- Help instantbookable listings price properly taking into consideration the convenience they are providing;
- Help highly rated listings price properly to reflect their high ratings and superior quality of services.
Limitations and assumptions:
-
This analysis assumes that hosts that have not been reviewed since November 2022 are inactive;
-
Revenue potential is only measured by the potential revenue coming in the next 30 days, with the assumption that most visitors book their stays less than 30 days in advance.