# Analysis of the impact of the Covid Pandemic on the Airbnb Market in London

In this project, we'll assess how the Covid-19 pandemic has affected the Airbnb market in London. We'll aim to find out the impact on:
- Occupation Rates
- Types of Accommodation
- Prices
- Which neighbourhoods are most affected

# Summary of Results

- The total number of active listings in November-2020 was half the number of November-2019, and the total occupation rate fell by 75%.
- Although still the majority of listings, the proportion of Entire Homes did not increase significantly.
- Prices for Hotel Rooms and Entire Homes (the majority of listings in popular tourist destinations) were the most affected. Prices for Private and Shared Rooms (most popular option in Outer London) did not change significantly.
- Areas like the touristy West-End and the trendy East-End had the highest occupancy rate drops.

# 1. Import Libraries

In [None]:
#Data Processing
import numpy as np
import pandas as pd
pd.set_option('display.max_columns', None)

#Visualization
import matplotlib as mpl
import matplotlib.pyplot as plt 
import matplotlib.dates as mdates
import seaborn as sns

#Set-up Visualization settings
%matplotlib inline
plt.style.use('fivethirtyeight')
mpl.rcParams['figure.figsize'] = (11, 4)
blue="#30a2da"
red = "#fc4f30"
yellow = "#e5ae38"
green = "#6d904f"
grey = "#8b8b8b"
five_thirty_eight = [blue, red, yellow, green, grey]
sns.set_palette(five_thirty_eight)


#Maps Visualization
import geopandas as gpd
import folium
from folium import Choropleth

# 2. Read and Explore the Data

The data is publicly available on the website [Inside Airbnb](http://insideairbnb.com/). It contains data from listings in many cities across Europe and America. The data is sourced from publicly available information from the Airbnb site, so there are no privacy issues.

We'll compare the scrapped data on November-2019 (pre-pandemic) to the one on November-2020.

In [None]:
#Read in the Data
listings_19 = pd.read_csv('../input/airbnb/listings_summary.csv', parse_dates=['last_review'])
listings_20 = pd.read_csv('../input/airbnb-2020-post-covid/listings_20.csv', parse_dates=['last_review'])
reviews = pd.read_csv('../input/airbnb-2020-post-covid/reviews_20.csv',parse_dates=['date'])

In [None]:
#Explore the 2019 data
listings_19

In [None]:
#Explore columns
listings_19.info()

In [None]:
#Explore the 2020 data
listings_20

In [None]:
#Explore columns
listings_20.info()

Observations:
- These 2 datasets are 'pictures' of the airbnb website on a certain day. The first one on 05-11-19 (pre-covid) and the second one on 06-11-20 (post-covid) respectively. 
- There is not a variable that tells us directly the occupancy rate of the listings. We'll use the reviews_per_month column as a proxy variable. It's sensible to think that the review rate is approximately proportional to the occupancy rate.
- There are missing values on the last_review and reviews_per_month columns in both datasets. There are also listings where the last review was more than a year before the data was taken. We'll get rid of the listings that have not received any reviews at all.

In [None]:
#Drop listings with no reviews
listings_19.dropna(subset=['last_review'], inplace=True)
listings_20.dropna(subset=['last_review'], inplace=True)

In [None]:
#Explore reviews
reviews

In [None]:
reviews.info()

Observations:
- The reviews dataset is quite large, containing more than 1 million reviews. This tells us how big the Airbnb market used to be in London.
- We've got data ranging from 2010 to 2020, so it will allow us to view the change in the review rate across a large time period. 
- We can join the reviews DataFrame with the listings DataFrames on the listing_id column to get interesting information about the occupancy rates for different types of accommodations.

# 3. Total Occupancy Rate Timeline

As previously mentioned, we'll use the review rate as a proxy variable for the occupancy rate. We'll assume that the total number of reviews is proportional to the total number of bookings.

In [None]:
#Group Total Reviews by Month and Year
reviews = reviews.set_index('date')
year_reviews = reviews.resample('Y').size()
month_reviews = reviews.resample('M').size()
year_mean = month_reviews.resample('Y').mean()
year_mean = year_mean.shift(6, freq='M') #centre it in the middle of each year

In [None]:
#Format percentages for plotting
def convert_to_pct(x):
    if x<0:
        return str(x)+'%'
    if x>0:
        return '+'+str(x)+'%'
    if x==0:
        return '0.0%'

#Calculate the total reviews year percentage change
per_change = round(year_reviews.pct_change().loc['2012':]*100,1).apply(convert_to_pct)

Below we plot the number of reviews per year and the percentage change from the previous year:

In [None]:
#Plot Total Reviews per Year Evolution

fig, ax = plt.subplots()

year_reviews.plot.bar(ax=ax)

ax.set_xticklabels(list(range(2009,2021)))
ax.get_yaxis().set_major_formatter(
    mpl.ticker.FuncFormatter(lambda x, p: format(int(x), ',')))
ax.set_ylim(-10000,410000)

for i, p in enumerate(ax.patches[3:]):
    width = p.get_width()
    height = p.get_height()
    x, y = p.get_xy() 
    ax.annotate(f'{per_change[i]}', (x + width/2, y + height+5000), ha='center')
    
ax.set_xlabel('')
ax.set_ylabel('Year Reviews')
ax.set_title('Total Reviews per Year Evolution');

We can see that before 2020, the occupancy rate of Airbnbs in London grew every year. It had very low occupancy rates in the early years and then grew really fast from 2012 to 2015, more than doubling each year. It slowed down from 2016, but kept growing at a high pace.
The decrease in 2020 of 70.5% is quite significant, completely breaking this trend.

Let's take a look at the monthly variation:

In [None]:
#Plot Total Reviews per Month Evolution
fig, ax = plt.subplots()
month_reviews.plot(figsize=(15,10), marker='o', markersize=8, ax=ax, label='Monthly Reviews')
year_mean.plot(ax=ax, color='green', marker='o', markersize=12, label='Monthly Reviews (Year Mean)')
ax.axvline('20-March-2020', color='red', linestyle='--', label = 'Start of First England Lockdown', linewidth=3)
ax.set_xlim('Dec-2009','Feb-2021')
ax.set_xticks(pd.to_datetime(list(range(2010,2022)), format='%Y'))
ax.set_xticklabels(list(range(2010,2022)))
ax.set_xlabel('')
ax.get_yaxis().set_major_formatter(
    mpl.ticker.FuncFormatter(lambda x, p: format(int(x), ',')))
ax.set_ylabel('Monthly Reviews')
ax.set_title('Total Reviews per Month Evolution')
plt.legend();

Zooming in:

In [None]:
#Prepare x-axis dates
dates = pd.date_range(start='31-Jan-2016', end='31-Jan-2021', freq='3M')
dates = dates[dates.month!=1]
years = pd.to_datetime(list(range(2016,2022)), format='%Y')
years.strftime('%Y')

#Plot Total Reviews per Month Evolution
fig, ax = plt.subplots()
month_reviews_16_20 = month_reviews[month_reviews.index.year>=2016].copy()
month_reviews_16_20.plot(figsize=(15,10), marker='o', markersize=8, ax=ax, label='Monthly Reviews')
year_mean.plot(ax=ax, color='green', marker='o', markersize=12, label='Monthly Reviews (Year Mean)')
ax.axvline('20-March-2020', color='red', linestyle='--', label = 'Start of 1st England Lockdown', linewidth=3)
ax.axvline('1-June-2020', color='purple', linestyle='--', label = 'End of 1st England Lockdown', linewidth=3)
ax.axvline('12-Oct-2020', color='brown', linestyle='--', label = 'Tier System Announced', linewidth=3)

ax.set_xlim('Dec-2015','Feb-2021')
ax.set_xticks(years, minor=False)
ax.set_xticklabels(years.strftime('%Y'), rotation=90, minor=False, fontweight='bold')
ax.set_xticks(dates, minor=True)
ax.set_xticklabels(dates.strftime('%b'), rotation=90, minor=True)
ax.grid(which='minor')

for year in years:
    ax.axvline(year, linewidth=1, color='grey')

ax.get_yaxis().set_major_formatter(
    mpl.ticker.FuncFormatter(lambda x, p: format(int(x), ',')))    

ax.set_xlabel('')
ax.set_ylabel('Monthly Reviews')
ax.set_title('Total Reviews per Month Evolution')
plt.legend();

Before 2020, there was a clear growing trend from year to year and a similar seasonal variation with respect to the mean for each year. The winter months had the lowest occupation rate and the summer months the highest.

In happier pre-pandemic times August was always the busiest month for holidays, but we can see that September typically has a higher review rate than August. This is very likely due to guests taking a few weeks to leave a review. We can assume that there is a time offset between the review rate and the occupancy rate.

There is a huge drop in the review rate from March-2020 onwards. The month of March, whose review rate is typically larger than February's, includes reviews from stays in February, so it is very likely that the occupancy rate was already rapidly falling before the 1st Lockdown was announced, as a result of people cancelling holidays due to the great uncertainty the world was experiencing. The summer months are the only months that save 2020 from having an occupancy rate close to 0. Once the lockdown restrictions were partially eased, the occupancy rate grew, but not to levels comparable to recent years. Once restrictions came back we can see a drop to almost 0% occupancy rate.

Let's compare 2020 with 2019:

In [None]:
#Prepare data for plotting
month_reviews_19 = month_reviews['2019'].values[:-1]
month_reviews_20 = month_reviews['2020'].values
index = pd.date_range(start='2020-01', freq='M', periods=11).month_name()
reviews_compare = pd.DataFrame(data={'2019':month_reviews_19, '2020':month_reviews_20}, index=index)
per_change = round(reviews_compare.apply(lambda row: (row.iloc[1]-row.iloc[0])/row.iloc[0]*100, axis=1),1)

#Plot Month Reviews Comparison 2019vs2020
fig, axes = plt.subplots(2, 1, figsize=(11, 10))

reviews_compare.plot.bar(ax=axes[0])
axes[0].set_ylabel('Monthly Reviews')
axes[0].get_yaxis().set_major_formatter(
    mpl.ticker.FuncFormatter(lambda x, p: format(int(x), ',')))
axes[0].set_title('Total Monthly Reviews\n (2019 vs 2020)')
axes[0].legend(loc=(1,0.8))
plt.tight_layout()

per_change.plot(ax=axes[1], color='black', linewidth=3, marker='o', markersize=8)
axes[1].set_xticks(np.arange(11))
axes[1].set_xticklabels(index, rotation=90)
axes[1].set_ylim(-105,105)
axes[1].axhline(0, linewidth=1, color='black')
axes[1].yaxis.set_major_formatter(mpl.ticker.PercentFormatter())
axes[1].set_title('Percentage Change in Total Monthly Reviews (2019 to 2020)')
plt.tight_layout()

The occupancy rate in 2020 was still growing in January and February with respect to 2019. Every month after that experienced a huge drop. April, May, June and November were the worst months. July, August, September and October have the highest occupancy rates in 2020 but still suffered very large drops in occupancy.

# 4. Impact on Individual Listing Occupancy Rates

So far, we've explored the impact on the total occupancy rate, looking at the timeline before and after the pandemic. Now we'll take a look how the reviews were distributed across different types of accommodations.

The listings DataFrames have a column called last_review that contains the date each listing was last reviewed. This is a good approximation for when the listing was last occupied. The data for the listings_19 DataFrame was gathered on 5th-Nov-19 and the data for listings_20 on 6th-Nov-2020 (Before and after the pandemic).

Let's take a look at when listings were last reviewed: 

In [None]:
lry_19 = listings_19['last_review'].dt.year.value_counts() #Last-reviewed Year Distribution
lry_19.index = lry_19.index.astype(int) #Prepare for plotting
lry_19 = lry_19.sort_index()

lry_19_per = round(listings_19['last_review'].dt.year.value_counts(normalize=True),2) #Calculate percentage for each year
lry_19_per = lry_19_per.sort_index(ascending=True)

#Plot year of last review distribution on 05-11-19
fig, ax = plt.subplots()
lry_19.plot.bar()
ax.set_xlabel('\nYear of Last Review')
ax.set_ylabel('Number of Listings\n')
ax.set_ylim(-1000,61000)

ax.get_yaxis().set_major_formatter(
    mpl.ticker.FuncFormatter(lambda x, p: format(int(x), ',')))

ax.set_title('05-Nov-2019')

for i,p in enumerate(ax.patches):
    width = p.get_width()
    height = p.get_height()
    x, y = p.get_xy() 
    ax.annotate(f'{lry_19_per.iloc[i]:.1%}', (x + width/2, y + height+700), ha='center')

By the end of 2019, 25% of listings had not received a review on that year. We'll consider them inactive and delete them. Let's take a look at 2020:

In [None]:
lry_20 = listings_20['last_review'].dt.year.value_counts() #Last-reviewed Year Distribution
lry_20.index = lry_20.index.astype(int) #Prepare for plotting
lry_20 = lry_20.sort_index()

lry_20_per = round(listings_20['last_review'].dt.year.value_counts(normalize=True),2) #Calculate percentage for each year
lry_20_per = lry_20_per.sort_index(ascending=True)

#Plot year of last review distribution on 06-11-20
fig, ax = plt.subplots()
blue='#008FD5'
red = '#FF2700'
lry_20.plot.bar(color=red)
ax.set_xlabel('\nYear of Last Review')
ax.set_ylabel('Number of Listings\n')
ax.set_ylim(-1000,61000)

ax.get_yaxis().set_major_formatter(
    mpl.ticker.FuncFormatter(lambda x, p: format(int(x), ',')))

ax.set_title('06-Nov-2020')

for i,p in enumerate(ax.patches):
    width = p.get_width()
    height = p.get_height()
    x, y = p.get_xy() 
    ax.annotate(f'{lry_20_per.iloc[i]:.1%}', (x + width/2, y + height+700), ha='center')

We can see that by the end of 2019, 75% of the listings on the website had been occupied in that year. In 2020 there is a big drop, to just 47%.

Below we compare the number of active listings in the past year for both years:

In [None]:
list_act_19 = listings_19[listings_19['last_review'].dt.year==2019]
list_act_20 = listings_20[listings_20['last_review'].dt.year==2020]

num_act_19 = len(list_act_19)
num_act_20 = len(list_act_20)

df_act = pd.Series({'2019':num_act_19, '2020':num_act_20})
ax = df_act.plot.bar(color=[blue, red])
ax.get_yaxis().set_major_formatter(
    mpl.ticker.FuncFormatter(lambda x, p: format(int(x), ',')))
ax.set_ylim(-1000,55000)
ax.set_xlabel('')
ax.set_ylabel('Total Active Listings')
ax.set_title('Change in Number of Active Listings');

The number of active listings has nearly halved in the past year, which is consistent with the drop in the total occupancy rate.

Let's look at the change in the review rate: 

In [None]:
fig, axes = plt.subplots(2,1, figsize=(14,5), sharex=True)

list_act_19['reviews_per_month'].plot.box(vert=False, ax=axes[0])
axes[0].set_yticklabels([''])
axes[0].set_title('Distribution of Average Reviews per Month in 2019')


list_act_19['reviews_per_month'].plot.hist(ax=axes[1])
axes[1].set_xlabel('Average Reviews per Month')
axes[1].set_ylabel('Number of Listings')
axes[1].get_yaxis().set_major_formatter(
    mpl.ticker.FuncFormatter(lambda x, p: format(int(x), ',')));

We can see that the vast majority of listings didn't get more than 5 reviews a month in 2019. There are a few listings with very high review rates that distort the distribution. We won't plot these outliers when comparing the distribution of 2019 vs 2020.

In [None]:
#Combine both datasets to make it easier to plot

data_19 = list_act_19[['id','reviews_per_month', 'price', 'neighbourhood', 'room_type']].copy()
data_19['year']=2019
data_20 = list_act_20[['id','reviews_per_month', 'price', 'neighbourhood', 'room_type']].copy()
data_20['year']=2020

combined = pd.concat([data_19, data_20], ignore_index=True)

In [None]:
combined_plt = combined[combined.reviews_per_month<=8.5] #Don't show the most extreme outliers

#Plot Listing Average Reviews Distribution\n(2019 vs 2020)
fig, axes = plt.subplots(2,1, figsize=(14,5), sharex=True)

sns.boxplot(x='reviews_per_month', y='year', data=combined_plt, orient='h', ax=axes[0]).set(ylabel='Year\n')
axes[0].set_title('Listings Average Monthly Reviews\nFrequency Distribution\n(2019 vs 2020)')
plt.tight_layout()
axes[1].set_xlabel('')

list_act_19['reviews_per_month'].plot.hist(
    ax=axes[1], bins=np.arange(0,9,0.5),  #Don't show the most extreme outliers
    histtype='step', linewidth=3, label=2019
    )
list_act_20['reviews_per_month'].plot.hist(
    ax=axes[1], bins=np.arange(0,9,0.5), #Don't show the most extreme outliers
    histtype='step', linewidth=3, label=2020)

axes[1].set_xticks(np.arange(0,9,0.5))
axes[1].set_xlabel('Reviews per Month')
axes[1].set_ylabel('Number of Listings')
axes[1].get_yaxis().set_major_formatter(
    mpl.ticker.FuncFormatter(lambda x, p: format(int(x), ',')))
axes[1].set_ylim(-1000,16000)
axes[1].legend();

We can see that the distribution of the Listing Average Reviews per Month for listings in 2020 is more skewed to the bottom-left than in 2019. This means that there were fewer listings being occupied at all in 2020, and the ones that did had lower occupancy rates on average.

# 5. Impact on Types of Listings

Now we'll assess the impact of the Covid Pandemic in more detail. We'll first look at what types of accommodation have been the most affected. Note that we consider to be Active Listings those that have received at least 1 review in the last year:

In [None]:
#Calculate the number and percentage of listings for each type of accommodation in 2019 and 2020
listing_types = pd.crosstab(index=combined['room_type'], columns=combined['year'])
listing_types_per = pd.crosstab(index=combined['room_type'], columns=combined['year'], normalize='columns')
listing_types = listing_types.loc[['Entire home/apt', 'Private room','Hotel room', 'Shared room']]
listing_types_per = listing_types_per.loc[['Entire home/apt', 'Private room','Hotel room', 'Shared room']]

#Plot Types of Active Listings 2019 vs 2020
fig, ax = plt.subplots()

listing_types.T.plot.bar(ax=ax)

ax.get_yaxis().set_major_formatter(
    mpl.ticker.FuncFormatter(lambda x, p: format(int(x), ',')))

ax.set_ylim(-1000,31000)

ax.set_xlabel('Year')
ax.set_ylabel('Total Active Listings')
ax.set_title('Types of Active Listings\n(2019 vs 2020)')

for i,p in enumerate(ax.patches):
    width = p.get_width()
    height = p.get_height()
    x, y = p.get_xy() 
    value = listing_types_per.T.unstack().iloc[i]
    ax.annotate(f'{value:.1%}', (x + width/2, y + height+700), ha='center')

Observations:
- In 2019, Entire Homes represented the majority of the listings, with 57.6% of the total, followed by Private Rooms in shared houses, with 40.3%. Hotel rooms and Shared rooms have much smaller numbers. 
- We can see that there's been a big decrease in the number of listings for each type of accommodation except Shared Rooms. 
- The last is strange, as we would expect that Shared Rooms would have been wiped and Entire Homes would now represent the vast majority of listings. However, their share has only increased 2%.

Let's look at the change in occupancy rate for each accommodation type:

In [None]:
review_rates_rooms = combined.pivot_table(index='room_type', columns=['year'], aggfunc='mean').loc[:,'reviews_per_month']
review_rates_rooms = review_rates_rooms.loc[['Entire home/apt', 'Private room','Hotel room', 'Shared room']]
per_change = round(review_rates_rooms.apply(lambda row: (row.iloc[1]-row.iloc[0])/row.iloc[0]*100, axis=1),1).astype(str)+'%'

#Plot Change in Review Rate\nper Accommodation Type\n(Active Listings Only)
fig, ax = plt.subplots(figsize=(6,6))

review_rates_rooms.T.plot(marker='o', ax=ax)
ax.set_xticks([2019,2020])
ax.set_xticklabels([2019,2020])
ax.set_ylim(0.75,2.1)
ax.set_xlabel('Year')
ax.set_ylabel('Mean Monthly Listings Reviews')
ax.set_title('Change in Review Rate\nper Accommodation Type\n(Active Listings Only)')
ax.text(x = 2019.2, y = 1.8, s = per_change['Private room'], color = red, weight = 'bold')
ax.text(x = 2019.2, y = 1.55, s = per_change['Shared room'], color = green, weight = 'bold')
ax.text(x = 2019.2, y = 1.1, s = per_change['Entire home/apt'], color = blue, weight = 'bold')
ax.text(x = 2019.2, y = 1.3, s = per_change['Hotel room'], color = yellow, weight = 'bold')

ax.legend(loc=1);

In 2019 Private and Shared Rooms had the highest review rates, and Entire Homes and Hotels the lowest. We considered the total occupancy rate for all listings proportional to the total number of reviews. However, we need to be careful with this assumption when it comes to comparing review rates for different types of accommodations, as people staying in someone's home and personally meeting the host will be more keen on leaving a review than when staying in a hotel or a whole flat and not meeting the host.

The review rate for Entire Homes and Private rooms had a higher fall than for Hotel Rooms and Shared Rooms. Shared rooms and Hotel Rooms, although being a significantly lower portion of listings than Entire Homes and Private Rooms in houses and flats seem to be less affected by the pandemic.

Let's look at the change in prices:

In [None]:
# Plot price Distribution by Accommodation Type
fig, ax = plt.subplots(figsize=(11, 10))

combined_price_cleaned = combined[combined.price<400]
sns.boxplot(x='price', y='room_type', hue='year', 
            data=combined_price_cleaned, orient='h', ax=ax).set(xlabel='Price ($)\n', ylabel='Accommodation Type\n')
ax.set_title('Price Distribution by Accommodation Type\n(2019 vs 2020)')
plt.show()

As it was expected, Hotel Rooms and Entire Homes are the most expensive on average, followed by Private Rooms and Shared Rooms. The biggest drop in price from 2019 to 2020 happened for Hotel Rooms, followed by Entire Homes, whereas the price for Shared Rooms increased slightly and for Private Rooms decreased slightly.

# 6. Impact by Neighbourhood

Let's analyse the change in occupancy rate and prices for each neighbourhood:

In [None]:
#Calculate the number of listings per neighbourhood
listings_nbd_year = pd.crosstab(index=combined['neighbourhood'], columns=combined['year'])
per_change = listings_nbd_year.apply(lambda row: (row.iloc[1]-row.iloc[0])/row.iloc[0], axis=1)
order = combined.loc[combined.year==2019, 'neighbourhood'].value_counts().sort_values(ascending=True).index
listings_nbd_year = listings_nbd_year.loc[order]
per_change = per_change.loc[order]

#Plot Change in Number of Listings per Neighbourhood
fig, ax = plt.subplots(figsize=(11,15))

listings_nbd_year[[2020,2019]].plot.barh(ax=ax, color=[red,blue])
ax.set_xlabel('Number of Active Listings')
ax.set_ylabel('Neighbourhood')
ax.set_xlim(-100,7100)
ax.set_title('Change in Number of Listings per Neighbourhood\n(2019 vs 2020)')
for i,p in enumerate(ax.patches[33:66]):
    width = p.get_width()
    height = p.get_height()
    x, y = p.get_xy()
    value = per_change.iloc[i]
    ax.annotate(f'{value:.1%}', (x+width+400, y-height/2), ha='center')
    
ax.legend(loc=(0.8,0.8))
    
plt.show()

We can see that listings in popular tourist destinations like the West End and trendy areas like the East End have the highest number of listings, whereas listings in Outer London Boroughs have much lower numbers. The drop in the number of active listings is around 50% in tourist popular boroughs, whereas in not touristy areas is lower, and as low as 14.8% in Bexley, in Outer London.

As we can see in the graph below, boroughs with tourist destinations tend to have a higher proportion of Entire Homes, whereas in outer London, the most popular option is Private Rooms. This explains why Private Rooms had a lower decrease in price than Entire Homes.

In [None]:
#Calculate the proportion of listing types per borough and year
room_types_nb_19 = pd.crosstab(index=list_act_19['neighbourhood'], columns=list_act_19['room_type'], normalize='index')*100
room_types_nb_19 = room_types_nb_19.loc[order, listing_types.index]

#Plot the proportion of listing types per borough and year
fig, ax = plt.subplots(figsize=(11, 10))
room_types_nb_19.sort_values(by='Entire home/apt').plot.barh(ax=ax, stacked=True)
ax.set_xlabel('Percentage (%)')
ax.set_ylabel('Borough')
ax.set_title('Types of Active Listings per Borough (2019)')
ax.legend(loc=(1.05,0.5))
plt.show()

room_types_nb_20 = pd.crosstab(index=list_act_20['neighbourhood'], columns=list_act_20['room_type'], normalize='index')*100
room_types_nb_20 = room_types_nb_20.loc[order, listing_types.index]

fig, ax = plt.subplots(figsize=(11, 10))
room_types_nb_20.sort_values(by='Entire home/apt').plot.barh(ax=ax, stacked=True)
ax.set_xlabel('Percentage (%)')
ax.set_ylabel('Borough')
ax.set_title('Types of Active Listings per Borough (2020)')
ax.legend(loc=(1.05,0.5))
plt.show()

The proportion of listing types has not changed significantly for any neighbourhood.

The map below helps visualize the difference in popularity among boroughs in 2019:

In [None]:
nb_listings = list_act_19['neighbourhood'].value_counts().sort_values()

boroughs = gpd.read_file('../input/airbnb/neighbourhoods.geojson')
boroughs = boroughs[['neighbourhood', 'geometry']].set_index('neighbourhood')

f = folium.Figure(width=1000, height=700)
m_1 = folium.Map(tiles='openstreetmap').add_to(f)
m_1.fit_bounds([[51.697855, -0.519505], [51.276832, 0.353619]])

Choropleth(geo_data=boroughs.__geo_interface__, 
           data=nb_listings, 
           key_on="feature.id", 
           fill_color='YlGnBu', 
           legend_name='Number of Listings'
          ).add_to(m_1)
m_1

# 7. Conclusion

In this project, we assessed the impact of the Covid-19 Pandemic on the London Airbnb Market. We found out that:
- The total number of bookings in 2020 was 75% less than in 2019, breaking a rapid growing trend. The number of listings that got at least 1 review was half.
- Boroughs popular with tourists like Westminster and Kensington, and trendy areas like Tower Hamlets and Hackney were the most affected.
- Entire Homes, which are the most popular option in central touristy areas, and hotels had the highest decrease in price. Prices in Private Rooms, which are the majority in the rest of London had a lower drop. Prices in Shared Rooms increased slightly, though this option remains the least popular.

Ideas to expand this project include:
- Analyse the change in price per accommodation type per borough.
- Compare the impact on London with other cities. Data is publicly available on [Inside Airbnb](http://insideairbnb.com/).
- Compare the impact on the Airbnb Market with the impact on the housing market.