# Airbnb Property Listing - Data Preparation

## Library Imports

In [5]:
import pandas as pd
import numpy as np
import os
from zipfile import ZipFile

## Download the data

In [4]:
if not os.path.exists('./airbnb-property-listings.zip'):
    !wget "https://aicore-project-files.s3.eu-west-1.amazonaws.com/airbnb-property-listings.zip"

--2022-10-28 14:54:36--  https://aicore-project-files.s3.eu-west-1.amazonaws.com/airbnb-property-listings.zip
Resolving aicore-project-files.s3.eu-west-1.amazonaws.com (aicore-project-files.s3.eu-west-1.amazonaws.com)... 52.92.18.106, 52.218.98.56, 52.218.116.90, ...
Connecting to aicore-project-files.s3.eu-west-1.amazonaws.com (aicore-project-files.s3.eu-west-1.amazonaws.com)|52.92.18.106|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 356510420 (340M) [application/zip]
Saving to: ‘airbnb-property-listings.zip’


2022-10-28 14:55:00 (14.7 MB/s) - ‘airbnb-property-listings.zip’ saved [356510420/356510420]



## Data Exploration

In [6]:
with ZipFile('./airbnb-property-listings.zip') as myzip:
    data = myzip.open(f"AirbnbDataSci/tabular_data/AirBnbData.csv")
data_df = pd.read_csv(data)
data_df.head()

Unnamed: 0,ID,Category,Title,Description,Amenities,Location,guests,beds,bathrooms,Price_Night,Cleanliness_rate,Accuracy_rate,Communication_rate,Location_rate,Check-in_rate,Value_rate,amenities_count,url,bedrooms
0,f9dcbd09-32ac-41d9-a0b1-fdb2793378cf,Treehouses,Red Kite Tree Tent - Ynys Affalon,"['About this space', ""Escape to one of these t...","['What this place offers', 'Bathroom', 'Shampo...",Llandrindod Wells United Kingdom,2.0,1.0,1.0,105.0,4.6,4.7,4.3,5.0,4.3,4.3,13.0,https://www.airbnb.co.uk/rooms/26620994?adults...,
1,1b4736a7-e73e-45bc-a9b5-d3e7fcf652fd,Treehouses,Az Alom Cabin - Treehouse Tree to Nature Cabin,"['About this space', ""Come and spend a romanti...","['What this place offers', 'Bedroom and laundr...",Guyonvelle Grand Est France,3.0,3.0,0.0,92.0,4.3,4.7,4.6,4.9,4.7,4.5,8.0,https://www.airbnb.co.uk/rooms/27055498?adults...,1.0
2,d577bc30-2222-4bef-a35e-a9825642aec4,Treehouses,Cabane Entre Les Pins\n🌲🏕️🌲,"['About this space', 'Rustic cabin between the...","['What this place offers', 'Scenic views', 'Ga...",Duclair Normandie France,4.0,2.0,1.5,52.0,4.2,4.6,4.8,4.8,4.8,4.7,51.0,https://www.airbnb.co.uk/rooms/51427108?adults...,1.0
3,ca9cbfd4-7798-4e8d-8c17-d5a64fba0abc,Treehouses,Tree Top Cabin with log burner & private hot tub,"['About this space', 'The Tree top cabin is si...","['What this place offers', 'Bathroom', 'Hot wa...",Barmouth Wales United Kingdom,2.0,,1.0,132.0,4.8,4.9,4.9,4.9,5.0,4.6,23.0,https://www.airbnb.co.uk/rooms/49543851?adults...,
4,8b2d0f78-16d8-4559-8692-62ebce2a1302,Treehouses,Hanging cabin,"['About this space', 'Feel refreshed at this u...","['What this place offers', 'Heating and coolin...",Wargnies-le-Petit Hauts-de-France France,2.0,1.0,,111.0,,,,,,,5.0,https://www.airbnb.co.uk/rooms/50166553?adults...,1.0


In [8]:
data_df.shape

(988, 19)

In [15]:
def tweak_data(df):
    return (df.rename(columns=dict(zip(df.columns, df.columns.str.lower())))
                .assign(category = lambda df_: df_.category.astype('category'))
    )
tweaked_df = tweak_data(data_df)
tweaked_df.head()

Unnamed: 0,id,category,title,description,amenities,location,guests,beds,bathrooms,price_night,cleanliness_rate,accuracy_rate,communication_rate,location_rate,check-in_rate,value_rate,amenities_count,url,bedrooms
0,f9dcbd09-32ac-41d9-a0b1-fdb2793378cf,Treehouses,Red Kite Tree Tent - Ynys Affalon,"['About this space', ""Escape to one of these t...","['What this place offers', 'Bathroom', 'Shampo...",Llandrindod Wells United Kingdom,2.0,1.0,1.0,105.0,4.6,4.7,4.3,5.0,4.3,4.3,13.0,https://www.airbnb.co.uk/rooms/26620994?adults...,
1,1b4736a7-e73e-45bc-a9b5-d3e7fcf652fd,Treehouses,Az Alom Cabin - Treehouse Tree to Nature Cabin,"['About this space', ""Come and spend a romanti...","['What this place offers', 'Bedroom and laundr...",Guyonvelle Grand Est France,3.0,3.0,0.0,92.0,4.3,4.7,4.6,4.9,4.7,4.5,8.0,https://www.airbnb.co.uk/rooms/27055498?adults...,1.0
2,d577bc30-2222-4bef-a35e-a9825642aec4,Treehouses,Cabane Entre Les Pins\n🌲🏕️🌲,"['About this space', 'Rustic cabin between the...","['What this place offers', 'Scenic views', 'Ga...",Duclair Normandie France,4.0,2.0,1.5,52.0,4.2,4.6,4.8,4.8,4.8,4.7,51.0,https://www.airbnb.co.uk/rooms/51427108?adults...,1.0
3,ca9cbfd4-7798-4e8d-8c17-d5a64fba0abc,Treehouses,Tree Top Cabin with log burner & private hot tub,"['About this space', 'The Tree top cabin is si...","['What this place offers', 'Bathroom', 'Hot wa...",Barmouth Wales United Kingdom,2.0,,1.0,132.0,4.8,4.9,4.9,4.9,5.0,4.6,23.0,https://www.airbnb.co.uk/rooms/49543851?adults...,
4,8b2d0f78-16d8-4559-8692-62ebce2a1302,Treehouses,Hanging cabin,"['About this space', 'Feel refreshed at this u...","['What this place offers', 'Heating and coolin...",Wargnies-le-Petit Hauts-de-France France,2.0,1.0,,111.0,,,,,,,5.0,https://www.airbnb.co.uk/rooms/50166553?adults...,1.0


In [16]:
tweaked_df.category.value_counts()

Treehouses       243
Offbeat          204
Amazing pools    197
Chalets          196
Beachfront       148
Name: category, dtype: int64

In [19]:
# Some repeated title's which is a bit strange
tweaked_df.title.value_counts()[tweaked_df.title.value_counts() > 1]

The Pool House                                                 5
Treehouse                                                      3
The Beach House                                                2
The best in tiny living!\nTreehouseTopia                       2
Cliff Dweller:  Spend a night Suspended from the Ridgeline!    2
Secluded Oak Barn Retreat with Hot Tub & Pool!                 2
Countryside retreat, swimming pool, stunning views             2
Name: title, dtype: int64

In [22]:
# description is a strange looking string
tweaked_df.description[0]

'[\'About this space\', "Escape to one of these two fabulous Tree Tents. Suspended high above the canopy, it’s time to appreciate life from a new perspective. Featured on George Clarke’s Amazing Spaces, these Tree Tents are a feat of aviation technology. Tree Tent comes complete with fire pit, outdoor kitchen and shower with hot water. You’ll discover a comfortable bed and cosy wood burning stove. Part of the Red Kite Estate, along with our barn and its sister tree tent, the first ever built in the UK, Dragon\'s Egg.", \'The space\', \'The space\\nThe true joy of this place is how wonderfully simple it is (aviation technology aside). Days are filled with fireside discussions, wildlife watching and stunningly beautiful walks. With the nearest mobile signal a ten minute walk away, it’s a great place to ditch the digital and truly escape. Head over the bridge to your own private deck that happily houses a clever outdoor-kitchen and shower (complete with hot water). It’s the perfect spot t