# **Airbnb Price Prediction and Neighborhood Insights using Python**

# An end-to-end data science and analytics project to explore, visualize, and model Airbnb listings with a focus on pricing trends, host behavior, and predictive modeling.


airbnb-price-prediction-insights

**IMPORTING LIBRARIES:**

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import warnings
warnings.filterwarnings("ignore")



# Set display options
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 100)  # Adjust as needed
pd.set_option('display.width', 1000)    # Stretch display width



In [3]:
data = pd.read_csv("/content/listings.csv")

In [4]:
data.shape

(2730, 79)

We got about 2730 rows with 79 columns

In [5]:
data.columns

Index(['id', 'listing_url', 'scrape_id', 'last_scraped', 'source', 'name', 'description', 'neighborhood_overview', 'picture_url', 'host_id', 'host_url', 'host_name', 'host_since', 'host_location', 'host_about', 'host_response_time', 'host_response_rate', 'host_acceptance_rate', 'host_is_superhost', 'host_thumbnail_url', 'host_picture_url', 'host_neighbourhood', 'host_listings_count', 'host_total_listings_count', 'host_verifications', 'host_has_profile_pic', 'host_identity_verified', 'neighbourhood', 'neighbourhood_cleansed', 'neighbourhood_group_cleansed', 'latitude', 'longitude', 'property_type', 'room_type', 'accommodates', 'bathrooms', 'bathrooms_text', 'bedrooms', 'beds', 'amenities', 'price', 'minimum_nights', 'maximum_nights', 'minimum_minimum_nights', 'maximum_minimum_nights', 'minimum_maximum_nights', 'maximum_maximum_nights', 'minimum_nights_avg_ntm', 'maximum_nights_avg_ntm', 'calendar_updated', 'has_availability', 'availability_30', 'availability_60', 'availability_90',
    

Here, Price is the target column. Some of the important columns to explore are neighbourhood,room_type,latitude and longitde to map,host_is_superhost,amenities,number of reviews.

In [6]:
data.head(10)

Unnamed: 0,id,listing_url,scrape_id,last_scraped,source,name,description,neighborhood_overview,picture_url,host_id,host_url,host_name,host_since,host_location,host_about,host_response_time,host_response_rate,host_acceptance_rate,host_is_superhost,host_thumbnail_url,host_picture_url,host_neighbourhood,host_listings_count,host_total_listings_count,host_verifications,host_has_profile_pic,host_identity_verified,neighbourhood,neighbourhood_cleansed,neighbourhood_group_cleansed,latitude,longitude,property_type,room_type,accommodates,bathrooms,bathrooms_text,bedrooms,beds,amenities,price,minimum_nights,maximum_nights,minimum_minimum_nights,maximum_minimum_nights,minimum_maximum_nights,maximum_maximum_nights,minimum_nights_avg_ntm,maximum_nights_avg_ntm,calendar_updated,has_availability,availability_30,availability_60,availability_90,availability_365,calendar_last_scraped,number_of_reviews,number_of_reviews_ltm,number_of_reviews_l30d,availability_eoy,number_of_reviews_ly,estimated_occupancy_l365d,estimated_revenue_l365d,first_review,last_review,review_scores_rating,review_scores_accuracy,review_scores_cleanliness,review_scores_checkin,review_scores_communication,review_scores_location,review_scores_value,license,instant_bookable,calculated_host_listings_count,calculated_host_listings_count_entire_homes,calculated_host_listings_count_private_rooms,calculated_host_listings_count_shared_rooms,reviews_per_month
0,42515,https://www.airbnb.com/rooms/42515,20250323033821,2025-03-23,city scrape,Geneva center - Hosting females,,,https://a0.muscache.com/pictures/10640277/ff1d...,185647,https://www.airbnb.com/users/show/185647,Noelle,2010-07-30,"Geneva, Switzerland","Hi, I am a Management consultant professional,...",within a few hours,100%,88%,f,https://a0.muscache.com/im/users/185647/profil...,https://a0.muscache.com/im/users/185647/profil...,,1,1,"['email', 'phone']",t,t,,Commune de Genève,,46.20198,6.15672,Private room in rental unit,Private room,1,1.5,1.5 shared baths,1.0,1.0,"[""Paid washer \u2013 In building"", ""Elevator"",...",$89.00,3,1125,3,3,1125,1125,3.0,1125.0,,t,23,53,83,358,2025-03-23,79,6,0,277,6,36,3204.0,2011-09-24,2024-12-14,4.74,4.73,4.78,4.86,4.84,4.85,4.55,,f,1,0,1,0,0.48
1,203997,https://www.airbnb.com/rooms/203997,20250323033821,2025-03-23,city scrape,Top location studio 1 block from lake!,"Spacious studio with washer/dryer, Wi-fi, supe...",This is the most sought after area in Geneva: ...,https://a0.muscache.com/pictures/14163740/9392...,1001280,https://www.airbnb.com/users/show/1001280,Mike,2011-08-21,"Geneva, Switzerland","I love airbnb, both for hosting and traveling....",within an hour,100%,100%,t,https://a0.muscache.com/im/pictures/user/3bbea...,https://a0.muscache.com/im/pictures/user/3bbea...,,6,12,"['email', 'phone']",t,t,"Geneva, Switzerland",Commune de Genève,,46.19964,6.1558,Entire rental unit,Entire home/apt,2,1.0,1 bath,1.0,0.0,"[""Elevator"", ""Shampoo"", ""Luggage dropoff allow...",$128.00,5,730,5,5,1125,1125,5.0,1125.0,,t,20,40,70,78,2025-03-23,92,8,0,78,7,80,10240.0,2012-11-11,2025-01-18,4.91,4.98,4.89,4.97,4.99,4.96,4.81,,t,2,2,0,0,0.61
2,276025,https://www.airbnb.com/rooms/276025,20250323033821,2025-03-23,city scrape,Lovely flat by Geneva lake,This cozy flat is located in a very pleasant v...,"Versoix is a pretty fishing village, on the ed...",https://a0.muscache.com/pictures/4156456/d3aed...,1442438,https://www.airbnb.com/users/show/1442438,Gaby,2011-11-25,"Geneva, Switzerland","Hi, my name is Gaby! I'm a globetrotter. I spe...",within an hour,100%,100%,t,https://a0.muscache.com/im/users/1442438/profi...,https://a0.muscache.com/im/users/1442438/profi...,,2,2,"['email', 'phone']",t,t,"Versoix, Geneva, Switzerland",Versoix,,46.28031,6.16833,Entire rental unit,Entire home/apt,3,1.0,1 bath,1.0,4.0,"[""Toaster"", ""Free dryer \u2013 In building"", ""...",$70.00,7,360,7,7,1125,1125,7.0,1125.0,,t,0,0,0,238,2025-03-23,122,4,0,157,4,56,3920.0,2012-03-15,2024-12-01,4.64,4.7,4.74,4.88,4.89,4.77,4.68,,f,1,1,0,0,0.77
3,338682,https://www.airbnb.com/rooms/338682,20250323033821,2025-03-23,previous scrape,Top location 1 block from the lake!,One bedroom apartment located in a pedestrian ...,This is the favourite area for expats in Genev...,https://a0.muscache.com/pictures/14164129/db92...,1001280,https://www.airbnb.com/users/show/1001280,Mike,2011-08-21,"Geneva, Switzerland","I love airbnb, both for hosting and traveling....",within an hour,100%,100%,t,https://a0.muscache.com/im/pictures/user/3bbea...,https://a0.muscache.com/im/pictures/user/3bbea...,,6,12,"['email', 'phone']",t,t,"Geneva, Switzerland",Commune de Genève,,46.20708,6.15545,Entire rental unit,Entire home/apt,2,,1.5 baths,1.0,,"[""Elevator"", ""Dishwasher"", ""Shampoo"", ""Cooking...",,4,365,4,4,1125,1125,4.0,1125.0,,t,2,4,4,4,2025-03-23,123,7,0,4,6,56,,2012-03-02,2025-01-11,4.82,4.86,4.77,4.9,4.89,4.88,4.71,,t,2,2,0,0,0.77
4,399388,https://www.airbnb.com/rooms/399388,20250323033821,2025-03-23,city scrape,Nice modern bedroom with balcony,Unwind in this modern private bedroom with a c...,,https://a0.muscache.com/pictures/hosting/Hosti...,1994454,https://www.airbnb.com/users/show/1994454,Oscar,2012-03-23,"Geneva, Switzerland","Hi,\n\nI’m Oscar, an experienced host. I'm off...",within an hour,100%,90%,t,https://a0.muscache.com/im/pictures/user/ebd3b...,https://a0.muscache.com/im/pictures/user/ebd3b...,,2,3,"['email', 'phone', 'work_email']",t,t,,Commune de Genève,,46.18904,6.13908,Private room in rental unit,Private room,2,1.0,1 shared bath,1.0,1.0,"[""Elevator"", ""Bed linens"", ""Blender"", ""Outdoor...",$55.00,2,1125,2,2,1125,1125,2.0,1125.0,,t,4,5,22,223,2025-03-23,58,8,0,142,7,48,2640.0,2012-07-15,2025-01-01,4.85,4.81,4.88,4.95,4.9,4.71,4.71,,f,2,0,2,0,0.38
5,419631,https://www.airbnb.com/rooms/419631,20250323033821,2025-03-23,city scrape,Beautiful Apm't in central Geneva (NB Guest Room),Fabulous apartment with design interior in cha...,Quiet yet very central<br /><br />N.B. If you ...,https://a0.muscache.com/pictures/e591abb4-fc83...,2086993,https://www.airbnb.com/users/show/2086993,Mark,2012-04-07,"Geneva, Switzerland","I speak English, French, German and after 2 gl...",within an hour,100%,92%,f,https://a0.muscache.com/im/users/2086993/profi...,https://a0.muscache.com/im/users/2086993/profi...,,2,2,"['email', 'phone', 'work_email']",t,t,"Geneva, Canton of Geneva, Switzerland",Commune de Genève,,46.1995,6.17308,Private room in rental unit,Private room,2,1.5,1.5 baths,1.0,1.0,"[""Elevator"", ""Dishwasher"", ""Shampoo"", ""Electro...",$100.00,2,14,2,2,1125,1125,2.0,1125.0,,t,21,37,67,67,2025-03-23,81,3,0,67,4,18,1800.0,2013-05-25,2024-06-17,4.94,4.98,5.0,4.99,4.99,4.86,4.91,,t,2,1,1,0,0.56
6,626512,https://www.airbnb.com/rooms/626512,20250323033821,2025-03-23,city scrape,Design apartment 2 blocks from lake,"Very well located, all commodities at walking ...",Lots of restaurants and shops in the area. Lak...,https://a0.muscache.com/pictures/10170754/3bfa...,3110118,https://www.airbnb.com/users/show/3110118,Enrica,2012-07-31,"Geneva, Switzerland","I'm Italian, live in Geneva and love meeting i...",within an hour,100%,80%,t,https://a0.muscache.com/im/pictures/user/7da2d...,https://a0.muscache.com/im/pictures/user/7da2d...,,2,3,"['email', 'phone']",t,t,"Geneva, Switzerland",Commune de Genève,,46.2114,6.15107,Entire rental unit,Entire home/apt,2,1.0,1 bath,1.0,2.0,"[""Carbon monoxide alarm"", ""Elevator"", ""Stove"",...",$112.00,13,700,13,13,700,700,13.0,700.0,,t,2,21,21,21,2025-03-23,56,14,1,21,14,255,28560.0,2012-09-17,2025-02-28,4.98,4.95,4.96,5.0,4.98,4.84,4.75,,f,1,1,0,0,0.37
7,649023,https://www.airbnb.com/rooms/649023,20250323033821,2025-03-23,city scrape,Artistic studio in Geneva Old Town,A cosy studio featuring a comfortable double b...,Old Town is like a museum but people live here...,https://a0.muscache.com/pictures/fd80075d-ff34...,2375758,https://www.airbnb.com/users/show/2375758,Eliora Elena,2012-05-15,"Geneva, Switzerland",Hello and thank you for dropping by my profile...,within an hour,100%,100%,t,https://a0.muscache.com/im/users/2375758/profi...,https://a0.muscache.com/im/users/2375758/profi...,,1,2,"['email', 'phone', 'work_email']",t,t,"Geneva, Switzerland",Commune de Genève,,46.20175,6.14852,Entire rental unit,Entire home/apt,2,1.0,1 bath,1.0,1.0,"[""Lockbox"", ""Coffee maker"", ""Kitchen"", ""Hot wa...",$87.00,1,21,1,1,1125,1125,1.0,1125.0,,t,11,28,48,123,2025-03-23,167,35,1,123,37,210,18270.0,2012-09-30,2025-03-04,4.92,4.95,4.92,4.97,4.96,4.99,4.85,,f,1,1,0,0,1.1
8,664917,https://www.airbnb.com/rooms/664917,20250323033821,2025-03-23,previous scrape,"Centre, elegant and quiet (long stay possible)",Enjoy Geneva in an elegant and fully renovated...,"The building is located in a very safe, quite ...",https://a0.muscache.com/pictures/10024367/683d...,3359333,https://www.airbnb.com/users/show/3359333,Jean-Luc,2012-08-24,"Geneva, Switzerland",Je me réjouis de vous accueillir à Genève dans...,,,100%,f,https://a0.muscache.com/im/users/3359333/profi...,https://a0.muscache.com/im/users/3359333/profi...,,4,4,"['email', 'phone']",t,t,"Geneva, Switzerland",Commune de Genève,,46.20542,6.12802,Entire rental unit,Entire home/apt,2,,1 bath,1.0,,"[""Bio Balance body soap"", ""44 inch HDTV with A...",,300,730,300,300,1125,1125,300.0,1125.0,,t,0,0,0,0,2025-03-23,47,2,0,0,2,255,,2012-10-18,2024-08-30,4.89,4.89,4.93,4.96,4.98,4.8,4.76,,f,4,4,0,0,0.31
9,693976,https://www.airbnb.com/rooms/693976,20250323033821,2025-03-23,previous scrape,Suite overlooking Geneva & Mt Blanc,Because of its location and in order to help w...,Vermont area,https://a0.muscache.com/pictures/9744862/b3044...,3550575,https://www.airbnb.com/users/show/3550575,Ramon,2012-09-12,Switzerland,Hello to everyone ! \r\n\r\nThe best and short...,,,,f,https://a0.muscache.com/im/users/3550575/profi...,https://a0.muscache.com/im/users/3550575/profi...,,1,3,"['email', 'phone']",t,f,"Geneva, Switzerland",Commune de Genève,,46.21716,6.13978,Entire rental unit,Entire home/apt,1,,1 bath,2.0,,"[""Bosch electric stove"", ""Toaster"", ""Carbon mo...",,2,5,2,2,5,5,2.0,5.0,,,0,0,0,0,2025-03-23,51,0,0,0,0,0,,2012-09-28,2017-02-09,4.9,4.76,4.94,4.88,4.98,4.78,4.8,,f,1,1,0,0,0.34


In [7]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2730 entries, 0 to 2729
Data columns (total 79 columns):
 #   Column                                        Non-Null Count  Dtype  
---  ------                                        --------------  -----  
 0   id                                            2730 non-null   int64  
 1   listing_url                                   2730 non-null   object 
 2   scrape_id                                     2730 non-null   int64  
 3   last_scraped                                  2730 non-null   object 
 4   source                                        2730 non-null   object 
 5   name                                          2730 non-null   object 
 6   description                                   2636 non-null   object 
 7   neighborhood_overview                         1022 non-null   object 
 8   picture_url                                   2730 non-null   object 
 9   host_id                                       2730 non-null   i

Data contains many null values,contains both numeric and object data type.Our target columns has some null values as wll the type is object.

In [8]:
data.describe()

Unnamed: 0,id,scrape_id,host_id,host_listings_count,host_total_listings_count,neighbourhood_group_cleansed,latitude,longitude,accommodates,bathrooms,bedrooms,beds,minimum_nights,maximum_nights,minimum_minimum_nights,maximum_minimum_nights,minimum_maximum_nights,maximum_maximum_nights,minimum_nights_avg_ntm,maximum_nights_avg_ntm,calendar_updated,availability_30,availability_60,availability_90,availability_365,number_of_reviews,number_of_reviews_ltm,number_of_reviews_l30d,availability_eoy,number_of_reviews_ly,estimated_occupancy_l365d,estimated_revenue_l365d,review_scores_rating,review_scores_accuracy,review_scores_cleanliness,review_scores_checkin,review_scores_communication,review_scores_location,review_scores_value,calculated_host_listings_count,calculated_host_listings_count_entire_homes,calculated_host_listings_count_private_rooms,calculated_host_listings_count_shared_rooms,reviews_per_month
count,2730.0,2730.0,2730.0,2730.0,2730.0,0.0,2730.0,2730.0,2730.0,2011.0,2519.0,2007.0,2730.0,2730.0,2730.0,2730.0,2730.0,2730.0,2730.0,2730.0,0.0,2730.0,2730.0,2730.0,2730.0,2730.0,2730.0,2730.0,2730.0,2730.0,2730.0,2005.0,2057.0,2057.0,2057.0,2057.0,2057.0,2057.0,2057.0,2730.0,2730.0,2730.0,2730.0,2057.0
mean,5.964151e+17,20250320000000.0,184872400.0,34.445788,46.741758,,46.207117,6.144779,2.641758,1.178767,1.25526,1.562033,9.124176,426.071795,8.441758,9.348718,547.061538,575.331868,8.997253,566.341392,,9.478022,21.692674,34.686081,134.232234,26.465568,6.36337,0.355678,108.92674,6.250183,40.411722,6160.074813,4.732188,4.765843,4.718935,4.811006,4.802504,4.792825,4.619208,18.287912,16.86337,1.351282,0.003663,0.970632
std,5.442481e+17,0.5547891,197269900.0,94.648896,120.071377,,0.019881,0.025079,1.525361,0.492977,0.999775,1.110995,46.023211,403.141721,44.882499,45.292656,451.374369,447.990797,45.044895,444.71576,,10.381883,22.21065,34.365622,134.615069,58.96595,14.020604,1.126911,106.102655,14.25029,67.31229,11085.03578,0.396831,0.394143,0.407506,0.368086,0.37976,0.337144,0.436437,45.814872,45.499041,3.392927,0.071531,1.389309
min,42515.0,20250320000000.0,67837.0,1.0,1.0,,46.14243,5.97391,1.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.01
25%,35152950.0,20250320000000.0,19769300.0,1.0,1.0,,46.1968,6.13387,2.0,1.0,1.0,1.0,1.0,60.0,1.0,2.0,86.0,105.0,1.0,100.0,,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,4.64,4.7,4.63,4.78,4.76,4.73,4.5,1.0,1.0,0.0,0.0,0.17
50%,7.327249e+17,20250320000000.0,98599300.0,2.0,3.0,,46.205835,6.145291,2.0,1.0,1.0,1.0,3.0,365.0,2.0,3.0,365.0,365.0,3.0,365.0,,6.0,15.0,24.0,84.0,6.0,1.0,0.0,80.0,1.0,6.0,1560.0,4.85,4.88,4.84,4.92,4.94,4.88,4.71,1.0,1.0,0.0,0.0,0.46
75%,1.103114e+18,20250320000000.0,273734300.0,6.0,8.0,,46.21263,6.15545,4.0,1.0,1.0,2.0,5.0,370.0,4.0,5.0,1125.0,1125.0,5.0,1125.0,,18.0,44.0,71.0,265.0,25.0,6.0,0.0,214.0,6.0,54.0,7680.0,5.0,5.0,5.0,5.0,5.0,5.0,4.9,5.0,3.0,1.0,0.0,1.19
max,1.381934e+18,20250320000000.0,685062600.0,375.0,447.0,,46.35168,6.27308,15.0,6.5,28.0,12.0,1100.0,1200.0,1100.0,1100.0,1200.0,1200.0,1100.0,1200.0,,30.0,60.0,90.0,365.0,896.0,165.0,17.0,284.0,168.0,255.0,108870.0,5.0,5.0,5.0,5.0,5.0,5.0,5.0,178.0,178.0,25.0,2.0,12.93


In [9]:
data["price"]

Unnamed: 0,price
0,$89.00
1,$128.00
2,$70.00
3,
4,$55.00
...,...
2725,$85.00
2726,$80.00
2727,$354.00
2728,$262.00


In [10]:
data["price"] = data["price"].replace(r'[^\d.]', '', regex=True).astype(float)


In [11]:
data["price"]

Unnamed: 0,price
0,89.0
1,128.0
2,70.0
3,
4,55.0
...,...
2725,85.0
2726,80.0
2727,354.0
2728,262.0


In [12]:
data["price"].describe()

Unnamed: 0,price
count,2005.0
mean,153.721696
std,239.834562
min,21.0
25%,82.0
50%,115.0
75%,166.0
max,8214.0


The average price is 153.72 CHF.Valis listings are 2005.STD shows widely spread. min denotes cheap listings i.e private or shared rooms. max likely be the outlier costing around 8214. The mean is higher than min  i.e right skewed.

In [13]:
data.isnull().sum().sort_values(ascending=False)


Unnamed: 0,0
neighbourhood_group_cleansed,2730
calendar_updated,2730
license,2729
host_neighbourhood,2700
neighborhood_overview,1708
neighbourhood,1708
host_about,1405
estimated_revenue_l365d,725
price,725
beds,723


We are dropping columns like host_neighbourhood almost all missing and not critical.host_about personal information are not useful.neighborhood_overview / neighbourhood neighbourhood_cleansed instead,
estimated_revenue_l365d,not needed for our model.
first_review, last_review has dates not needed for price modeling.scrape_id, host_name	IDs are irrelevant for analysis.


In [14]:
data['neighbourhood_cleansed'].isnull().sum()


np.int64(0)

In [16]:
for col in data.columns:
    print(f"\n🔹 {col} ({data[col].nunique()} unique values):")
    print(data[col].unique()[:10])  # shows first 10 unique values



🔹 id (2730 unique values):
[ 42515 203997 276025 338682 399388 419631 626512 649023 664917 693976]

🔹 listing_url (2730 unique values):
['https://www.airbnb.com/rooms/42515'
 'https://www.airbnb.com/rooms/203997'
 'https://www.airbnb.com/rooms/276025'
 'https://www.airbnb.com/rooms/338682'
 'https://www.airbnb.com/rooms/399388'
 'https://www.airbnb.com/rooms/419631'
 'https://www.airbnb.com/rooms/626512'
 'https://www.airbnb.com/rooms/649023'
 'https://www.airbnb.com/rooms/664917'
 'https://www.airbnb.com/rooms/693976']

🔹 scrape_id (1 unique values):
[20250323033821]

🔹 last_scraped (1 unique values):
['2025-03-23']

🔹 source (2 unique values):
['city scrape' 'previous scrape']

🔹 name (2667 unique values):
['Geneva center - Hosting females'
 'Top location studio 1 block from lake!' 'Lovely flat by Geneva lake'
 'Top location 1 block from the lake!' 'Nice modern bedroom with balcony'
 "Beautiful Apm't in central Geneva (NB Guest Room)"
 'Design apartment 2 blocks from lake'
 'Artisti

In [17]:
data.drop(columns=[
    'host_neighbourhood', 'host_about', 'neighborhood_overview', 'neighbourhood',
    'estimated_revenue_l365d', 'first_review', 'last_review',
    'scrape_id', 'host_name'
], inplace=True)


Dropping all the non-crictical columns

In [18]:
data.dropna(subset=['price', 'beds', 'bathrooms', 'bedrooms', 'host_is_superhost'], inplace=True)


Dropping the NULL values in the most crictical columns.

In [20]:
data.shape

(1969, 70)

In [21]:
data.isnull().sum().sort_values(ascending=False)

Unnamed: 0,0
calendar_updated,1969
neighbourhood_group_cleansed,1969
license,1968
review_scores_accuracy,482
review_scores_location,482
review_scores_value,482
review_scores_communication,482
review_scores_cleanliness,482
reviews_per_month,482
review_scores_rating,482


In [22]:
data.drop(columns=['calendar_updated', 'neighbourhood_group_cleansed', 'license'], inplace=True)


Dropping the columns containing almost NULL values

**IMPUTATION WITH MEDIAN:**

In [25]:
review_cols=[ 'review_scores_accuracy', 'review_scores_location', 'review_scores_value',
    'review_scores_communication', 'review_scores_cleanliness',
    'review_scores_checkin', 'review_scores_rating']

In [26]:
data[review_cols] = data[review_cols].fillna(data[review_cols].median())

For the column review_per_month Nan indicates listings has no bookings. Therefore filling NaN with 0 is accurate and honest.

In [27]:
data['reviews_per_month'].fillna(0, inplace=True)


For the columns host response rate and host acceptance rate the values are in % in object type.So converting those columns to float and filling the null values with median.

In [28]:
# Strip % and convert to float
data['host_response_rate'] = data['host_response_rate'].str.rstrip('%').astype(float)
data['host_acceptance_rate'] = data['host_acceptance_rate'].str.rstrip('%').astype(float)

# Fill remaining nulls with median
data['host_response_rate'].fillna(data['host_response_rate'].median(), inplace=True)
data['host_acceptance_rate'].fillna(data['host_acceptance_rate'].median(), inplace=True)


For the text columns like host response time,host location and description, the Null values are filled with unknown and no description.

In [29]:
data['host_location'].fillna('Unknown', inplace=True)
data['host_response_time'].fillna('Unknown', inplace=True)
data['description'].fillna('No description', inplace=True)


In [30]:
data['has_availability'] = data['has_availability'].map({'t': True, 'f': False})
data['has_availability'].fillna(False, inplace=True)


In [31]:
data.isnull().sum().sort_values(ascending=False)

Unnamed: 0,0
id,0
listing_url,0
last_scraped,0
source,0
name,0
description,0
picture_url,0
host_id,0
host_url,0
host_since,0


In [32]:
data.shape

(1969, 67)

In [34]:
data.to_csv("geneva_airbnb_cleaned.csv", index=False)
