# Maui AirBnB 🌴🏡🌴

## What Questions are we asking?
- what can we learn about different hosts and areas?
- what can we learn from predictions? (ex. locations, price, reviews, etc.)
- which hosts are the busiest and why? 
- is there any noticable different of traffice among different areas and what could be the reason for it? 

### Important KPIs(Key Performance Indicator) for Airbnb.
- **Occupancy Rate**: divide the total number of occupied nights by the total number of available nights and multiply by 100.
- **Average Daily Rate (ADR)**: divide the total revenue generated by the total number of occupied nights.
- **Revenue**: Multiply the ADR by the number of occupied nights to calculate the revenue.
- **Guest Satisfaction**: You can calculate average review scores for aspects like cleanliness, communication, check-in, location, and value. Average these scores to determine the overall guest satisfaction rating.
- **Booking Lead Time**: Calculate the difference between the booking date and the check-in date for each booking and determine the average lead time.
- **Return on Investment(ROI)**: subtracting the total expenses (such as maintenance, cleaning fees, and utilities) from the total revenue and dividing the result by the total expenses. Multiply by 100 to get the ROI percentage.
- **Average Length of Stay**: Calculate this by dividing the total number of occupied nights by the total number of bookings.
- **Cancellation Rate**: Divide the number of canceled bookings by the total number of bookings and multiply by 100 to calculate the cancellation rate.
- **Net Promoter Score (NPS)**:Calculate NPS by subtracting the percentage of detractors (those unlikely to recommend) from the percentage of promoters (those likely to recommend).

#### We will try to answer as many of these KPIs as we can. 

# Data Exploration🧪

## Call in the Libraries 📝

In [53]:
# 1 - Data Manipulation
import pandas as pd
import numpy as np
import re

# 2 - Data Visualisation
import matplotlib.pyplot as plt
import seaborn as sns

## Call in the Datasets 📃

In [54]:
# calling in the datasets and exploring what I need.
listings_data = pd.read_csv('../raw_data/h_listings.csv')
calendar_data = pd.read_csv('../raw_data/h_calendar.csv')
review_data = pd.read_csv('../raw_data/h_reviews.csv')

In [55]:
listings_data.head(1)

Unnamed: 0,id,listing_url,scrape_id,last_scraped,source,name,description,neighborhood_overview,picture_url,host_id,...,review_scores_communication,review_scores_location,review_scores_value,license,instant_bookable,calculated_host_listings_count,calculated_host_listings_count_entire_homes,calculated_host_listings_count_private_rooms,calculated_host_listings_count_shared_rooms,reviews_per_month
0,81566,https://www.airbnb.com/rooms/81566,20230610213045,2023-06-11,city scrape,Rental unit in Haleiwa · ★4.67 · 2 bedrooms · ...,2 bedroom/1 bath ocean view & beach access<br ...,There are so many awesome things to do on the ...,https://a0.muscache.com/pictures/555752/35365a...,442490,...,4.75,4.86,4.62,ta-020-657-9712-01,f,1,1,0,0,1.76


In [56]:
calendar_data.head(1)

Unnamed: 0,listing_id,date,available,price,adjusted_price,minimum_nights,maximum_nights
0,300379,2023-06-11,f,$80.00,$80.00,3.0,28.0


In [57]:
review_data.head(1)

Unnamed: 0,listing_id,id,date,reviewer_id,reviewer_name,comments
0,81566,252069,2011-05-05,408328,Amanda,the pictures of the beach dont do this propert...


In [58]:
# looking through the column names to find what is needed.
column_names = listings_data.columns.tolist()
column_names

['id',
 'listing_url',
 'scrape_id',
 'last_scraped',
 'source',
 'name',
 'description',
 'neighborhood_overview',
 'picture_url',
 'host_id',
 'host_url',
 'host_name',
 'host_since',
 'host_location',
 'host_about',
 'host_response_time',
 'host_response_rate',
 'host_acceptance_rate',
 'host_is_superhost',
 'host_thumbnail_url',
 'host_picture_url',
 'host_neighbourhood',
 'host_listings_count',
 'host_total_listings_count',
 'host_verifications',
 'host_has_profile_pic',
 'host_identity_verified',
 'neighbourhood',
 'neighbourhood_cleansed',
 'neighbourhood_group_cleansed',
 'latitude',
 'longitude',
 'property_type',
 'room_type',
 'accommodates',
 'bathrooms',
 'bathrooms_text',
 'bedrooms',
 'beds',
 'amenities',
 'price',
 'minimum_nights',
 'maximum_nights',
 'minimum_minimum_nights',
 'maximum_minimum_nights',
 'minimum_maximum_nights',
 'maximum_maximum_nights',
 'minimum_nights_avg_ntm',
 'maximum_nights_avg_ntm',
 'calendar_updated',
 'has_availability',
 'availability_30

In [59]:
# Made an alternate Data set of just the essential columns
# These columns provide the necessary information to calculate the suggested KPIs, 
# such as occupancy rate, ADR, revenue, guest satisfaction, ROI, 
# average length of stay, cancellation rate, and NPS.


listings_essential = listings_data[[
    'id',
    'name',
    'host_id',
    'host_name',
    'availability_365',
    'price',
    'latitude',
    'longitude',
    'number_of_reviews',
    'review_scores_rating',
    'minimum_nights',
    'maximum_nights',
    'review_scores_accuracy',
    'review_scores_cleanliness',
    'review_scores_checkin',
    'review_scores_communication',
    'review_scores_location',
    'review_scores_value',
    'instant_bookable',
    'accommodates',
    'bathrooms',
    'bedrooms',
    'beds',
    'neighbourhood_cleansed',
    'neighbourhood_group_cleansed',
    'room_type',  
    'calculated_host_listings_count', 
    ]]

In [60]:
#seeing what the data set looks like
listings_essential.head(3)

Unnamed: 0,id,name,host_id,host_name,availability_365,price,latitude,longitude,number_of_reviews,review_scores_rating,...,review_scores_value,instant_bookable,accommodates,bathrooms,bedrooms,beds,neighbourhood_cleansed,neighbourhood_group_cleansed,room_type,calculated_host_listings_count
0,81566,Rental unit in Haleiwa · ★4.67 · 2 bedrooms · ...,442490,Susan,249,$250.00,21.589247,-158.111008,260,4.67,...,4.62,f,4,,2.0,2.0,North Shore Oahu,Honolulu,Entire home/apt,1
1,81582,Home in Pāhoa · ★4.94 · 2 bedrooms · 3 beds · ...,442698,Elizabeth,69,$119.00,19.43428,-155.21609,181,4.94,...,4.83,f,4,,2.0,3.0,Puna,Hawaii,Entire home/apt,2
2,83221,Cabin in Pāhoa · ★4.93 · 1 bedroom · 1 bed · 1...,451536,Connie And Andrew,323,$117.00,19.53979,-155.00207,311,4.93,...,4.94,f,3,,1.0,1.0,Puna,Hawaii,Entire home/apt,1


In [61]:
# looing at what data types we need to change.
listings_essential.dtypes

id                                  int64
name                               object
host_id                             int64
host_name                          object
availability_365                    int64
price                              object
latitude                          float64
longitude                         float64
number_of_reviews                   int64
review_scores_rating              float64
minimum_nights                      int64
maximum_nights                      int64
review_scores_accuracy            float64
review_scores_cleanliness         float64
review_scores_checkin             float64
review_scores_communication       float64
review_scores_location            float64
review_scores_value               float64
instant_bookable                   object
accommodates                        int64
bathrooms                         float64
bedrooms                          float64
beds                              float64
neighbourhood_cleansed            

### Insights on types:
- **Price** is an object and we need to turn it into a real number: Integer or float
- everything else seems to be fairly inplace after looking at the data set above. 

In [62]:
# How many null values are there?
listings_essential.isna().sum()

id                                    0
name                                  0
host_id                               0
host_name                             2
availability_365                      0
price                                 0
latitude                              0
longitude                             0
number_of_reviews                     0
review_scores_rating               7678
minimum_nights                        0
maximum_nights                        0
review_scores_accuracy             7730
review_scores_cleanliness          7730
review_scores_checkin              7732
review_scores_communication        7730
review_scores_location             7732
review_scores_value                7732
instant_bookable                      0
accommodates                          0
bathrooms                         32597
bedrooms                           5611
beds                                449
neighbourhood_cleansed                0
neighbourhood_group_cleansed          0


In [63]:
listings_essential[["review_scores_accuracy",
"review_scores_cleanliness",
"review_scores_checkin",
"review_scores_communication",
"review_scores_location",
"review_scores_value"]]

Unnamed: 0,review_scores_accuracy,review_scores_cleanliness,review_scores_checkin,review_scores_communication,review_scores_location,review_scores_value
0,4.70,4.76,4.85,4.75,4.86,4.62
1,4.97,4.91,4.94,4.94,4.94,4.83
2,4.98,4.97,4.97,4.96,4.86,4.94
3,4.85,4.38,5.00,4.88,5.00,4.85
4,4.57,4.21,4.64,4.50,4.89,4.36
...,...,...,...,...,...,...
32592,,,,,,
32593,,,,,,
32594,,,,,,
32595,,,,,,


### Insights on the missing Values:
- **host_name** has 2 missing values that are probably Unknown 
  - Solution: Replace the value with 'Unknown' since it's already an object data type
- **Ratings**: all have pretty similar amount of missing values, meaning these listings are probably new so there is no rating. 
  - Solution: They cannot be set to 0 because a 0 rating is technically bad, but NAN does't really help either
- **bathroom, bedroom, bed** - These properies probably don't have these amenities. 
  - Solution: value can probably be set to 0

## Manipulation ⛏

In [64]:
# Filling unknown values in bathrooms, bedrooms, and beds
listings_essential[['bathrooms', 'bedrooms', 'beds']] = listings_essential[['bathrooms', 'bedrooms', 'beds']].fillna(0)

#Filling unknown values in host name
listings_essential['host_name'].fillna('Unknown', inplace=True)

#Filling unknown values in reviews with 0 as there is no score due to new properties.
listings_essential[["review_scores_accuracy",
"review_scores_cleanliness",
"review_scores_checkin",
"review_scores_communication",
"review_scores_location",
"review_scores_value"]] = listings_essential[["review_scores_accuracy",
"review_scores_cleanliness",
"review_scores_checkin",
"review_scores_communication",
"review_scores_location",
"review_scores_value"]].fillna(0)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  listings_essential[['bathrooms', 'bedrooms', 'beds']] = listings_essential[['bathrooms', 'bedrooms', 'beds']].fillna(0)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  listings_essential['host_name'].fillna('Unknown', inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  listings_essential[["review_scores_accuracy",


In [65]:
listings_essential['review_scores_rating'].fillna(0, inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  listings_essential['review_scores_rating'].fillna(0, inplace=True)


In [66]:
#checking to see if there are still any null values
listings_essential.isnull().sum()

id                                0
name                              0
host_id                           0
host_name                         0
availability_365                  0
price                             0
latitude                          0
longitude                         0
number_of_reviews                 0
review_scores_rating              0
minimum_nights                    0
maximum_nights                    0
review_scores_accuracy            0
review_scores_cleanliness         0
review_scores_checkin             0
review_scores_communication       0
review_scores_location            0
review_scores_value               0
instant_bookable                  0
accommodates                      0
bathrooms                         0
bedrooms                          0
beds                              0
neighbourhood_cleansed            0
neighbourhood_group_cleansed      0
room_type                         0
calculated_host_listings_count    0
dtype: int64

In [67]:
listings_essential.shape

(32597, 27)

In [68]:
listings_essential.columns

Index(['id', 'name', 'host_id', 'host_name', 'availability_365', 'price',
       'latitude', 'longitude', 'number_of_reviews', 'review_scores_rating',
       'minimum_nights', 'maximum_nights', 'review_scores_accuracy',
       'review_scores_cleanliness', 'review_scores_checkin',
       'review_scores_communication', 'review_scores_location',
       'review_scores_value', 'instant_bookable', 'accommodates', 'bathrooms',
       'bedrooms', 'beds', 'neighbourhood_cleansed',
       'neighbourhood_group_cleansed', 'room_type',
       'calculated_host_listings_count'],
      dtype='object')

In [69]:
# checking the list to make sure all data is useable
listings_essential.head()

Unnamed: 0,id,name,host_id,host_name,availability_365,price,latitude,longitude,number_of_reviews,review_scores_rating,...,review_scores_value,instant_bookable,accommodates,bathrooms,bedrooms,beds,neighbourhood_cleansed,neighbourhood_group_cleansed,room_type,calculated_host_listings_count
0,81566,Rental unit in Haleiwa · ★4.67 · 2 bedrooms · ...,442490,Susan,249,$250.00,21.589247,-158.111008,260,4.67,...,4.62,f,4,0.0,2.0,2.0,North Shore Oahu,Honolulu,Entire home/apt,1
1,81582,Home in Pāhoa · ★4.94 · 2 bedrooms · 3 beds · ...,442698,Elizabeth,69,$119.00,19.43428,-155.21609,181,4.94,...,4.83,f,4,0.0,2.0,3.0,Puna,Hawaii,Entire home/apt,2
2,83221,Cabin in Pāhoa · ★4.93 · 1 bedroom · 1 bed · 1...,451536,Connie And Andrew,323,$117.00,19.53979,-155.00207,311,4.93,...,4.94,f,3,0.0,1.0,1.0,Puna,Hawaii,Entire home/apt,1
3,5269,Rental unit in Kamuela · ★4.65 · 1 bedroom · 1...,7620,Lea & Pat,180,$144.00,20.0274,-155.702,27,4.65,...,4.85,f,2,0.0,1.0,1.0,South Kohala,Hawaii,Entire home/apt,4
4,84405,Townhouse in Lahaina · ★4.50 · 3 bedrooms · 4 ...,461037,Crystal,298,$687.00,20.99596,-156.66574,28,4.5,...,4.36,f,8,0.0,3.0,4.0,Lahaina,Maui,Entire home/apt,22


### Further Insights on data
- Still need to change the price column to a number from an object
- can probably drop the 'name' column as it is redundant info
- can change the 'Instant bookable' to numbers from object
- Should change the Data Set name to hawaiian_islands_df
- can split all the data up into respective islands

In [70]:
# Changing the data in the price column
listings_essential['price'] = listings_essential['price'].str.replace('$', '').str.replace(',', '').astype(float)


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  listings_essential['price'] = listings_essential['price'].str.replace('$', '').str.replace(',', '').astype(float)


In [71]:
# price has been changed
listings_essential.head()

Unnamed: 0,id,name,host_id,host_name,availability_365,price,latitude,longitude,number_of_reviews,review_scores_rating,...,review_scores_value,instant_bookable,accommodates,bathrooms,bedrooms,beds,neighbourhood_cleansed,neighbourhood_group_cleansed,room_type,calculated_host_listings_count
0,81566,Rental unit in Haleiwa · ★4.67 · 2 bedrooms · ...,442490,Susan,249,250.0,21.589247,-158.111008,260,4.67,...,4.62,f,4,0.0,2.0,2.0,North Shore Oahu,Honolulu,Entire home/apt,1
1,81582,Home in Pāhoa · ★4.94 · 2 bedrooms · 3 beds · ...,442698,Elizabeth,69,119.0,19.43428,-155.21609,181,4.94,...,4.83,f,4,0.0,2.0,3.0,Puna,Hawaii,Entire home/apt,2
2,83221,Cabin in Pāhoa · ★4.93 · 1 bedroom · 1 bed · 1...,451536,Connie And Andrew,323,117.0,19.53979,-155.00207,311,4.93,...,4.94,f,3,0.0,1.0,1.0,Puna,Hawaii,Entire home/apt,1
3,5269,Rental unit in Kamuela · ★4.65 · 1 bedroom · 1...,7620,Lea & Pat,180,144.0,20.0274,-155.702,27,4.65,...,4.85,f,2,0.0,1.0,1.0,South Kohala,Hawaii,Entire home/apt,4
4,84405,Townhouse in Lahaina · ★4.50 · 3 bedrooms · 4 ...,461037,Crystal,298,687.0,20.99596,-156.66574,28,4.5,...,4.36,f,8,0.0,3.0,4.0,Lahaina,Maui,Entire home/apt,22


In [72]:
#droping the name column 
listings_essential.drop('name', axis= 1,  inplace =True)
listings_essential


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  listings_essential.drop('name', axis= 1,  inplace =True)


Unnamed: 0,id,host_id,host_name,availability_365,price,latitude,longitude,number_of_reviews,review_scores_rating,minimum_nights,...,review_scores_value,instant_bookable,accommodates,bathrooms,bedrooms,beds,neighbourhood_cleansed,neighbourhood_group_cleansed,room_type,calculated_host_listings_count
0,81566,442490,Susan,249,250.0,21.589247,-158.111008,260,4.67,4,...,4.62,f,4,0.0,2.0,2.0,North Shore Oahu,Honolulu,Entire home/apt,1
1,81582,442698,Elizabeth,69,119.0,19.434280,-155.216090,181,4.94,2,...,4.83,f,4,0.0,2.0,3.0,Puna,Hawaii,Entire home/apt,2
2,83221,451536,Connie And Andrew,323,117.0,19.539790,-155.002070,311,4.93,3,...,4.94,f,3,0.0,1.0,1.0,Puna,Hawaii,Entire home/apt,1
3,5269,7620,Lea & Pat,180,144.0,20.027400,-155.702000,27,4.65,5,...,4.85,f,2,0.0,1.0,1.0,South Kohala,Hawaii,Entire home/apt,4
4,84405,461037,Crystal,298,687.0,20.995960,-156.665740,28,4.50,3,...,4.36,f,8,0.0,3.0,4.0,Lahaina,Maui,Entire home/apt,22
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
32592,908813827623739095,240078487,Steven,90,228.0,19.556160,-155.961220,0,0.00,2,...,0.00,f,4,0.0,1.0,2.0,North Kona,Hawaii,Entire home/apt,110
32593,908814385299008320,428576478,Ryan,89,275.0,19.924658,-155.793491,0,0.00,2,...,0.00,f,6,0.0,2.0,3.0,South Kohala,Hawaii,Entire home/apt,105
32594,908814612038475768,43793840,Mark,4,175.0,21.282890,-157.831660,0,0.00,2,...,0.00,t,4,0.0,0.0,2.0,Primary Urban Center,Honolulu,Entire home/apt,182
32595,908821427715937106,240078487,Steven,90,228.0,22.219156,-159.485522,0,0.00,2,...,0.00,f,4,0.0,1.0,2.0,North Shore Kauai,Kauai,Entire home/apt,110


In [73]:
#changing the 'f' and 't' to numbers in 'instant_bookable'
listings_essential['instant_bookable'] = listings_essential['instant_bookable'].map({'f': 0, 't': 1})


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  listings_essential['instant_bookable'] = listings_essential['instant_bookable'].map({'f': 0, 't': 1})


In [74]:
# check if it worked 
listings_essential

Unnamed: 0,id,host_id,host_name,availability_365,price,latitude,longitude,number_of_reviews,review_scores_rating,minimum_nights,...,review_scores_value,instant_bookable,accommodates,bathrooms,bedrooms,beds,neighbourhood_cleansed,neighbourhood_group_cleansed,room_type,calculated_host_listings_count
0,81566,442490,Susan,249,250.0,21.589247,-158.111008,260,4.67,4,...,4.62,0,4,0.0,2.0,2.0,North Shore Oahu,Honolulu,Entire home/apt,1
1,81582,442698,Elizabeth,69,119.0,19.434280,-155.216090,181,4.94,2,...,4.83,0,4,0.0,2.0,3.0,Puna,Hawaii,Entire home/apt,2
2,83221,451536,Connie And Andrew,323,117.0,19.539790,-155.002070,311,4.93,3,...,4.94,0,3,0.0,1.0,1.0,Puna,Hawaii,Entire home/apt,1
3,5269,7620,Lea & Pat,180,144.0,20.027400,-155.702000,27,4.65,5,...,4.85,0,2,0.0,1.0,1.0,South Kohala,Hawaii,Entire home/apt,4
4,84405,461037,Crystal,298,687.0,20.995960,-156.665740,28,4.50,3,...,4.36,0,8,0.0,3.0,4.0,Lahaina,Maui,Entire home/apt,22
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
32592,908813827623739095,240078487,Steven,90,228.0,19.556160,-155.961220,0,0.00,2,...,0.00,0,4,0.0,1.0,2.0,North Kona,Hawaii,Entire home/apt,110
32593,908814385299008320,428576478,Ryan,89,275.0,19.924658,-155.793491,0,0.00,2,...,0.00,0,6,0.0,2.0,3.0,South Kohala,Hawaii,Entire home/apt,105
32594,908814612038475768,43793840,Mark,4,175.0,21.282890,-157.831660,0,0.00,2,...,0.00,1,4,0.0,0.0,2.0,Primary Urban Center,Honolulu,Entire home/apt,182
32595,908821427715937106,240078487,Steven,90,228.0,22.219156,-159.485522,0,0.00,2,...,0.00,0,4,0.0,1.0,2.0,North Shore Kauai,Kauai,Entire home/apt,110


In [75]:
# changing the name of the data set
hawaiian_island_df = listings_essential

In [76]:
hawaiian_island_df

Unnamed: 0,id,host_id,host_name,availability_365,price,latitude,longitude,number_of_reviews,review_scores_rating,minimum_nights,...,review_scores_value,instant_bookable,accommodates,bathrooms,bedrooms,beds,neighbourhood_cleansed,neighbourhood_group_cleansed,room_type,calculated_host_listings_count
0,81566,442490,Susan,249,250.0,21.589247,-158.111008,260,4.67,4,...,4.62,0,4,0.0,2.0,2.0,North Shore Oahu,Honolulu,Entire home/apt,1
1,81582,442698,Elizabeth,69,119.0,19.434280,-155.216090,181,4.94,2,...,4.83,0,4,0.0,2.0,3.0,Puna,Hawaii,Entire home/apt,2
2,83221,451536,Connie And Andrew,323,117.0,19.539790,-155.002070,311,4.93,3,...,4.94,0,3,0.0,1.0,1.0,Puna,Hawaii,Entire home/apt,1
3,5269,7620,Lea & Pat,180,144.0,20.027400,-155.702000,27,4.65,5,...,4.85,0,2,0.0,1.0,1.0,South Kohala,Hawaii,Entire home/apt,4
4,84405,461037,Crystal,298,687.0,20.995960,-156.665740,28,4.50,3,...,4.36,0,8,0.0,3.0,4.0,Lahaina,Maui,Entire home/apt,22
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
32592,908813827623739095,240078487,Steven,90,228.0,19.556160,-155.961220,0,0.00,2,...,0.00,0,4,0.0,1.0,2.0,North Kona,Hawaii,Entire home/apt,110
32593,908814385299008320,428576478,Ryan,89,275.0,19.924658,-155.793491,0,0.00,2,...,0.00,0,6,0.0,2.0,3.0,South Kohala,Hawaii,Entire home/apt,105
32594,908814612038475768,43793840,Mark,4,175.0,21.282890,-157.831660,0,0.00,2,...,0.00,1,4,0.0,0.0,2.0,Primary Urban Center,Honolulu,Entire home/apt,182
32595,908821427715937106,240078487,Steven,90,228.0,22.219156,-159.485522,0,0.00,2,...,0.00,0,4,0.0,1.0,2.0,North Shore Kauai,Kauai,Entire home/apt,110


## Split data by island for different use case. 🪓

In [77]:
# so now we want the maui only neighbourhoods
# but we can seperate them other islands for later use.
maui_df = hawaiian_island_df[hawaiian_island_df['neighbourhood_group_cleansed'] == 'Maui']
honolulu_df = hawaiian_island_df[hawaiian_island_df['neighbourhood_group_cleansed'] == 'Honolulu']
hawaii_df = hawaiian_island_df[hawaiian_island_df['neighbourhood_group_cleansed'] == 'Hawaii']
kauai_df = hawaiian_island_df[hawaiian_island_df['neighbourhood_group_cleansed'] == 'Kauai']

In [78]:
maui_df.head(1)

Unnamed: 0,id,host_id,host_name,availability_365,price,latitude,longitude,number_of_reviews,review_scores_rating,minimum_nights,...,review_scores_value,instant_bookable,accommodates,bathrooms,bedrooms,beds,neighbourhood_cleansed,neighbourhood_group_cleansed,room_type,calculated_host_listings_count
4,84405,461037,Crystal,298,687.0,20.99596,-156.66574,28,4.5,3,...,4.36,0,8,0.0,3.0,4.0,Lahaina,Maui,Entire home/apt,22


In [79]:
honolulu_df.head(1)

Unnamed: 0,id,host_id,host_name,availability_365,price,latitude,longitude,number_of_reviews,review_scores_rating,minimum_nights,...,review_scores_value,instant_bookable,accommodates,bathrooms,bedrooms,beds,neighbourhood_cleansed,neighbourhood_group_cleansed,room_type,calculated_host_listings_count
0,81566,442490,Susan,249,250.0,21.589247,-158.111008,260,4.67,4,...,4.62,0,4,0.0,2.0,2.0,North Shore Oahu,Honolulu,Entire home/apt,1


In [80]:
hawaii_df.head(1)

Unnamed: 0,id,host_id,host_name,availability_365,price,latitude,longitude,number_of_reviews,review_scores_rating,minimum_nights,...,review_scores_value,instant_bookable,accommodates,bathrooms,bedrooms,beds,neighbourhood_cleansed,neighbourhood_group_cleansed,room_type,calculated_host_listings_count
1,81582,442698,Elizabeth,69,119.0,19.43428,-155.21609,181,4.94,2,...,4.83,0,4,0.0,2.0,3.0,Puna,Hawaii,Entire home/apt,2


In [81]:
kauai_df.head(1)

Unnamed: 0,id,host_id,host_name,availability_365,price,latitude,longitude,number_of_reviews,review_scores_rating,minimum_nights,...,review_scores_value,instant_bookable,accommodates,bathrooms,bedrooms,beds,neighbourhood_cleansed,neighbourhood_group_cleansed,room_type,calculated_host_listings_count
11,91091,492687,Tony,365,657.0,22.2226,-159.4689,8,5.0,4,...,5.0,0,4,0.0,1.0,2.0,North Shore Kauai,Kauai,Private room,2


# Make the clean data into a CSV files to use in visualization app.👀

In [82]:
# make it into CSV
hawaiian_island_df.to_csv('hawaiian_islands_df.csv', index=False)


In [83]:
# CSV Data by island
maui_df.to_csv('maui_df.csv', index=False)
honolulu_df.to_csv('honolulu_df.csv', index=False)
hawaii_df.to_csv('hawaii_df.csv', index=False)
kauai_df.to_csv('kauai_df.csv', index=False)