# AirBnB NY Locations Data Case Study

In this final project, you task will be to take the data provided and find evidance to answer the following questions.

1. Which hosts are the busiest and why?
2. How many neighborhood groups are available and which shows up the most?
3. Are private rooms the most popular in manhattan?
4. Which hosts are the busiest and based on their reviews?
5. Which neighorhood group has the highest average price?
6. Which neighborhood group has the highest total price?
7. Which top 5 hosts have the highest total price?
8. Who currently has no (zero) availability with a review count of 100 or more?
9. What host has the highest total of prices and where are they located?
10. When did Danielle from Queens last receive a review?

You will be given **4 hours** to complete this assignment. 
**Be Advised** I will go dark for this intire assignment time period. That said, any questions that you would like to ask about the data, or the project **MUST** be asked before the time starts. Once the time has started, I can no longer give information.

This is to similate what you will face when you are out in the wild. 

Happy Coding!

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

air_bnb = pd.read_csv('AB_NYC_2019.csv')
air_bnb.head(20)

Unnamed: 0,id,name,host_id,host_name,neighbourhood_group,neighbourhood,latitude,longitude,room_type,price,minimum_nights,number_of_reviews,last_review,reviews_per_month,calculated_host_listings_count,availability_365
0,2539,Clean & quiet apt home by the park,2787,John,Brooklyn,Kensington,40.64749,-73.97237,Private room,149,1,9,2018-10-19,0.21,6,365
1,2595,Skylit Midtown Castle,2845,Jennifer,Manhattan,Midtown,40.75362,-73.98377,Entire home/apt,225,1,45,2019-05-21,0.38,2,355
2,3647,THE VILLAGE OF HARLEM....NEW YORK !,4632,Elisabeth,Manhattan,Harlem,40.80902,-73.9419,Private room,150,3,0,,,1,365
3,3831,Cozy Entire Floor of Brownstone,4869,LisaRoxanne,Brooklyn,Clinton Hill,40.68514,-73.95976,Entire home/apt,89,1,270,2019-07-05,4.64,1,194
4,5022,Entire Apt: Spacious Studio/Loft by central park,7192,Laura,Manhattan,East Harlem,40.79851,-73.94399,Entire home/apt,80,10,9,2018-11-19,0.1,1,0
5,5099,Large Cozy 1 BR Apartment In Midtown East,7322,Chris,Manhattan,Murray Hill,40.74767,-73.975,Entire home/apt,200,3,74,2019-06-22,0.59,1,129
6,5121,BlissArtsSpace!,7356,Garon,Brooklyn,Bedford-Stuyvesant,40.68688,-73.95596,Private room,60,45,49,2017-10-05,0.4,1,0
7,5178,Large Furnished Room Near B'way,8967,Shunichi,Manhattan,Hell's Kitchen,40.76489,-73.98493,Private room,79,2,430,2019-06-24,3.47,1,220
8,5203,Cozy Clean Guest Room - Family Apt,7490,MaryEllen,Manhattan,Upper West Side,40.80178,-73.96723,Private room,79,2,118,2017-07-21,0.99,1,0
9,5238,Cute & Cozy Lower East Side 1 bdrm,7549,Ben,Manhattan,Chinatown,40.71344,-73.99037,Entire home/apt,150,1,160,2019-06-09,1.33,4,188


In [2]:
# Which hosts are the busiest and why?

host_reviews = air_bnb.groupby(['host_id', 'host_name'])['number_of_reviews'].sum().reset_index()

busiest_hosts = host_reviews.sort_values(by='number_of_reviews', ascending=False)

print("Top 5 busiest hosts:")
busiest_hosts.head()

Top 5 busiest hosts:


Unnamed: 0,host_id,host_name,number_of_reviews
21304,37312959,Maya,2273
1052,344035,Brooklyn& Breakfast -Len-,2205
18626,26432133,Danielle,2017
20872,35524316,Yasu & Akiko,1971
21921,40176101,Brady,1818


In [4]:
# How many neighborhood groups are available and which shows up the most?
unique_neighborhood_groups = air_bnb['neighbourhood_group'].nunique()
print(f'Number of unique neighborhood groups: {unique_neighborhood_groups}')

most_common_group = air_bnb['neighbourhood_group'].value_counts().idxmax()
most_common_group_count = air_bnb['neighbourhood_group'].value_counts().max()
print(f'The most common neighborhood group is: {most_common_group} with {most_common_group_count} listings.')

Number of unique neighborhood groups: 5
The most common neighborhood group is: Manhattan with 21661 listings.


In [5]:
# Are private rooms the most popular in manhattan?
manhattan_listings = air_bnb[air_bnb['neighbourhood_group']=='Manhattan']
room_type_counts = manhattan_listings['room_type'].value_counts()
print(room_type_counts)

most_popular_room_type = room_type_counts.idxmax()
most_popular_room_type_count = room_type_counts.max()
print(f'The most popular room type in Manhattan is: {most_popular_room_type} with {most_popular_room_type_count} listings.' )

room_type
Entire home/apt    13199
Private room        7982
Shared room          480
Name: count, dtype: int64
The most popular room type in Manhattan is: Entire home/apt with 13199 listings.


In [6]:
# Which hosts are the busiest and based on their reviews?

host_reviews = air_bnb.groupby('host_id')['number_of_reviews'].sum()
busiest_host_id = host_reviews.idxmax()
busiest_host_reviews = host_reviews.max()

busiest_host_info = air_bnb.loc[air_bnb['host_id'] == busiest_host_id, ['host_id', 'host_name']]

print(f"The busiest host is {busiest_host_info['host_name'].iloc[0]} with {busiest_host_reviews} reviews. ")

The busiest host is Maya with 2273 reviews. 


In [7]:
#Which neighorhood group has the highest average price?
neighbourhood_group_prices = air_bnb.groupby('neighbourhood_group')['price'].mean()
highest_avg_price_neighbourhood_group=neighbourhood_group_prices.idxmax()
highest_avg_price = neighbourhood_group_prices.max()

print(f"The neighbourhood group with the highest average price is {highest_avg_price_neighbourhood_group} with an average price of ${highest_avg_price:.2f}.")

The neighbourhood group with the highest average price is Manhattan with an average price of $196.88.


In [8]:
# Which neighbor hood group has the highest total price?
neighbourhood_group_total_price = air_bnb.groupby('neighbourhood_group')['price'].sum()
highest_total_price_neighbourhood_group=neighbourhood_group_total_price.idxmax()
highest_total_price = neighbourhood_group_total_price.max()

print(f"The neighbourhood group with the highest total price is {highest_total_price_neighbourhood_group} with a total price of ${highest_total_price:.2f}.")

The neighbourhood group with the highest total price is Manhattan with a total price of $4264527.00.


In [9]:
#Which top 5 hosts have the highest total price?
air_bnb['total_price'] = air_bnb['price'] * air_bnb['minimum_nights']


host_total_price = air_bnb.groupby(['host_id', 'host_name'])['total_price'].sum().reset_index()


top_hosts_by_price = host_total_price.sort_values(by='total_price', ascending=False)

print("Top 5 hosts with the highest total price:")
top_hosts_by_price.head()

Top 5 hosts with the highest total price:


Unnamed: 0,host_id,host_name,total_price
29393,107434423,Blueground,2258580
37288,271248669,Jenny,1170000
19564,30283594,Kara,1164243
16306,20582832,Kathrine,1000000
6312,3906464,Amy,989901


In [10]:
# Who currently has no (zero) availability with a review count of 100 or more?
zero_availability_high_reviews = air_bnb[(air_bnb['availability_365'] == 0) & (air_bnb['number_of_reviews']>= 100)]

result = zero_availability_high_reviews[['host_id', 'host_name', 'number_of_reviews', 'availability_365']]

print("Hosts with zero availability and 100 or more reviews:")
result

Hosts with zero availability and 100 or more reviews:


Unnamed: 0,host_id,host_name,number_of_reviews,availability_365
8,7490,MaryEllen,118,0
94,79402,Christiana,168,0
132,129352,Sol,193,0
174,193722,Coral,114,0
180,67778,Doug,206,0
...,...,...,...,...
29581,127740507,Kathleen,103,0
30461,176185168,Janet,119,0
31250,21074914,Albert,102,0
32670,40119874,Stephany,131,0


In [11]:
# What host has the highest total of prices and where are they located?
host_total_price = air_bnb.groupby('host_id')['price'].sum()

highest_total_price_host_id = host_total_price.idxmax()
highest_total_price = host_total_price.max()

highest_total_price_host_info = air_bnb[air_bnb['host_id']==highest_total_price_host_id].iloc[0]

host_name = highest_total_price_host_info['host_name']
neighbourhood_group = highest_total_price_host_info['neighbourhood_group']
neighbourhood = highest_total_price_host_info['neighbourhood']

print(f"The host with the highest total price is {host_name} with a total price of ${highest_total_price:.2f}. ")
print(f"They are located in {neighbourhood}, {neighbourhood_group}. ")

The host with the highest total price is Sonder (NYC) with a total price of $82795.00. 
They are located in Financial District, Manhattan. 


In [12]:
# When did Danielle from Queens last receive a review?

air_bnb['last_review'] = pd.to_datetime(air_bnb['last_review'], errors='coerce')

danielle_queens_listings = air_bnb[(air_bnb['host_name']== 'Danielle') & (air_bnb['neighbourhood_group']=='Queens')]

last_review_date = danielle_queens_listings['last_review'].max()

print(f"Danielle from Queens last received a review on: {last_review_date.date()} ")

Danielle from Queens last received a review on: 2019-07-08 


## Further Questions

1. Which host has the most listings?

In [13]:
host_listings_count = air_bnb.groupby('host_id').size()

most_listings_host_id = host_listings_count.idxmax()
most_listings_count = host_listings_count.max()

most_listings_host_info = air_bnb[air_bnb['host_id'] == most_listings_host_id].iloc[0]

host_name = most_listings_host_info['host_name']

print(f"The host with the most listings is {host_name} with {most_listings_count} listings.")


The host with the most listings is Sonder (NYC) with 327 listings.


2. How many listings have completely open availability?

In [14]:
completely_open_listings = air_bnb[air_bnb['availability_365'] == 365]

num_completely_open_listings = completely_open_listings.shape[0]

print(f"The number of listings with completely open availability is: {num_completely_open_listings}")

The number of listings with completely open availability is: 1295


3. What room_types have the highest review numbers?

In [15]:
reviews_by_room_type = air_bnb.groupby('room_type')['number_of_reviews'].sum()

room_type_with_most_reviews = reviews_by_room_type.idxmax()
max_reviews = reviews_by_room_type.max()

print(f"The room type with the highest number of reviews is: {room_type_with_most_reviews} with {max_reviews} reviews.")

The room type with the highest number of reviews is: Entire home/apt with 580403 reviews.


# Final Conclusion

In this cell, write your final conclusion for each of the questions asked.

Also, if you uncovered some more details that were not asked above, please discribe them here.

-- Add your conclusion --

In [None]:
# 1.Which hosts are the busiest and why?
#
#
# 2.How many neighborhood groups are available and which shows up the most?
#
#
# 3.Are private rooms the most popular in manhattan?
#
#
# 4.Which hosts are the busiest and based on their reviews?
#
#
# 5.Which neighorhood group has the highest average price?
#
#
# 6.Which neighborhood group has the highest total price?
#
#
# 7.Which top 5 hosts have the highest total price?
#
#
# 8.Who currently has no (zero) availability with a review count of 100 or more?
#
#
# 9.What host has the highest total of prices and where are they located?
#
#
# 10.When did Danielle from Queens last receive a review?
#
#

