# AirBnB NY Locations Data Case Study

In this project, you task will be to take the data provided and find evidence to answer the following questions.

1. Which hosts are the busiest and why?
2. How many neighborhood groups are available and which shows up the most?
3. Are private rooms the most popular in manhattan?
4. Which hosts are the busiest based on their reviews?
5. Which neighorhood group has the highest average price?
6. Which neighborhood group has the highest total price?
7. Which top 5 hosts have the highest total price?
8. Who currently has no (zero) availability with a review count of 100 or more?
9. What host has the highest total of prices and where are they located?
10. When did Danielle from Queens last receive a review?



In [5]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [87]:
air_bnb = pd.read_csv('AB_NYC_2019.csv')
air_bnb.head()

Unnamed: 0,id,name,host_id,host_name,neighbourhood_group,neighbourhood,latitude,longitude,room_type,price,minimum_nights,number_of_reviews,last_review,reviews_per_month,calculated_host_listings_count,availability_365
0,2539,Clean & quiet apt home by the park,2787,John,Brooklyn,Kensington,40.64749,-73.97237,Private room,149,1,9,2018-10-19,0.21,6,365
1,2595,Skylit Midtown Castle,2845,Jennifer,Manhattan,Midtown,40.75362,-73.98377,Entire home/apt,225,1,45,2019-05-21,0.38,2,355
2,3647,THE VILLAGE OF HARLEM....NEW YORK !,4632,Elisabeth,Manhattan,Harlem,40.80902,-73.9419,Private room,150,3,0,,,1,365
3,3831,Cozy Entire Floor of Brownstone,4869,LisaRoxanne,Brooklyn,Clinton Hill,40.68514,-73.95976,Entire home/apt,89,1,270,2019-07-05,4.64,1,194
4,5022,Entire Apt: Spacious Studio/Loft by central park,7192,Laura,Manhattan,East Harlem,40.79851,-73.94399,Entire home/apt,80,10,9,2018-11-19,0.1,1,0


In [100]:
# How many neighborhood groups are available and which shows up the most?
groupby_neighbourhood=air_bnb.groupby('neighbourhood_group', as_index=False)
groupby_neighbourhood.count()[['neighbourhood_group', 'neighbourhood']].sort_values(['neighbourhood'], ascending=False)

Unnamed: 0,neighbourhood_group,neighbourhood
2,Manhattan,21661
1,Brooklyn,20104
3,Queens,5666
0,Bronx,1091
4,Staten Island,373


5 neighbourhood groups with Manhattan showing up the most with 21661 times.

In [98]:
# Are private rooms the most popular in manhattan?
groupby_neighbourhood=air_bnb.groupby(['neighbourhood_group', 'room_type'], as_index=True)
groupby_neighbourhood[['id']].count()

Unnamed: 0_level_0,Unnamed: 1_level_0,id
neighbourhood_group,room_type,Unnamed: 2_level_1
Bronx,Entire home/apt,379
Bronx,Private room,652
Bronx,Shared room,60
Brooklyn,Entire home/apt,9559
Brooklyn,Private room,10132
Brooklyn,Shared room,413
Manhattan,Entire home/apt,13199
Manhattan,Private room,7982
Manhattan,Shared room,480
Queens,Entire home/apt,2096


Private rooms are not the most popular in Manhattan. Entirehomes/apartments are.

In [90]:
# Which hosts are the busiest and based on their reviews?
groupby_host_id=air_bnb.groupby('host_id', as_index=True)
groupby_host_id[['reviews_per_month']].count().sort_values(['reviews_per_month'], ascending=False).head()

Unnamed: 0_level_0,reviews_per_month
host_id,Unnamed: 1_level_1
219517861,207
61391963,79
16098958,61
137358866,51
7503643,49


Host 219517861 is the businest with 207 reviews per month

In [103]:
#Which neighorhood group has the highest average price?
groupby_neighbourhood=air_bnb.groupby(['neighbourhood_group'], as_index=True)
groupby_neighbourhood[['price']].mean().round(decimals=2).sort_values(['price'], ascending=False)

Unnamed: 0_level_0,price
neighbourhood_group,Unnamed: 1_level_1
Manhattan,196.88
Brooklyn,124.38
Staten Island,114.81
Queens,99.52
Bronx,87.5


Manhattan has the highest average price of  $196.88

In [172]:
# Which neighbor hood group has the highest total price?
air_bnb.sort_values(['price'], ascending=False)[['neighbourhood_group','price']].head()

Unnamed: 0,neighbourhood_group,price
9151,Queens,10000
17692,Brooklyn,10000
29238,Manhattan,10000
40433,Manhattan,9999
12342,Manhattan,9999


Queens, Brooklyn, and Manhattan are tied for having the highest total price of  $10,000

In [187]:
#Which top 5 hosts have the highest total price?
air_bnb.sort_values(['price'], ascending=False)[['host_id','price']].head(7)

Unnamed: 0,host_id,price,neighbourhood
9151,20582832,10000,Astoria
17692,5143901,10000,Greenpoint
29238,72390391,10000,Upper West Side
40433,4382127,9999,Lower East Side
12342,3906464,9999,Lower East Side
6530,1235070,9999,East Harlem
30268,18128455,8500,Tribeca


These hosts have the 5 highest total prices: 20582832, 5143901, 72390391, 4382127, 3906464 (and 1235070 who is tied at $9,999)

In [183]:
# Who currently has no (zero) availability with a review count of 100 or more?
l1 = air_bnb[air_bnb['availability_365'] == 0][['id', 'availability_365' ,'number_of_reviews']]
l2 = air_bnb[air_bnb['number_of_reviews'] > 100][['id', 'availability_365' ,'number_of_reviews']]
joined = l1.merge(l2, on = 'id', how='inner')
joined

Unnamed: 0,id,availability_365_x,number_of_reviews_x,availability_365_y,number_of_reviews_y
0,5203,0,118,0,118
1,20913,0,168,0,168
2,30031,0,193,0,193
3,44221,0,114,0,114
4,45556,0,206,0,206
...,...,...,...,...,...
153,22705516,0,103,0,103
154,23574142,0,119,0,119
155,24267706,0,102,0,102
156,25719044,0,131,0,131


In [195]:
# What host has the highest total of prices and where are they located?
hi=air_bnb.sort_values(['price'], ascending=False)[['host_id','price', 'neighbourhood']]
groupby_host_id=hi.groupby('host_id', as_index=True)
groupby_host_id.head(5)

Unnamed: 0,host_id,price,neighbourhood
9151,20582832,10000,Astoria
17692,5143901,10000,Greenpoint
29238,72390391,10000,Upper West Side
40433,4382127,9999,Lower East Side
12342,3906464,9999,Lower East Side
...,...,...,...
23161,8993084,0,Bedford-Stuyvesant
25794,86327101,0,Bedford-Stuyvesant
25778,10132166,0,Williamsburg
25796,86327101,0,Bedford-Stuyvesant


In [210]:
# When did Danielle from Queens last receive a review?
air_bnb[air_bnb(['host_name'] == 'Danielle') & (['neighbourhood'] == 'Queens')]

TypeError: 'DataFrame' object is not callable

## Further Questions

1. Which host has the most listings?

In [156]:
groupby_host_id=air_bnb.groupby('host_id', as_index=True)
groupby_host_id[['id']].count().sort_values(['id'], ascending=False).head()

Unnamed: 0_level_0,id
host_id,Unnamed: 1_level_1
219517861,327
107434423,232
30283594,121
137358866,103
16098958,96


Host 219517861 has the most listings: 327

2. How many listings have completely open availability?

In [145]:
full_availability = air_bnb[air_bnb['availability_365'] == 365]['id'].count()
print("there are", full_availability, "listings with full availability")

there are 1295 listings with full availability


3. What room_types have the highest review numbers?

In [159]:
groupby_room_type=air_bnb.groupby('room_type', as_index=True)
groupby_room_type[['number_of_reviews']].sum().sort_values(['number_of_reviews'], ascending=False).head()

Unnamed: 0_level_0,number_of_reviews
room_type,Unnamed: 1_level_1
Entire home/apt,580403
Private room,538346
Shared room,19256


Entire homes/apt have the highest numbers of reviews: 580403

# Final Conclusion

In this cell, write your final conclusion for each of the questions asked.

Also, if you uncovered some more details that were not asked above, please discribe them here.

The conclusion is renting our entire rooms/private rooms is the best way to rent out your home. You will get the most reviews, and more people are likely to stay at your home.