# AirBnB NY Locations Data Case Study

In this final project, you task will be to take the data provided and find evidance to answer the following questions.

1. Which hosts are the busiest and why?
2. How many neighborhood groups are available and which shows up the most?
3. Are private rooms the most popular in manhattan?
4. Which hosts are the busiest and based on their reviews?
5. Which neighorhood group has the highest average price?
6. Which neighborhood group has the highest total price?
7. Which top 5 hosts have the highest total price?
8. Who currently has no (zero) availability with a review count of 100 or more?
9. What host has the highest total of prices and where are they located?
10. When did Danielle from Queens last receive a review?

You will be given **4 hours** to complete this assignment. 
**Be Advised** I will go dark for this intire assignment time period. That said, any questions that you would like to ask about the data, or the project **MUST** be asked before the time starts. Once the time has started, I can no longer give information.

This is to similate what you will face when you are out in the wild. 

Happy Coding!

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [267]:
air_bnb = pd.read_csv('files/airbnbdata.csv')


In [162]:
# How many neighborhood groups are available and which shows up the most?

groups = air_bnb.drop_duplicates('neighbourhood_group').reset_index(drop=True).count()['neighbourhood_group']
air_bnb[['neighbourhood_group','id']].groupby('neighbourhood_group', as_index = False).count().sort_values('id', ascending = False)
most_common = air_bnb['neighbourhood_group'][0]

print(f'There are {groups} neighborhood groups total and {most_common} shows up the most.')

There are 8 neighborhood groups total and Brooklyn shows up the most.


In [163]:
# Are private rooms the most popular in manhattan?

manhattan = air_bnb[air_bnb['neighbourhood_group'] == 'Manhattan']
manhattan[['id','name','room_type']].groupby('room_type', as_index = False).count().sort_values('id', ascending = False)



Unnamed: 0,room_type,id,name
0,Entire home/apt,291,291
1,Private room,181,181
2,Shared room,6,6


In [200]:
# Which hosts are the busiest based on their reviews?


air_bnb.groupby(['host_id','number_of_reviews'], as_index=False).sum().sort_values('number_of_reviews', ascending = False)



Unnamed: 0,host_id,number_of_reviews,latitude,price,minimum_nights,reviews_per_month,calculated_host_listings_count,availability_365
461,21475,99,40.72019,120.0,2.0,1.04,1.0,345.0
1005,921500,99,40.82748,55.0,4.0,1.04,1.0,260.0
728,496164,98,40.71143,75.0,2.0,1.07,2.0,264.0
217,142147,98,40.70215,52.0,2.0,1.09,1.0,326.0
756,526805,98,40.66918,130.0,7.0,0.99,1.0,35.0
...,...,...,...,...,...,...,...,...
1010,933378,0,40.76739,90.0,1.0,0.00,1.0,0.0
285,1562045,0,40.78012,295.0,1.0,0.00,1.0,146.0
411,2006712,0,40.80192,125.0,1.0,0.00,1.0,365.0
858,703156,0,40.74249,200.0,4.0,0.00,1.0,0.0


In [216]:
#Which neighorhood group has the highest average price?
air_bnb.groupby('neighbourhood_group', as_index = False).mean().sort_values('price', ascending = False)[['neighbourhood_group','price']].head(1)

Unnamed: 0,neighbourhood_group,price
3,Manhattan,183.012552


In [123]:
# Which neighbor hood group has the highest total price?
air_bnb.groupby('neighbourhood_group', as_index = False).sum().sort_values('price', ascending = False)

Unnamed: 0,neighbourhood_group,latitude,price,minimum_nights,reviews_per_month,calculated_host_listings_count,availability_365
3,Manhattan,19484.88651,87480.0,5029.0,405.05,707.0,81310.0
1,Brooklyn,20061.01286,72142.0,3880.0,448.99,1542.0,78509.0
4,Queens,1996.00837,4622.0,715.0,41.52,112.0,10973.0
6,Staten Island,528.05312,985.0,35.0,12.72,42.0,3584.0
0,Bronx,531.15045,921.0,57.0,18.82,40.0,4128.0
7,Williamsburg,-73.9529,30.0,30.0,1.0,91.0,0.0
2,East Harlem,-73.94466,3.0,25.0,2.0,127.0,0.0
5,SoHo,-73.99976,3.0,134.0,1.0,231.0,0.0


In [124]:
#Rental with highest total price because I don't know if I understand the previous question correctly
air_bnb.sort_values('price',ascending = False).head(1)

Unnamed: 0,id,name,host_id,host_name,neighbourhood_group,neighbourhood,latitude,longitude,room_type,price,minimum_nights,number_of_reviews,last_review,reviews_per_month,calculated_host_listings_count,availability_365
948,363673,Beautiful 3 bedroom in Manhattan,256239,Tracey,Manhattan,Upper West Side,40.80142,-73.96931,Private room,3000.0,7.0,0,,,1.0,365.0


In [285]:
#Which top 5 hosts have the highest total price?
highest_price = air_bnb.groupby('host_name',as_index = False).sum().sort_values('price', ascending = False)
highest_price.head()

Unnamed: 0,host_name,id,host_id,latitude,longitude,price,minimum_nights,number_of_reviews,reviews_per_month,calculated_host_listings_count,availability_365
631,The Box House Hotel,5821805,10855104.0,1059.1771,-1922.79083,6564,78,769,8.88,728,1730
643,Tracey,363673,256239.0,40.80142,-73.96931,3000,7,0,0.0,1,365
677,West Village,784644,1816389.0,122.19679,-222.00975,2300,12,312,3.37,12,839
247,Henry,215005,1008019.0,81.47557,-147.97268,2250,33,44,0.46,12,189
147,Daniel,1743405,8039115.0,407.30356,-739.73953,1966,44,838,8.77,13,1461


In [284]:
# Who currently has no (zero) availability with a review count of 100 or more?

no_avail = air_bnb[air_bnb['availability_365'] == 0]
high_reviews = no_avail.groupby(['name','host_name','number_of_reviews','availability_365'], as_index = False).sum().sort_values('number_of_reviews',ascending=False)
high_reviews[high_reviews['number_of_reviews']>=100][['host_name','name','number_of_reviews','availability_365']]


Unnamed: 0,host_name,name,number_of_reviews,availability_365
103,Wanda,LG Private Room/Family Friendly,480,0
124,James,"Luxury Williamsburg, Brooklyn LOFT",320,0
81,S,"Fresh, Clean Brooklyn Garden Apt.",233,0
80,Doug,"Fort Greene, Brooklyn: Center Bedroom",206,0
35,Jsun,Blue Room in Awesome Artist's Apartment!,205,0
130,Ori,"Modern, Large East Village Loft",205,0
135,Sol,NYC artists’ loft with roof deck,193,0
40,Christiana,Charming 1 bed GR8 WBurg LOCATION!,168,0
68,Elle,Decorators 5-Star Flat West Village,157,0
18,Lissette,B NYC Staten Alternative...,147,0


In [189]:
# What host has the highest total of prices and where are they located?
highest_price = air_bnb.groupby(['host_name','neighbourhood'],as_index = False).sum().sort_values('price', ascending = False).head()
highest_price.head(1)[['host_name','neighbourhood']]

Unnamed: 0,host_name,neighbourhood
811,The Box House Hotel,Greenpoint


In [265]:
# When did Danielle from Queens last receive a review?
air_bnb[air_bnb['host_name']=='Danielle'][['host_name','last_review']]


Unnamed: 0,host_name,last_review
555,Danielle,4/30/2019


## Further Questions

1. Which host has the most listings?

In [254]:
air_bnb.groupby('host_name').count().sort_values('name', ascending = False).head(1)

Unnamed: 0_level_0,id,name,host_id,neighbourhood_group,neighbourhood,latitude,longitude,room_type,price,minimum_nights,number_of_reviews,last_review,reviews_per_month,calculated_host_listings_count,availability_365
host_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
The Box House Hotel,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26


2. How many listings have completely open availability?

In [219]:
air_bnb[air_bnb['availability_365']==365].count()['availability_365']

45

3. What room_types have the highest review numbers?

In [277]:
air_bnb.groupby(['room_type']).mean().sort_values('number_of_reviews',ascending=False)[['number_of_reviews','reviews_per_month']]

Unnamed: 0_level_0,number_of_reviews,reviews_per_month
room_type,Unnamed: 1_level_1,Unnamed: 2_level_1
Private room,87.515294,1.037698
Shared room,77.571429,1.343333
Entire home/apt,73.356564,0.842714


# Final Conclusion

In this cell, write your final conclusion for each of the questions asked.

Also, if you uncovered some more details that were not asked above, please discribe them here.

-- Add your conclusion --