# AirBnB NY Locations Data Case Study

In this final project, you task will be to take the data provided and find evidance to answer the following questions.

1. Which hosts are the busiest and why?
2. How many neighborhood groups are available and which shows up the most?
3. Are private rooms the most popular in manhattan?
4. Which hosts are the busiest and based on their reviews?
5. Which neighorhood group has the highest average price?
6. Which neighborhood group has the highest total price?
7. Which top 5 hosts have the highest total price?
8. Who currently has no (zero) availability with a review count of 100 or more?
9. What host has the highest total of prices and where are they located?
10. When did Danielle from Queens last receive a review?

You will be given **4 hours** to complete this assignment. 
**Be Advised** I will go dark for this intire assignment time period. That said, any questions that you would like to ask about the data, or the project **MUST** be asked before the time starts. Once the time has started, I can no longer give information.

This is to similate what you will face when you are out in the wild. 

Happy Coding!

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [26]:
air_bnb = pd.read_csv('AB_NYC_2019.csv')
air_bnb.keys()


Index(['id', 'name', 'host_id', 'host_name', 'neighbourhood_group',
       'neighbourhood', 'latitude', 'longitude', 'room_type', 'price',
       'minimum_nights', 'number_of_reviews', 'last_review',
       'reviews_per_month', 'calculated_host_listings_count',
       'availability_365'],
      dtype='object')

In [27]:
# How many neighborhood groups are available and which shows up the most?
air_bnb.groupby('neighbourhood_group', as_index=False).sum().sort_values('calculated_host_listings_count', ascending=False)[['neighbourhood_group','calculated_host_listings_count']]

#There are 5 neighborhood groups. Manhattan has the most listings.

Unnamed: 0,neighbourhood_group,calculated_host_listings_count
2,Manhattan,277073
1,Brooklyn,45925
3,Queens,23005
0,Bronx,2437
4,Staten Island,865


In [180]:
# Are private rooms the most popular in manhattan?

air_bnb.groupby(['neighbourhood_group','room_type']).count().loc['Manhattan'][['number_of_reviews','reviews_per_month']]

# Entire home/apartments are the most popular in manhattan

Unnamed: 0_level_0,number_of_reviews,reviews_per_month
room_type,Unnamed: 1_level_1,Unnamed: 2_level_1
Entire home/apt,13199,9967
Private room,7982,6309
Shared room,480,356


In [78]:
# Which hosts are the busiest and based on their reviews?

#based on total number of reviews
tot_reviews=air_bnb.groupby(['host_id','host_name']).count().sort_values('number_of_reviews', ascending=False).head(20)[['number_of_reviews']]
print(tot_reviews)

#based on number of reviews per month
per_month =air_bnb.groupby(['host_id','host_name']).count().sort_values('reviews_per_month', ascending=False).head(20)[['reviews_per_month']]
print(per_month)

                             number_of_reviews
host_id   host_name                           
219517861 Sonder (NYC)                     327
107434423 Blueground                       232
30283594  Kara                             121
137358866 Kazuya                           103
16098958  Jeremy & Laura                    96
12243051  Sonder                            96
61391963  Corporate Housing                 91
22541573  Ken                               87
200380610 Pranjal                           65
1475015   Mike                              52
7503643   Vida                              52
120762452 Stanley                           50
205031545 Red Awning                        49
2856748   Ruchi                             49
190921808 John                              47
26377263  Stat                              43
2119276   Host                              39
19303369  Hiroki                            37
119669058 Melissa                           34
25237492  Jul

In [164]:
#Which neighorhood group has the highest average price?
air_bnb.groupby(['neighbourhood_group']).mean().sort_values('price',ascending=False)[['price']].head(1)

Unnamed: 0_level_0,price
neighbourhood_group,Unnamed: 1_level_1
Manhattan,196.875814


In [165]:
# Which neighbor hood group has the highest total price?
air_bnb.groupby(['neighbourhood_group']).sum().sort_values('price',ascending=False)[['price']].head(1)

Unnamed: 0_level_0,price
neighbourhood_group,Unnamed: 1_level_1
Manhattan,4264527


In [88]:
#Which top 5 hosts have the highest total price?
air_bnb.groupby(['host_name']).sum().sort_values('price',ascending=False)[['price']].head(5)

Unnamed: 0_level_0,price
host_name,Unnamed: 1_level_1
Sonder (NYC),82795
Blueground,70331
Michael,66895
David,65844
Alex,52563


In [154]:
# Who currently has no (zero) availability with a review count of 100 or more?
air_bnb.loc[air_bnb['calculated_host_listings_count']== 0].loc[air_bnb['number_of_reviews'] > 100].sort_values('number_of_reviews',ascending=False)[['host_name','number_of_reviews','calculated_host_listings_count']]

#There is currently no-one with 'zero' availability

Unnamed: 0,host_name,number_of_reviews,calculated_host_listings_count


In [127]:
# What host has the highest total of prices and where are they located?

air_bnb.groupby(['host_name','neighbourhood_group']).sum().sort_values('price',ascending=False)[['price']].head(1)

Unnamed: 0_level_0,Unnamed: 1_level_0,price
host_name,neighbourhood_group,Unnamed: 2_level_1
Sonder (NYC),Manhattan,82795


In [150]:
# When did Danielle from Queens last receive a review?
air_bnb.keys()
air_bnb.loc[air_bnb['host_name']=='Danielle'].loc[air_bnb['neighbourhood_group']=='Queens'].sort_values('last_review',ascending=False).head(1)[['host_id','host_name','neighbourhood_group','last_review']]


Unnamed: 0,host_id,host_name,neighbourhood_group,last_review
22469,26432133,Danielle,Queens,2019-07-08


## Further Questions

1. Which host has the most listings?

In [153]:
air_bnb.groupby(['host_name']).count().sort_values('calculated_host_listings_count',ascending=False)[['host_id','calculated_host_listings_count']].head(1)

Unnamed: 0_level_0,host_id,calculated_host_listings_count
host_name,Unnamed: 1_level_1,Unnamed: 2_level_1
Michael,417,417


2. How many listings have completely open availability?

In [161]:
air_bnb.loc[air_bnb['availability_365']==365].count()[['availability_365']]

#1295 listings have availability 365 days a year

availability_365    1295
dtype: int64

3. What room_types have the highest review numbers?

In [177]:
air_bnb.keys()
air_bnb.groupby(['room_type']).count()[['number_of_reviews']]

Unnamed: 0_level_0,number_of_reviews
room_type,Unnamed: 1_level_1
Entire home/apt,25409
Private room,22326
Shared room,1160


# Final Conclusion

In this cell, write your final conclusion for each of the questions asked.

Also, if you uncovered some more details that were not asked above, please discribe them here.

-- Add your conclusion --

In [None]:
"""
I am not surprised that Manhattan has the highest avg room price and the
highest number of listings.I was surprised that Michael has the most listings
and not a 'business/company.'
"""