# AirBnB NY Locations Data Case Study

In this final project, you task will be to take the data provided and find evidance to answer the following questions.

1. Which hosts are the busiest and why?
2. How many neighborhood groups are available and which shows up the most?
3. Are private rooms the most popular in manhattan?
4. Which hosts are the busiest and based on their reviews?
5. Which neighorhood group has the highest average price?
6. Which neighborhood group has the highest total price?
7. Which top 5 hosts have the highest total price?
8. Who currently has no (zero) availability with a review count of 100 or more?
9. What host has the highest total of prices and where are they located?
10. When did Danielle from Queens last receive a review?

You will be given **4 hours** to complete this assignment. 
**Be Advised** I will go dark for this intire assignment time period. That said, any questions that you would like to ask about the data, or the project **MUST** be asked before the time starts. Once the time has started, I can no longer give information.

This is to similate what you will face when you are out in the wild. 

Happy Coding!

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [2]:
air_bnb = pd.read_csv('AB_NYC_2019.csv')
air_bnb.head()

Unnamed: 0,id,name,host_id,host_name,neighbourhood_group,neighbourhood,latitude,longitude,room_type,price,minimum_nights,number_of_reviews,last_review,reviews_per_month,calculated_host_listings_count,availability_365
0,2539,Clean & quiet apt home by the park,2787,John,Brooklyn,Kensington,40.64749,-73.97237,Private room,149,1,9,2018-10-19,0.21,6,365
1,2595,Skylit Midtown Castle,2845,Jennifer,Manhattan,Midtown,40.75362,-73.98377,Entire home/apt,225,1,45,2019-05-21,0.38,2,355
2,3647,THE VILLAGE OF HARLEM....NEW YORK !,4632,Elisabeth,Manhattan,Harlem,40.80902,-73.9419,Private room,150,3,0,,,1,365
3,3831,Cozy Entire Floor of Brownstone,4869,LisaRoxanne,Brooklyn,Clinton Hill,40.68514,-73.95976,Entire home/apt,89,1,270,2019-07-05,4.64,1,194
4,5022,Entire Apt: Spacious Studio/Loft by central park,7192,Laura,Manhattan,East Harlem,40.79851,-73.94399,Entire home/apt,80,10,9,2018-11-19,0.1,1,0


In [3]:
# How many neighborhood groups are available and which shows up the most?

air_bnb.groupby('neighbourhood_group').count()[['neighbourhood']].sort_values(['neighbourhood'])

# Manhatten has the most neighbourhoods with availability

Unnamed: 0_level_0,neighbourhood
neighbourhood_group,Unnamed: 1_level_1
Staten Island,373
Bronx,1091
Queens,5666
Brooklyn,20104
Manhattan,21661


In [20]:
# Are private rooms the most popular in manhattan?

air_bnb.groupby(['neighbourhood_group','room_type'])[['room_type']].count()

# No Entire home/apt are the most popular in Manhatten

Unnamed: 0_level_0,Unnamed: 1_level_0,room_type
neighbourhood_group,room_type,Unnamed: 2_level_1
Bronx,Entire home/apt,379
Bronx,Private room,652
Bronx,Shared room,60
Brooklyn,Entire home/apt,9559
Brooklyn,Private room,10132
Brooklyn,Shared room,413
Manhattan,Entire home/apt,13199
Manhattan,Private room,7982
Manhattan,Shared room,480
Queens,Entire home/apt,2096


In [31]:
# Which hosts are the busiest and based on their reviews?

air_bnb.groupby('host_name')[['number_of_reviews']].count().sort_values('number_of_reviews', ascending=False)

# Michael is the bussiest host with 417 reviews.

Unnamed: 0_level_0,number_of_reviews
host_name,Unnamed: 1_level_1
Michael,417
David,403
Sonder (NYC),327
John,294
Alex,279
...,...
Jerbean,1
Jerald,1
Jeonghoon,1
Jeny,1


In [34]:
#Which neighorhood group has the highest average price?

air_bnb.groupby('neighbourhood_group')[['price']].mean().sort_values('price', ascending=False)

# Manhatten has the highest average price $196.88

Unnamed: 0_level_0,price
neighbourhood_group,Unnamed: 1_level_1
Manhattan,196.875814
Brooklyn,124.383207
Staten Island,114.812332
Queens,99.517649
Bronx,87.496792


In [38]:
# Which neighborhood group has the highest total price?

air_bnb.groupby('neighbourhood_group').sum()[['price']].sort_values('price', ascending=False)

# Manhatten has the highest total price.


Unnamed: 0_level_0,price
neighbourhood_group,Unnamed: 1_level_1
Manhattan,4264527
Brooklyn,2500600
Queens,563867
Bronx,95459
Staten Island,42825


In [41]:
#Which top 5 hosts have the highest total price?

air_bnb[['host_name', 'price']].sort_values('price', ascending=False).head()

# Kathrine, Erin, Jelena, Matt, Amy.

Unnamed: 0,host_name,price
9151,Kathrine,10000
17692,Erin,10000
29238,Jelena,10000
40433,Matt,9999
12342,Amy,9999


In [46]:
# Who currently has no (zero) availability with a review count of 100 or more?

air_bnb[(air_bnb['availability_365']==0) & (air_bnb['number_of_reviews'] >= 100)][['host_name', 'availability_365']].reset_index()

# there are 161 hosts with a review count of 100 or more with zero availability

Unnamed: 0,index,host_name,availability_365
0,8,MaryEllen,0
1,94,Christiana,0
2,132,Sol,0
3,174,Coral,0
4,180,Doug,0
...,...,...,...
157,29581,Kathleen,0
158,30461,Janet,0
159,31250,Albert,0
160,32670,Stephany,0


In [57]:
# What host has the highest total of prices and where are they located?

air_bnb.groupby(['host_name'])[['host_name', 'price', 'neighbourhood']].sum().sort_values('price', ascending=False)

# Sonder has the highest cumulative price in NYC


Unnamed: 0_level_0,price
host_name,Unnamed: 1_level_1
Sonder (NYC),82795
Blueground,70331
Michael,66895
David,65844
Alex,52563
...,...
Carolann,12
Vishanti & Jeremy,10
Salim,10
Qiuchi,0


In [58]:
# When did Danielle from Queens last receive a review?

air_bnb[(air_bnb['host_name'] == 'Danielle') & (air_bnb['neighbourhood_group'] == 'Queens')][['host_name','last_review']].sort_values('last_review', ascending=False)

# She last lived in queens on 2019-07-08

Unnamed: 0,host_name,last_review
22469,Danielle,2019-07-08
21517,Danielle,2019-07-07
20403,Danielle,2019-07-06
22068,Danielle,2019-07-06
7086,Danielle,2019-07-03
33861,Danielle,2019-06-20
27021,Danielle,2018-01-02
16349,Danielle,


## Further Questions

1. Which host has the most listings?

In [68]:
air_bnb.groupby('host_name')[['name']].count().sort_values(['name'], ascending = False)

# Michael has the most listings

Unnamed: 0_level_0,name
host_name,Unnamed: 1_level_1
Michael,417
David,403
Sonder (NYC),327
John,294
Alex,279
...,...
Jerrel,1
Jerrell,1
현선,1
Huei-Yin,0


2. How many listings have completely open availability?

In [70]:
air_bnb[air_bnb['availability_365'] == 365].count()[['name']]

# There's 1294 properties available 365 days a year.

name    1294
dtype: int64

3. What room_types have the highest review numbers?

In [76]:
air_bnb.groupby('room_type').sum('number_of_reviews')[['number_of_reviews']]

# Entire home/apt has the most reviews

Unnamed: 0_level_0,number_of_reviews
room_type,Unnamed: 1_level_1
Entire home/apt,580403
Private room,538346
Shared room,19256


# Final Conclusion

In this cell, write your final conclusion for each of the questions asked.

Also, if you uncovered some more details that were not asked above, please discribe them here.

-- Add your conclusion --