# AirBnB NY Locations Data Case Study

In this final project, you task will be to take the data provided and find evidance to answer the following questions.

1. Which hosts are the busiest and why?
2. How many neighborhood groups are available and which shows up the most?
3. Are private rooms the most popular in manhattan?
4. Which hosts are the busiest and based on their reviews?
5. Which neighorhood group has the highest average price?
6. Which neighborhood group has the highest total price?
7. Which top 5 hosts have the highest total price?
8. Who currently has no (zero) availability with a review count of 100 or more?
9. What host has the highest total of prices and where are they located?
10. When did Danielle from Queens last receive a review?

You will be given **4 hours** to complete this assignment. 
**Be Advised** I will go dark for this intire assignment time period. That said, any questions that you would like to ask about the data, or the project **MUST** be asked before the time starts. Once the time has started, I can no longer give information.

This is to similate what you will face when you are out in the wild. 

Happy Coding!

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [3]:
air_bnb = pd.read_csv('AB_NYC_2019.csv')
air_bnb.head(4)

Unnamed: 0,id,name,host_id,host_name,neighbourhood_group,neighbourhood,latitude,longitude,room_type,price,minimum_nights,number_of_reviews,last_review,reviews_per_month,calculated_host_listings_count,availability_365
0,2539,Clean & quiet apt home by the park,2787,John,Brooklyn,Kensington,40.64749,-73.97237,Private room,149,1,9,2018-10-19,0.21,6,365
1,2595,Skylit Midtown Castle,2845,Jennifer,Manhattan,Midtown,40.75362,-73.98377,Entire home/apt,225,1,45,2019-05-21,0.38,2,355
2,3647,THE VILLAGE OF HARLEM....NEW YORK !,4632,Elisabeth,Manhattan,Harlem,40.80902,-73.9419,Private room,150,3,0,,,1,365
3,3831,Cozy Entire Floor of Brownstone,4869,LisaRoxanne,Brooklyn,Clinton Hill,40.68514,-73.95976,Entire home/apt,89,1,270,2019-07-05,4.64,1,194


In [None]:
# How many neighborhood groups are available and which shows up the most?
neighbourhood_group_frequency = air_bnb.groupby('neighbourhood_group')['neighbourhood_group'].count().sort_values(ascending = False)
neighbourhood_group_frequency


neighbourhood_group
Manhattan        21661
Brooklyn         20104
Queens            5666
Bronx             1091
Staten Island      373
Name: neighbourhood_group, dtype: int64

In [None]:
# Are private rooms the most popular in manhattan?
manhattan_filter = air_bnb[air_bnb.neighbourhood_group == 'Manhattan']
manhattan_most_popular_type = manhattan_filter.groupby('room_type')['room_type'].count()
manhattan_most_popular_type


room_type
Entire home/apt    13199
Private room        7982
Shared room          480
Name: room_type, dtype: int64

In [None]:
# Which hosts are the busiest and based on their reviews?

#Analyze two different ways:
# 1) See which host maintains the most property listings 
#most_listings_host = air_bnb.drop_duplicates('calculated_host_listings_count').sort_values('calculated_host_listings_count', ascending = False, kind = 'mergesort')
#most_listings_host.head(10)[['host_id', 'host_name', 'calculated_host_listings_count']].reset_index(drop = True)

# OR 2) Since property count doesnt neccesarily translate to usage and turnover, see 
# which hosts are getting the most reviews, which gives indication how popular a listing is

most_reviewed_host = air_bnb.groupby(['host_id', 'host_name'])['number_of_reviews'].sum().sort_values(ascending = False)
most_reviewed_host.head(10)


host_id    host_name                     
37312959   Maya                              2273
344035     Brooklyn&   Breakfast    -Len-    2205
26432133   Danielle                          2017
35524316   Yasu & Akiko                      1971
40176101   Brady                             1818
4734398    Jj                                1798
16677326   Alex And Zeena                    1355
6885157    Randy                             1346
219517861  Sonder (NYC)                      1281
23591164   Angela                            1269
Name: number_of_reviews, dtype: int64

In [None]:
#Which neighorhood group has the highest average price?
avg_prices_neighbourhood = air_bnb.groupby('neighbourhood_group')['price'].mean().sort_values(ascending = False)
avg_prices_neighbourhood.round(2)

neighbourhood_group
Manhattan        196.88
Brooklyn         124.38
Staten Island    114.81
Queens            99.52
Bronx             87.50
Name: price, dtype: float64

In [None]:
# Which neighbor hood group has the highest total price?


#remove subgroup 'neighbourhood' in groupby() function to see total for the 5 neighbourhood grups
price_max_neighbourhood = air_bnb.groupby(['neighbourhood_group', 'neighbourhood']).sum().sort_values('price', ascending = False, kind ='mergesort')
price_max_neighbourhood.head(10)[['price']]

Unnamed: 0_level_0,Unnamed: 1_level_0,price
neighbourhood_group,neighbourhood,Unnamed: 2_level_1
Brooklyn,Williamsburg,563707
Manhattan,Midtown,436801
Manhattan,Upper West Side,415720
Manhattan,Hell's Kitchen,400987
Brooklyn,Bedford-Stuyvesant,399917
Manhattan,East Village,344812
Manhattan,Upper East Side,339729
Manhattan,Harlem,316233
Manhattan,Chelsea,277959
Brooklyn,Bushwick,209033


In [None]:
#Which top 5 hosts have the highest total price?

# SUMMATION OF ALL HOST'S PROPERTY PRICES CURRENTLY LISTED
#lavish_lifestylers = air_bnb.groupby(['host_id', 'host_name'])['price'].sum().sort_values(ascending = False)
#lavish_lifestylers.head(5)

# OR TOP 5 PRICES FOR INDIVIDUAL LISTINGS WITH HOST NAME ATTACHED

top_properties = air_bnb.sort_values('price', ascending = False, kind = 'mergesort')
top_properties.head(10)[['name', 'host_name', 'neighbourhood', 'neighbourhood_group', 'minimum_nights', 'price']].reset_index(drop = True)

Unnamed: 0,name,host_name,neighbourhood,neighbourhood_group,minimum_nights,price
0,Furnished room in Astoria apartment,Kathrine,Astoria,Queens,100,10000
1,Luxury 1 bedroom apt. -stunning Manhattan views,Erin,Greenpoint,Brooklyn,5,10000
2,1-BR Lincoln Center,Jelena,Upper West Side,Manhattan,30,10000
3,Spanish Harlem Apt,Olson,East Harlem,Manhattan,5,9999
4,"Quiet, Clean, Lit @ LES & Chinatown",Amy,Lower East Side,Manhattan,99,9999
5,2br - The Heart of NYC: Manhattans Lower East ...,Matt,Lower East Side,Manhattan,30,9999
6,Beautiful/Spacious 1 bed luxury flat-TriBeCa/Soho,Rum,Tribeca,Manhattan,30,8500
7,Film Location,Jessica,Clinton Hill,Brooklyn,1,8000
8,East 72nd Townhouse by (Hidden by Airbnb),Sally,Upper East Side,Manhattan,1,7703
9,70' Luxury MotorYacht on the Hudson,Jack,Battery Park City,Manhattan,1,7500


In [10]:
# Who currently has no (zero) availability with a review count of 100 or more?
popular_listings = air_bnb[(air_bnb.availability_365 == 0) & (air_bnb.number_of_reviews >= 100)]
popular_listings.drop_duplicates('host_id')[['host_name', 'name', 'neighbourhood_group', 'neighbourhood', 'price', 'number_of_reviews']].reset_index(drop = True)


Unnamed: 0,host_name,name,neighbourhood_group,neighbourhood,price,number_of_reviews
0,MaryEllen,Cozy Clean Guest Room - Family Apt,Manhattan,Upper West Side,79,118
1,Christiana,Charming 1 bed GR8 WBurg LOCATION!,Brooklyn,Williamsburg,100,168
2,Sol,NYC artists’ loft with roof deck,Brooklyn,Greenpoint,50,193
3,Coral,Financial District Luxury Loft,Manhattan,Financial District,196,114
4,Doug,"Fort Greene, Brooklyn: Center Bedroom",Brooklyn,Fort Greene,65,206
...,...,...,...,...,...,...
143,Kathleen,The Quietest Block in Manhattan :),Manhattan,Harlem,65,103
144,Janet,queens get away!!,Queens,Laurelton,65,119
145,Albert,entire sunshine of the spotless mind room,Brooklyn,Bedford-Stuyvesant,49,102
146,Stephany,COZY Room for Female Guests,Brooklyn,Prospect-Lefferts Gardens,48,131


In [None]:
# What host has the highest total of prices and where are they located?

corpo_durpo = air_bnb[air_bnb.host_id == 219517861]
corpo_durpo.sort_values('price', ascending = False, kind = 'mergesort').head(15)[['name', 'id', 'neighbourhood_group', 'neighbourhood', 'price']].reset_index(drop = True)


Unnamed: 0,name,id,neighbourhood_group,neighbourhood,price
0,Sonder | The Biltmore | 1BR,34177117,Manhattan,Theater District,699
1,Sonder | The Biltmore | Spacious 1BR + Kitchen,34183825,Manhattan,Theater District,699
2,Sonder | The Biltmore | Spacious 1BR + Kitchen,34203594,Manhattan,Theater District,699
3,Sonder | The Biltmore | Spacious 1BR + Kitchen,34311549,Manhattan,Theater District,699
4,Sonder | The Biltmore | Stunning 1BR + Sofa Bed,34312320,Manhattan,Theater District,699
5,Superior 1BR in FiDi by Sonder,34313143,Manhattan,Financial District,699
6,Lovely Studio in FiDi by Sonder,34313960,Manhattan,Financial District,699
7,Sonder | 116 John | Polished Studio + Gym,35936418,Manhattan,Financial District,699
8,Sonder | 116 John | Simple Studio + Gym,35937891,Manhattan,Financial District,699
9,Sonder | Wall Street | Superior 3BR + Rooftop,33998142,Manhattan,Financial District,616


In [14]:
# When did Danielle from Queens last receive a review?
danielle_queens_filter = air_bnb[(air_bnb.host_name == 'Danielle') & (air_bnb.neighbourhood_group == 'Queens')]
danielle_review_snapshot = danielle_queens_filter.sort_values('last_review', ascending = False)
danielle_review_snapshot[['host_id', 'host_name', 'last_review']].reset_index(drop = True)

# It would be better to know 'host_id' of the Danielle you are referring to, 
# currently 4 Danielle's with listings in Queens


Unnamed: 0,host_id,host_name,last_review
0,26432133,Danielle,2019-07-08
1,26432133,Danielle,2019-07-07
2,26432133,Danielle,2019-07-06
3,26432133,Danielle,2019-07-06
4,26432133,Danielle,2019-07-03
5,201647469,Danielle,2019-06-20
6,154256662,Danielle,2018-01-02
7,18051286,Danielle,


## Further Questions

1. Which host has the most listings?

In [None]:
most_listings_host = air_bnb.drop_duplicates('calculated_host_listings_count').sort_values('calculated_host_listings_count', ascending = False, kind = 'mergesort')
most_listings_host.head(10)[['host_id', 'host_name', 'calculated_host_listings_count']].reset_index(drop = True)

Unnamed: 0,host_id,host_name,calculated_host_listings_count
0,219517861,Sonder (NYC),327
1,107434423,Blueground,232
2,30283594,Kara,121
3,137358866,Kazuya,103
4,16098958,Jeremy & Laura,96
5,61391963,Corporate Housing,91
6,22541573,Ken,87
7,200380610,Pranjal,65
8,1475015,Mike,52
9,120762452,Stanley,50


2. How many listings have completely open availability?

In [None]:
# Broken down by how many are in each neighbourhood

come_on_in_were_open = air_bnb[air_bnb.availability_365 == 365]
come_on_in_were_open.groupby('neighbourhood_group')['neighbourhood_group'].count()

neighbourhood_group
Bronx             54
Brooklyn         453
Manhattan        572
Queens           204
Staten Island     12
Name: neighbourhood_group, dtype: int64

3. What room_types have the highest review numbers?

In [None]:
type_most_reviewed = air_bnb.groupby('room_type').sum().sort_values('number_of_reviews', ascending = False)
type_most_reviewed[['number_of_reviews']]

Unnamed: 0_level_0,number_of_reviews
room_type,Unnamed: 1_level_1
Entire home/apt,580403
Private room,538346
Shared room,19256


# Final Conclusion

In this cell, write your final conclusion for each of the questions asked.

Also, if you uncovered some more details that were not asked above, please discribe them here.

-- Add your conclusion --

In [None]:
# Question 1: Which hosts are the busiest and why?
#
#             Numerous ways to approach this question since its a tad subjective depending on what someone deems as busy.
#             First approach could try to find which hosts have the most amount of properties, seeing
#             as the more properties one has, the higher probability they have to get a booking.
#             As well as the host having more expenses and upkeep associated with listings.
#
#            -Hosts with Highest Total # of Listings:
#                     * Sonder(NYC): 327
#                     * Blueground: 232
#                     * Kara : 121
#                     * KAZUYA: 103
#                     * Jeremy & Laura : 96
#
#             Another way to approach question would be more similar to whats asked in question 4,
#             as that gives more insight into rental turnover and how popular certain listings are.      
#
         

# Question 2: How many neighbourhood groups are available and which shows up the most?
#
#            -There are 5 neighbourhood groupings and Manhattan has the most:
#                              * Manhattan: 21,661
#                              * Brooklyn: 20,104
#                              * Queens: 5,666
#                              * Brooklyn: 1,091
#                              * Staten Island: 373


# Question 3: Are private rooms the most popular in Manhattan?
#
#            -No. Entire homes/apts were the most popular listing in Manhattan
#             with 13,199 listings compared to 7,982 private room listings 


# Question 4: Which hosts are the busiest based on their reviews? (host_ids can be found above for clarification)
#
#            So I thought it was interesting that all of these hosts have numerous listings
#            and they're all centered around tourism industry in NYC. Two of these hosts exclusively
#            have properties by LaGuardia Airport (Maya and Danielle). Yasu & Akiko have 11 listings all a five minute walk from Times Square
#            and Brooklyn Bed & Breakfast is self-explanatory
#           
#           -These are the four busiest hosts, all hovering around 2,000 cumulative reviews:
#                             * Maya: 2,273
#                             * Brooklyn Bed & Breakfast: 2,205
#                             * Danielle: 2,017
#                             * Yasu & Akiko: 1,971


# Question 5: Which neighourhood group has the highest average price?
#
#             -Manhattan had the highest average price per listing:
#                         * Manhattan : $196.88
#                         * Brooklyn: $124.38
#                         * Staten Island: $114.81
#                         * Queens: $99.52
#                         * Bronx: $87.50


# Question 6: Which neighbourhood group has the highest total price?
#
#                     (rental capitilaztion = sum of all listing prices in given neighbourhood)
#
#            - Manhattan has the highest total rental cap then any other neighbourhood group, with three neighbourhoods
#              having a rental cap over $400,000 (Midtown, Upper West Side, and Hell's Kitchen). Interestingly though,
#              Brooklyn had the neighbourhood with highest rental cap in all of NYC with Williamsburg having a rental cap
#              of $563,707. It was the only neighbourhood to have over a $500,000 cap and roughly $130,000 more than next closest
#              neighbourhood Midtown in Manhattan. More data below:
#
#                          -Neighbourhood Groups Ranked (rental cap):
#                                    * Manhattan: $4,264,527
#                                    * Brooklyn: $2,500,600
#                                    * Queens: $563,867
#                                    * Bronx: $95,459
#                                    * Staten Island: $42,825
#
#     -Top 3 Manhattan Neighbourhoods:                           -Top 3 Brooklyn Neighbourhoods:
#         * Midtown: $436,801                                       * Williamsburg: $563,707
#         * Upper West Side: $415,720                               * Bedford-Stuy: $399,917
#         * Hell's Kitchen: $400,987                                * Bushwick: $209,033


# Question 7: Which top 5 hosts have the highest total price?
#       
#           - The 5 hosts with the highest total price are as follows:
#                 * Katherine : $10,000, 100 night min stay (Astoria, Queens)
#                 * Erin: $10,000, 5 night min stay (Greenpoint, Brooklyn) 
#                 * Jelena: $10,000, 30 night min stay (Upper West Side, Manhattan)
#                 * Olson: $9,999, 5 night min stay (East Harlem, Manhattan)
#                 * Amy: $9,999, 99 night min stay (Lower East Side, Manhattan)
#
#          - There was one more listing at $9,999 in Lower East Side with a 30 night min stay, then there is a noticable 
#            drop until the next listing at $8,500. All 6 listings near the $10,000 price point are the only property listed
#            by the host on AirBnB. 4 out the 6 listings also are meant for long term stays, with one even being a 3 month rental


# Question 8: Who currently has no (zero) availability with a review count of 100 or more?
#
#            - Out of the whole .csv file, there were only 162 listings but only 148 hosts out of all NYC that had zero availability and 
#              over a 100 reviews. Output of list is above if you'd like to look at different listings
#              but insights gained from query shows you properties that are in high demand and popular with customers. 


#Question 9: What host has the highest total of prices and where are they located?
#
#           - Sonder (NYC) has the highest total list prices out of any other host. From earlier analysis that shouldn't
#             be suprising since Sonder (NYC) had the most listings at 327. Many of the top hosts in this category are companies
#             not individuals.
#                                 Top 5 Hosts Ranked by Rental Cap:
#                                     * Sonder (NYC): $82,795  
#                                     * Blueground: $70,331
#                                     * Sally: $37,097
#                                     * Red Awning: $35,581
#                                     * Kara: $33,581
#
#           - All of Sonder's top properties are located in Manhattan, either the Financial or Theatre District.


# Question 10: When did Danielle from Queens last receive a review?
#
#            - To better answer this question would probably clarify with stakeholder the host id number to make sure 
#              I'm searching for the correct Danielle, since there are four Danielle's that have listings in Queens.
#              I was under impression that we were talking about the Danielle who's raking it
#              in from having a couple of listings by LaGuardia but will provide last review date of all Danielle's
#              from Queens
#                   
#                                     Last Reviews for Danielle (host_id):
#                                     * Danielle (26432133)  : 2019-07-08
#                                     * Danielle (201647469) : 2019-06-20 
#                                     * Danielle (154256662) : 2018-01-02
#                                     * Danielle (18051286)  :    N/A
#


