# Understanding Airbnb in New York City

Research Questions:

1. Part I: Which neighborhoods in New York City have the costliest Airbnb listings?
2. Part II: Which factors have the greatest influence on the price of an Airbnb listing?
3. Part III: Which factors have the greatest influence on the star rating of an Airbnb listing?

## Getting the Data

[ Aatish, put your script/explanation here ]

In [23]:
import pandas as pd

In [24]:
airbnbDF = pd.read_csv("airbnb-all-features.csv")
airbnbDF.head()

Unnamed: 0,lat,lng,price,star_rating,reviews_count,bathrooms,bedrooms,neighborhood,beds,picture_count,description
0,40.683045,-73.964919,25,5.0,3,1.0,1,Clinton Hill,1.0,6,My place is in Clinton Hill near historic Bed-...
1,40.778616,-73.949092,85,,1,1.0,1,Upper East Side,1.0,5,My place is close to Shake Shack The Metropoli...
2,40.745823,-73.99678,108,,0,1.0,0,Chelsea,1.0,9,Walking distance to the High Line MACY*S Times...
3,40.731186,-73.988558,95,4.5,13,1.0,1,East Village,1.0,7,Lovely big room & apartment with quiet private...
4,40.823202,-73.955805,35,5.0,9,1.0,1,Hamilton Heights,1.0,7,A beautiful space tucked in between Broadway a...


## Part I: Costliest Neighborhoods in NYC

In this section, we'll create map visualizations to understand which areas of New York City have the most expensive Airbnb listings. To create these maps, we'll use gmaps, a Jupyter notebook plugin that lets you embed Google Maps into your notebook.

To install gmaps:

    `pip install gmaps`

You'll also need a Google Maps Javascript API key to use gmaps. Get your API key [here](https://developers.google.com/maps/documentation/javascript/get-api-key#key).

In [25]:
import gmaps
gmaps.configure(api_key="AIzaSyAGU2-lOITiEiNpEpN-2-RIGfeLsKS-3DE")

### Plotting Airbnb Listings in NYC
First, let's use gmaps to visualize the distribution of Airbnb listings across New York City. Here, we simply plot the latitude and longitude of our listing datapoints onto the map.

In [29]:
airbnbLocations = []
for i, listing in airbnbDF.iterrows():
    coords = (listing['lat'], listing['lng'])
    airbnbLocations.append(coords)
airbnbLocationsLayer = gmaps.symbol_layer(airbnbLocations, fill_color="white", 
                                          stroke_color="red", scale=4)

m = gmaps.Map()
m.add_layer(airbnbLocationsLayer)
m

![Airbnb Map](locations_map.png)
It appears that most listings are in the Manhattan area, with fewer in Brooklyn, Bronx, and the other boroughs.

### Heatmap of Prices

Now that we know the spread of the Airbnb properties geographically, we can also visualize the distribution of the prices across the city

In [28]:
airbnbLocationPrices = []
for i, listing in airbnbDF.iterrows():
    coordsAndPrice = (listing['lat'], listing['lng'], listing['price'])
    airbnbLocationPrices.append(coordsAndPrice)

m = gmaps.Map()
priceHeatmapLayer = gmaps.WeightedHeatmap(data=airbnbLocationPrices)
m.add_layer(priceHeatmapLayer)
m

![Prices Heatmap](prices_heatmap.png)
And as expected, the Manhattan area has the priciest listings, although certain pockets of Brooklyn appear expensive as well.

## Part II: What Factors Influence Price?

From the previous section, it's clear that the location of a property has an impact on its price. Now, we'll perform linear regression and hypothesis testing to determine what other attributes influence price.

## Part III: What Factors Influence Star Rating?

To be done later...