Welcome to New York City, one of the most-visited cities in the world. There are many Airbnb listings in New York City to meet the high demand for temporary lodging for travelers, which can be anywhere between a few nights to many months. In this project, we will take a closer look at the New York Airbnb market by combining data from multiple file types like `.csv`, `.tsv`, and `.xlsx`.

Recall that **CSV**, **TSV**, and **Excel** files are three common formats for storing data. 
Three files containing data on 2019 Airbnb listings are available to you:

**data/airbnb_price.csv**
This is a CSV file containing data on Airbnb listing prices and locations.
- **`listing_id`**: unique identifier of listing
- **`price`**: nightly listing price in USD
- **`nbhood_full`**: name of borough and neighborhood where listing is located

**data/airbnb_room_type.xlsx**
This is an Excel file containing data on Airbnb listing descriptions and room types.
- **`listing_id`**: unique identifier of listing
- **`description`**: listing description
- **`room_type`**: Airbnb has three types of rooms: shared rooms, private rooms, and entire homes/apartments

**data/airbnb_last_review.tsv**
This is a TSV file containing data on Airbnb host names and review dates.
- **`listing_id`**: unique identifier of listing
- **`host_name`**: name of listing host
- **`last_review`**: date when the listing was last reviewed

In [1]:
# import libraries
import numpy as np
import pandas as pd

In [3]:
# importing all different types of dataframes
prices = pd.read_csv('airbnb_price.csv')


rooms = pd.read_excel('airbnb_room_type.xlsx')


reviews = pd.read_csv('airbnb_last_review.tsv', sep = '\t')

In [4]:
# Find the earliest and most recent reviews 
reviews['last_review_date'] = pd.to_datetime(reviews['last_review'], format='%B %d %Y')


# Find earliest entry
earliest_date = reviews['last_review_date'].min()

# Find latest entry
latest_date = reviews['last_review_date'].max()


print('The earliest review for an Airbnb listing was registered on', earliest_date,', whereas the latest entry was registered on', latest_date, '.')


The earliest review for an Airbnb listing was registered on 2019-01-01 00:00:00 , whereas the latest entry was registered on 2019-07-09 00:00:00 .


In [5]:
# Number of private room listings
rooms['room_type'] = rooms['room_type'].str.lower()
private_rooms = len(rooms[rooms['room_type'] == 'private room'])

print('There are', private_rooms, 'listings as private rooms.')

There are 11356 listings as private rooms.


In [6]:
# Find the average listing price

prices['price'] = prices['price'].str.strip(' dollars')
prices['price'] = prices['price'].astype(float)

mean_price = round(prices['price'].mean(), 2)

print('The average listing price on Airbnb for the listings in New York is', mean_price, 'Dollars.')

The average listing price on Airbnb for the listings in New York is 141.78 Dollars.


In [8]:
# Combining all previous results in a single Dataframe

data = {
    'first_reviewed' : [earliest_date],
    'last_reviewed' : [latest_date],
    'nb_private_rooms' : [private_rooms],
    'avg_price' : [mean_price]
}

review_dates = pd.DataFrame(data)

review_dates.head()

Unnamed: 0,first_reviewed,last_reviewed,nb_private_rooms,avg_price
0,2019-01-01,2019-07-09,11356,141.78
