![NYC Skyline](nyc.jpg)

Welcome to New York City, one of the most-visited cities in the world. There are many Airbnb listings in New York City to meet the high demand for temporary lodging for travelers, which can be anywhere between a few nights to many months. In this project, we will take a closer look at the New York Airbnb market by combining data from multiple file types like `.csv`, `.tsv`, and `.xlsx`.

Recall that **CSV**, **TSV**, and **Excel** files are three common formats for storing data. 
Three files containing data on 2019 Airbnb listings are available to you:

**data/airbnb_price.csv**
This is a CSV file containing data on Airbnb listing prices and locations.
- **`listing_id`**: unique identifier of listing
- **`price`**: nightly listing price in USD
- **`nbhood_full`**: name of borough and neighborhood where listing is located

**data/airbnb_room_type.xlsx**
This is an Excel file containing data on Airbnb listing descriptions and room types.
- **`listing_id`**: unique identifier of listing
- **`description`**: listing description
- **`room_type`**: Airbnb has three types of rooms: shared rooms, private rooms, and entire homes/apartments

**data/airbnb_last_review.tsv**
This is a TSV file containing data on Airbnb host names and review dates.
- **`listing_id`**: unique identifier of listing
- **`host_name`**: name of listing host
- **`last_review`**: date when the listing was last reviewed

In [1]:
# We've loaded your first package for you! You can add as many cells as you need.
import numpy as np
import pandas as pd

price = pd.read_csv("data/airbnb_price.csv")
#print(price.head())
room_type = pd.read_excel("data/airbnb_room_type.xlsx")
#print(room_type)
review = pd.read_csv("data/airbnb_last_review.tsv",sep="\t")
#print(review)

df = price.merge(room_type,on="listing_id").merge(review,on="listing_id")
print(df.info())
df['last_review'] = pd.to_datetime(df['last_review'])

earliest_review = df['last_review'].dt.date.min()
recent_review = df['last_review'].dt.date.max()
df['room_type'] = df['room_type'].str.lower()
numberOfPrivateRoom = df[df['room_type']=='private room'].shape[0]
print(df['room_type'].head(10))
print(numberOfPrivateRoom)
df['price'] = df['price'].str.replace(" dollars","")
df['price'] = df['price'].astype(float)
averagePrice = df['price'].mean()

review_dates = pd.DataFrame({'first_reviewed': [earliest_review],'last_reviewed': [recent_review],'nb_private_rooms':[numberOfPrivateRoom],'avg_price':[round(averagePrice,2)] })

print(review_dates)

<class 'pandas.core.frame.DataFrame'>
Int64Index: 25209 entries, 0 to 25208
Data columns (total 7 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   listing_id   25209 non-null  int64 
 1   price        25209 non-null  object
 2   nbhood_full  25209 non-null  object
 3   description  25199 non-null  object
 4   room_type    25209 non-null  object
 5   host_name    25201 non-null  object
 6   last_review  25209 non-null  object
dtypes: int64(1), object(6)
memory usage: 1.5+ MB
None
0    entire home/apt
1    entire home/apt
2    entire home/apt
3       private room
4    entire home/apt
5    entire home/apt
6       private room
7       private room
8       private room
9    entire home/apt
Name: room_type, dtype: object
11356
  first_reviewed last_reviewed  nb_private_rooms  avg_price
0     2019-01-01    2019-07-09             11356     141.78
