
 
Files containing data on 2019 Airbnb listings:

**data/airbnb_price.csv**
This is a CSV file containing data on Airbnb listing prices and locations.
- **`listing_id`**: unique identifier of listing
- **`price`**: nightly listing price in USD
- **`nbhood_full`**: name of borough and neighborhood where listing is located

**data/airbnb_room_type.xlsx**
This is an Excel file containing data on Airbnb listing descriptions and room types.
- **`listing_id`**: unique identifier of listing
- **`description`**: listing description
- **`room_type`**: Airbnb has three types of rooms: shared rooms, private rooms, and entire homes/apartments

**data/airbnb_last_review.tsv**
This is a TSV file containing data on Airbnb host names and review dates.
- **`listing_id`**: unique identifier of listing
- **`host_name`**: name of listing host
- **`last_review`**: date when the listing was last reviewed

In [19]:
import numpy as np
import pandas as pd


#1. Loading the data

airbnb_price = pd.read_csv("data/airbnb_price.csv")
airbnb_room_type = pd.read_excel("data/airbnb_room_type.xlsx")
airbnb_last_review = pd.read_csv("data/airbnb_last_review.tsv", sep='\t')
airbnb_price.head()
airbnb_room_type.head()
airbnb_last_review.head()


#2. Merging the three DataFrames
airbnb_m = pd.merge(airbnb_price, airbnb_room_type, on="listing_id")
airbnb = pd.merge(airbnb_m,airbnb_last_review, on="listing_id")
airbnb.head()


#3. Determining the earliest and most recent review dates
airbnb_last_review["last_review"] = pd.to_datetime (airbnb_last_review["last_review"])
first_reviewed = airbnb_last_review["last_review"].min()
last_reviewed = airbnb_last_review["last_review"].max()
print(first_reviewed)
print(last_reviewed)


#4. Finding how many listings are private rooms
airbnb_room_type["room_type"] = airbnb_room_type["room_type"].str.lower()
nb_private_rooms = airbnb_room_type[airbnb_room_type["room_type"] == "private room"].shape[0]
print(nb_private_rooms)


#5. Finding the average price of listings
airbnb_price["price"] = airbnb_price["price"].str.replace("dollars","").astype(float)
avg_price = airbnb_price["price"].mean()
print(avg_price)



#6. Creating a DataFrame with the four solution values
avg_price = round(avg_price,2)
review_dates = pd.DataFrame({"first_reviewed":[first_reviewed], "last_reviewed":[last_reviewed], "nb_private_rooms":[nb_private_rooms],  "avg_price":[avg_price]})








2019-01-01 00:00:00
2019-07-09 00:00:00
11356
141.7779364512674
