![NYC Skyline](img/nyc.jpg)

Welcome to New York City, one of the most-visited cities in the world. There are many [Airbnb](https://www.airbnb.com/) listings in New York City to meet the high demand for temporary lodging for travelers, which can be anywhere between a few nights to many months. In this project, you will take a closer look at the New York Airbnb market by combining data from multiple file types like `.csv`, `.tsv`, and `.xlsx` (Excel files).

Recall that **CSV**, **TSV**, and **Excel** files are three common formats for storing data. 
Three files containing data on 2019 Airbnb listings are available to you:

**data/airbnb_price.csv**
This is a CSV file containing data on Airbnb listing prices and locations.
- **`listing_id`**: unique identifier of listing
- **`price`**: nightly listing price in USD
- **`nbhood_full`**: name of borough and neighborhood where listing is located

**data/airbnb_room_type.xlsx**
This is an Excel file containing data on Airbnb listing descriptions and room types.
- **`listing_id`**: unique identifier of listing
- **`description`**: listing description
- **`room_type`**: Airbnb has three types of rooms: shared rooms, private rooms, and entire homes/apartments

**data/airbnb_last_review.tsv**
This is a TSV file containing data on Airbnb host names and review dates.
- **`listing_id`**: unique identifier of listing
- **`host_name`**: name of listing host
- **`last_review`**: date when the listing was last reviewed


In [152]:
# We've loaded the necessary packages for you in the first cell. Please feel free to add as many cells as you like!
suppressMessages(library(dplyr)) # This line is required to check your answer correctly
options(readr.show_types = FALSE) # This line is required to check your answer correctly
library(readr)
library(readxl)
library(stringr)

In [153]:
# Load data
airbnb_price <- read_csv('data/airbnb_price.csv')
airbnb_price
airbnb_room_type <- read_excel('data/airbnb_room_type.xlsx')
airbnb_room_type
airbnb_room_review <- read_tsv('data/airbnb_last_review.tsv')
airbnb_room_review

[1mRows: [22m[34m25209[39m [1mColumns: [22m[34m3[39m
[36m──[39m [1mColumn specification[22m [36m────────────────────────────────────────────────────────[39m
[1mDelimiter:[22m ","
[31mchr[39m (2): price, nbhood_full
[32mdbl[39m (1): listing_id

[36mℹ[39m Use `spec()` to retrieve the full column specification for this data.
[36mℹ[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.


listing_id,price,nbhood_full
<dbl>,<chr>,<chr>
2595,225 dollars,"Manhattan, Midtown"
3831,89 dollars,"Brooklyn, Clinton Hill"
5099,200 dollars,"Manhattan, Murray Hill"
5178,79 dollars,"Manhattan, Hell's Kitchen"
5238,150 dollars,"Manhattan, Chinatown"
5295,135 dollars,"Manhattan, Upper West Side"
5441,85 dollars,"Manhattan, Hell's Kitchen"
5803,89 dollars,"Brooklyn, South Slope"
6021,85 dollars,"Manhattan, Upper West Side"
6848,140 dollars,"Brooklyn, Williamsburg"


listing_id,description,room_type
<dbl>,<chr>,<chr>
2595,Skylit Midtown Castle,Entire home/apt
3831,Cozy Entire Floor of Brownstone,Entire home/apt
5099,Large Cozy 1 BR Apartment In Midtown East,Entire home/apt
5178,Large Furnished Room Near B'way,private room
5238,Cute & Cozy Lower East Side 1 bdrm,Entire home/apt
5295,Beautiful 1br on Upper West Side,Entire home/apt
5441,Central Manhattan/near Broadway,Private room
5803,"Lovely Room 1, Garden, Best Area, Legal rental",Private room
6021,Wonderful Guest Bedroom in Manhattan for SINGLES,Private room
6848,Only 2 stops to Manhattan studio,entire home/apt


[1mRows: [22m[34m25209[39m [1mColumns: [22m[34m3[39m
[36m──[39m [1mColumn specification[22m [36m────────────────────────────────────────────────────────[39m
[1mDelimiter:[22m "\t"
[31mchr[39m (2): host_name, last_review
[32mdbl[39m (1): listing_id

[36mℹ[39m Use `spec()` to retrieve the full column specification for this data.
[36mℹ[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.


listing_id,host_name,last_review
<dbl>,<chr>,<chr>
2595,Jennifer,May 21 2019
3831,LisaRoxanne,July 05 2019
5099,Chris,June 22 2019
5178,Shunichi,June 24 2019
5238,Ben,June 09 2019
5295,Lena,June 22 2019
5441,Kate,June 23 2019
5803,Laurie,June 24 2019
6021,Claudio,July 05 2019
6848,Allen & Irina,June 29 2019


In [154]:
# Merge data together using %>% operator
merged_airbnb_data <- airbnb_price %>% 
  inner_join(airbnb_room_type, by = 'listing_id') %>% 
  inner_join(airbnb_room_review, by = 'listing_id')

merged_airbnb_data

listing_id,price,nbhood_full,description,room_type,host_name,last_review
<dbl>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>
2595,225 dollars,"Manhattan, Midtown",Skylit Midtown Castle,Entire home/apt,Jennifer,May 21 2019
3831,89 dollars,"Brooklyn, Clinton Hill",Cozy Entire Floor of Brownstone,Entire home/apt,LisaRoxanne,July 05 2019
5099,200 dollars,"Manhattan, Murray Hill",Large Cozy 1 BR Apartment In Midtown East,Entire home/apt,Chris,June 22 2019
5178,79 dollars,"Manhattan, Hell's Kitchen",Large Furnished Room Near B'way,private room,Shunichi,June 24 2019
5238,150 dollars,"Manhattan, Chinatown",Cute & Cozy Lower East Side 1 bdrm,Entire home/apt,Ben,June 09 2019
5295,135 dollars,"Manhattan, Upper West Side",Beautiful 1br on Upper West Side,Entire home/apt,Lena,June 22 2019
5441,85 dollars,"Manhattan, Hell's Kitchen",Central Manhattan/near Broadway,Private room,Kate,June 23 2019
5803,89 dollars,"Brooklyn, South Slope","Lovely Room 1, Garden, Best Area, Legal rental",Private room,Laurie,June 24 2019
6021,85 dollars,"Manhattan, Upper West Side",Wonderful Guest Bedroom in Manhattan for SINGLES,Private room,Claudio,July 05 2019
6848,140 dollars,"Brooklyn, Williamsburg",Only 2 stops to Manhattan studio,entire home/apt,Allen & Irina,June 29 2019


In [155]:
# Determining the earliest and most recent review dates
# Step 1: Converting reviews to a date format
# Step 2: Comparing dates of reviews
# Step 3: Pulling the obtained data to create a tibble at a later stage as per instructions
number_one <- merged_airbnb_data %>%
  mutate(last_review = as.Date(last_review, '%B %d %Y')) %>%
  summarize(first_reviewed = min(last_review),
             last_reviewed = max(last_review)) %>%
  pull(first_reviewed)
number_one

number_two <- merged_airbnb_data %>%
  mutate(last_review = as.Date(last_review, '%B %d %Y')) %>%
  summarize(first_reviewed = min(last_review),
             last_reviewed = max(last_review)) %>%
  pull(last_reviewed)
number_two

In [156]:
# Finding how many listing are private rooms

# Step 1: Cleaning up variety in capitalisation
# Step 2: Counting the number of private rooms
# Step 3: Pulling the obtained data to create a tibble at a later stage as per instructions
number_three <- merged_airbnb_data %>%
  mutate(room_type = str_to_title(room_type)) %>%
  filter(room_type == "Private Room") %>%
  count() %>%
  pull(n)

number_three

In [157]:
# Finding average price of listings
# Step 1: Removing the word 'dollars' from price column
number_four <- merged_airbnb_data %>%
  mutate(price = str_remove(price, 'dollars')) %>%
  mutate(price = as.numeric(price)) %>%
  summarize(avg_price = mean(price)) %>%
  pull(avg_price)

number_four

In [158]:
# Creating a tibble with the four solution values
review_dates <- tibble(
  first_reviewed = number_one,
  last_reviewed = number_two,
  nb_private_rooms = as.numeric(number_three),
  avg_price = round(number_four,2)
)
review_dates

first_reviewed,last_reviewed,nb_private_rooms,avg_price
<date>,<date>,<dbl>,<dbl>
2019-01-01,2019-07-09,11356,141.78
