<p align="center">
<img src="https://github.com/datacamp/r-live-training-template/blob/master/assets/datacamp.svg?raw=True" alt = "DataCamp icon" width="50%">
</p>
<br><br>

## Cleaning Data in R Live Training
Welcome to this hands-on training where you'll identify issues in a dataset and clean it from start to finish using R. It's often said that data scientists spend 80% of their time cleaning and manipulating data and only about 20% of their time analyzing it, so cleaning data is an important skill to master!

In this session, you will:

- Examine a dataset and identify its problem areas, and what needs to be done to fix them.
-Convert between data types to make analysis easier.
- Correct inconsistencies in categorical data.
- Deal with missing data.
- Perform data validation to ensure every value makes sense.

## **The Dataset**

_Enter a brief description of your dataset and its columns, here's an example below:_


The dataset you'll use is a CSV file named `nyc_airbnb.csv`, which contains data on Airbnb listings in New York City. It contains the following columns:

- `listing_id`: The unique identifier for a listing
- `description`: The description used on the listing
- `host_id`: Unique identifier for a host
- `host_name`: Name of host
- `nbhood_full`: Name of borough and neighborhood
- `coordinates`: Coordinates of listing _(latitude, longitude)_
- `room_type`: Type of room 
- `price`: Price per night for listing
- `nb_reviews`: Number of reviews received 
- `last_review`: Date of last review
- `reviews_per_month`: Average number of reviews per month
- `availability_365`: Number of days available per year
- `avg_rating`: Average rating (from 0 to 5)
- `nb_stays`: Total number of stays thus far
- `pct_5_stars`: Percent of reviews that were 5-stars
- `listing_added`: Date when listing was added


In [0]:
# Install R Packages
install.packages("dplyr")
install.packages("stringr")
install.packages("ggplot2")

In [0]:
# Load packages
library(dplyr)
library(stringr)
library(ggplot2)

In [0]:
# Get dataset
airbnb <- read.csv("https://raw.githubusercontent.com/datacamp/cleaning-data-in-r-live-training/master/assets/nyc_airbnb.csv")

In [8]:
head(airbnb)

Unnamed: 0_level_0,X,listing_id,name,host_id,host_name,nbhood_full,coordinates,room_type,price,nb_reviews,last_review,reviews_per_month,availability_365,avg_rating,nb_stays,pct_5_stars,listing_added
Unnamed: 0_level_1,<int>,<int>,<fct>,<int>,<fct>,<fct>,<fct>,<fct>,<fct>,<int>,<fct>,<dbl>,<int>,<dbl>,<dbl>,<dbl>,<fct>
1,1,13740704,"Cozy,budget friendly, cable inc, private entrance!",20583125,Michel,"Brooklyn, Flatlands","(40.63222, -73.93398)",Private room,$45,10,2018-12-12,0.7,85,4.100954,12.0,0.6094315,2018-06-08
2,2,22005115,Two floor apartment near Central Park,82746113,Cecilia,"Manhattan, Upper West Side","(40.78761, -73.96862)",Entire home/apt,$135,1,2019-06-30,1.0,145,3.3676,1.2,0.7461346,2018-12-25
3,3,21667615,Beautiful 1BR in Brooklyn Heights,78251,Leslie,"Brooklyn, Brooklyn Heights","(40.7007, -73.99517)",Entire home/apt,$150,0,,,65,,,,2018-08-15
4,4,6425850,"Spacious, charming studio",32715865,Yelena,"Manhattan, Upper West Side","(40.79169, -73.97498)",Entire home/apt,$86,5,2017-09-23,0.13,0,4.763203,6.0,0.7699471,2017-03-20
5,5,22986519,Bedroom on the lively Lower East Side,154262349,Brooke,"Manhattan, Lower East Side","(40.71884, -73.98354)",Private room,$160,23,2019-06-12,2.29,102,3.822591,27.6,0.6493831,2020-10-23
6,6,271954,Beautiful brownstone apartment,1423798,Aj,"Manhattan, Greenwich Village","(40.73388, -73.99452)",Entire home/apt,$150,203,2019-06-20,2.22,300,4.478396,243.6,0.7434997,2018-12-15
