# Filters and Plots
## Overview
In this activity, you'll review a scenario, and practice creating a data visualization with ggplot2. You will learn how to make use of the filters and facets features of ggplot2 to create custom visualizations based on different criteria. 

Throughout this activity, you will also have the opportunity to practice writing your own code by making changes to the code chunks yourself.

## The Scenario
As a junior data analyst for a hotel booking company, you have been asked to clean hotel booking data, create visualizations with `ggplot2` to gain insight into the data, and present different facets of the data through visualization. Now, you are going to build on the work you performed previously to apply filters to your data visualizations in `ggplot2`.

### Import Data

In [None]:
hotel_bookings <- read.csv("hotel_bookings.csv")

### Get to Know the Data

In [None]:
head(hotel_bookings)
colnames(hotel_bookings)

### Install Packages

In [None]:
install.packages('ggplot2')
library(ggplot2)

### Making Different Charts

You decide to create a bar chart showing each hotel type and market segment. Use different colors to represent each market segment.

In [None]:
ggplot(data = hotel_bookings) + geom_bar(mapping = aes(x = hotel, fill = market_segment))

![image.png](attachment:caa31d99-6cbe-4d6d-8eff-fdcd34dbbe76.png)

You decide to use the facet_wrap() function to create a separate plot for each market segment:

In [None]:
ggplot(data = hotel_bookings) + 
    geom_bar(mapping = aes(x = hotel)) +
    facet_wrap(~market_segment)

![image.png](attachment:5d663c31-9560-4f8a-89e0-266d7ad5ad35.png)

### Filtering
For the next step, you will need to have the `tidyverse` package installed and loaded. You may see a pop-up asking if you want to install; if that's the case, click 'Install.' This may take a few minutes!

In [None]:
install.packages('tidyverse')
library(tidyverse)

After considering all the data, your stakeholder decides to send the promotion to families that make online bookings for city hotels. The online segment is the fastest growing segment, and families tend to spend more at city hotels than other types of guests. 

Your stakeholder asks if you can create a plot that shows the relationship between lead time and guests traveling with children for online bookings at city hotels. This will give her a better idea of the specific timing for the promotion. 

You think about it, and realize you have all the tools you need to fulfill the request. You break it down into the following two steps: 1) filtering your data; 2) plotting your filtered data. 

For the first step, you can use the `filter()` function to create a data set that only includes the data you want. Input 'City Hotel' in the first set of quotation marks and 'Online TA' in the second set of quotations marks to specify your criteria: 

In [None]:
onlineta_city_hotels <- filter(hotel_bookings, 
                           (hotel=="City Hotel" & 
                             hotel_bookings$market_segment=="Online TA"))

You can use the`View`() function to check out your new data frame:

In [None]:
View(onlineta_city_hotels)

There is also another way to do this. You can use the pipe operator (%>%) to do this in steps! 

You name this data frame `onlineta_city_hotels_v2`

In [None]:
onlineta_city_hotels_v2 <- hotel_bookings %>%
  filter(hotel=="City Hotel") %>%
  filter(market_segment=="Online TA")
View(onlineta_city_hotels_v2)

### Using the New Dataframe

Make a scatterplot using either `onlineta_city_hotels` or `onlineta_city_hotels_v2` to plot the data your stakeholder requested:

In [None]:
ggplot(data = onlineta_city_hotels) +
  geom_point(mapping = aes(x = lead_time, y = children))

![image.png](attachment:2434820f-d0eb-4554-bf04-1999bed3e053.png)