Start by loading the `tidymodels` package...

Let's work with [hotel bookings data](https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-02-11/readme.md) from TidyTuesday. 

In [None]:
hotels = 
  readr::read_csv("https://tidymodels.org/start/case-study/hotels.csv", col_types = readr::cols()) |>
  mutate(across(where(is.character), as.factor)) # turns all categorical columns into factors

hotels |> head()

Each row is a "stay" and we want to predict whether the stay included children given the other features.

We'll want to extract a bit more information from the `arrival_date` column. So, let's make a `recipe`.

In [None]:
rec = recipe(children ~ ., data = hotels) |>
    step_date(arrival_date) |>
    step_holiday(arrival_date) |>
    step_rm(arrival_date) |>
    prep()

What does this recipe do? Look at the transformed data with "juice".

Let's start with a decision tree. 

In [None]:
mod = decision_tree() |>
    set_engine('rpart') |>
    set_mode('classification')

Fit `mod` using the data from your prepped recipe. (Hint: remember `juice`).

Use the snippet below to visualize the tree. What is the first split?

```r
rpart.plot::rpart.plot(mod_fit$fit)
```

Now create a `random_forest` model and fit it to the transformed `hotels` data.

1. the `mode` should be "classification", and,
2. add `importance = "impurity"` to `set_engine` so we can calculate feature importances with the model.

We can plot the feature importances. Pipe your fitted model into `extract_fit_parsnip()` and then into `vip(num_features = 20)`.