## [12.3.4 Exercises](https://r4ds.hadley.nz/logicals#exercises-1)

### 1. Find all flights where `arr_delay` is missing but `dep_delay` is not. Find all flights where neither `arr_time` nor `sched_arr_time` are missing, but `arr_delay` is.

### 2. How many flights have a missing `dep_time`? What other variables are missing in these rows? What might these rows represent?

### 3. Assuming that a missing `dep_time` implies that a flight is cancelled, look at the number of cancelled flights per day. Is there a pattern? Is there a connection between the proportion of cancelled flights and the average delay of non-cancelled flights?

## [12.4.4 Exercises](https://r4ds.hadley.nz/logicals#exercises-2)

### 1. What will `sum(is.na(x))` tell you? How about `mean(is.na(x))`?

### 2. What does `prod()` return when applied to a logical vector? What logical summary function is it equivalent to? What does `min()` return when applied to a logical vector? What logical summary function is it equivalent to? Read the documentation and perform a few experiments.

## [12.5.4 Exercises](https://r4ds.hadley.nz/logicals#exercises-3)

### 1. A number is even if it’s divisible by two, which in R you can find out with `x %% 2 == 0`. Use this fact and `if_else()` to determine whether each number between 0 and 20 is even or odd.

### 2. Given a vector of days like `x <- c("Monday", "Saturday", "Wednesday")`, use an `if_else()` statement to label them as weekends or weekdays.

### 3. Use `if_else()` to compute the absolute value of a numeric vector called `x`.

### 4. Write a `case_when()` statement that uses the `month` and `day` columns from `flights` to label a selection of important US holidays (e.g., New Years Day, 4th of July, Thanksgiving, and Christmas). First create a logical column that is either `TRUE` or `FALSE`, and then create a character column that either gives the name of the holiday or is `NA`.

## [13.3.1 Exercises](https://r4ds.hadley.nz/numbers#exercises)

### 1. How can you use `count()` to count the number of rows with a missing value for a given variable?

### 2. Expand the following calls to `count()` to instead use `group_by()`, `summarize()`, and `arrange()`:

- `flights |> count(dest, sort = TRUE)`

- `flights |> count(tailnum, wt = distance)`

In [None]:
<img src="numbers_files/figure-html/fig-prop-cancelled-1.png" class="img-fluid figure-img" alt="A line plot showing how proportion of cancelled flights changes over the course of the day. The proportion starts low at around 0.5% at 5am, then steadily increases over the course of the day until peaking at 4% at 7pm. The proportion of cancelled flights then drops rapidly getting down to around 1% by midnight." width="576">

## [13.4.8 Exercises](https://r4ds.hadley.nz/numbers#exercises-1) 

### 1. Explain in words what each line of the code used to generate Figure 13.1 does.

<!-- ![Alt text](https://r4ds.hadley.nz/numbers_files/figure-html/fig-prop-cancelled-1.png) -->
<figure>
    <img width=60% alt="A line plot showing how proportion of cancelled flights changes over the course of the day. The proportion starts low at around 0.5% at 5am, then steadily increases over the course of the day until peaking at 4% at 7pm. The proportion of cancelled flights then drops rapidly getting down to around 1% by midnight." src="https://r4ds.hadley.nz/numbers_files/figure-html/fig-prop-cancelled-1.png"></img>  
    <figcaption>Figure 13.1: A line plot with scheduled departure hour on the x-axis, and proportion of cancelled flights on the y-axis. Cancellations seem to accumulate over the course of the day until 8pm, very late flights are much less likely to be cancelled.</figcaption>
</figure>
    


        

### 2. What trigonometric functions does R provide? Guess some names and look up the documentation. Do they use degrees or radians?

### 3.  Currently `dep_time` and `sched_dep_time` are convenient to look at, but hard to compute with because they’re not really continuous numbers. You can see the basic problem by running the code below: there’s a gap between each hour.
```r
    flights |> 
      filter(month == 1, day == 1) |> 
      ggplot(aes(x = sched_dep_time, y = dep_delay)) +
      geom_point()
```

### Convert them to a more truthful representation of time (either fractional hours or minutes since midnight).


### 4. Round `dep_time` and `arr_time` to the nearest five minutes.

## [13.5.4 Exercises](https://r4ds.hadley.nz/numbers#exercises-2)

### 1. Find the 10 most delayed flights using a ranking function. How do you want to handle ties? Carefully read the documentation for `min_rank()`.

### 2. Which plane (`tailnum`) has the worst on-time record?

### 3. What time of day should you fly if you want to avoid delays as much as possible?

### 4. What does `flights |> group_by(dest) |> filter(row_number() < 4)` do? What does `flights |> group_by(dest) |> filter(row_number(dep_delay) < 4)` do?

### 5. For each destination, compute the total minutes of delay. For each flight, compute the proportion of the total delay for its destination.

### 6. Delays are typically temporally correlated: even once the problem that caused the initial delay has been resolved, later flights are delayed to allow earlier flights to leave. Using `lag()`, explore how the average flight delay for an hour is related to the average delay for the previous hour.
```r
    flights |> 
      mutate(hour = dep_time %/% 100) |> 
      group_by(year, month, day, hour) |> 
      summarize(
        dep_delay = mean(dep_delay, na.rm = TRUE),
        n = n(),
        .groups = "drop"
      ) |> 
      filter(n > 5)
```

### 7. Look at each destination. Can you find flights that are suspiciously fast (i.e. flights that represent a potential data entry error)? Compute the air time of a flight relative to the shortest flight to that destination. Which flights were most delayed in the air?

### 8. Find all destinations that are flown by at least two carriers. Use those destinations to come up with a relative ranking of the carriers based on their performance for the same destination.

## [13.6.7 Exercises](https://r4ds.hadley.nz/numbers#exercises-3)

### 1. Brainstorm at least 5 different ways to assess the typical delay characteristics of a group of flights. When is `mean()` useful? When is `median()` useful? When might you want to use something else? Should you use arrival delay or departure delay? Why might you want to use data from `planes`?

### 2. Which destinations show the greatest variation in air speed?

### 3. Create a plot to further explore the adventures of EGE. Can you find any evidence that the airport moved locations? Can you find another variable that might explain the difference?

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 

### 1. 