## Logical Values

We've now learned how to manipulate the following types of entities in R:

1. Numbers
2. Strings
3. Vectors
4. Data Frames

In this lesson, we're going to learn to manipulate yet another type of entity: *logical values*. The phrase *logical values* is just a fancy way of saying "values that are either `TRUE` or `FALSE`. We can store logical values inside of variables just like all of the other values that we have worked with so far:

In [1]:
this.is.a.false.logical.value <- FALSE
this.is.a.true.logical.value <- TRUE
print(this.is.a.false.logical.value)
print(this.is.a.true.logical.value)

[1] FALSE
[1] TRUE


Note: we do not write quotes around `FALSE` or `TRUE` in the example above; if we did, they would be of type `character`, not of type `logical`. The `typeof` function allows us to verify that these variables do, in fact, contain values of the `logical` type:

In [2]:
typeof(this.is.a.false.logical.value)

## Comparison Operators

Logical values come up often when we compare other objects in R. For example, we can use the `<` operator to see if one number is less than another - this expression will return the appropriate logical value (`TRUE` or `FALSE`) - which we can then store as we wish:

In [3]:
my.comparison.result <- 3 < 5
print(my.comparison.result)
print(typeof(my.comparison.result))

[1] TRUE
[1] "logical"


R implements all of the standard comparison operators, all of which are shown below for completeness:

In [4]:
# equal-to
5 == 5 # TRUE
5 == 6 # FALSE
# not-equal-to
5 != 5 # FALSE
5 != 6 # TRUE
# greater-than
5 > 3 # TRUE
5 > 6 # FALSE
# greater-than-or-equal-to
5 >= 4 # TRUE
5 >= 5 # TRUE
5 >= 6 # FALSE
# less than
5 < 6 # TRUE
6 < 5 # FALSE
# less-than-or-equal-to
5 <= 6 # TRUE
6 <= 6 # TRUE
7 <= 6 # FALSE

One thing that is very special about the R programming language is that comparison operators can be very easily applied to *vectors* containing multiple values. For example, we can create a `logical` vector that indicates whether each corresponding element in the vector `my.numbers` is greater than `7`:

In [5]:
my.numbers <- c(1,5,2,8,2,9) 
is.gt.7 <- my.numbers > 7
is.gt.7

What makes this capability especially powerful is that we can then *filter* the original vector using only the indexing operator `[...]`:

In [6]:
all.values.gt.7 <- my.numbers[is.gt.7]
all.values.gt.7

As you can see, only the elements at positions correspoding to `TRUE` values in `is.gt.7` are kept.

<span style="color:blue;font-weight:bold">Exercise</span>: Use the `<=` to create a logical vector called `is.lte.5` containing all values in `my.numbers` that are less-than-or-equal-to 5. Use this vector to filter for those elements and store the resulting filtered vector in the variable `all.values.lte.5`:

In [8]:
# delete this entire line and replace it with your code

is.lte.5 <- my.numbers <= 5
all.values.lte.5 <- my.numbers[is.lte.5]

In [9]:
check.variable.value("is.lte.5", my.numbers <= 5)
check.variable.value("all.values.lte.5", my.numbers[my.numbers <= 5])
success()

## Logical Operators

R also provides us with additional *logical operators* that we can use to combine logical values. For example, the `&` operator will return `TRUE` only if both of its operands are true:

In [9]:
FALSE & FALSE # FALSE
TRUE & FALSE  # FALSE
FALSE & TRUE  # FALSE
TRUE & TRUE   # TRUE

Similarly, the `|` operator will return `TRUE` if either (or both) of its elements are true:

In [10]:
FALSE | FALSE  # FALSE
TRUE  | FALSE  # TRUE
FALSE | TRUE   # TRUE
TRUE | TRUE    # TRUE

Finally, the negation operator `!` will invert any logical value:

In [11]:
!TRUE  # FALSE
!FALSE # TRUE 

Again, a special aspect of R is that all of these operators also work on vectors - they are simply applied to each pair of elements sequentially:

In [12]:
first.vec <- c(FALSE, TRUE)
second.vec <- c(TRUE, TRUE)
first.vec & second.vec      # FALSE TRUE
first.vec | second.vec      # TRUE TRUE
!first.vec                  # TRUE FALSE

# Filtering Data Frames using Logical Values

Let's load the cement data that we used in the previous fundamentals course:

In [13]:
cement.df <- read.csv("data/cement.csv")
head(cement.df)

gdp.trillions,cement.imports.billions
<dbl>,<dbl>
2.0,0.4019249
2.05132,0.3957969
2.117807,0.4203297
2.21658,0.459539
2.233002,0.4397102
2.388326,0.4980864


Very frequently in our courses, you will see that we filter data frame rows using logical operators. For example, we can obtain a logical vector indicating which rows have `gdp.trillions > 2.3`: 

In [14]:
big.rows <- cement.df$gdp.trillions > 2.3 
big.rows

We can then filter to select only those rows, while keeping all columns:

In [15]:
cement.df[big.rows,]

Unnamed: 0_level_0,gdp.trillions,cement.imports.billions
Unnamed: 0_level_1,<dbl>,<dbl>
6,2.388326,0.4980864
7,2.427764,0.4949906
8,2.566075,0.5340344
9,2.648952,0.5489615
10,2.5813,0.5121134


Logical operators are very useful for filtering rows based on more complex conditions. For example, let's experiment with the sales data that we used in the previous fundamentals course:

In [16]:
sales.df <- read.csv("data/sales.csv")
head(sales.df)

sales.person.name,sales.q2.thousands,sales.q3.thousands
<fct>,<int>,<int>
joe,62,45
john,51,21
jeff,47,56
james,21,24
johnathan,44,54
jeffrey,53,42


Let's hunt for salespeople who consistently outperform the average - first, we'll calculate the average for each quarter:

In [17]:
mean.q2 <- mean(sales.df$sales.q2.thousands)
mean.q3 <- mean(sales.df$sales.q3.thousands)

Next, we'll filter for rows who exceeded it in both quarters:

In [18]:
sales.df[(sales.df$sales.q2.thousands > mean.q2) & (sales.df$sales.q3.thousands > mean.q3),]

Unnamed: 0_level_0,sales.person.name,sales.q2.thousands,sales.q3.thousands
Unnamed: 0_level_1,<fct>,<int>,<int>
1,joe,62,45
3,jeff,47,56
6,jeffrey,53,42


<span style="color:blue;font-weight:bold">Exercise</span>: Adapt the code above to filter for salespeople who consistently score *less than* the quarterly average and store the resulting filtered data frame in the variable `filtered.df`

In [19]:
# delete this entire line and replace it with your code

filtered.df <- sales.df[(sales.df$sales.q2.thousands < mean.q2) & (sales.df$sales.q3.thousands < mean.q3),]

In [20]:
correct.filtered.df <- sales.df[(sales.df$sales.q2.thousands < mean.q2) & (sales.df$sales.q3.thousands < mean.q3),]
check.variable.value("filtered.df", correct.filtered.df)
success()