<a href="https://colab.research.google.com/github/yardsale8/DSCI_210_R_notebooks/blob/main/lecture_7_2_2_branching_in_R_with_ifelse_and_case_when.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [25]:
# This loads all of the dplyr functions
#must do everytime you start new R session
library(tidyverse)

# Branching in R

In this notebook, we will look at functions for branching in `R`.

1. Using `ifelse` to perform an `IF ... THEN ... ELSE` operations.
2. Using `case_when` to perform `IF ... THEN ... ELSEIF ...` and `IF ... THEN ... ELSEIF ... ELSE` operations.
2. Performing a data management process using a pipe.

## Example - Loading some survey data.

In [7]:
surveys <- read.csv('https://github.com/WSU-DataScience/DSCI_210_R_notebooks/raw/main/data/portal_data_joined.csv')
head(surveys)

Unnamed: 0_level_0,record_id,month,day,year,plot_id,species_id,sex,hindfoot_length,weight,genus,species,taxa,plot_type
Unnamed: 0_level_1,<int>,<int>,<int>,<int>,<int>,<chr>,<chr>,<int>,<int>,<chr>,<chr>,<chr>,<chr>
1,1,7,16,1977,2,NL,M,32.0,,Neotoma,albigula,Rodent,Control
2,72,8,19,1977,2,NL,M,31.0,,Neotoma,albigula,Rodent,Control
3,224,9,13,1977,2,NL,,,,Neotoma,albigula,Rodent,Control
4,266,10,16,1977,2,NL,,,,Neotoma,albigula,Rodent,Control
5,349,11,12,1977,2,NL,,,,Neotoma,albigula,Rodent,Control
6,363,11,12,1977,2,NL,,,,Neotoma,albigula,Rodent,Control


In [17]:
surveys$year %>% unique

## Topic 1 - Using `ifelse` to perform an `IF ... THEN ... ELSE` operations.

**Syntax.**

```{R}
...
%>% mutate(new_col = ifelse(_cond_, _then_, _else_)
...
```


#### Example - Recoding the `sex` columns

Suppose that we want to make an indicator column for males.

**Step 1.** First, we use `unique` to see the unique labels.

In [8]:
surveys$sex %>% unique

Now we transform this column using `mutate` and `ifelse`

In [12]:
(surveys
 %>% select(sex) # Temp.  Remember to delete before saving
 %>% mutate(is_male = ifelse(sex == 'M', 1, 0))
 %>% head # Temp.  Remember to delete before saving
)

Unnamed: 0_level_0,sex,is_male
Unnamed: 0_level_1,<chr>,<dbl>
1,M,1
2,M,1
3,,0
4,,0
5,,0
6,,0


<font color="red"> <b> Question </b></font> Should the missing values be coded as `0` or `NA`? Explain.

<font color = "orange">
Your answer here.
</font>

## <font color="red"> <b> Exercise 7.2.4 </b></font>

**Task.** Create a `after_1990` column using `mutate` and `ifelse` that is `1` for years before `1990` and `0` otherwise.

In [18]:
# Your code here

## Topic 2 - Using `case_when` to perform longer branches.

**Syntax - No Else.**
```{R}
...
%>% mutate(new_col = case_when( _cond1_ ~ _val1_,
                                _cond2_ ~ _val2_,
                                ...
                                _condk_ ~ _valk_,
                              )

```
**Note.** Any values that don't match a condition become `NA`

#### Example - Recoding `sex`

Let's redo the last problem, but this using `case_when` with no `ELSE`

In [15]:
(surveys
 %>% select(sex) # Temp.  Remember to delete before saving
 %>% mutate(is_male = case_when(sex == 'M' ~ 1,
                                sex == 'F' ~ 0,
                               )
           )
 %>% head # Temp.  Remember to delete before saving
)

Unnamed: 0_level_0,sex,is_male
Unnamed: 0_level_1,<chr>,<dbl>
1,M,1.0
2,M,1.0
3,,
4,,
5,,
6,,


### Adding an `ELSE` using `.default =`


**Syntax - With Else.**
```{R}
...
%>% mutate(new_col = case_when( _cond1_ ~ _val1_,
                                _cond2_ ~ _val2_,
                                ...
                                _condk_ ~ _valk_,
                                .default = _elseVal_
                              )
...
```
**Note.** Any values that don't match get assigned the `.default` value

#### Example - Recoding `sex`

Finally, suppose we want to recode `sex` as follows
1. Recode `M` and `F` to `Male` and `Female`, respectively.
2. Make missing values `''` into `Unknown`.

In [16]:
(surveys
 %>% select(sex) # Temp.  Remember to delete before saving
 %>% mutate(sex = case_when(sex == 'M' ~ 'Male',
                            sex == 'F' ~ 'Female',
                            .default = 'Unknown'
                            )
           )
 %>% head # Temp.  Remember to delete before saving
)

Unnamed: 0_level_0,sex
Unnamed: 0_level_1,<chr>
1,Male
2,Male
3,Unknown
4,Unknown
5,Unknown
6,Unknown


## <font color="red"> <b> Exercise 7.2.5 </b></font>

**Task.** Create a `decade` column that contains the decade written out in text.  

**Hints.**
1. You should use `unique` to inspect the `year` column to determine the possible decades.
2. Use `case_when` with inequalities,
3. Use `.default` to catch the last case (less/simpler code), and
3. `R` doesn't allow compound inequalities ($2 < x \le 5$).  Remember that you can rule out the previous cases, which should allow you to write simple inequalities.  Ask for help if needed.

In [19]:
# Your code here

## Topic 3 - Using `case_match` to match specific values

In Tableau Prep, we could use the `CASE ... WHEN ... THEN ... END` expression to
1. Match specific values, and
2. Remove the need for Boolean expressions.

In `R`, this is accomplished using `case_match`.


**Syntax.**
```{R}
...
%>% mutate(new_col = case_match( col,
                                _original_val1_ ~ _new_val2_,
                                _original_val2_ ~ _new_val2_,
                                ...
                                _original_valk_ ~ _new_valk_,
                                .default = _elseVal_ # Optional
                              )
...
```

In [22]:
# Look at the help
?case_match

#### Example - Recoding `sex` one more time

Let's redo the last two examples of recoding the `sex` column, but this time using `case_when`.

In [23]:
# Making an indicator column
(surveys
 %>% select(sex) # Temp.  Remember to delete before saving
 %>% mutate(sex = case_match(sex,
                            'M' ~ 1,
                            'F' ~ 0,
                            )
           )
 %>% head # Temp.  Remember to delete before saving
)

Unnamed: 0_level_0,sex
Unnamed: 0_level_1,<dbl>
1,1.0
2,1.0
3,
4,
5,
6,


In [24]:
# Recoding in place with Unknown
(surveys
 %>% select(sex) # Temp.  Remember to delete before saving
 %>% mutate(sex = case_match(sex,
                            'M' ~ 'Male',
                            'F' ~ 'Female',
                            .default = 'Unknown'
                            )
           )
 %>% head # Temp.  Remember to delete before saving
)

Unnamed: 0_level_0,sex
Unnamed: 0_level_1,<chr>
1,Male
2,Male
3,Unknown
4,Unknown
5,Unknown
6,Unknown


## <font color="red"> <b> Exercise 7.2.6 </b></font>

**Tasks/Questions.**

1. Use `case_match` to recode the month numbers to (e.g., `1`) the month names (e.g., `"January"`).
2. Explain why we don't want to use `case_match` to define either `after_1990` or `decade`.

In [None]:
# Your code here.