In [2]:
library(tidyverse)

# Create, modify, and delete columns

`mutate()` adds new variables and preserves existing ones; `transmute()` adds new variables and drops existing ones. New variables overwrite existing variables of the same name. Variables can be removed by setting their value to NULL

```R
mutate(.data, ...)

# S3 method for data.frame
mutate(
  .data,
  ...,
  .keep = c("all", "used", "unused", "none"),
  .before = NULL,
  .after = NULL
)

transmute(.data, ...)
```

# Useful mutate functions

# Examples

Newly created variables are available immediately

In [4]:
#Create a new column, preserve the existing ones
iris %>% mutate(Sepal.Length.Square = Sepal.Length ^ 2) %>% head()

Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species,Sepal.Length.Square
5.1,3.5,1.4,0.2,setosa,26.01
4.9,3.0,1.4,0.2,setosa,24.01
4.7,3.2,1.3,0.2,setosa,22.09
4.6,3.1,1.5,0.2,setosa,21.16
5.0,3.6,1.4,0.2,setosa,25.0
5.4,3.9,1.7,0.4,setosa,29.16


Only add newly created variables, drop existing variables

In [15]:
iris %>% transmute(Sepal.Length.Square = Sepal.Length ^ 2) %>% head()

Sepal.Length.Square
26.01
24.01
22.09
21.16
25.0
29.16


As well as adding new variables, you can use mutate() to
remove variables and modify existing variables.

In [5]:
#remove column Sepal.Length, Sepal.Width
iris %>% mutate(Sepal.Length = NULL, Sepal.Width = NULL) %>% head()

Petal.Length,Petal.Width,Species
1.4,0.2,setosa
1.4,0.2,setosa
1.3,0.2,setosa
1.5,0.2,setosa
1.4,0.2,setosa
1.7,0.4,setosa


In [7]:
#Modify: transform Petal.Width to sqrt
iris %>% mutate(Petal.Width = sqrt(Petal.Width)) %>% head()

Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species
5.1,3.5,1.4,0.4472136,setosa
4.9,3.0,1.4,0.4472136,setosa
4.7,3.2,1.3,0.4472136,setosa
4.6,3.1,1.5,0.4472136,setosa
5.0,3.6,1.4,0.4472136,setosa
5.4,3.9,1.7,0.6324555,setosa


In [8]:
# Use across() with mutate() to apply a transformation
# to multiple columns in a tibble.
starwars %>%
 select(name, homeworld, species) %>%
 mutate(across(!name, as.factor))

name,homeworld,species
Luke Skywalker,Tatooine,Human
C-3PO,Tatooine,Droid
R2-D2,Naboo,Droid
Darth Vader,Tatooine,Human
Leia Organa,Alderaan,Human
Owen Lars,Tatooine,Human
Beru Whitesun lars,Tatooine,Human
R5-D4,Tatooine,Droid
Biggs Darklighter,Tatooine,Human
Obi-Wan Kenobi,Stewjon,Human


Window functions are useful for grouped mutates:

In [11]:
TF <- data.frame(
    clan = c('VNC', 'VNC', 'VNC', 'King Allool', 'King Allool'),
    player = c('Meomeo888', 'Tank Cao', 'VN Pikachu', 'xXx-Hadi-xXx', 'GHOST'),
    power = c(95, 88, 100, 50, 65)
)

TF

clan,player,power
VNC,Meomeo888,95
VNC,Tank Cao,88
VNC,VN Pikachu,100
King Allool,xXx-Hadi-xXx,50
King Allool,GHOST,65


In [12]:
#For each class, ranking player by his power
TF %>% 
group_by(clan) %>%
mutate(rank = rank(desc(power)))

clan,player,power,rank
VNC,Meomeo888,95,2
VNC,Tank Cao,88,3
VNC,VN Pikachu,100,1
King Allool,xXx-Hadi-xXx,50,2
King Allool,GHOST,65,1


In [14]:
#Normalize Sepal.Width for each Species
iris %>% 
group_by(Species) %>% 
mutate(Sepal.Width = (Sepal.Width - mean(Sepal.Width)) / sd(Sepal.Width)) %>%
head()

Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species
5.1,0.1899414,1.4,0.2,setosa
4.9,-1.1290958,1.4,0.2,setosa
4.7,-0.601481,1.3,0.2,setosa
4.6,-0.8652884,1.5,0.2,setosa
5.0,0.4537488,1.4,0.2,setosa
5.4,1.2451711,1.7,0.4,setosa


# Arguments

### `.data`	

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

### `...`

	
<data-masking> Name-value pairs. The name gives the name of the column in the output.

The value can be:

* A vector of length 1, which will be recycled to the correct length.

* A vector the same length as the current group (or the whole data frame if ungrouped).

* NULL, to remove the column.

* A data frame or tibble, to create multiple columns in the output.

In [4]:
clan <- data.frame(
    name = c('VN Pikachu', 'Tank Cao', 'quachtinh'),
    level = c(31, 34, 33)
)

clan

name,level
VN Pikachu,31
Tank Cao,34
quachtinh,33


a vector of length 1

In [9]:
clan %>% mutate(sex = 'Male')

name,level,sex
VN Pikachu,31,Male
Tank Cao,34,Male
quachtinh,33,Male


A vector the same length as the current group (or the whole data frame if ungrouped)

In [7]:
clan %>% mutate(power = c(100, 85, 53))

name,level,power
VN Pikachu,31,100
Tank Cao,34,85
quachtinh,33,53


In [13]:
#level %/% 10 return a vector of length 3
#think of it as value = clan$level %/% 10
clan %>% mutate(value = level %/% 10) 

name,level,value
VN Pikachu,31,3
Tank Cao,34,3
quachtinh,33,3


In [8]:
#level = NULL to remove column level
clan %>% mutate(level = NULL)

name
VN Pikachu
Tank Cao
quachtinh


In [None]:
#A data frame or tibble, to create multiple columns in the output.