# Adding, Changing, and Renaming Columns

### Load required Tidyverse packages

In [1]:
library(readr)
library(dplyr)
library(tidyr)
library(ggplot2)


Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union



### mutate() function can be used to Add or Change existing Columns in the data frame
- Using "worldcup" Dataset in package "faraway":  2010 World Cup Statistics
- Using a localy saved dataset (as csv file)
- This dataset contains observations by player, including the player's team, position, number of shots ...
- Dataset is not tidy, and includes variable player's name as rownames rather than a column 

In [2]:
#install.packages("faraway")
#library(faraway)  
#data(worldcup)

#### Tibble doesn't use row names.  Convert rownames into a column variable (if using data from a package):

In [3]:
# worldcup <- worldcup %>%
#   mutate(player_name = rownames(worldcup))
# head(worldcup)

In [4]:
worldcup <- read_csv("Data/worldcup.csv")

Parsed with column specification:
cols(
  Team = col_character(),
  Position = col_character(),
  Time = col_integer(),
  Shots = col_integer(),
  Passes = col_integer(),
  Tackles = col_integer(),
  Saves = col_integer(),
  player_name = col_character()
)


In [5]:
class(worldcup)

In [6]:
head(worldcup)

Team,Position,Time,Shots,Passes,Tackles,Saves,player_name
Algeria,Midfielder,16,0,6,0,0,Abdoun
Japan,Midfielder,351,0,101,14,0,Abe
France,Defender,180,0,91,6,0,Abidal
France,Midfielder,270,1,111,5,0,Abou Diaby
Cameroon,Forward,46,2,16,0,0,Aboubakar
Uruguay,Forward,72,0,15,0,0,Abreu


### Re-order player_name column to be the 1st column

In [7]:
worldcup <- worldcup %>%
  select(player_name, everything())

head(worldcup)

player_name,Team,Position,Time,Shots,Passes,Tackles,Saves
Abdoun,Algeria,Midfielder,16,0,6,0,0
Abe,Japan,Midfielder,351,0,101,14,0
Abidal,France,Defender,180,0,91,6,0
Abou Diaby,France,Midfielder,270,1,111,5,0
Aboubakar,Cameroon,Forward,46,2,16,0,0
Abreu,Uruguay,Forward,72,0,15,0,0


### With summarize(), get average number of shots for a player's position

In [8]:
worldcup %>%
  group_by(Position) %>%
  summarize(ave_shot = mean(Shots)) %>%
  head()

Position,ave_shot
Defender,1.16489362
Forward,4.23076923
Goalkeeper,0.02777778
Midfielder,2.39473684


### Add a column with average number of shots for a player's position with mutate() function
- Use mutate instead of summarize() to add to the data frame
- Use ungroup() to ungroup the data frame and add to each row based on Position

In [9]:
worldcup <- worldcup %>%
  group_by(Position) %>%
  mutate(ave_shots = mean(Shots)) %>%
  ungroup()

head(worldcup)

player_name,Team,Position,Time,Shots,Passes,Tackles,Saves,ave_shots
Abdoun,Algeria,Midfielder,16,0,6,0,0,2.394737
Abe,Japan,Midfielder,351,0,101,14,0,2.394737
Abidal,France,Defender,180,0,91,6,0,1.164894
Abou Diaby,France,Midfielder,270,1,111,5,0,2.394737
Aboubakar,Cameroon,Forward,46,2,16,0,0,4.230769
Abreu,Uruguay,Forward,72,0,15,0,0,4.230769


# Using rename() function

### To rename a column, use rename() function
- The new name is the first argument: rename(new_name = old_name)

In [10]:
worldcup <- worldcup %>%
  rename(Name = player_name, Ave_Shots = ave_shots)

head(worldcup)

Name,Team,Position,Time,Shots,Passes,Tackles,Saves,Ave_Shots
Abdoun,Algeria,Midfielder,16,0,6,0,0,2.394737
Abe,Japan,Midfielder,351,0,101,14,0,2.394737
Abidal,France,Defender,180,0,91,6,0,1.164894
Abou Diaby,France,Midfielder,270,1,111,5,0,2.394737
Aboubakar,Cameroon,Forward,46,2,16,0,0,4.230769
Abreu,Uruguay,Forward,72,0,15,0,0,4.230769
