## while loop

In [1]:
# Initialize the speed variable
speed <- 88

while (speed > 30) {
  print(paste("Your speed is", speed))
  
  # Break the while loop when speed exceeds 80
  if (speed > 80) {
    break
  }
  
  if (speed > 48) {
    print("Slow down big time!")
    speed <- speed - 11
  } else {
    print("Slow down!")
    speed <- speed - 6
  }
}

[1] "Your speed is 88"


## for loop

The break statement abandons the active loop: the remaining code in the loop is skipped and the loop is not iterated over anymore.

The next statement skips the remainder of the code in the loop, but continues the iteration.

In [2]:
# The linkedin vector has already been defined for you
linkedin <- c(16, 9, 13, 5, 2, 17, 14)

# Extend the for loop
for (li in linkedin) {
  if (li > 10) {
    print("You're popular!")
  } else {
    print("Be more visible!")
  }
  
  # Add if statement with break
  if (li > 16) {
    print("This is ridiculous, I'm outta here!")
    break
  } 
  if (li < 5) {
    print("This is too embarrassing!")
    next
  }
  print(li)
}

[1] "You're popular!"
[1] 16
[1] "Be more visible!"
[1] 9
[1] "You're popular!"
[1] 13
[1] "Be more visible!"
[1] 5
[1] "Be more visible!"
[1] "This is too embarrassing!"
[1] "You're popular!"
[1] "This is ridiculous, I'm outta here!"


## lapply

lapply returns always a list

In [3]:
# Definition of split_low
pioneers <- c("GAUSS:1777", "BAYES:1702", "PASCAL:1623", "PEARSON:1857")
split <- strsplit(pioneers, split = ":")
split_low <- lapply(split, tolower)

# Generic select function
select_el <- function(x, index) {
  x[index]
}

# Use lapply() twice on split_low: names and years
names <- lapply(split_low, select_el, index = 1)
years <- lapply(split_low, select_el, index = 2)

## sapply

sapply simplifies lapply result

In [4]:
# temp is already defined in the workspace

# Finish function definition of extremes_avg
extremes_avg <- function(x) {
  ( min(x) + max(x) ) / 2
}

# Apply extremes_avg() over temp using sapply()
sapply(temp, extremes_avg)

# Apply extremes_avg() over temp using lapply()
lapply(temp, extremes_avg)

ERROR: Error in lapply(X = X, FUN = FUN, ...): Objekt 'temp' nicht gefunden


sapply returns a matrix when the vector is > 1 

In [5]:
# temp is already available in the workspace

# Create a function that returns min and max of a vector: extremes
extremes <- function(x) {
  c(min = min(x), max = max(x))
}

# Apply extremes() over temp with sapply()
sapply(temp, extremes)

# Apply extremes() over temp with lapply()
lapply(temp, extremes)

ERROR: Error in lapply(X = X, FUN = FUN, ...): Objekt 'temp' nicht gefunden


## useful functions


    seq(): Generate sequences, by specifying the from, to, and by arguments.
    rep(): Replicate elements of vectors and lists.
    sort(): Sort a vector in ascending order. Works on numerics, but also on character strings and logicals.
    rev(): Reverse the elements in a data structures for which reversal is defined.
    str(): Display the structure of any R object.
    append(): Merge vectors or lists.
    is.*(): Check for the class of an R object.
    as.*(): Convert an R object from one class to another.
    unlist(): Flatten (possibly embedded) lists to produce a vector.


## Regular expression

In their most basic form, regular expressions can be used to see whether a pattern exists inside a character string or a vector of character strings. For this purpose, you can use:

    grepl(), which returns TRUE when a pattern is found in the corresponding character string.
    grep(), which returns a vector of indices of the character strings that contains the pattern.
    
You can use the caret, ^, and the dollar sign, $ to match the content located in the start and end of a string, respectively. 

        ".*", which matches any character (.) zero or more times (*). Both the dot and the asterisk are metacharacters. You can use them to match any character between the at-sign and the ".edu" portion of an email address.

        "\\.edu$", to match the ".edu" part of the email at the end of the string. The \\ part escapes the dot: it tells R that you want to use the . as an actual character.


In [6]:
# The emails vector has already been defined for you
emails <- c("john.doe@ivyleague.edu", "education@world.gov", "dalai.lama@peace.org",
            "invalid.edu", "quant@bigdatacollege.edu", "cookie.monster@sesame.tv")

# Use grepl() to match for .edu addresses more robustly
grepl(pattern = "@.*\\.edu$", emails)

# Use grep() to match for .edu addresses more robustly, save result to hits
hits <- grep(pattern = "@.*\\.edu$", emails)

# Subset emails using hits
emails[hits]

In [7]:
# The emails vector has already been defined for you
emails <- c("john.doe@ivyleague.edu", "education@world.gov", "global@peace.org",
            "invalid.edu", "quant@bigdatacollege.edu", "cookie.monster@sesame.tv")

# Use sub() to convert the email domains to datacamp.edu
sub(pattern = "@.*\\.edu$",replacement = "@datacamp.edu", x = emails)


    .*: A usual suspect! It can be read as "any character that is matched zero or more times".
    \\s: Match a space. The "s" is normally a character, escaping it (\\) makes it a metacharacter.
    [0-9]+: Match the numbers 0 to 9, at least once (+).
    ([0-9]+): The parentheses are used to make parts of the matching string available to define the replacement. The \\1 in the replacement argument of sub() gets set to the string that is captured by the regular expression [0-9]+.


## dates and times

Create and format dates

To create a Date object from a simple character string in R, you can use the as.Date() function. The character string has to obey a format that can be defined using a set of symbols (the examples correspond to 13 January, 1982):

    %Y: 4-digit year (1982)
    %y: 2-digit year (82)
    %m: 2-digit month (01)
    %d: 2-digit day of the month (13)
    %A: weekday (Wednesday)
    %a: abbreviated weekday (Wed)
    %B: month (January)
    %b: abbreviated month (Jan)


In [8]:
# Definition of character strings representing dates
str1 <- "May 23, '96"
str2 <- "2012-03-15"
str3 <- "30/January/2006"

# Convert the strings to dates: date1, date2, date3
date1 <- as.Date(str1, format = "%b %d, '%y")
date2 <- as.Date(str2)
date3 <- as.Date(str3, format = "%d/%B/%Y")


# Convert dates to formatted strings
format(date1, "%A")
format(date2, "%d")
format(date3, "%b %Y")


As default R matches your character string to the formats "%Y-%m-%d" or "%Y/%m/%d".

In addition to creating dates, you can also convert dates to character strings that use a different date notation. For this, you use the format() function. Try the following lines of code:

Create and format times

Similar to working with dates, you can use as.POSIXct() to convert from a character string to a POSIXct object, and format() to convert from a POSIXct object to a character string. Again, you have a wide variety of symbols:

    %H: hours as a decimal number (00-23)
    %I: hours as a decimal number (01-12)
    %M: minutes as a decimal number
    %S: seconds as a decimal number
    %T: shorthand notation for the typical format %H:%M:%S
    %p: AM/PM indicator
    
For a full list of conversion symbols, consult the ?strptime documentation in the console

Again,as.POSIXct() uses a default format to match character strings. In this case, it's %Y-%m-%d %H:%M:%S. In this exercise, abstraction is made of different time zones.
