[from here https://www.datacamp.com/community/tutorials/r-tutorial-apply-family?utm_source=adwords_ppc&utm_campaignid=898687156&utm_adgroupid=48947256715&utm_device=c&utm_keyword=&utm_matchtype=b&utm_network=g&utm_adpostion=1t1&utm_creative=229765585186&utm_targetid=aud-299261629574:dsa-473406586995&utm_loc_interest_ms=&utm_loc_physical_ms=1006976&gclid=CjwKCAjwh9_bBRA_EiwApObaOBBBa3VQg2IdUyVeeI0gMPTVevVK_LBKtyEp6hDeKp0sWCwjgMKLORoCV2MQAvD_BwE#codelapplycode]

These functions allow crossing the data in a number of ways and avoid explicit use of loop constructs. They act on an input list, matrix or array and apply a named function with one or several optional arguments.

#### apply()

In [1]:
X <- matrix(rnorm(30), nrow=5, ncol=6)

In [2]:
X

0,1,2,3,4,5
0.2734289,0.8387221,-0.7864828,0.3732844,-0.2974639,-0.7458732
0.7800701,0.3131216,-0.3510538,2.2020555,0.0478064,0.6605179
-0.2644721,0.1483697,-2.2204709,0.6579581,-0.4106993,-0.54488
-0.8687482,0.8705984,0.7703488,-0.4113856,0.7319545,0.6594421
0.1025178,0.3612828,-1.0280697,-0.5458293,0.2098225,-0.4154092


In [3]:
apply(X, 2, sum)

#### lapply()

The difference from apply() is that:

- It can be used for other objects like dataframes, lists or vectors; and
- The output returned is a list (which explains the “l” in the function name), which has the same number of elements as the object passed to it.

In [4]:
A <- matrix(data = 1:9, nrow = 3, ncol = 3)
B <- matrix(data = 4:15, nrow = 4, ncol = 3)
C <- matrix(data = 8:10, nrow = 3, ncol = 2)

MyList <- list(A, B, C)

In [5]:
MyList

0,1,2
1,4,7
2,5,8
3,6,9

0,1,2
4,8,12
5,9,13
6,10,14
7,11,15

0,1
8,8
9,9
10,10


The empty space between the commas is because that's because where you'd say what row we want; 
here, we only want column

In [6]:
lapply(MyList,"[", ,2)

In this case we have selected a row, but we've left the column blank

In [7]:
lapply(MyList, "[", 1, )

#### sapply()

The sapply() function works like lapply(), but it tries to simplify the output to the most elementary data structure that is possible. And indeed, sapply() is a ‘wrapper’ function for lapply().

In [8]:
# Compare lapply() and sapply()

lapply(MyList,"[", 2, 1 )

In [9]:
typeof(lapply(MyList,"[", 2, 1 ))

In [10]:
sapply(MyList, "[", 2, 1)

In [11]:
typeof(sapply(MyList, "[", 2, 1))

In [12]:
# When simplify is FALSE behavious is as with lapply()

sapply(MyList, "[", 2, 1, simplify = FALSE)

Good article: https://www.r-bloggers.com/using-apply-sapply-lapply-in-r/. Implemented below:

In [30]:
m <- matrix(data=cbind(rnorm(30, 0), rnorm(30, 2), rnorm(30, 5)), nrow=30, ncol=3)

In [31]:
m

0,1,2
-0.8757037,0.23236472,4.273751
-1.25728942,1.29681482,4.90058
-1.31861507,1.00354265,3.831502
0.08881604,0.35773992,3.041967
-0.73372043,1.47315741,5.203949
-0.63420327,3.60191845,4.650125
1.79409248,1.19269789,3.929694
-0.01757364,1.20184233,5.716597
-1.2428026,0.80799914,5.968494
-0.09169076,2.32714697,5.73485


In [32]:
apply(m, 2, mean)

In [33]:
apply(m, 2, function(x) length(x[x<0]))

Here the function definition is not required, we could instead just pass the is.vector function, as it only takes one argument and has already been wrapped up in a function for us. Let’s check they are vectors as we might expect.

In [40]:
apply(m, 2, is.vector) == apply(m, 2, function(x) is.vector(x))

Why then did we need to wrap up our length function? When we want to define our own handling function for apply, we must at a minimum give a name to the incoming data, so we can use it in our function.

In [41]:
apply(m, 2, length(x[x<0]))

ERROR: Error in match.fun(FUN): object 'x' not found


In [42]:
apply(m, 2, function(x) mean(x[x>0]))