In [None]:
from notebook.services.config import ConfigManager
cm = ConfigManager()
#![Rlogo]("E:\\Repos\\Notebooks\\applyFamily\\Rlogo.png")
cm.update('livereveal', {
    'width':1000,
    'height':600,
    'scroll':True,
    })


## Demystifying R's `*apply` Family of Functions
<br>
<img src="./Rlogo.png" align="right" width="200"/>
<br>
<p align="right">By: James D. Triveri</p>
<p align="right">Date: 2017-10-12</p>    
<br>


### Base R comes pre-installed with a number of `*apply` functions   
           
<br>
*  `apply(X, MARGIN, FUN, ...)`                  
<br>
*  `lapply(X, FUN, ...)`    
<br>
*  `sapply(X, FUN, ..., simplify=TRUE, USE.NAMES=TRUE)`        
<br>   
*  `vapply(X, FUN, FUN.VALUE, ..., USE.NAMES=TRUE)`        
<br>   
*  `mapply(FUN, ..., MoreArgs=NULL, SIMPLIFY=TRUE, USE.NAMES=TRUE)`        
<br>    
*  `tapply(X, INDEX, FUN = NULL, …, default = NA, simplify = TRUE)`     
<br>      
*  `mclapply(X, FUN, ...)`      
<br>


### Functional Primitives
<br>
In addition to the `*apply` family, there are 3 functional primitives that           
serve a similiar purpose:     
<br>       
*  `Reduce(f, x, init, right = FALSE, accumulate = FALSE)`            
<br>   
*  `Filter(f, x)`             
<br>      
*  `Map(f, ...)`             
<br>

With so many tools to accomplish a similiar set of tasks, where        
does one even begin?    
<br>


### The following slides will cover: 
<br>

*  The origins of the `*apply` family of functions       
<br>   
*  Use cases of `*apply` functions and related functional primitives   
<br>    
*  Relationship between functional primitives and `*apply` family functions    
<br>    
*  The core tenets of the functional programming pardigm    
<br>    
*  How to incorporate functional programming best practices into your development     
<br>



## Orgins of R
<br>
Where does R come from?
<br>
```

        LISP (John McCarthy, 1958)                     
                \
                 \                                  S Language (John Chambers, 1976)
                  \                                      /     
                Scheme (Guy L. Steele 1970)             /
                                           \           / 
                                            \         /    
                                             \       /
                                              \     /
                                               \   /
                                       R (Ihaka & Gentleman, 1993)      
         

```
<br>


### LISP background
<br>
*  LISP is the second oldest programming language still used today.          
   Only Fortran is older, created one year earlier (in 1957).       
<br>    
*  LISP pioneered many ideas in computer science, including tree data structures,       
   dynamic typing, conditionals, higher-order functions and recursion.       
<br>
*  LISP is still used extensively by Artificial Intelligence researchers, and is     
   still the official programming language of MIT's AI Lab.     
<br>
*  Originally created as a practical mathematical notation for computer programs,     
   influenced by the notation of Alonzo Church's lambda calculus.     
<br>
*  The syntax is notoriously unwieldy and difficult to interpret - a.k.a.     
   "fully parenthesized prefix notation".         
<br>   


### A sample LISP function declaration
<br>     
```
(defparameter *source* 
  '(lambda (x) (let ((y (+ x 0.1))) 
      (format t "foo! ~a~%" (+ x y)) x)))
          (LAMBDA (X)
      (LET ((Y (+ X 0.1)))
   (FORMAT T "foo! ~a~%" (+ X Y))
X))
; No value
```
<br>

### From xkcd
<br>
<div style="text-align:center" markdown="1">
![lisp_cycles]("./lisp_cycles.png")
</div>
<br>

### Functional Programming Paradigm        
<br>
LISP (in addition to Scheme) are examples of *Functional Programming Languages*      
<br>
**What makes a programming language "functional"?**      
<br>
Hint: It's more than support for defining your own functions!     
<br>    


### Functional Programming Paradigm
<br>
*  Functions are pure (i.e. no side-effects)            
<br>    
*  Functions are *First-Class*        
<br>           
*  Variables are immutable     
<br>


### Functions are pure (i.e. no side-effects)...     
<br>
*  A pure function is a function where the return value is only determined by      
   its input values, without observable side effects.            
<br>   
*  This is how functions in mathematics work: Ln(x) will, for the same value of x,             
   always return the same result.             
<br>     
           



### Example of a non-pure function declaration:
<br>   
```python
def non_pure_natural_log(x):

    result = Ln(x)
    
    # print message to console =>  SIDE-EFFECT #1!
    print("Oh and by the way, which one's Pink?")
    
    # close database connection => SIDE-EFFECT #2!
    conn1.close(purge=True)        
    
    # close a file descriptor   => SIDE-EFFECT #3!
    f.close()
    
    # finally, return `result`
    return(result)
    
```
<br>    
**Even when programming in a non-functional language, avoiding this type     
of design pattern (or "anti-pattern") can greatly reduce the number of bugs       
introduced into your codebase.**         
<br>     



### Functions are First-Class...
<br>
Support for first-class functions means the language supports passing functions           
as arguments to other functions, returning them as the values from other functions,       
and assigning them to variables or storing them in data structures.   
<br>



### Example of a first-class function declaration in R
<br>    
```R       
# function `f` takes argument func, which is another function =>
f = function(a, b, func) {
    return(func(a*b))
}

# Then, passing functions as arguments to f =>
f(.5, .6, cos)
f(.5, .6, tan)
f(.5, .6, exp)

```    
<br>
     
Since R has semantics inherited from Scheme, First-Class functions
are supported.    
<br>    


####  Variables are immutable...
<br>
*Immutability* means that once a variable has been assigned to an       
object, it cannot be reassigned to another object.      
<br>
An immutability violation frequently found in non-functional          
implementations is the mutating loop tracker:          
<br>

```R
i = 1
l = vector()

while (i<10) {

    l[i] = runif(1)
    
    i = i+1 # re-binding `i` to `i+1` not allowed by the functional paradigm!  
}    
```
<br>


#### How do functional laguages perform operations on sequences without mutable state?      

**Recursion!**        
<br>
From the Wikipedia entry for Recursion(computer science):      

*Recursion in computer science is a method where the solution to a problem depends on          
solutions to smaller instances of the same problem (as opposed to iteration). The         
approach can be applied to many types of problems. Recursion is one of the central           
ideas of computer science.*           
<br>   



#### The "Hello, World!" of recursive functional design:
<br>

```R
# ==========================================
# Recursive factorial implementation in R =>
# ==========================================
factorial = function(n) {
    if (n==1) {
        return(1)
    } else {
        return(n*factorial(n-1))
    }
}

# calling factorial function =>
factorial(5)  # returns 120
factorial(10) # returns 3628800

```
<br>   


In many programming languages, map is the name of a higher-order function that applies a given function to each element of a list, returning a list of results in the same order. It is often called apply-to-all when considered in functional form.


The map function originated in functional programming languages.


The language Lisp introduced a map function called maplist[2] in 1959, with slightly different versions already appearing in 1958.[3] This is the original definition for maplist, mapping a function over successive rest lists:





lapply returns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X.

sapply is a user-friendly version and wrapper of lapply by default returning a vector, matrix or, if simplify = "array", an array if appropriate, by applying simplify2array(). sapply(x, f, simplify = FALSE, USE.NAMES = FALSE) is the same as lapply(x, f).


vapply is similar to sapply, but has a pre-specified type of return value, so it can be safer (and sometimes faster) to use.


mapply is a multivariate version of sapply. mapply applies FUN to the first elements of each ... argument, the second elements, the third elements, and so on. Arguments are recycled if necessary.


apply returns a vector or array or list of values obtained by applying a function to margins of an array or matrix.


tapply Applies a function to each cell of a ragged array, that is to each (non-empty) group of values given by a unique combination of the levels of certain factors.


mclapply is a parallelized version of lapply, it returns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X. It relies on forking and hence is not available on Windows unless mc.cores = 1







apply(X, MARGIN, FUN, ...)

lapply(X, FUN, ...)

sapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)

vapply(X, FUN, FUN.VALUE, ..., USE.NAMES = TRUE)

mapply(FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE)



In [None]:
IPython.display import Image, display
display(Image(filename='output1.png'))

### Equivalences

sapply(*, simplify = FALSE, USE.NAMES = FALSE) is equivalent to lapply(*).

mapply(rep, 1:4, 4:1)

mapply(rep, times = 1:4, x = 4:1)

mapply(rep, times = 1:4, MoreArgs = list(x = 42))

mapply(function(x, y) seq_len(x) + y,
       c(a =  1, b = 2, c = 3),  # names from first
       c(A = 10, B = 0, C = -10))

word <- function(C, k) paste(rep.int(C, k), collapse = "")
utils::str(mapply(word, LETTERS[1:6], 6:1, SIMPLIFY = FALSE))




mapply(rep, 1:4, 4:1)
mapply(rep, times = 1:4, x = 4:1)
mapply(rep, times = 1:4, MoreArgs = list(x = 42))
mapply(function(x, y) seq_len(x) + y,
       c(a =  1, b = 2, c = 3),  # names from first
       c(A = 10, B = 0, C = -10))
word <- function(C, k) paste(rep.int(C, k), collapse = "")
utils::str(mapply(word, LETTERS[1:6], 6:1, SIMPLIFY = FALSE))




n <- 17; fac <- factor(rep_len(1:3, n), levels = 1:5)
table(fac)
tapply(1:n, fac, sum)
tapply(1:n, fac, sum, default = 0) # maybe more desirable
tapply(1:n, fac, sum, simplify = FALSE)
tapply(1:n, fac, range)
tapply(1:n, fac, quantile)
tapply(1:n, fac, length) ## NA's
tapply(1:n, fac, length, default = 0) # == table(fac)


Now, we want to calculate the mean of the Sepal Length but broken by the Species, so we will use the tapply() function

tapply(iris$Sepal.Length, iris$Species, mean)

