## Lexical scoping ##

### Lecturas recomendadas ###
- [Forms of the Assignment operator](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/rprog-assignmentOperators.md)
- [R Objects, Lexical Scoping](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/rprog-lexicalScoping.md)
- [Advanced R by Hadley Wickham - Functions](http://adv-r.had.co.nz/Functions.html)
- [Advanced R by Hadley Wickham - Non standard evaluation](http://adv-r.had.co.nz/Computing-on-the-language.html#scoping-issues)
- [Lexical Scoping and Statistical Computing](https://www.stat.auckland.ac.nz/~ihaka/downloads/lexical.pdf)

In R every object is tied to an environment. Specifically for functions, **each function includes a pointer to its parent environment**. This allows the function to have **access to the objects that are defined in the parent environment**, in addition to any objects that are created within the function. The combination of a function and the variables referenced in its environment is also known in computer science as a **closure**.

This feature allows a developer to write functions within a function that can access objects defined in all of the parent environment(s) in the hierarchy between the child function and the R Global Environment.

In [1]:
y <- 10

f <- function(x){
        y <- 2
        y^2 + g(x)
}

g <- function(x) { x* y}

f(3)

## What is a closure? ##

A Closure is a functional programming concept that is central to lexical scoping. A closure represents the association between a function and its environment, including the local variables that are defined within its scope and the name or reference to which the name was bound at design time. Since anonymous functions are unnamed, they are associated with environments by reference.

A closure enables the function to access these variables through copies or references even when the function is accessed outside their scope, unlike a regular function that is defined without an environment.

## Caching ##

A cache is a way to **store objects in memory to accelerate subsequent access to the same object**. In statistics, some matrix algebra computations are notoriously expensive, such as calculating the inverse of a matrix. Therefore, if one needs to use the same inverted matrix for subsequent computations, it is advantageous to cache it in memory instead of repeatedly calculating the inverse.

In [None]:
makeVector <- function(x = numeric()) {
        m <- NULL
        set <- function(y) {
                x <<- y
                m <<- NULL
        }
        get <- function() x
        setmean <- function(mean) m <<- mean
        getmean <- function() m
        list(set = set, get = get,
             setmean = setmean,
             getmean = getmean)
}

cachemean <- function(x, ...) {
        m <- x$getmean()
        if(!is.null(m)) {
                message("getting cached data")
                return(m)
        }
        data <- x$get()
        m <- mean(data, ...)
        x$setmean(m)
        m
}

myVector <- makeVector(1:15)

When an R function returns an object that contains functions to its parent environment (as is the case with a call like myVector <- makeVector(1:15)), not only does myVector have access to the specific functions in its list, but it also retains access to the entire environment defined by makeVector(), including the original argument used to start the function.

Why is this the case? myVector contains pointers to functions that are within the makeVector() environment after the function ends, so these pointers prevent the memory consumed by makeVector() from being released by the garbage collector. Therefore, the entire makeVector() environment stays in memory, and myVector can access its functions as well as any data in that environment that is referenced in its functions.

This feature explains why x (the argument initialized on the original function call) is accessible by subsequent calls to functions on myVector such as myVector$get(), and it also explains why the code works without having to explicitly issue myVector$set() to set the value of x.

## makeVector() step by step ##

After initializing key objects that store key information within makeVector(), the code provides four basic behaviors that are typical for data elements within an object-oriented program. They're called "getters and settters," and more formally known as mutator and accessor methods. As one might expect, "getters" are program modules that retrieve (access) data within an object, and "setters" are program modules that set (mutate) the data values within an object.

```R
 set <- function(y) {
                x <<- y
                m <<- NULL
        }
        get <- function() x
```

<<- is one of three forms of the assignment operator. The double left arrow << indicates that the assignment should be made to the parent environment.

Within set() we use the <<- form of the assignment operator, which assigns the value on the right side of the operator to an object in the parent environment named by the object on the left side of the operator.

When set() is executed, it does two things:

1. Assign the input argument to the x object in the parent environment, and
2. Assign the value of NULL to the m object in the parent environment. This line of code clears any value of m that had been cached by a prior execution of cachemean().

Therefore, if there is already a valid mean cached in m, whenever x is reset, the value of m cached in the memory of the object is cleared, forcing subsequent calls to cachemean() to recalculate the mean rather than retrieving the wrong value from cache.

Step 3: Create a new object by returning a list()

```R
    list(set = set,          # gives the name 'set' to the set() function defined above
         get = get,          # gives the name 'get' to the get() function defined above
         setmean = setmean,  # gives the name 'setmean' to the setmean() function defined above
         getmean = getmean)  # gives the name 'getmean' to the getmean() function defined above
```

The last section of code assigns each of these functions as an element within a list(), and returns it to the parent environment.Each element in the list is named. This allows us to use the $ form of the extract operator to access the functions by name rather than using the [[ form of the extract operator, as in myVector[[2]](), to get the contents of the vector.

It's important to note that the cachemean() function REQUIRES an input argument of type makeVector(). If one passes a regular vector to the function, as in

```R
aResult <- cachemean(1:15)
```

the function call will fail with an error explaining that cachemean() was unable to access $getmean() on the input argument because $ does not work with atomic vectors.

## Explaining cachemean() ##

cachemean() is required to populate or retrieve the mean from an object of type makeVector().

```R
cachemean <- function(x, ...) {
     ...
```

cachemean() starts with a single argument, x, and an ellipsis that allows the caller to pass additional arguments into the function.

Note that cachemean() is the only place where the mean() function is executed, which is why makeVector() is incomplete without cachemean().