# Environments and scoping in R

Scoping in R isn't really that weird, if we're coming into it from a programming background, but it really stresses out mathematicians, I guess. 

Generally, you'll be working in the Global Environment, unless you're working inside a function.

(Attribution note: a _lot_ of the code for this particular notebook came from Jordan Benis's lecture notes, which he kindly shared with me.)

In [1]:
environment()

<environment: R_GlobalEnv>

In [2]:
# the global environment does have a parent environment
parent.env(environment())

<environment: 0x00000000427d9fd0>
attr(,"name")
[1] "jupyter:irkernel"

In [3]:
# ... which has a parent environment
parent.env(parent.env(environment()))
# we could keep going up, but you get the idea

<environment: package:stats>
attr(,"name")
[1] "package:stats"
attr(,"path")
[1] "C:/Users/csheldon-hess/Anaconda3/envs/r-with-jupyter/Lib/R/library/stats"

In [5]:
print(ls())
print(ls(globalenv()))
#ls(baseenv())

character(0)
character(0)


In [8]:
checkEnv <- function(){
  print(environment())
  print(parent.env(environment()))
  print(ls())
}

checkEnv()
ls()

<environment: 0x0000000043573030>
<environment: R_GlobalEnv>
character(0)


In [15]:
checkEnvWarg <- function(arg = 3){
  print(environment())
  print(parent.env(environment()))
  print(ls()) # oooh
}

checkEnvWarg()


<environment: 0x00000000411eab90>
<environment: R_GlobalEnv>
[1] "arg"
<environment: 0x00000000412aa3c8>
<environment: R_GlobalEnv>
character(0)


In [16]:
ls()

To some extent, scoping in R works just like we expect from other languages. Create a variable inside a function? It goes away when the function returns. It's local to that function. Makes sense. Seems good. We're happy.

In [14]:
a <- 7

changea0 <- function(){
    b <- 20
    a <- a + b
}

changea0()
a
b

ERROR: Error in eval(expr, envir, enclos): object 'b' not found


Within a function, you can access global variables. Also unsurprising. But R does a nice thing, here, where it doesn't just let you go changing global variables without _really_ intending to.

It creates a new copy of the variable, within the environment of the function. It's separate from the global variable. Changes to it don't affect the global variable.

In [17]:
a <- 7 #in global env

# use a in a function
changea1 <- function(){
  a <- a + 1 # now we are in function's environment
  print(a)
}
changea1()
a

[1] 8


In [18]:
a <- 7 #in global env

# ok, what if we make a a parameter?
# (don't do this, why would you do this?)
changea2 <- function(a = 9){
  a <- a + 1 # can still get that good global value!
  print(a)   # can change it in function scope
}
changea2(a)
a # but it remains the same, globally

[1] 8


### Updating global variables inside functions

Occasionally, you might need to update a global variable inside a function. It's a thing, I _guess._ (I claim it's not a thing you will want to do often, although the book's example with "deal a card from the deck and then pull that card out of the deck" is a compelling exception.) 

In [19]:
# updating a value
# how do we think this works?

globalVar <- 17

updateGV <- function(numForUpdate){
  globalVar <- globalVar + numForUpdate
  globalVar
}

updateGV(3)
globalVar

Yeah, no.

For this we need the, I kid you not, **superassignment operator** `<<-`

In [20]:
globalVar <- 17

updateGV <- function(numForUpdate){
  globalVar <<- globalVar + numForUpdate
  globalVar
}
updateGV(3)

globalVar

Alternately, you can use the `assign()` function.

In [21]:
globalVar <- 17

updateGV <- function(numForUpdate){
  assign("globalVar", globalVar + numForUpdate, envir = parent.env(environment()))
  globalVar
}

updateGV(3) # we can try this with other numbers if you want

globalVar

Questions? Things you want to try?

# NAs in R

As data analysts, we often have to deal with missing values in our data. R was built with this kind of thing in mind and represents missing values with the symbol `NA`. You can't do math or logic with NAs.

In [22]:
5 + NA #it doesn't return 5

NA == "Dog" #it does return FALSE

In [23]:
vecWNA <- c(NA,1:10)

vecWNA

mean(vecWNA) #will return NA!

All is not lost! We can tell R, "yes, this data has some gaps, but we want to know what you can tell us using only the valid parts of the data." 