# R Language Basics

In this module, we will explore the basics of R data structures and functions.

If you are unfamiliar with notebooks, please review some basics [here](https://github.com/michhar/useR2016-tutorial-jupyter). 

## Essential Tips for jupyter notebooks

A very brief summary of the critical components and commands within jupyter are:

1. Critically, press `Ctrl+Enter` to run (or render) the current cell.
2. Output will print to the notebook. You may have to scroll up to see it all.
3. Get help for any function by typing a question mark and then its name into
   the console: `?rxLinMod`. It will split the window, and will bring up the documentation for 
   that function below.
5. Files will appear in the specified directory. You can find them by selecting File in the menu bar and selecting "Open...". This will open a new browser window with a file navigator.
6. R objects can be viewed by typing `ls()` in an R cell.
7. Run all the example code!

There are a number of hands-on exercises in the document, so while you can run the notebook from beginning to end, you will get a lot more out of it by actually walking through cell-by-cell, and filling out the corresponding exercises.

These notebooks are based on a tutorial presented at a Microsoft conference in June of 2016. The original files are available [here](https://github.com/joseph-rickert/MLADS_JUNE_2016)

## Basic Data Structures

R has 5 basic data structures:     

1. atomic vectors
2. matrices
3. arrays
4. lists
5. data frames
  
Note that atomic vectors are as simple as it gets - the way to represent a scalar is to create a vector with only 1 element in it. Let's start with these.

There are a variety of ways to create vectors, but the simplest is to use the `<-` assignment operator:


In [None]:
v <- 3.14

We can read the prior cell as "v gets the value 10".

When we are investigating atomic vectors, it is useful to note that atomic vectors have 3 basic properties:

1. type: the type of object that is represented
2. length: the number of elements in the vector
3. attributes: arbitrary other metadata (e.g. element names)

We can access each of these properties with that property's accessor function.

### Example 1

In [None]:
v             ## returns the value of the object
typeof(v)     ## returns the type of object - its first property
length(v)     ## returns the length of the object - its second property
attributes(v) ## returns the attributes of the object - its third property

### Example 2

We can create arbitrary vectors with the `c()` function. You can think of `c()` standing for "combine", in that the output of `c()` will combine all of its arguments is a single vector or list. 

In [None]:
b <- c('john', 'jane')    ## combine john and jane into a single vector!
b                         ## return the value of b
typeof(b)                 ## return the type
length(b)                 ## return the length (its number of elemebts)
attributes(b)             ## return the attributes

### Attributes

We have seen vectors with different types (numeric and character), and vectors with different numbers of elements. What are attributes used for?

One of the most common uses of attributes with atomic vectors is to name the elements within a vector. We can name them when constructing them with the `c()` function:

In [None]:
b2 <- c(first = "jane", last = "doe")
attributes(b2)

Because names are so common, they have their own accessor function:

In [None]:
names(b2)

### Types of atomic Vectors

The four common types of atomic vectors are:   

1. integer
2. double
3. character
4. logical

The type of an atomic vector is important, because atomic vectors are homogeneous. This means that *every element* within a single atomic vector has to be of the same type. If one element of object `myvec` is character is character, then they all have to be.

#### Integer vectors   

In our previous examples, `v`, and `u` both contained integers, but were of `numeric` (i.e. double) type. How do we enforce R to use an integer representation?

We can enforce integers by levering the "L" syntax, or by using a a specific coercion or casting function. In R, the conversion functions are almost always named `as.XXX()` where `XXX` is the target object type. To coerce one vector to be integer type, we can use `as.integer()`:

In [None]:
u1 <- c(12L,55L,43L,9L)    # specify integer type - adding an `L` ensures that it will be integer
typeof(u1)

## revisit `u`
typeof(u)
u  <- as.integer(u)        # coerce u to be an integer vector
typeof(u)   

#### Character vectors

Character vectors are created in the same kinds of ways that other vectors are. In the most general case, we can use `c()`:

In [None]:
c1 <- c("abc","def","any old string")       # form a character vector
typeof(c1)
length(c1)

c2 <- c("2","3"); c2
u2 <- as.integer(c2)        # coerce c2 to be an integer - as.integer() works on character vectors too!
u2; typeof(u2)

#### Logical vectors    

In addition to integer, double, and character vectors, the next most common type of vector is logical. We can also create logical vectors with `c()` as well!

In [None]:
u3 <- c(TRUE,T,FALSE,F)     # create a logical vector
typeof(u3)

#### Exercise

While the symbols `T` and `F` can be used to abbreviate for the values TRUE and FALSE, you must be very careful with them, because they are not protected. For example, what is the type of the vector that is created with the following cell?

In [None]:
T <- "I am false!"
test_T <- c(TRUE, FALSE, T)
rm(T)

In [None]:
## Place your exercise code here

#### Logical vectors

Most logical vectors are not created by assigning TRUE and FALSE arbitrarily. Usually, logical vectors are created by evaluating logical expressions. For example, we can create a vector of 10 random numbers between 0 and 1, and then we ask whether each of those 10 values is greater than 0.5. The result of the comparison is assigned to a logical vector `u4`:

In [None]:
r10 <- runif(10)
u4 <- r10 > .5       # create a random logical vector
u4; typeof(u4)  

#### Exercise

Get help on `Logic` and `Comparison` to identify which logical comparisons are available, and what each of their symbols is.

Advanced: Can you describe the difference between `&&` and `&`? 

In [None]:
# Place you exercise code here.

#### Exercise

As we mention above, atomic vectors are *homogeneous*, which means that every element is of the same type. Additionally, we mentioned that `c()` can be used to combine arbitrary vectors - our examples early were just combining vectors of length 1, but that was to be very clear.

Use `c()` to combine `v` and `u1` into a new vector `vu1`. What is the type of the resulting vector? What is it's length?


In [None]:
# Place exercise code here

#### Exercise

What happens if you use `c()` to combine `u4`, `u1`, `v`, and `c1`? How long is the result? What is the type?

In [None]:
# Place exercise code here

#### Exercise

Finally - execute the following code. Can you determine why `test1` and `test2 are different?

In [None]:
conversion1 <- c(c1, c(u4, u1))
conversion2 <- c(c1, u4, u1)
all(conversion1 == conversion2)    ## Why is this false?

# Place exercise code here


### Vector Operations

The prior example on creating a random logical vector illustrates a very powerful feature of R. Many of the arithmetic and logical operations are vectorized, meaning that an author can use one statement to evaluate a function for *every element in an object*. When we look at the statement:

```
r10 > 0.5
```

That is evaluating whether each element of `r10` is greater than 0.5. While this is often very useful, we sometimes also want to extract subsets of that data. We can index from a vector using square brackets. 

In order to demonstrate this, let's generate a couple of new sequences. We can generate systematic numeric and integer sequences using `seq()` and the `:` operator.


In [None]:
v1 <- seq(from = 1,to = 50, by=5)          # seq() generates sequences - see ?seq for more help
v2 <- 1:10                                 # `:` operates like seq(..., by = 1), but returns an integer vector

v1
v2

#### Subsetting

We can subset by using the square bracket `[`. One feature of the square bracket is that it is pretty flexible.

You can extract a single value - if the indexing value is numeric, it will return the element at that position:

In [None]:
v1                                         # return the full vector
v1[5]                                      # extracts the 5th value (length of the )

Or, you can extract multiple values. If the indexing value has a length > 1, then it can extract as many values as you pass it. You can even extract the same value multiple times, so it's possible that the subset vector could be *longer* than your original vector!

In [None]:
v2[3:4]                                # extracts the third and fourth values - indices can be vectors too!
v3 <- v1[c(v2, v2)]                    # extracts all values of v1 twice!
length(v3)
v3

Or, you can use logical vectors. If a logical vector is used, then it will extract all values for which the logical expression is true.

In [None]:
v2 <= 5
v1[v2 <= 5]     ## extract the elements 1 - 5, because the first 5 elements are <= 5

If elements are named, then you can also subset with names:

In [None]:
b2["first"]

#### Vector arithmetic

As we saw above, we compared all values in the r10 vector to 0.5 with one statement. It turns out that we can do this element-wise type of operation for many arithmetic statements as well. For example, if we add two vectors to one another, it will operate element-by-element.

In [None]:

s <- v1 + v2             # add vectors element by element
s
p <- v1 * v2             # multiply element by element
p


#### Vectors of different length

One thing to note is that R behaves very reasonably when you are performing vector operations or comparisons are the same length - it proceeds and compares element k of vector 1 to element k of vector 2, all the way through the entire length of vector 1 and 2.

What happens when the two vectors are no the same length? R "recycles" the shorter vector. A really clear example of this is when we compared `r10` to the value 0.5: 0.5 is, in and of itself, a vector with length 1. R effectively expanded 0.5 to length 10 (where all elements equal 0.5), and then went element by element and compared `r10` to that new, longer comparison vector. While this is a trivial example, this type of recycling is ubiquitous. For example, consider the following statement:


In [None]:
v2/(1:2)

Each odd number is divided by 1, and each even number is divided by 2. The statement is actually equivalent to:

In [None]:
v3 <- rep(1:2, length = 10)
v3
v2 / v3

#### Exercise

What happens when the longer vector's length is not a multiple of the shorter vector's length (try a vector of length 3, and a vector of length 10)?


In [None]:
## Place exercise code here

### Matrices   

Matrices are simply two-dimensional vectors. They have the same constraint that they are *homogeneous*, but they can be indexed and access by row and by column. Most generically, we use the `matrix()` function to create a matrix object.

In [None]:
m1 <- matrix(1:100,nrow=10,ncol=10)  
m1  
typeof(m1); class(m1); length(m1); dim(m1)  

#### Indexing a Matrix

A matrix still uses square brackets to access subsets. However, it is frequently indexed with both a row index and a column index, separated with a column:

In [None]:

m1[5,5]        # index a particular element of a matrix

If we want to index an entire row, then we leave the column index empty, and if we want to index an entire column, we leave the row index empty:

In [None]:
m1[1,]
m1[,1]

Similar to vectors, each row and column index can be a vector, so if we want the first 3 rows and the first three columns, then we can do that easily:

In [None]:
m1[1:3, 1:3]

And we can also use vectors as indices when leaving the other dimension blank, to indicate that I want the given rows (or columns)

In [None]:
m1[1:3,]      ## first three rows
m1[,9:10]     ## last two columns

#### Exercise

Index m1 to extract the bottom right square (last 5 rows and last 5 columns).

Next, index m1 to extract all the even rows


In [None]:
## Place exercise code here.

#### Some Elementary Matrix Functions

Similar to vectors, we can do element-wise arithmetic:

In [None]:
m2 <- matrix(1:100,nrow=10,ncol=10, byrow = TRUE)

m1^2         # square each element
m1 + m2      # add 2 matrices elementwise
m1 * m2      # multiply 2 matrices elementwise
m3 <- m1 > m2      # logical element-wise comparison
m3

#### There are also matrix operations

A number of useful operations are defined for matrices.

In [None]:
t(m1)              # matrix transpose
m1 %*% m2          # matrix multiplication
diag(m1)           # extract the diagonal
diag(4)            # create an identity matrix
lower.tri(m1)      # which values are in the lower triangle?
m1[lower.tri(m1)]  # extract those values - note conversion to a vector
outer(1:10,1:10)   # outer product

#### And you can create matrices of different types

Just like atomic vectors, matrices can be of different types.

In [None]:
c3 <- subset(letters,letters!="z")
c3
m3
typeof(m3); class(m3)
m4 <- matrix(c3,nrow=5)		# matrix of characters
m4; typeof(m4); class(m4)

### Arrays   

Multidimensional arrays are simply the N-dimensional extension to matrices, and they are also basic data types. In the most generic case, we create an array data structure with the `array()` function. When defining a matrix, we specified the number of columns with `ncol` and the number of rows with `nrow`. With arrays, we do not know the dimensionality of the resultant array without it being specified, so we use a single argument `dim` to specify the length of each dimension. The resulting dimensionality of the object is the same as the length of the `dim` argument used in its creation.

In [None]:
A <- array(1:24, dim = c(3,4,2))
A
print(A)
dim(A)
B <- array(A, dim = c(2,3,4))   # reshape A
B
print(B)
dim(B)

#### Indexing Arrays

Indexing arrays works very similarly to indexing matrices. The difference is that you need to maintain a position for every dimension available in the array. For example, to extract a single element from a three-dimensional array, you must specify three positions:

In [None]:
A[1,1,1] ## first item
A[3,4,2] ## last item

If we want to extract all rows and columns for the first slice, then we leave the first two position arguments empty, and provide an index to extract the particular "slice".

In [None]:
A[,,1]
A[,,2]

One common use of arrays is to represent two-dimensional space across time (the third dimension). If we wanted to extract the time-series for a particular location, we would access that particular position, but leave the third-dimension empty

In [None]:
A2 <- array(rnorm(40), dim = c(2,2,10))
A2[1,1,]
plot(A2[1,1,], type = 'b')

### Lists

Compared to the data structures that we have seen thus far, lists are much more flexible. In fact, they are the most flexible data structure available in R. The primary difference between lists and other data structures is that an element within a list can be anything, and two elements within a list can be of different types!

As a simple example, we can create a list with two elements - one element is a numeric vector, and the second element is a character vector:

In [None]:
lst0 <- list(v, c1)
lst0

Lists are not limited to this, though - an element within a list can be **anything** that R can represent. For example, an element can be a matrix:

In [None]:
lst <- list(v, c1, m1)
lst

It can be an array:

In [None]:
lst <- list(v, c1, m1, A)
lst

It can even be another list!

In [None]:
lst <- list(v,v1,v2,c1,list(m1,m2),c3,m3, A,B)
lst

With this flexibility, we can still use the same types of functions to describe the list:

In [None]:
typeof(lst)
length(lst)

Note here that `length()` describes the number of elements within the list, not the total number of values. There are a few ways to count the "leaves" within the list, one of which is to flatten the list with `unlist()`, and use length on that:

In [None]:
length(unlist(lst))

#### Accessing an element within a list

Accessing an element within a list can be a little confusing. The way to extract a single element from a list is to use the **double-square* bracket:

In [None]:
lst[[6]]     ## extract the sixth element of a list

Because that element is a vector, if we want to subset and identify the first 5 values (rather than the entire vector), we can continue indexing that as though `lst[[6]]` were the name of the object:

In [None]:
lst[[6]][1:5]

#### Exercise

We just demonstrated how to extract some values from a vector element in a list.


Element 7 is a matrix - can you extract the first 3 rows of that matrix?

Element 8 is an array - can you extract the first "slice" of rows and columns (i.e. all rows and columns where the third dimension index is 1).

Element 5 in `lst` is a list - can you extract the second element of *that* list? 

To build on the last question - Element 5 is actually a list of matrices - can you extract the first two columns of the first matrix in that sub-list?

In [None]:
## Place your exercise code here


### Data Frames

Data frames are the internal representation of datasets within R. They mirror the ways that statisticians think about data: they are a table in which each row is an observation and each column a variable. Specifically, data frames are tabular lists, where each element within the list is a variable, but all elements are constrained to have the same length (effectively the number of rows).

There are a number of data frames that are available within R that we can see with the `data()` function:

In [None]:
data()

Each one of these represents a data frame that is registered with R. We can get help on them by typing `?dataset_name`, just like we can get help on other objects in R. And we can see the dataset by typing `dataset_name`. The typical help entry is a data dictionary that defines the variables and units in that dataset.

For example, examine the `mtcars` dataset

In [None]:
?mtcars
mtcars

We can treat it as a table, and subset rows or columns (or both):

In [None]:
mtcars[1:3,]      ##
mtcars[,'mpg']    ## the columns are named, so we can use a character vector to index!

Or, we can treat it as a list and extract the elements by accessing particular variables:

In [None]:
mtcars[[1]]       ## extract the first element!

Because the columns in a data frame are named, we can also use the `$` operator to extract variables

In [None]:
mtcars$mpg

We can also determine whether an object is a data frame by extracting the class of the object, or do more explicit testing with the `is.data.frame()` function. While conversion functions are typically called `as.XXX`, testing functions are typically named `is.XXX`.

In [None]:
class(mtcars)
typeof(mtcars)
is.data.frame(mtcars)

As mentioned above, note that the "type" of mtcars is actually a list. It is a special type of list that has the constraint that each element has to have the same length as the others, but it has the benefit that there are a number of operations defined for it. The fact that a data frame is a list is what enables different variables to be different types.

While the datasets surfaced through `data()` are useful, typically we want to bring (or make) our own. Typically, we would read in datasets from an external file using a function such as `read.csv()` (see more details in [this notebook](2-Reading_csv_files.ipynb)). However, if we want to construct a data frame by hand, then we can do so using the `data.frame()` function. 

Do you notice a pattern here? We create a matrix using the `matrix()` function, an array using the `array()` function, a list with the `list()` function, and a data frame with the `data.frame()` function. A common practice is to name constructor functions the same name as the class of object it creates.

Here we build a data frame "by hand" using the `data.frame()` function. In this example, we will build a toy data.frame to simulate heights of men and women based on the aggregate statistics published by the cdc [here](http://www.cdc.gov/nchs/data/series/sr_11/sr11_252.pdf), specifically tables 9 and 11 (pgs 13 and 15).


In [None]:
# simulate 20 of each gender
number_each = 10
# assume a normal distribution and use aggregate statistics for all females and males who have age >= 20
# mean = mean height in cm; sd = std. error of mean * sqrt(N)
female_height <- rnorm(number_each, mean = 162.1, sd = 0.14*sqrt(5971)) ## mean, s.e. and N from table 9
male_height <- rnorm(number_each, mean = 175.9, sd = 0.20*sqrt(5647))   ## mean, s.e. and N from table 11
height <- c(female_height, male_height)
gender <- rep(c("f", "m"), each = number_each)                          ## rep is useful to "repeat" vectors in systematic ways
#
dF <- data.frame(gender, height)
dF

#### Operating on a data frame

We can operate on these data frames in similar ways as before. We can get data about the data frame from a variety of fashions:

In [None]:
head(dF) 
dim(dF)
typeof(dF)
class(dF)

What happens when we want to process each variable? Well, the easiest way is to use a "higher-level" function that can implicitly loop across the variables. The `apply` family of functions is good for this, and in particular, `lapply()` will allow us to apply a particular function to every element within a list. In this case, I can use lapply to get the `type` of each variable:

In [None]:
lapply(dF, typeof)

Notice that sex is now of type "integer"!!! How and why did the data.frame() function turn a vector of "f" and "m" values to integers? Further, if I just look at dF, it still looks like a set of "f" and "m" values...

What happened is that, by default, the `data.frame()` function automatically turns character vectors into an object of class "factor". Factors are the R representation of categorical (i.e. nominal) variables. The `data.frame()` function has this default because in many situations (particularly model development and visualization), factors are treated specially. However, in many cases, we do not want all of our character vectors turned into factors. 

If we do not want this to happen, we can either set stringsAsFactors = FALSE in the call to data.frame (see ?data.frame), or this can also be set in the global options via `options(stringsAsFactors = FALSE)`


In [None]:
## confirm that gender is in fact a factor:
sapply(dF,class)

# set it false in this particular case:
dF2 <- data.frame(gender, height, stringsAsFactors = FALSE)
sapply(dF2,class)

# set it as a global option:
options(stringsAsFactors = FALSE)
dF3 <- data.frame(gender, height)  
## I don't set stringsAsFactors = FALSE here, but it reads from the global options, so it is still a character variable!
sapply(dF3,class)

## Functions

"A function is a group of instructions that takes inputs, uses them to compute other values, and returns a result." - [Norm Matloff, The Art of R Programming](https://www.safaribooksonline.com/library/view/the-art-of/9781593273842/ch01s03.html)

When we are going to do a set of steps multiple times, it can be very useful to write the specific set of steps in a function, so that we can call that single function repeatedly, rather than re-writing the same steps over and over.

### Writing Functions  

Let's start with a simple example. Let's say that we wanted to estimate the mean of a vector. THe mean of a vector is just:

$$ \frac{\Sigma_{i=1}^N X}{N} $$

In other words, it is simply the sum of the elements in `X` divided by the number of elements in `X`.

If we want to create a function in R, we use the constructor function is called `function()`.

In [None]:
mymean <- function(x){
    m <- sum(x)/length(x)
    return(m)
}

This creates an object `mymean` that takes one argument (`x`) as input, and then returns the mean as the output.

Once that code is run, then we can see that object in our workspace:

In [None]:
ls(pattern = "mean")

And we can then use that object just like any other function that we have seen before:

In [None]:
mymean(female_height)

If we wanted to go farther, we may not actually want all of the extra precision, so we could want to round the mean before reporting it.

Let's try again and improve the formatting

In [None]:
mymean2 <- function(x){
  m <- round(sum(x)/length(x),2)
  return(m)
}

In [None]:
ls(pattern = "mean")

In [None]:
mymean2(female_height)

Rounding is good for viewing, but may not be good for computation. It may be a good idea to allow the user to control the degree of rounding.

R has a special argument that is designed for argument pass-through. This special argument is "...". In this context, "..." specifies an undisclosed number of arguments that get passed through to another function that is called in the body of the function you are defining. 


In [None]:
jmean3 <- function(x,...){
  m <- round(sum(x)/length(x),...)
  return(m)
}


In this case, any arguments that are not in the named arguments of the function (in this case, any argument other than `x`), will be passed through as additional arguments to `round()`.

In [None]:
jmean3(pemax,3) # 3 digits
jmean3(pemax)   # the default value for round is 0
jmean3(pemax,1) # 1 digit

## Functions calling Functions  

How about giving the user a choice about which rounding function to use?  We have already seen functions that can call other functions (e.g. `sapply()` and `lapply()`). What if we wanted to define one?

Look at the difference between round() and signif().

In [None]:
pi
round(pi,4)
signif(pi,4)

If we wanted to write a function that takes a function as an argument, we can simplify specify an argument, and then use that argument exactly as though it were a function:

In [None]:
jmean4 <- function(x,FUN,...){
  m <- FUN(sum(x)/length(x),...)
  return(m)
}

jmean4(female_height,round,4)
jmean4(female_height,signif,4)
jmean4(female_height, `(`)        ## a way of not actually rounding at all!
       

In the last example, note the backticks that allow R to see the parenthesis as a function handle, rather than an opening of a set of parentheses.

## Functions with Defaults
One last try  
     - make round the default method of rounding
     - make the default number of decimal digits 2  

In [None]:
jmean5 <- function(x,FUN = round,digits=3){
 m <- FUN(sum(x)/length(x),digits)
 return(m)
}
jmean5(female_height)
jmean5(female_height,FUN=signif)
jmean5(female_height,FUN=signif,digits=4)
jmean5(female_height, digits = 0)      ## note that I can modify digits but NOT FUN

#### Missing Values

`NA` is the way to designate a missing value as we saw above.

In [None]:
z <- c(1:3,NA)
z

However, some care must be shown when dealing with missing values.

We might normally think that we could use an equality comparison "==" to check for missing-ness. However, the logic functions in R are three-valued, and a missing value compared to anything is also a missing value:

In [None]:
z == NA 


The proper way to test for a missing value is to use a "is" testing function. Specifically, we can use `is.na()` to test for a missing value:

In [None]:
zm <- is.na(z)
zm
z[!zm]


In addition to representing a missing value, there is a different symbol for representing "not a number". That symbol is `NaN`. 

For testing, `is.na()` will return TRUE for both `NA` and `NaN` symbols. If we want to find only `NaN` symbols, we can use `is.nan()`.

In [None]:
z1 <- c(z,0/0)				# NaN: not a number
z1
is.na(z1)					    # Finds NAs and NaNs
#
is.nan(z1)            # Only finds NaNs

### Removing NAs From A Data Frame

As we saw earlier a few cells above, we can index a vector with the result of `!is.na()` to return only values that are not missing.

For data frames, this can be a bit more complicated, because different variables may have different rows missing. In order to simplify the process of comparing across all variables, there is a handy utility function that faciltates this: `na.omit()`

`na.omit()` will remove `NA`s from a data frame. However, keep in mind that applying `na.omit()` to a data frame will remove the **entire row** if it finds an NA in any of the columns.

In [None]:
## insert some missing values:
dF[c(3, 12),1] <- NA
dF[c(7, 18),2] <- NA
dF
dF_no_NA <- na.omit(dF)	
dF_no_NA	

Notice now that four rows are missing (the rows we inserted `NA` into).

### How to make a function return multiple values

Earlier, we created a function that returned a length 1 vector. There is nothing particularly special about returning length 1 vectors, so if we would like to return multiple values, the easiest way is to return a data structure with multiple values. For example, we could create a function that returns a vector or a list. Specifically, in this case, we may want to create our own summary function that will provide the mean, variance, and standard deviation of all numeric variables in a data frame.

A simple function that returns multiple values

In [None]:
mvsd <- function(x){
  numericvars <- sapply(x, is.numeric) # returns TRUE if a variable is numeric
  m <- sapply(x[numericvars],mean)   ## get the means of each item in a list
  v <- sapply(x[numericvars],var)
  s <- sapply(x[numericvars],sd)
  res <- list(mean = m, variance = v,sd = s)
  return(res)
}

In [None]:
mvsd(dF_no_NA)
mvsd(mtcars)