# Matrix and its related operations in R
---
We will talk about the matrix and its related operations in R today. Matrix is essential in statistics. In this tutorial, I assumed you have linear algebra background, we will not talk it too much ,but we will cover some basic ideas. 

So, what is the matrix?

> A matrix is a two-dimensional array where each element has the same mode (numeric, character, or logical). Matrices are created with the matrix function. -*R in Action*

There is a simple matrix will help us have a better understanding. 
$$
\begin{matrix}1 & 2  & 3  \\ 4 & 5 & 6\end{matrix}
$$
For this matrix, it has two tows and three columns which means this matrix's dimensions(some people use shape for it) is $(2\times3)$ or $(2, 3)$. 

For each value element in a matrix, it will have a unique location to find it. Take an example the location for `2`.  `2` is in the first row and second column, so its location is $(1, 2)$. For `6`, the location is $(2, 3)$.  We describe row first.

Now we could start to build a matrix in R, here are two examples. 

In [1]:
# Build matrix directly
mymatrix_1 <- matrix(data = 1:20, nrow = 4, ncol = 5, byrow = FALSE, dimnames = NULL)

cell <- c(1:20)
rnames <- c('r1', 'r2', 'r3', 'r4', 'r5')
cnames <- c('c1', 'c2', 'c3', 'c4')

mymatrix_2 <- matrix(data = cell, nrow = 5, byrow = TRUE, dimnames = list(rnames, cnames))

***Tips***: use help(matrix) to get more details about `matrix` function.

I built two matrices, for `mymatrix_1`, I set numbers for `nrow` and `ncol`.  For `mymatrix_2`, I just set `nrow`,  R will calculate the rest one if we define one of `nrow` or `ncol`. 

```R
> mymatrix_1
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    5    9   13   17
[2,]    2    6   10   14   18
[3,]    3    7   11   15   19
[4,]    4    8   12   16   20

> mymatrix_2
   c1 c2 c3 c4
r1  1  2  3  4
r2  5  6  7  8
r3  9 10 11 12
r4 13 14 15 16
r5 17 18 19 20
```

I did not assign row and column names for`mymatrix_1`, but you could see R provides the index information for us, we could quickly tell the location of any element in the matrix. For `mymatrix_2`  we could find the column and row names replaced the index. 

We are interested in some elements sometimes, how could we do this? We could identify any row, column or element by using subscripts and brackets. $x[i,]$ refers $i$th row in matrix `x`,  $x[,j]$ refers $j$th column in `x`. If we want to select multiple rows or columns, we could use $x[a:b, ]$ or  $x[, c:d]$ to do this.  You could combine them, $x[a:b, c:d]$ will select the elements from `row a` to `row b` and `column c` to `column d`.  Let's see some examples.

```R
> mymatrix_1[1, ]
[1]  1  5  9 13 17

> mymatrix_1[1: 3, ]
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    5    9   13   17
[2,]    2    6   10   14   18
[3,]    3    7   11   15   19

> mymatrix_1[, 2]
[1] 5 6 7 8

> mymatrix_1[, 2: 3]
     [,1] [,2]
[1,]    5    9
[2,]    6   10
[3,]    7   11
[4,]    8   12

> mymatrix_1[1:2, 2:3]
     [,1] [,2]
[1,]    5    9
[2,]    6   10
```

For `mymatrix_2` we could use row and column names to select elements.

In [2]:
mymatrix_2[c('r2'), c('c1')]

The data types we talked last time, They all could put in a matrix.

So far, we already know some fundamental thing about the matrix. Next, we discuss the math operation about the matrix. This part, we only cover linear algebra a little bit. To have a clear view of this part, we define two matrices first.

In [3]:
m1 <- matrix(1:4, 2, 2, byrow = F)
m2 <- matrix(5:8, 2, 2, byrow = F)

```R
# Build m1 and m2
> m1
     [,1] [,2]
[1,]    1    3
[2,]    2    4

> m2
     [,1] [,2]
[1,]    5    7
[2,]    6    8
# Basic math operation in matrix
> m1 - m2
     [,1] [,2]
[1,]   -4   -4
[2,]   -4   -4
> m1 + m2
     [,1] [,2]
[1,]    6   10
[2,]    8   12
> m1 / m2
          [,1]      [,2]
[1,] 0.2000000 0.4285714
[2,] 0.3333333 0.5000000
#Element-wise multiplication
> m1 * m2
     [,1] [,2]
[1,]    5   21
[2,]   12   32
#Matrix multiplication
> m1 %*% m2
     [,1] [,2]
[1,]   23   31
[2,]   34   46
#Transpose
> t(m1)
     [,1] [,2]
[1,]    1    2
[2,]    3    4
#Inverse of matrix where matrix is a square matrix.
> solve(m1)
     [,1] [,2]
[1,]   -2  1.5
[2,]    1 -0.5
```

Here is a link with more details about [matrix operations](https://www.statmethods.net/advstats/matrix.html). If you do not understand this math, here is a tutorial from [Stanford CS229](http://cs229.stanford.edu/section/cs229-linalg.pdf), so far you need to read page 2 to page 8. Math may be a little bit boring, but you need it when you want to explore more in the data science area.

One point I would like to mention this time, $m1 * m2$ and $m1\cdot m2$ (`m1 %*% m2`) are fundamentally different. Let's see two simple examples.
```R
> m1 * m2
     [,1] [,2]
[1,]    5   21
[2,]   12   32
```

Here is the calculation:



$$
\begin{matrix} 1 & 3 \\ 2 & 4 \end{matrix}\times \begin{matrix} 5 & 7 \\ 6 & 8 \end{matrix}=\begin{matrix} 1\times 5 & 3\times 7 \\ 2\times 6 & 4\times 8 \end{matrix}=\begin{matrix} 5 & 21 \\ 12 & 32 \end{matrix}
$$


```R
> m1 %*% m2
     [,1] [,2]
[1,]   23   31
[2,]   34   46
```

Here is the calculation:
$$
\begin{matrix} 1 & 3 \\ 2 & 4 \end{matrix}\cdot \begin{matrix} 5 & 7 \\ 6 & 8 \end{matrix}=\begin{matrix} 1\times 5+3\times 6 & 1\times 7+3\times 8 \\ 2\times 5+4\times 6 & 2\times 7+4\times 8 \end{matrix}=\begin{matrix} 23 & 31 \\ 34 & 46 \end{matrix}
$$

**Attributes**:

We end today's tutorial with attributes. We start with some examples:

In [4]:
attributes(mymatrix_1)

In [5]:
attributes(mymatrix_2)

We could easily find they both have `$dim` attribute, but only mymatrix_2 has `$dimnames` attribute. This is because we set names for mymatrix_2. 

Attributes can be accessed individually with `attr()` or all at once with `attributes()`. Sometimes, we could use `attr()` to create an attribute.

In [6]:
# Add a new attribute
attr(mymatrix_1, 'Author') <- 'Shawn'
attr(mymatrix_1, 'Author')

In [7]:
# Remove one attribute
attr(mymatrix_1, 'Author') <- NULL
attr(mymatrix_1, 'Author')

NULL

In [8]:
attributes(mymatrix_1)

We already know how to add and remove attributes. There are three attributes are most important.

* Names
* Dimensions
* Classes

We could use `names()` function to set names for each element in your matrix or array, at the meantime, this function also helps use to check the name attribute.


In [9]:
# Use names() set name for each element
names(mymatrix_1) <- letters[1:20]

In [10]:
# Use names() to check it
names(mymatrix_1)

In [11]:
# Use attributes() to call all attributes
attributes(mymatrix_1)

Once you named each element, we could use the name to select them.

In [12]:
# R will return the element you select and its name
mymatrix_1['a']

In [13]:
mymatrix_1[c('b', 'd')]

`dim()` will tell us the dimensions of this matrix. `class()` will return the class information for us.

In [14]:
dim(mymatrix_1)

In [15]:
class(mymatrix_1)

OK, this is today's lecture. We will talk about array next time.

See you guys next time.