# A Rapid and Informal Introduction to Python for Data Science

# - Numpy Fundamentals

#### Developed by:  Brian Vegetabile, PhD Candidate, University of California, Irvine

This notebook is a supplement to the workshop "A Rapid and Informal Introduction to Python for Data Science"

# Getting Started

Import `numpy` as `np` to get started

As was mentioned earlier, any package should be brought in with an identifer like the one above.  

# Comparing Python Lists with Numpy Arrays

Let's start to disect why we want to actually be using numpy arrays in the first place.  Consider the following list of numbers
```python
vect_a = [ 0.40596906,  1.03987797, -0.74112064, -1.81293637,  0.12438781,
           0.97333303, -1.56900792, -0.41787639, -0.15112056, -0.46346588]
```

We can use the `dir()` command to see that we can append to this list, count it, extend it, insert into it, pop values from it, remove indices, sort or reverse it.  Additionally, if we want to know how many items are in the list we can use the `len` function.  

These don't seem like useful commands for performing say matrix algebra though.  Additionally if we embed a list inside another list, we get the same operations...

```python
mat_a = [[0.40596906,  1.03987797, -0.74112064, -1.81293637,  0.12438781],
         [0.97333303, -1.56900792, -0.41787639, -0.15112056, -0.46346588]]
```

```
mat_a.append   mat_a.extend   mat_a.insert   mat_a.remove   mat_a.sort     
mat_a.count    mat_a.index    mat_a.pop      mat_a.reverse  
```

Using the length operation on this returns 2, which clearly is not what would be hoping to expect.  

#### Converting to Numpy Arrays

Let's convert these to numpy arrays and see the list of things that are available to us. 

```python
vect_b = np.array(vect_a)
mat_b = np.array(vect_b)
```

Typing `mat.<tab>` shows us a much larger list of things that we can do with a `numpy` array.  

```
mat_b.T             mat_b.copy          mat_b.imag          mat_b.ravel         mat_b.sum
mat_b.all           mat_b.ctypes        mat_b.item          mat_b.real          mat_b.swapaxes
mat_b.any           mat_b.cumprod       mat_b.itemset       mat_b.repeat        mat_b.take
mat_b.argmax        mat_b.cumsum        mat_b.itemsize      mat_b.reshape       mat_b.tobytes
mat_b.argmin        mat_b.data          mat_b.max           mat_b.resize        mat_b.tofile
mat_b.argpartition  mat_b.diagonal      mat_b.mean          mat_b.round         mat_b.tolist
mat_b.argsort       mat_b.dot           mat_b.min           mat_b.searchsorted  mat_b.tostring
mat_b.astype        mat_b.dtype         mat_b.nbytes        mat_b.setfield      mat_b.trace
mat_b.base          mat_b.dump          mat_b.ndim          mat_b.setflags      mat_b.transpose
mat_b.byteswap      mat_b.dumps         mat_b.newbyteorder  mat_b.shape         mat_b.var
mat_b.choose        mat_b.fill          mat_b.nonzero       mat_b.size          mat_b.view
mat_b.clip          mat_b.flags         mat_b.partition     mat_b.sort          
mat_b.compress      mat_b.flat          mat_b.prod          mat_b.squeeze       
mat_b.conj          mat_b.flatten       mat_b.ptp           mat_b.std           
mat_b.conjugate     mat_b.getfield      mat_b.put           mat_b.strides       
```

Operations like ```cumsum``` and `size` are more in line with what we will want to do with matrix operations for data science.  

For the object `mat_b` from the row sums and column sums for this object.  Now attempt to write a small loop for `mat_a` and compare the amount of code that you used.  

# Linear Algebra with Numpy

Another great feature of `numpy` is that it is incredibly useful for performing matrix operations.  

Consider the two matrices.

```python
mat_A = np.array([[1, 4],
                  [3, 5]])
mat_B = np.array([[12, 14, 34],
                  [1, 3, 5]])
vect_x = np.array([1,2])
```

Copy and paste below to run the code

## Matrix Multiplication

There is a slight oddity to matrix multiplication in python, but once you get used to the syntax it becomes very easy.  The function from numpy for matrix multiplication is called `dot`.  

Therefore to multiply to matrices use

```python
np.dot(A, B)
```

where `A` and `B` are of appropriate dimensions.  Additionally, there is another notation which is similar and often times more easy to use.

```python
A.dot(B)
```

This implies that the array `A` has method `dot` which can be applied to matrix `B` if it is of appropriate dimension.  This chaining together of commands can be very useful.

Attempting to do matrix muplication in following 

```python
A * B
```
will attempt to perform element-wise multiplication.

##### Small exercise.  

Play with the matrix multiplication commands.  The multiplication $AB$ should work, while $BA$ should give and error.  Attempt to multiply the vector $x$ times matrix $A$ with both matrix and elementwise multiplication

In [None]:
np

## Numpy.linalg Functions

Numpy additionally has a bunch of functions that are accesible through typing `np.linalg.<tab>`

```python
np.linalg.LinAlgError      np.linalg.eig              np.linalg.lstsq            np.linalg.slogdet
np.linalg.Tester           np.linalg.eigh             np.linalg.matrix_power     np.linalg.solve
np.linalg.absolute_import  np.linalg.eigvals          np.linalg.matrix_rank      np.linalg.svd
np.linalg.bench            np.linalg.eigvalsh         np.linalg.multi_dot        np.linalg.tensorinv
np.linalg.cholesky         np.linalg.info             np.linalg.norm             np.linalg.tensorsolve
np.linalg.cond             np.linalg.inv              np.linalg.pinv             np.linalg.test
np.linalg.det              np.linalg.lapack_lite      np.linalg.print_function   
np.linalg.division         np.linalg.linalg           np.linalg.qr                         
```

Often you will find yourself pounding your fist on the desk forgeting to type `linalg` after `np` trying to invert a matrix.


# Random Numbers in Numpy