# NumPy

Numpy is a package for scientific computing with python. Ultimately speaking, it is a package that can do a ton of amazing linear algebra calculations. It can do a whole lot more. It contains: 

- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, Fourier transform, and random number capabilities


For our purposes however, we'll be using it for its abilities to work with multidimensional data. To learn more about what NumPy is capable of, check out the [NumPy website]('https://numpy.org'). 

## Installation

There are a number of ways to install numpy. I'd highly recommend downloading anaconda, a scientific package handler. When downloading the anaconda suite, you'll recieve a number of packages including numpy, scipy, pandas, seaborn, and matplotlib, to name a few. [Follow this link to download anaconda]('https://www.anaconda.com'). If you choose to go that route, you can view all of your packages by typing: 

`conda list`

Otherwise, you need to use `pip`, the python package installer. You can see all of its documentation at the [pip website]('https://pip.pypa.io/en/stable/')

To see if you have pip already installed, in your command line, type: 

```pip --version```

If you already have pip installed, you ought to see something like: 

```pip 19.2.3 from /Users/matthewlane/anaconda3/lib/python3.6/site-packages/pip (python 3.6)```

If you do not have pip installed, follow the documentation at the [pip website]('https://pip.pypa.io/en/stable/installing/'). You can also quickly install pip via these commands via the command line: 

```
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python get-pip.py
```

(Note: It is also possible to install pip via package managers such as Homebrew or apt-get).


Once pip is installed, to install Numpy, you simply install it to your environment via: 

```pip install numpy```

To check if you've successfully installed NumPy, in your command line type: 

```pip list | grep numpy```

If you have successfully installed it, you ought to see:

```numpy      1.17.2```

(and possibly a few other packages containing numpy)


To use numpy, you need to import it into your project. By typing `import numpy as np` you'll be aliasing the numpy library as np (just for accessing the library more quickly): 


In [1]:
import numpy as np

In python, we're able to create a list or a matrix by typing: 

In [2]:
someList = [1,2,3,4,5]
someMatrix = [[0, 1, 2, 3, 4, 5],
             [ 1, 0, 1, 2, 3, 4 ], 
             [ 2, 1, 0, 1, 2, 3 ], 
             [ 3, 2, 1, 0, 1, 2 ], 
             [ 4, 3, 2, 1, 0, 1 ]]

print(someList)
print(someMatrix)

[1, 2, 3, 4, 5]
[[0, 1, 2, 3, 4, 5], [1, 0, 1, 2, 3, 4], [2, 1, 0, 1, 2, 3], [3, 2, 1, 0, 1, 2], [4, 3, 2, 1, 0, 1]]


Numpy lets us easily take arrays and lists as parameters so that we can easily work with them: 

In [3]:
myNpArray = np.array(someList)
myNpMatrix = np.array(someMatrix)

In [4]:
print(myNpArray)
print(myNpMatrix)

[1 2 3 4 5]
[[0 1 2 3 4 5]
 [1 0 1 2 3 4]
 [2 1 0 1 2 3]
 [3 2 1 0 1 2]
 [4 3 2 1 0 1]]


Additionally, supposing that you don't wish to write out your entire matrix, there are a number of methods that may be useful: 

### arange

By using arange, you can quickly create an array: 

In [5]:
startingPoint = 0
endingPoint = 10  ## Exclusive
np.arange(startingPoint, endingPoint)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

It's also entirely possible to determine the increment of your array in the arange function: 

In [6]:
startingPoint = 0
endingPoint = 10  ## Exclusive
increment = 2
np.arange(startingPoint, endingPoint, increment)

array([0, 2, 4, 6, 8])

### zeros

Suppose you wanted to create an array or matrix quickly. You can create a zeroed out matrix by using `np.zeros()`

For an array, you need only the array length: 

In [7]:
arrayLength = 4
np.zeros(arrayLength)

array([0., 0., 0., 0.])

Matrices, however, require both the length of a row, as well as the length of the columns (or, for computer science terms, the length of an array, and the amount of arrays in the multidimensional array). 

Take notice, though, that you'll need to pass the data in as a tuple (that is, instead of passing in two parameters, a single parameter gets passed in as a tuple): 

In [8]:
rows = 4
columns = 4

rowColumnTuple = (rows, columns)

np.zeros(rowColumnTuple)

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

### ones

The same methodologies that worked for zeros also works for ones: 

In [9]:
arrayLength = 4
np.ones(arrayLength)

array([1., 1., 1., 1.])

In [10]:
rows = 4
columns = 4

rowColumnTuple = (rows, columns)

np.ones(rowColumnTuple)

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

### linspace

To create an array that has evenly spaced values that are not integer intervals, use linspace. By using linspace, you don't enter the number to increment by, you enter the amount of numbers you would like in your array: 

In [11]:
startingPoint = 0
endingPoint = 10  ## Exclusive
amountOfNumbers = 2
np.linspace(startingPoint, endingPoint, amountOfNumbers)

array([ 0., 10.])

In [12]:
startingPoint = 0
endingPoint = 10  ## Inclusive
amountOfNumbers = 20
np.linspace(startingPoint, endingPoint, amountOfNumbers)

array([ 0.        ,  0.52631579,  1.05263158,  1.57894737,  2.10526316,
        2.63157895,  3.15789474,  3.68421053,  4.21052632,  4.73684211,
        5.26315789,  5.78947368,  6.31578947,  6.84210526,  7.36842105,
        7.89473684,  8.42105263,  8.94736842,  9.47368421, 10.        ])

### eye

The eye function produces an eye-dentity matrix! This function takes in literally only one parameter, which is the length of column/row (given that identity matrices are `n x n` matrices): 

In [13]:
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

### rand

It is also possible to fill an array and matrix with random numbers (what is nice is that the random is already seeded, so you will receive "random" numbers every time): 

In [14]:
np.random.rand(10)

array([0.12413475, 0.31539724, 0.06184091, 0.12287762, 0.72563279,
       0.32798195, 0.51989487, 0.2412437 , 0.63748307, 0.08186862])

Just as you can createa a random array, you can create a randomized matrix: 

In [15]:
np.random.rand(4, 4)

array([[0.64083356, 0.17440491, 0.36433419, 0.66712133],
       [0.90153414, 0.26000524, 0.20281031, 0.32124052],
       [0.76620735, 0.49665398, 0.04711449, 0.13530554],
       [0.96163426, 0.52021155, 0.67482792, 0.30361416]])

## Methods

While generating arrays and matrices is fun, it would be rather pointless without the ability to manipulate them. 

### Reshaping: 
It's possible that you may wish to reshape an array at a given time. In order to do so, simply use the reshape method: 

In [16]:
singleDimensionalArray = np.random.rand(16)

print("Single Dimensional array: \n", singleDimensionalArray)

multidimensionalArray = singleDimensionalArray.reshape(4,4)
print("\n\nMultidimensional array: \n", multidimensionalArray)

Single Dimensional array: 
 [0.8922966  0.3544306  0.01533947 0.2084559  0.64336946 0.96953758
 0.01709565 0.95826237 0.80446295 0.00196007 0.47618238 0.00294245
 0.07238956 0.5326511  0.54651923 0.20690257]


Multidimensional array: 
 [[0.8922966  0.3544306  0.01533947 0.2084559 ]
 [0.64336946 0.96953758 0.01709565 0.95826237]
 [0.80446295 0.00196007 0.47618238 0.00294245]
 [0.07238956 0.5326511  0.54651923 0.20690257]]


### Getting Values from the arrays and matrices: 

There are many times you'll want to get the max or min values from an array (and possibly also their locations):


In [17]:
singleDimensionMax = singleDimensionalArray.max()
singleDimensionMaxLocation = singleDimensionalArray.argmax()
print("Singledimensional array {} is at position {} ".format(singleDimensionMax, singleDimensionMaxLocation))

Singledimensional array 0.9695375796054362 is at position 5 


In [18]:
multiDimensionMax = multidimensionalArray.max()
multiDimensionMaxLocation = multidimensionalArray.argmax()
print("Multidimensional array {} is at position {} ".format(multiDimensionMax, multiDimensionMaxLocation))

Multidimensional array 0.9695375796054362 is at position 5 


The functionality does work with both the single and multidimensional arrays, however, by attempting to get the argmax from a 2d array, you only get the offset from beginning of the array. 

## What's the point?

So far, we've just been creating arrays and matrices, reshaping them, and looking at a few methods. What makes numpy so spectacular isn't the fact that it's relatively painless to generate arrays and matrices. It's the fact that we can manipulate them with incredible ease. 

Numpy allows for us to do singular instructions on vectors and matrices (such as scalar addition / multiplication). 

In [19]:
startingPoint = 0
endingPoint = 5  ## Exclusive


vector = np.arange(startingPoint,endingPoint)
matrix = np.arange(startingPoint, endingPoint*5).reshape(5,5)

print("Vector: \n", vector)
print("Matrix: \n", matrix)

Vector: 
 [0 1 2 3 4]
Matrix: 
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]


#### Scalar Addition: 

In [20]:
print("Vector + 15: \n", vector + 15)
print("Matrix + 15: \n", matrix + 15)

Vector + 15: 
 [15 16 17 18 19]
Matrix + 15: 
 [[15 16 17 18 19]
 [20 21 22 23 24]
 [25 26 27 28 29]
 [30 31 32 33 34]
 [35 36 37 38 39]]


#### Scalar Multiplication

In [21]:
print("Vector + 15: \n", vector * 3)
print("Matrix + 15: \n", matrix * 3)

Vector + 15: 
 [ 0  3  6  9 12]
Matrix + 15: 
 [[ 0  3  6  9 12]
 [15 18 21 24 27]
 [30 33 36 39 42]
 [45 48 51 54 57]
 [60 63 66 69 72]]


#### Vector / Matrix Addition 

In [22]:
print("Vector + [0,1,2,3,4]: \n", vector + vector)
print("Matrix + Matrix \n", matrix + matrix)

Vector + [0,1,2,3,4]: 
 [0 2 4 6 8]
Matrix + Matrix 
 [[ 0  2  4  6  8]
 [10 12 14 16 18]
 [20 22 24 26 28]
 [30 32 34 36 38]
 [40 42 44 46 48]]


#### Vector / Matrix Multiplation (Not linear algebra!)

In [23]:
print("Vector * [0,1,2,3,4]: \n", vector * vector)
print("Matrix * Vector \n", matrix * vector)
print("Matrix * Matrix \n", matrix * matrix)

Vector * [0,1,2,3,4]: 
 [ 0  1  4  9 16]
Matrix * Vector 
 [[ 0  1  4  9 16]
 [ 0  6 14 24 36]
 [ 0 11 24 39 56]
 [ 0 16 34 54 76]
 [ 0 21 44 69 96]]
Matrix * Matrix 
 [[  0   1   4   9  16]
 [ 25  36  49  64  81]
 [100 121 144 169 196]
 [225 256 289 324 361]
 [400 441 484 529 576]]


Take note that though this looks very easily like these could be linear algebra equations, they're not. To do linear algebra equations, you'll need to consult the [documentation]('https://docs.scipy.org/doc/numpy/reference/routines.linalg.html'). The matrix multiplication of our matrix by itself would yield: 

In [24]:
np.dot(matrix, matrix)

array([[ 150,  160,  170,  180,  190],
       [ 400,  435,  470,  505,  540],
       [ 650,  710,  770,  830,  890],
       [ 900,  985, 1070, 1155, 1240],
       [1150, 1260, 1370, 1480, 1590]])

We can also manipulate only specific parts of our arrays and matrices: 

In [25]:
vector[:2] = 13
vector

array([13, 13,  2,  3,  4])

To access specific portions of a matrix, you can index the matrices by accessing only the individual elements: 

In [26]:
matrix[2]

array([10, 11, 12, 13, 14])

In [27]:
print('You can access matrix data with [][] notation' , matrix[2][4])
print('You can access matrix data with [ , ] notation' ,matrix[2,4])

You can access matrix data with [][] notation 14
You can access matrix data with [ , ] notation 14


### Accessing data by boolean vectors and matrices: 

It's rarely the case where you know exactly what you'll be wanting out of a matrix. With numpy, you can grab data based on boolean matrices. Let's just try to see only even numbers in our vector and matrix: 

In [28]:
startingPoint = 0
endingPoint = 5  ## Exclusive


vector = np.arange(startingPoint,endingPoint)
matrix = np.arange(startingPoint, endingPoint*5).reshape(5,5)

print("Boolean Vector divisible by 2\n", vector % 2 == 0)
print("\nBoolean Matrix dividible by 2\n", matrix % 2 == 0)

Boolean Vector divisible by 2
 [ True False  True False  True]

Boolean Matrix dividible by 2
 [[ True False  True False  True]
 [False  True False  True False]
 [ True False  True False  True]
 [False  True False  True False]
 [ True False  True False  True]]


You can take these boolean vectors and matrices, and use them as arguments to pass to your data (this notation takes a bit of getting used to): 

In [29]:
evenNumberBooleanVector = vector % 2 == 0
vector[evenNumberBooleanVector]

array([0, 2, 4])

In [30]:
evenNumberBooleanMatrix = matrix % 2 == 0
matrix[evenNumberBooleanMatrix]

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22, 24])

Now, take note that while `matrix` was in the `5x5` matrix shape, it is now a singular dimension. This notation returns single dimensional arrays (it would be very curious to see a 5x5 matrix with only 13 values). 

### Caveats

It's entirely possible that in your work you'll wind up doing some division. If you do end up dividing by zero, you'll find that the kernal doesn't actually break, but it can end up giving you a number you may not know what to do with. 

In [32]:
print(vector)

[0 1 2 3 4]


In [33]:
vector / vector

  """Entry point for launching an IPython kernel.


array([nan,  1.,  1.,  1.,  1.])

In [34]:
1 / vector

  """Entry point for launching an IPython kernel.


array([       inf, 1.        , 0.5       , 0.33333333, 0.25      ])