# Introducing NumPy


Python is convenient, but it can also be slow. However, it does allow us to access libraries that execute faster code written in languages like C. NumPy is one such library: it provides **fast alternatives to math operations in Python** and is designed to work efficiently with groups of numbers - like matrices.
NumPy is a large library and we are only going to scratch the surface of it here.

This hands on tutorial includes the following: <br>
- #### Data Types and Shapes
    -  Scalar <br>
    -  Vectors <br>
    -  Matrices <br>
    - Tensors <br>
    - Changing shapes* <br>
    - Commonly used initializations <br>
- #### Element wise operation
    -  The Python way <br>
    -  The NumPy way <br>
    -  Element-wise Matrix Operations <br>
- #### NumPy Matrix Multiplication
    -  Element-wise Multiplication <br>
    -  Matrix Product <br>
    -  NumPys ```dot``` function <br>
    -  Transpose <br>
    -  A real use case <br>
- #### Sorting, Subsetting, Slicing, Splitting and Indexing

### Importing NumPy

When importing the NumPy library, the convention we'll see used most often – including here – is to name it ```np```, like so:

In [0]:
import numpy as np

Now we can use the library by prefixing the names of functions and types with ```np.```, which we'll see in the following examples.

### Data Types and Shapes

The most common way to work with numbers in NumPy is through ```ndarray``` objects. They are similar to Python lists, but can have any number of dimensions. Also, ```ndarray``` supports fast math operations, which is just what we want. <br>
Since it can store any number of dimensions, we can use ```ndarray```'s to represent any of the data types: **scalars**, **vectors**, **matrices**, or **tensors**.

### Scalars

Scalars in NumPy are a bit more involved than in Python. Instead of Python’s basic types like ```int```, ```float```, etc., NumPy lets we specify signed and unsigned types, as well as different sizes. So instead of Python’s ```int```, we have access to types like ```uint8```, ```int8```, ```uint16```, ```int16```, and so on. <br>
These types are important because every object we make (vectors, matrices, tensors) eventually stores scalars. And when we create a NumPy array, we can specify the type - but **every item in the array must have the same type**. In this regard, NumPy arrays are more like C arrays than Python **lists**.

If we want to create a NumPy array that holds a scalar, we can do so by passing the value to NumPy's array function, like so:

In [0]:
s = np.array(5)
print (s)

We can see the shape of our arrays by checking their ```shape``` attribute. So if we executed this code:

In [0]:
print(s.shape)

Here, this empty pair of parenthesis, ```()``` indicates that it has zero dimensions.

Even though scalars are inside arrays, we still use them like a normal scalar. So we could type:

In [0]:
x = s + 3
print(x)

If we were to check the type of x, we'd find it is probably ```numpy.int64```, because its working with NumPy types, not Python types.

In [0]:
print(type(x))

Even scalar types support most of the array functions. so we can call ```x.shape``` and it would return ```()``` because it has zero dimensions, even though it is not an array. If we tried that with a normal Python scalar, we'd get an error.

In [0]:
print(x.shape)
x = 8
print(x.shape)

### Vectors

To create a vector, we'd pass a Python list to the array function, like this:

In [0]:
v = np.array([1,2,3])
print (v)

If we check a vector's shape attribute, it will return a single number representing the vector's one-dimensional length.

In [0]:
print(v.shape)

Now that there is a number, we can see that the shape is a tuple with the sizes of each of the ```ndarray```'s dimensions. For scalars it was just an empty tuple, but vectors have one dimension, so the tuple includes a number and a comma. <br>
Python doesn’t understand ```(3)``` as a tuple with one item, so it requires the comma.

We can access an element within the vector using indices, like this:

In [0]:
x = v[1]
print(x)

NumPy also supports advanced indexing techniques. For example, to access the items from the second element onward, we would say:

In [0]:
print(v[1:])

### Matrices

We can create matrices using NumPy's array function, similar to that of vectors. However, instead of just passing in a list, we need to supply a list of lists, where each list represents a row. So to create a 3x3 matrix containing the numbers one through nine, we could do this:


In [0]:
m = np.array([[1,2,3], [4,5,6], [7,8,9]])
print("m = \n",m, )
print("m.shape: ",m.shape)

```m``` is thus a ```(3, 3)``` tuple. This indicates that it has two dimensions, each of size 3.
We can access elements of matrices just like vectors, but using additional index values. So to find the number 6 in the above matrix, we'd access

In [0]:
print(m[1][2])

Doing the same with a list will give an error:

In [0]:
m = [[1,2,3], [4,5,6], [7,8,9]]
print("m = \n",m)
print("m.shape: ",m.shape)

But, each element can be selected via the same method:

In [0]:
print(m[1][2])

### Tensors

Tensors are just like vectors and matrices, **but they can have more dimensions**. For example, to create a 3x3x2x1 tensor, we could do the following:

In [0]:
t = np.array([[[[1],[2]],[[3],[4]],[[5],[6]]],\
              [[[7],[8]],[[9],[10]],[[11],[12]]],\
              [[[13],[14]],[[15],[16]],[[17],[18]]]])

And ```t.shape``` would return 

In [0]:
print(t.shape)

We can access items just like with matrices, but with more indices. So ```t[2][1][1][0]``` will return

In [0]:
print(t[2][1][1][0])

There is also a function called ```size``` which returns the total number of elements in an array. So for ```t```

In [0]:
print(t.size)

And we can clearly see from the figure above that the total number of elements in ```t``` are, as a matter of fact, 18.

### Changing Shapes <br>
Sometimes we'll need to change the shape of our data without actually changing its contents. For example, we may have a vector, which is one-dimensional, but need a matrix, which is two-dimensional. There are two ways we can do that.

Let's say we have the following vector:

In [0]:
v = np.array([1,2,3,4])

Calling v.shape would return 

In [0]:
print(v.shape)

But what if we want a 1x4 matrix? We can accomplish that with the reshape function, like so:

In [0]:
x = v.reshape(1,4)

Calling x.shape would return 

In [0]:
print(x.shape)

In [0]:
print(x)

Now, If we wanted a 4x1 matrix, we could do this:

In [0]:
x = v.reshape(4,1)
print(x.shape)
print(x)

For a 2x2 matrix:

In [0]:
x = v.reshape(2,2)
print(x.shape)
print(x)

One more thing about reshaping NumPy arrays: if we see code from experienced NumPy users, we will often see them use a special slicing syntax instead of calling reshape. Using this syntax, the previous two examples would look like this:

In [0]:
x = v[None, :]
print(x.shape)

     or

In [0]:
x = v[:, None]
print(x.shape)

These lines create a slice that looks at all of the items of v but asks NumPy to add a new dimension of size 1 for the associated axis. It may look strange to us now, but it's a common technique so it's good to be aware of it.

Now, let's suppose we have a matrix ```A```

In [0]:
A = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
print(A)

and we need to reshape it to a 2x6 matrix. We could do this:

In [0]:
x = A.reshape(2,6)
print(x)

And if we want to take the tensor ```t``` and reshape it to a 6x3 matrix, we can also do that

In [0]:
t = np.array([[[[1],[2]],[[3],[4]],[[5],[6]]],[[[7],[8]],\
    [[9],[10]],[[11],[12]]],[[[13],[14]],[[15],[16]],[[17],[18]]]])
print("t.shape is ",t.shape,"\n")
x = t.reshape(6,3)
print("x = \n",x)
print("x.shape is ",x.shape,"\n")

The thing to note here is that ```ndarray```'s can be shaped to any shape as long as the total number of tensor elements remain the same. For example

In [0]:
m = np.array([[1,2,3], [4,5,6], [7,8,9]])
print(m, "\n")
x = m.reshape(2,4)

Here, the matrix ```m``` has a size ```(3,3)``` where as ```m.reshape(2,4)``` is asking it to be rearranged to a dimension ```(2,4)``` which is not possible to do.

### Commonly used initializations

Many a times typing a ```np.array``` becomes a tedious task. To facilitate in this difficulty, NumPy provides ways to initialize bigger arrays with simple functions.

In [0]:
a = np.zeros((3,4))
print(a)

Thus ```np.zeros((3,4))``` initializes a NumPy array having dimensions 3x4 and all element values equal to ```0```'s

Similarly the ```np.ones()``` function will create a matrix of the desired dimensions with all element values equal to ```1```

In [0]:
a = np.ones((3,2,2))
print(a)

NumPy also provides us with the option of randomly initializing the element values in it's arrays. The ```np.random.rand()``` function creates an array of the given shape and populates it with random samples from a uniform distribution over ```[0, 1)```.

In [0]:
a = np.random.rand(3,5)
print(a)

And, the ```np.random.randn()``` does the same but with samples taken from a normal distribution given by $  \frac{1}{{\sigma \sqrt {2\pi } }}e^{{(x-\mu)}^2 / 2\sigma^2 } $ having mean ($\mu$) 0 and variance ($\sigma^2$) 1.

In [0]:
a = np.random.randn(2000,2000)
a
print(a.mean()) # For a large enough array, this value converges to 0
print(np.std(a)**2) # For a large enough array this value converges to 1

## Element-wise operations

### The Python way

Suppose we had a list of numbers, and we wanted to add 5 to every item in the list. Without NumPy, we might do something like this:

In [0]:
values = [1,2,3,4,5]
for i in range(len(values)):
    values[i] += 5
print(values)

That makes sense, but it's a lot of code to write and it runs slowly because it's pure Python. <br> <br>
**Note:** Just in case we aren't used to using operators like +=, that just means "*add these two items and then store the result in the left item.*" It is a more succinct way of writing ```values[i] = values[i] + 5```.

### The NumPy way

In NumPy, we could do the following:

In [0]:
values = [1,2,3,4,5]
values = np.array(values) +5
print(values)

Creating that array may seem odd, but normally we'll be storing our data in ndarrays anyway. So if we already had an ```ndarray``` named variable, we would have just done:

In [0]:
values -= 5     #back to the original values
print(values)
values += 5
print(values)

We should point out, NumPy actually has functions for things like adding, multiplying, etc. But it also supports using the standard math operators. So the following two lines are equivalent:

In [0]:
x = np.multiply(values, 5)
print(x)
x = values * 5
print(x)

We will usually use the operators instead of the functions because they are more convenient to type and easier to read, but it's really just personal preference.

One more example of operating with scalars and ndarrays. Let's say we have a matrix m and we want to reuse it, but first we need to set all its values to zero. Easy, just multiply by zero and assign the result back to the matrix, like this:

In [0]:
m = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(m)
m *= 0
print(m)

### Element-wise Matrix Operations
The same functions and operators that work with scalars and matrices also work with other dimensions. We just need to make sure that the items we perform the operation on have compatible shapes.
Let's say we want to get the squared values of a matrix. That's simply ```x = m * m``` (or if we want to assign the value back to m, it's just ```m *= m``` <br>
This works because it's an element-wise multiplication between two identically-shaped matrices. (In this case, they are shaped the same because they are actually the same object.) <br> <br>
Let's look at the following example:

In [0]:
a = np.array([[1,3],[5,7]])
a_list = [[1,3],[5,7]]
print("a = \n",a,"\n")
print("a_list = \n",a_list,"\n")

b = np.array([[2,4],[6,8]])
b_list = [[2,4],[6,8]]
print("b = \n",b,"\n")
print("b_list = \n",b_list,"\n")

print("a + b = \n",a + b,"\n")
print("a_list + b_list = \n",a_list + b_list,"\n")

We can see that simple lists yield quite a different result. <br>
And, if we try working with incompatible shapes we'd get an error:

In [0]:
a = np.array([[1,3],[5,7]])
print("a = \n",a,"\n") 

c = np.array([[2,3,6],[4,5,9],[1,8,7]])
print("c = \n",c,"\n")

print("a.shape: ",a.shape)
print("c.shape: ",c.shape,"\n")

print("a + c = \n",a + c)


## NumPy Matrix Multiplication
Now we'll talk about how to do matrix multiplication with NumPy. It's important to know that NumPy supports several types of matrix multiplication.

### Element-wise Multiplication
We have seen some element-wise multiplication already. we can also accomplish that with the ```multiply``` function or the ```*``` operator. Just to revisit, it would look like this:


In [0]:
m = np.array([[1,2,3],[4,5,6]])
print("m = \n",m,"\n")

n = m * 0.25
print("n = \n",n,"\n")

print("m * n = \n",m * n,"\n")

print("np.multiply(m, n)= \n",np.multiply(m, n))

### Matrix Product
To find the matrix product, we use NumPy's ```matmul``` function. This operation is exclusive for NumPy arrays <br>
If we have compatible shapes, then it's as simple as this:

In [0]:
a = np.array([[1,2,3,4],[5,6,7,8]])
print("a = \n",a)
print("a.shape: ",a.shape, "\n")

b = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
print("b = \n",b)
print("b.shape: ", b.shape, "\n")

c = np.matmul(a, b) # (2,4)*(4,3)
print("c = \n",c)
print("c.shape: ",c.shape,"\n")

If our matrices have incompatible shapes, we'll get an error, like the following:

In [0]:
np.matmul(b, a)

This is because for matrices $a.b \neq b.a  $ 

### NumPy's ```dot``` function
We may sometimes see NumPy's ```dot``` function in places where we would expect a ```matmul```. It turns out that the results of ```dot``` and ```matmul``` are the same if the matrices are two dimensional. <br>
So these two results are equivalent:

In [0]:
a = np.array([[1,2],[3,4]])
print("a = \n",a,"\n")

print("np.dot(a,a) = \n",np.dot(a,a),"\n")

print("a.dot(a) = \n",a.dot(a),"\n")  # you can call `dot` directly on the `ndarray`

print("np.matmul(a,a) = \n",np.matmul(a,a),"\n")

While these functions return the same results for two dimensional data, we should be careful about which we choose when working with other data shapes.

### Transpose
Getting the transpose of a matrix is really easy in NumPy. Simply access its ```T``` attribute. There is also a ```transpose()``` function which returns the same thing, but we’ll rarely see that used anywhere because typing ```T``` is much easier.
For example:

In [0]:
m = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print("m =\n",m,"\n")

print("m.T = \n",m.T)

NumPy does this without actually moving any data in memory - it simply changes the way it indexes the original matrix - so it’s quite efficient.

However, that also means we need to be careful with how we modify objects, because **they are sharing the same data**. For example, with the same matrix ```m``` from above, let's make a new variable ```m_t``` that stores ```m```'s transpose. Then look what happens if we modify a value in ```m_t```:

In [0]:
m_t = m.T
m_t[3][1] = 200
print("m_t = \n",m_t, "\n")

print("m = \n",m)

Notice how it modified both the transpose and the original matrix, too! That's because they are sharing the same copy of data. So remember to consider the transpose just as a different view of we matrix, rather than a different matrix entirely.

### A real use case
I don't want to get into too many details about neural networks because we haven't covered them yet, but there is one place we will almost certainly end up using a transpose, or at least thinking about it.

Let's say we have the following two matrices, called ```inputs``` and ```weights```,

In [0]:
inputs = np.array([[-0.27,  0.45,  0.64, 0.31]])
print("inputs = \n",inputs)

print("inputs.shape: ",inputs.shape, "\n")

weights = np.array([[0.02, 0.001, -0.03, 0.036], \
    [0.04, -0.003, 0.025, 0.009], [0.012, -0.045, 0.28, -0.067]])

print("weights = \n",weights)

print("weights.shape: ",weights.shape, "\n")

If we try it like they are now, we get an error:

In [0]:
print(np.matmul(inputs, weights))

This error of incompatible shapes is because the number of columns in the left matrix, ```4```, is not equal the number of rows in the right matrix, ```3```. <br>
So that doesn't work, but notice if we take the transpose of the ```weights``` matrix, it will:

In [0]:
print("np.matmul(inputs, weights.T) = \n",np.matmul(inputs, weights.T))

It also works if we take the transpose of inputs instead and swap their order

In [0]:
print(np.matmul(weights, inputs.T))

The two answers are transposes of each other, so which multiplication we use really just depends on the shape we want for the output.

## Sorting, Subsetting, Slicing, Splitting and Indexing

Data handling is one of the most important skills to have as a data scientist. NumPy provides a suite of data handling functions. We are going to look at a few of them here.

### Sorting

To sort any NumPy array we use the ```np.sort``` function:

In [0]:
a = np.array([[7,32,1],[55,0,-6],[3,99,20]])
print(a, "\n")
print(np.sort(a),"\n")
print(np.sort(a, axis =1))

By default, the elements are sorted in ascending order with respect to the last dimension, which in this case is the column dimension. We can also sort the elements with respect to the rows.

In [0]:
print(a, "\n")
print(np.sort(a, axis = 0))

We can also sort all the elements in a 1 dimensional array by setting ```axis``` to ```None```

In [0]:
print(np.sort(a, axis=None))

### Subsetting

Just like sets in mathematics, ```ndarray```'s also have subsets. We will look at some possible subsets of a 2D array (matrix)

In [0]:
a = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
print(a[1][3])
print(a[1,3])
print(a[0])
print(a[:,2])

Here, ```a[1][3]``` and ```a[1,3]``` both give the same output: The 4th element of the 2nd row; ```a[0]``` gives the entire 1st row where as ```a[:,2]``` gives all the row elements associated with the 3rd column. <br>
It should be noted that all of these are considered as subsets of the array ```a```.

### Slicing

Similar to subsetting, slicing allows us to extract certain portions of an array. For example

In [0]:
a = np.arange(10) # a 1D array from 0 to 9
print(a)
s = slice(2,9,2)
print(a[s])

Here, the tuple (2,9,2) means that ```s``` should start with the value ```2```, end at value ```9``` and take steps of ```2``` to get to ```9```. Or we could simply write

In [0]:
s = a[2:9:2]
print(s)

In both of these examples, we have picked a certain slice of the original array ```a```.

We can also something like this

In [0]:
print(a[::-1])

Here ```-1``` indicates the reverse order.

### Indexing

Although we have already seen indexing at work, there are a few things worth mentioning to develop a better intuition about how NumPy arrays, and to the extent, how tensors in TensorFlow can be handled.

In [0]:
print(a[a>5])

In the above example, known as boolean indexing, we put a condition on ```a```'s elements of being greater than 5. This kind of conditional filtering also proves helpful sometimes.