-----
# Numerical Python (NumPy) - Part 2
-----

In [None]:
import numpy as np

## Indexing and Slicing 

Indexing, slicing and iterating are extremely important for data manipulation and analysis because these techinques allow us to select data based on conditions, and copy or update data.

### Indexing

First, let's look at integer indexing. A one-dimensional array, works in similar ways as a list. To get an element in a one-dimensional array, we simply use the offset index.

In [None]:
a = np.array([1,3,5,7])
print(a[2])

For multidimensional array, we need to use integer array indexing. Let's create a new multidimensional array


In [None]:
a = np.array([[1,2], [3, 4], [5, 6]])
a

If we want to select one certain element, we can do so by entering the index, which is comprised of two integers the first being the row, and the second the column

In [None]:
a[2,1] #remember in python we start at 0!

If we want to get multiple elements for example, 1, 4, and 6 and put them into a one-dimensional array we can enter the indices directly into an array function

In [None]:
np.array([a[0, 0], a[1, 1], a[2, 1]])

Note that we can also easily change a single value in an array using integer indexing.

In [None]:
#let's look at a again
a

We can individually change each data type as follows. 

In [None]:
a[0,0] = 100
a[1,0] = 200
a

### Boolean Indexing

Boolean indexing allows us to select arbitrary elements based on conditions. For example, say we want to find all elements that are greater than 5 in the matrix we just talked about. In order to do so, we set up the following conditon: a >5

In [None]:
print(a>5)

This returns a boolean array showing whether the value at the corresponding index is (or isn't) greater than 5.

We can then place this array of booleans like a mask over the original array to return a one-dimensional array relating to the true values.

In [None]:
print(a[a>5])

As you will see, this functionality is sometimes known as ***boolean masking*** and is essential in the Pandas toolkit. We will talk about it in further detail when we discuss Pandas dataframes later in the course.

#### Using `np.where()`




Having learned about boolean indexing (or masking), it now time to introduce one of NumPy's useful functions: `numpy.where()`. This function returns the **indices** of elements in an input array where the given condition is satisfied. If you think about it, it is very similar to *boolean indexing*. 

`numpy.where` is basically defined as follows:

`numpy.where(<condition> [, x, y])`

What this says is that if `condition` holds true for some element in our array, the new array will choose elements from `x`. Otherwise, if it's false, elements from `y` will be taken. `x` and `y` are optional though. If `x` and `y` are not given, the function returns a tuple of arrays where each element of the tuple refers to a dimension where the given condition holds true.

Let's look at the following 5 scenarios to make it a bit more clear how this function returns the indices:


1. Using 1-dimensional array with no `x` or `y`

In [None]:
# Initializing a 1-D array
a = np.array([10, 20, 2, 40, 1, -2, 3])

# Getting indices where element is greater than 2
np.where(a > 2)

This returns a tuple with one array element, since `a` is a one dimensional array. The array includes the indices of the values that satisfy the condition.



2. Using 1-dimensional array with `x` and `y`

In [None]:
np.where(a > 2, a, 0)

What we did here was we asked `where` are the indices where elements in `a` are greater than 2. We then use those indices to pick elements in `a` to form our new array. Moreover, wherever the condition `a > 2` is false, we insert `0` at those indices in our new array.

Ok, what if both `x` and `y` are arrays. What do you think the output would be?

In [None]:
b = np.array([-1,-2,-3,-4,-5,-6,-7])
np.where(a>2, a, b)

This returns an array whose values are from array `a` if the condition is satisfied, and from `b` if not.

3. Using 2-dimensional array with no `x` or `y`

In [None]:
# Initializing a 2-D array
a = np.array([[1, 2, 3, 4, 5, 6, 3],
              [-2, 1, 2, 3, 4, 5, 2]])


print(np.where(a > 2))

Since `x` and `y` were not passed, this returns a tuple with 2 elements, each element refering to the indices of a dimension. Since we have a 2-D array, then the first element of the tuple refers to the indices in the first dimension of relevant elements; the second element refers to the second dimension.

4. Using 3-dimensional array with no `x` or `y`

Let's make it a little bit more clear. 




In [None]:
# Initializing a 3-D array
a = np.array([[[1, 2, 3, 4, 5, 6, 3],
              [-2, 1, 2, 3, 4, 5, 2]]])

print(a.shape)
print(np.where(a>2))

What did we change here? We simply added another dimension by adding extra square brackets (`[]`), even though that dimension has only one element.

5. Using boolean array as condition

Finally, we can also specify the condition as a boolean array as follows:

In [None]:
condition = [[True, False], [True, True]]
x =  [[1, 2], [3, 4]]
y = [[9, 8], [7, 6]]

print(x)
print(y)
np.where(condition,x,y)

### Slicing
Slicing is a way to create a sub-array based on the original array. For one-dimensional arrays, slicing works in similar ways to a list. To slice, we use the `:` sign to indicate a range - `array[start:stop]` 

Leaving `start` or `stop` empty will default to the beginning/end of the array.

For instance, if we put `:3` in the indexing brackets, we get elements from index 0 to index 3 (excluding index 3)

In [None]:
a = np.array([0,1,2,3,4,5])
print(a[:3])

By putting `2:4` in the bracket, we get elements from index 2 to index 4 (excluding index 4)

In [None]:
print(a[2:4])

For multi-dimensional arrays, it works similarly, lets see an example

In [None]:
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
a

First, if we put one argument in the array, for example `a[:2]` then we would get all the elements from the first (index 0) and second row (index 1)

In [None]:
a[:2]

If we add another argument to the array, for example `a[:2, 1:3]`, we get the first two rows but then the second and third column values only


In [None]:
a[:2, 1:3]

So, in two dimensional arrays, the **first argument is for selecting rows**, and the **second argument is for selecting columns**

## Copying Data

It is important to realize that a slice of an array is *a view into the same data*. This is called **passing by reference**. So modifying the sub array will consequently modify the original array! As such, be careful with copying and modifying arrays in NumPy!

Example: `a2` is a slice of `a`

In [None]:
print(a)
a2 = a[:3,:3]
a2

Set this slice's values to zero (remember that `[:]` selects the entire array)

In [None]:
a2[:] = 0
a2

`a` has also been changed!

In [None]:
a

To avoid this, use `a.copy()` to create a copy that will not affect the original array

In [None]:
a_copy = a.copy()
a_copy

Now when `a_copy` is modified, `a` will not be changed.

In [None]:
a_copy[:] = 10
print(a_copy)
print("\n") #prints a new line
print(a)

## Iterating Over Arrays

Lastly, let's learn how to iterate over arrays. 

First, let's create a new 4 by 3 array of random numbers from 0 to 9.

In [None]:
test = np.random.randint(0, 10, size=(4,3))
test

We can iterate **by row** as follows:

In [None]:
for row in test:
    print(row)

We can iterate **by row index** by using the length function on `test`, `len()` which returns the number of rows. 

In [None]:
for i in range(len(test)):
    print(test[i])

We can **combine these two ways** of iterating by using `enumerate`, which gives us the index of the row, and the row itself. 

In [None]:
for i, row in enumerate(test):
    print('row', i, 'is', row)

Finally, you can iterate **through the values of multiple arrays** using `zip()`.

In [None]:
#let's create another array
test2 = test**2
test2

In [None]:
for i, j in zip(test, test2):
    print(i,'+',j,'=',i+j)

Note that `zip()` is used to iterate over multiple iterables, as such it could also work with tuples and lists.

Numpy has a lot to offer. So be sure to look at the [NumPy documentation](https://numpy.org/doc/) to find out about more great features. 

