![logo](../../img/license_header_logo.png)
> **Copyright &copy; 2021 CertifAI Sdn. Bhd.**<br>
 <br>
This program is part of OSRFramework. You can redistribute it and/or modify
<br>it under the terms of the GNU Affero General Public License as published by
<br>the Free Software Foundation, either version 3 of the License, or
<br>(at your option) any later version.
<br>
<br>This program is distributed in the hope that it will be useful,
<br>but WITHOUT ANY WARRANTY; without even the implied warranty of
<br>MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
<br>GNU Affero General Public License for more details.
<br>
<br>You should have received a copy of the GNU Affero General Public License
<br>along with this program.  If not, see <http://www.gnu.org/licenses/>.

# 04 - Iterating, Indexing and Slicing
Authored by: [Kian Yang Lee](https://github.com/KianYang-Lee) - kianyang.lee@certifai.ai

## <a name="description">Notebook Description</a>

It is important for users to be able to index, slice and iterate through `ndarray` to extract information that are relevant. This tutorial will discuss on some of the common ways to achieve that.

By the end of this tutorial, you will be able to:

1. Perform iteration over `ndarray`
2. Perform basic indexing over `ndarray`
3. Perform basic slicing over `ndarray`
4. Explain the difference between `view` and `copy`

## Notebook Outline
Below is the outline for this tutorial:
1. [Notebook Description](#description)
2. [Notebook Configurations](#configuration)
3. [Iterating](#iterate)
4. [Basic Indexing](#indexing)
5. [Basic Slicing](#slicing)
6. [View and Copy](#view)
7. [Summary](#summary)
8. [Reference](#reference)

## <a name="configuration">Notebook Configurations</a>
This notebook will works only on `numpy` module, a popular `python` library for numerical computation. It is common for people to import it using the alias `np`.

In [1]:
### BEGIN SOLUTION
import numpy as np
### END SOLUTION

## <a name="iterate">Iterating</a>
`numpy ndarray` can be iterated over, much like a `list` object. One just need to write a `for` loop to do so.

In [2]:
### BEGIN SOLUTION
arr = np.arange(5)
arr
### END SOLUTION

array([0, 1, 2, 3, 4])

In [3]:
# for loop to iterate ndarray
### BEGIN SOLUTION
for element in arr:
    print(element)
### END SOLUTION

0
1
2
3
4


One can also perform mathematical operations while iterating, or whatever operation that is needed.

In [4]:
### BEGIN SOLUTION
for element in arr:
    print(element + 100)
### END SOLUTION

100
101
102
103
104


it is quite different when one wishes to iterate over n-dimensional `ndarray`. The first axis will be iterated when one does so, as shown below.

In [5]:
### BEGIN SOLUTION
arr_2d = np.arange(50).reshape(5, 10)
arr_2d
### END SOLUTION

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49]])

In [6]:
### BEGIN SOLUTION
for row in arr_2d:
    print(row)
### END SOLUTION

[0 1 2 3 4 5 6 7 8 9]
[10 11 12 13 14 15 16 17 18 19]
[20 21 22 23 24 25 26 27 28 29]
[30 31 32 33 34 35 36 37 38 39]
[40 41 42 43 44 45 46 47 48 49]


## <a name="indexing">Basic Indexing</a>
Indexing is the process of accessing certain element/s in an iterable object. Indexing an element or elements in `ndarray` is similar to how it is achieved for native `Python` `list`. Below shows the examples:

In [7]:
### BEGIN SOLUTION
arr_1 = np.arange(11)

print("The ndarray created is: \n\n")
print(arr_1)
print("\n\n The first element in the ndarray is: ")
print(arr_1[0])
print("\n\n The fifth element in the ndarray is: ")
print(arr_1[4])
### END SOLUTION

The ndarray created is: 


[ 0  1  2  3  4  5  6  7  8  9 10]


 The first element in the ndarray is: 
0


 The fifth element in the ndarray is: 
4


A few things to observe here. First, as `Python` is a zero-indexed programming language, counting starts from `0`. What this means is that the first index is actually `[0]`. 

This then brings us to the second point, where a square bracket notation with integer inside can be used to index and access the particular element corresponding to the integer position in the bracket. Refers to the example below to understand how each element in the `ndarray` can be accessed through indexing the right position.

In [8]:
# create a ndarray with size 10
### BEGIN SOLUTION
arr_2 = np.arange(start=0, stop=20, step=2)
arr_2
### END SOLUTION

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [9]:
### BEGIN SOLUTION
arr_2[2]
### END SOLUTION

4

In [10]:
# create a for loop to print element at each index position using indexing
### BEGIN SOLUTION
print("The ndarray that is to be indexed is: ")
print(arr_2)
print("\n\n")
for index in range(10):
    print(f"The index position now is: {index}")
    print(f"The corresponding element to the index position is: {arr_2[index]}")
### END SOLUTION

The ndarray that is to be indexed is: 
[ 0  2  4  6  8 10 12 14 16 18]



The index position now is: 0
The corresponding element to the index position is: 0
The index position now is: 1
The corresponding element to the index position is: 2
The index position now is: 2
The corresponding element to the index position is: 4
The index position now is: 3
The corresponding element to the index position is: 6
The index position now is: 4
The corresponding element to the index position is: 8
The index position now is: 5
The corresponding element to the index position is: 10
The index position now is: 6
The corresponding element to the index position is: 12
The index position now is: 7
The corresponding element to the index position is: 14
The index position now is: 8
The corresponding element to the index position is: 16
The index position now is: 9
The corresponding element to the index position is: 18


What were demonstrated above is basic indexing on 1-dimensional `ndarray`. Indexing for n-dimensional `ndarray` works more or less the same. Let's try to index a 2-dimensional `ndarray`. For easy understanding, let's create a `ndarray` with values in the half-interval `[0, 6)` and of shape `(2, 3)`.

In [11]:
### BEGIN SOLUTION
arr_3 = np.arange(6).reshape(2,3)
arr_3
### END SOLUTION

array([[0, 1, 2],
       [3, 4, 5]])

In [12]:
### BEGIN SOLUTION
arr_3.shape
### END SOLUTION

(2, 3)

Users have to know that the first value in the resultant tuple from `shape` attribute actually refers to the first axis (row) and second value in the tuple actually refers to the second axis (column). This is how axes in `numpy` are annotated. With that knowledge, we can start to index elements from `ndarray` of whatever dimension.

In [13]:
# index element 4
# second row, second column
### BEGIN SOLUTION
arr_3[1, 1] 
### END SOLUTION

4

In [14]:
# index element 2
# first row, third column
### BEGIN SOLUTION
arr_3[0, 2] 
### END SOLUTION

2

Users can also place a tuple as the singular argument instead of multiple arguments.

In [15]:
### BEGIN SOLUTION
arr_3[(0, 2)]
### END SOLUTION

2

## <a name="slicing">Basic Slicing</a>
Slicing refers to the process of extracting a customized portion of the `ndarray`. It uses bracket notation and a colon to control the number of elements to be extracted. For example:

In [16]:
### BEGIN SOLUTION
arr_4 = np.arange(11)
arr_4
### END SOLUTION

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [17]:
# return a slice from beginning to end (entire ndarray)
### BEGIN SOLUTION
arr_4[:] 
### END SOLUTION

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [18]:
# return a slice from second element to last element 
### BEGIN SOLUTION
arr_4[1:] 
### END SOLUTION

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [19]:
# return a slice from beginning to second last element 
### BEGIN SOLUTION
arr_4[:-1] 
### END SOLUTION

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Note that `-1` means reverse counting, where here it means that we want the first element counting from backwards (reverse). This is equivalent to below:

In [20]:
# return a slice from beginning to second last element 
### BEGIN SOLUTION
arr_4[:10] 
### END SOLUTION

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [21]:
# assert ndarray equality
### BEGIN SOLUTION
np.array_equal(arr_4[:-1], arr_4[:10]) 
### END SOLUTION

True

One thing to note is that it is exclusive of the element indicated in the right part of colon. The left part of the colon in square bracket notation indicates which element does the slice starts from, and the right part indicates which element does the slice ends (exclusive of the element). 

Can you guess what will `arr_4[3:6]` returns?

In [22]:
### BEGIN SOLUTION
arr_4[3:6]
### END SOLUTION

array([3, 4, 5])

It is more or less the same when slicing on n-dimensional `ndarray`. We will use a new 2-dimensional `ndarray` to demonstrate this.

In [23]:
### BEGIN SOLUTION
arr_5 = np.arange(20).reshape(4, 5)
### END SOLUTION

In [24]:
# return the entire ndarray
### BEGIN SOLUTION
arr_5[:] 
### END SOLUTION

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [25]:
# return a slice containing all the rows and first column
### BEGIN SOLUTION
arr_5[:, 0] 
### END SOLUTION

array([ 0,  5, 10, 15])

In [26]:
# return a slice containing only the first row and all the columns
### BEGIN SOLUTION
arr_5[0, :] 
### END SOLUTION

array([0, 1, 2, 3, 4])

When dealing with n-dimesional `ndarray`, we use comma to separate which axis that we are referring to when we are slicing. Recall that for a 2-dimension `ndarray`, the first axis refers to the row and the second axis refers to the column. 

Can you guess what will `arr_5[1:2,3:4]` return?

In [27]:
### BEGIN SOLUTION
arr_5[1:2, 3:4]
### END SOLUTION

array([[8]])

## <a name="view">View and Copy</a>
It is time to talk about the difference between `view` and `copy`. `ndarray` slices returns views on the original `ndarray`, which means that the data is not a copy of the original `ndarray`. Any modifications made on the view will affect the source `ndarray`. We will illustrate that point.

In [28]:
# create a new ndarray named arr_ori
### BEGIN SOLUTION
arr_ori = np.arange(8)
arr_ori
### END SOLUTION

array([0, 1, 2, 3, 4, 5, 6, 7])

In [29]:
# slice arr_ori and return a view of it, storing it in variable arr_slice
### BEGIN SOLUTION
arr_slice = arr_ori[2:]
arr_slice
### END SOLUTION

array([2, 3, 4, 5, 6, 7])

In [30]:
# assigning value of 1000 to arr_slice ndarray
### BEGIN SOLUTION
arr_slice[:] = 1000
### END SOLUTION

In [31]:
# display arr_slice ndarray
### BEGIN SOLUTION
arr_slice
### END SOLUTION

array([1000, 1000, 1000, 1000, 1000, 1000])

In [32]:
# display arr_ori ndarray
### BEGIN SOLUTION
arr_ori
### END SOLUTION

array([   0,    1, 1000, 1000, 1000, 1000, 1000, 1000])

We see that the elements in `arr_ori` has changed to `1000` even though no changes were made to it. This is because modifications when made to `arr_slice` are reflected on `arr_ori` since it is just a `view` of the original array.

To avoid this complications, one can use the `copy` method, whereby a new `copy` is generated instead of `view` of the original `ndarray`. Demonstrations are as below:

In [33]:
# create a new ndarray named arr_ori
### BEGIN SOLUTION
arr_ori = np.arange(8)
arr_ori
### END SOLUTION

array([0, 1, 2, 3, 4, 5, 6, 7])

In [34]:
# slice arr_ori and return a copy of it, storing it in variable arr_slice
### BEGIN SOLUTION
arr_slice = arr_ori[2:].copy()
arr_slice
### END SOLUTION

array([2, 3, 4, 5, 6, 7])

In [35]:
# assigning value of 1000 to arr_slice ndarray
### BEGIN SOLUTION
arr_slice[:] = 1000
### END SOLUTION

In [36]:
# display arr_slice ndarray
### BEGIN SOLUTION
arr_slice
### END SOLUTION

array([1000, 1000, 1000, 1000, 1000, 1000])

In [37]:
# display arr_ori ndarray
### BEGIN SOLUTION
arr_ori
### END SOLUTION

array([0, 1, 2, 3, 4, 5, 6, 7])

We see that the `arr_ori` is not affected by changes made to `arr_slice`.

##  <a name="summary">Summary</a>
To conclude, you should now be able to:
1. Perform iteration over `ndarray`
2. Perform basic indexing over `ndarray`
3. Perform basic slicing over `ndarray`
4. Explain the difference between `view` and `copy`<br><br>
Congratulations, that concludes this lesson.    

## <a name="reference">Reference</a>
* [Chapter 4. NumPy Basics: Arrays and Vectorized Computation](https://www.oreilly.com/library/view/python-for-data/9781449323592/ch04.html)
* [NumPy Quickstart](https://numpy.org/doc/stable/user/quickstart.html)