<div style="text-align:left;font-size:2em"><span style="font-weight:bolder;font-size:1.25em">SP2273 | Learning Portfolio</span><br><br><span style="font-weight:bold;color:darkred">Storing Data (Good)</span></div>

# What to expect in this chapter

# Subsetting: Indexing and Slicing

**Subsetting** is the act of selecting elements from a list or an array. There are two types of subsetting: **indexing** (selecting one element, demonstrated using `[]` in previous parts, and **slicing** (selecting a range of elements). 

## Lists & Arrays in 1D | Subsetting & Indexing

The syntax for slicing is similar to indexing.

For indexing, for the list `list1`, the syntax is `list[x]`, where `x` is the index number of the element (this applies to arrays as well).

For slicing, the index for the same list is `list1[x:y:z]`, where `x` is the index number of the element to start with, `y` the index number of the element after the last element to be included, and `z` is the steps (e.g. if `z = 2`, then the sliced list will include every other element within the range of the `x` and `y`). See below for more clarification (this also applies to arrays).

For `list1` with the elements `[a1, a2, a3, a4, a5, a6]`,

|**Syntax**|**Output**|**Explanation**|**Note**|
|:--|:--|:--|:--|
|`list1[2:5]`|`[a3, a4, a5]`|Sliced array starts with element with index 2, ends with element with index 4|Gives $5-2=3$ elements|
|`list1[0:4]`|`[a1, a2, a3, a4]`|Sliced array starts with element with index 0, ends with element with index 3|Gives $4-0=3$ elements|
|`list1[0:4:2]`|`[a1, a3]`|Sliced array starts with element with index 0, ends with element with index 3, with step of 2|Gives every other of $4-0=4$ elements|
|`list1[3:]`|`[a4, a5, a6]`|Sliced array starts with element with index 3, and includes all elements after it|Gives $6-3=3$ elements|
|`list1[:3]`|`[a1, a2, a3]`|Sliced array starts with element with index 0, and includes all elements after it up till the element with index 2|Gives $3-0=3$ elements|
|`list1[2:5:-1]`|`[a5, a4, a3]`|Same as the first example, but reversed||
|`list1[0:4:-2]`|`[a3, a1]`|Same as the third example, but reversed||
|`list1[::-1]`|`[a6, a5, a4, ..., a1]`|Reverses the list|Gives $6-0=6$ elements|

Notice that the number of elements given by the slicing for `list1[i:j]` is always `j - i` and the sliced elements start with the element with index `i` and ends with that with index `j - 1`.

<p></p>

## Arrays only | Subsetting by masking

With arrays **only**, we can pass in arguments in the form of a boolean array to control which elements are included in a sliced array; this is called `masking`. See below for an example.

In [4]:
import numpy as np

array_mask = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
for_masking = np.array([True, False, True, True, False, False, True, True, True, False])  

masked_array = array_mask[for_masking]           
print(masked_array)

[1 3 4 7 8 9]


In the example above, the only elements from `array_mask` that will be included in `masked_array` is those whose index numbers correspond to the same index number as the value `True` in the array `for_masking`.

So, for example, in `array_mask` the index number of the element `1` is 0. In `for_masking`, the index 0 corresponds to the element `True`. This means that in `masked_array`, there will be the element `1`. However, for element `10`, the index number 9 corresponds to the element `False` in `for_masking`. Hence, `10` is not included in `masked_array`.

<p>
</p>

Below shows more examples (and more complex) maskings.

**Example 1**

In [6]:
mask1 = array_mask > 3

print(array_mask[mask1])     #Prints all elements more than 3

[ 4  5  6  7  8  9 10]


<p></p>

**Example 2**

We can use the `bitwise operator`, `~`, to switch all the `True` elements to `False` and all the `False` elements to `True`; essentially, `~` is the `not` operator.

In [7]:
print(array_mask[~mask1])    #Prints all elements NOT more than 3

[1 2 3]


<p></p>

**Example 3**

We can combine two masking arrays using `&` (i.e. the `and` operator).

In [8]:
mask2 = array_mask < 8

print(array_mask[mask1 & mask2])   #Prints all elements more than 3 and less than 8

[4 5 6 7]


<p></p>

**Example 4**

We can also use the `or` operator using the symbol `|`.

In [9]:
mask3 = array_mask < 2

print(array_mask[mask1 | mask3])   #Prints all elements that are either less than 2 or more than 3

[ 1  4  5  6  7  8  9 10]


<p></p>

## Lists & Arrays in 2D | Indexing & Slicing

The way to index and slice high dimensional lists and arrays are different. Consider the list and array below.

In [10]:
py_list_2d = [[1, "A"], [2, "B"], [3, "C"], [4, "D"],
              [5, "E"], [6, "F"], [7, "G"], [8, "H"],
              [9, "I"], [10, "J"]]

np_array_2d = np.array(py_list_2d)

<p></p>

**Example 1**

The way to index the 1D list or array element is basically the same for both 2D list and 2D array. However, if we want to extract a single element from the 2D list or 2D array, the syntax is different.

For example, if we want to extract `'G'` from the list and array:

In [12]:
print(py_list_2d[6][1])        #Extracts the element with index 1 in the 1D list with index 6 on the 2D array
print(np_array_2d[6, 1])       #Does the same thing, but notice the difference in syntax where only one [] is used

G
G


<p></p>

**Example 2**

Consider the slicing below.

In [13]:
py_list_2d[:3][0]

[1, 'A']

By right, this should have extracted the elements with index 0 in all of the 1D lists in `py_list_2d[:3]`. However, this is not the case. Instead, it gives the first of the list we get from `py_list_2d[:3]` (the 1D list at index 0 in `py_list_2d[:3]`).

If it was an array, the output will be as expected. See below.

In [16]:
np_array_2d[:3, 0]

array(['1', '2', '3'], dtype='<U11')

<p></p>

**Example 3**

This example does not show the difference in syntaxes between subsetting of lists and arrays, but if we want to extract only the elements of a certain index in all of the 1D arrays in a 2D array, we do it like this:

In [17]:
np_array_2d[:, 0]      #Just using : means that we will be including all of the 1D lists

array(['1', '2', '3', '4', '5', '6', '7', '8', '9', '10'], dtype='<U11')

<p></p>

## Growing lists

Lists can be more useful than arrays when the size of the list changes (changing the size of arrays is complicated).

Below shows ways to change the size of a list.

<p></p>

**Example 1**

In [18]:
list_grow1 = [1, 2]
grown_list = list_grow1 * 5

print(grown_list)

[1, 2, 1, 2, 1, 2, 1, 2, 1, 2]


<p></p>

**Example 2**

We can grow the lists one element at a time.

In [23]:
x = [1]                 #x = [1]
x = x + [2]             #x = [1, 2]
x = x + [3]             #x = [1, 2, 3]
x = x + [4]             #x = [1, 2, 3, 4]

print(x)                #You can use the += shortcut too to make the code look cleaner

[1, 2, 3, 4]


<p></p>

We can use the attribute `.append` to add elements to a list too.

In [25]:
y = [1]                 #x = [1]
y.append(2)             #x = [1, 2]
y.append(3)             #x = [1, 2, 3]
y.append(4)             #x = [1, 2, 3, 4]

print(y)                #Append runs faster than using the first way of using +

[1, 2, 3, 4]


<p></p>

**Example 3**

We can grow the lists with multiple elements.

In [27]:
z = [1, 2, 3]
z += [4, 5, 6]

print(z)

[1, 2, 3, 4, 5, 6]


<p></p>

We can use the attribute `.extend` too.

In [30]:
xy = [1, 2, 3]
xy.extend([4, 5, 6])

print(xy)

[1, 2, 3, 4, 5, 6]


<p></p>

We cannot use the attribute `.append` because it will add a list inside the original list (making a 2D array). `.extend` works because it adds the individual elements in the list to be added into the original list.

In [31]:
yz = [1, 2, 3]
yz.append([4, 5, 6])      #Adds the 1D list [4, 5, 6] to the 1D list [1, 2, 3] to create a 2D list

print(yz)

[1, 2, 3, [4, 5, 6]]


# Some loose ends

## Tuples

Tuples are like lists, but they are created using `()` and are immutable (meaning cannot be changed after being created). Tuples can be useful over lists and arrays when the elements are supposed to be unchanged (and perhaps changing it will lead to a large scale error that will be inconvenient).

Tuples can still be indexed and sliced, and the syntax is the same as lists. However, we cannot add, delete or substitute any elements into the tuple once it is created.

In [32]:
tuple1 = (1, 2, 3)

print(tuple1[0])      #Same syntax to index
print(tuple1[:2])     #Same syntax to slice

1
(1, 2)


In [34]:
tuple1[0] = 3     #This yields an error because a tuple is immutable

TypeError: 'tuple' object does not support item assignment

In [36]:
tuple1 += (4)     #This yields an error because a tuple is immutable

TypeError: can only concatenate tuple (not "int") to tuple

<p></p>

## Be VERY careful when copying

Always use the `.copy` attribute to copy a list or an array; do not use `=`.

This is because when we use `=`, it only creates a reference object for the original list or array (with the same code). Any changes to this copy **will** change the original list or array.


In [37]:
copying_list = [1, 2, 3]
copy1 = copying_list

copy1[0] = 3               #Change the element 1 to 3

print(copying_list)        #Original list changes as well
print(copy1)

[3, 2, 3]
[3, 2, 3]


<p></p>

If we use `.copy`, we create an entirely new list or array that is independent from the original list or array. So, nay changes made in the copy **will not** change the original list or array.

In [39]:
copying_list_again = [1, 2, 3]
copy2 = copying_list_again.copy()

copy2[0] = 3

print(copying_list_again)    #Original list stays unchanged
print(copy2)

[1, 2, 3]
[3, 2, 3]
