<div style="text-align:left;font-size:2em"><span style="font-weight:bolder;font-size:1.25em">SP2273 | Learning Portfolio</span><br><br><span style="font-weight:bold;color:darkred">Storing Data (Good)</span></div>

# What to expect in this chapter

# 1 Subsetting: Indexing and Slicing

- We can select a portion (a subset) of the data by selecting a single element (**indexing** - as we access the element via its index) or select a range of elements (this is named as **slicing**)

Note:
- **Subsetting** means to ‘select’.
- **Indexing** refers to selecting one element.
- **Slicing** refers to selecting a range of elements.

## 1.1 Lists & Arrays in 1D | Subsetting & Indexing

**Indexing** applies for both `list` and `array`. It means we can access an element via its index in the data regardless of whether the data is of type `list` or `array`

Remember:
- If you slice with `[i:j]`, the slice will start at `i` and end at `j-1`, giving you a total of j-i elements.

![](https://www.askpython.com/wp-content/uploads/2020/03/String-Slicing-in-Python-1024x768.png.webp)

## 1.2 Arrays only | Subsetting by masking

Masking is using a condition to convert the element in an array to Boolean type. If the element satisfies the condition, it will be converted to `True` and `False` otherwise

In [7]:
import numpy as np
np_array = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
my_mask = np_array > 3
my_mask

array([False, False, False,  True,  True,  True,  True,  True,  True,
        True])

In [8]:
np_array = np_array[my_mask]
#Now, only element that is True will be returned since my_mask is of Boolean type

## `~` Symbol 

We can invert our mask by using the `~`, which is the `Bitwise Not` operator (basically means `otherwise`)

In [10]:
np_array = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
np_array[~(np_array > 3)]                 # '~' means 'NOT'
#only returns elements that are not greater than 3

array([1, 2, 3])

## `&` Symbol

We can have multiple conditions for our masking by using the operator `&`

In [11]:
np_array[(np_array > 3) & (np_array < 8)] # '&' means 'AND'
#only return elements that are greater than 3 and smaller than 8

array([4, 5, 6, 7])

## `|` Symbol

But we can also return elements that satisfy at least one of those multiple conditions

In [12]:
#only return elements that are smaller than 3 or greater than 8
np_array[(np_array < 3) | (np_array > 8)] # '|' means 'OR'

array([ 1,  2,  9, 10])

**However, this masking is only applicable for arrays**

## 1.3 Lists & Arrays in 2D | Indexing & Slicing

### Not nested element

Accessing elements that are not nested is similar in both lists and array by the square brackets followed by an interger `[<int>]`

In [13]:
py_list_2d = [[1, "A"], [2, "B"], [3, "C"], [4, "D"],
              [5, "E"], [6, "F"], [7, "G"], [8, "H"],
              [9, "I"], [10, "J"]]

np_array_2d = np.array(py_list_2d)

In [14]:
py_list_2d[3]

[4, 'D']

In [15]:
np_array_2d[3]

array(['4', 'D'], dtype='<U21')

Both yield similar answer

### Nested element (2D/3D and above)

#### List

To access a nested element in a nested list (2D/3D), we can have multiple square brackets `[]`. The order of the brackets will determine the subsequent layers of elements we want to access

In [16]:
py_list_2d[3][0]
# there are 2 brackets [3] and [0]. The first bracket [3] means 'going to the 1st layer and take element of index 3';
# the second bracket [0] means 'from there, take the element of index 0'

4

#### Array

To access a nested array, we just use a single pair of square brackets with series of intergers inside with its order indicating the layer and its value indicating the index of that particular layer

In [17]:
np_array_2d[3, 0]

'4'

### Other miscellaneous

In [23]:
py_list_2d[:3][0]
# this is equivalent to:
#1. get the first 3 elements of py_list_2d. Let A be a list containing those 3 elements
#2. then, take the first element of A
A = py_list_2d[:3]
print('First 3 elements: A = ', A)
print('Now, take first element of A: ', A[0])
print('py_list_2d[:3][0] is equivalent to: ', py_list_2d[:3][0])

First 3 elements: A =  [[1, 'A'], [2, 'B'], [3, 'C']]
Now, take first element of A:  [1, 'A']
py_list_2d[:3][0] is equivalent to:  [1, 'A']


### Documentation for Slicing in Numpy array 

Reference: [Numpy Array slicing](https://www.w3schools.com/python/numpy/numpy_array_slicing.asp)

![](https://miro.medium.com/v2/resize:fit:1400/1*W9DYV8pLr8AKVXJefwJBng.png)

In [None]:
np_array_2d[:3, 0]
#here, the first part ':3' means select the first 3 rows
#then, we will select the first element of those 3 rows
# np_array_2d = [1,
#                2,
#                3,
#                4,
#                5,
#                6,
#                7,
#                8,
#                9,
#                10
#                 ]

## Some common slicing syntax

![](https://prepinstadotcom.s3.ap-south-1.amazonaws.com/wp-content/uploads/2020/07/Slicing-in-python.webp)

## 1.4 Growing lists

Lists are more easy and efficient to append while arrays are not (but they are more proficient in operations)

We can use `append()` or `extend()` to grow a list

### Adding list: `.append()`

`.append()` adds its argument as a single element to the end of a list. The length of the list increases by one.

In [None]:
list1 = [1, 2, 3]

# Appending a single element
list1.append(4)
print(list1)  # Output: [1, 2, 3, 4]

# Appending another list as a single element
list1.append([5, 6])
print(list1)  # Output: [1, 2, 3, 4, [5, 6]]


### Adding list: `.extend()`

`.extend()` adds each element of its argument to the list.The length of the list increases by however many elements were in the iterable added.

In [27]:
list2 = [1, 2, 3]

# Extending with another list
list2.extend([4, 5])
print(list2)  # Output: [1, 2, 3, 4, 5]

# Extending with a tuple
list2.extend((6, 7))
print(list2)  # Output: [1, 2, 3, 4, 5, 6, 7]

[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5, 6, 7]


In this example, `extend([4, 5])` adds each element of `[4, 5]` to `list2`, and `extend((6, 7))` does the same with the tuple `(6, 7)`.

# Some loose ends

## 1.5 Tuples

Tuples are denoted by `()` and imutable. Hence, we cannot change it.

In [37]:
a = (1,2,3)
#a[0] = 1
#a
#this will throw an error

In [None]:
To add to a tuple, we can do:

In [38]:
a = a + (4,)
a

(1, 2, 3, 4)

To do some operation on a tuple, we have to create another variable and iterate through the tuple

In [39]:
a = tuple(i**2 for i in a)
print(a)

(1, 4, 9, 16)


## 1.6 Be VERY careful when copying

In [40]:
x=[1, 2, 3]
y=x           # DON'T do this!
z=x           # DON'T do this!

Doing like the above will mean both x,y,z are referring to the same address that holds the list `[1,2,3]`. Hence, when you access this list by either `x`, `y`, and `z`, the other variables will be changed too

In [43]:
x[0] = 10
print(f'x: {x}, y: {y}, z: {z}')
#we are only changing x but y and z are changed too since they are the same thing

x: [10, 2, 3], y: [10, 2, 3], z: [10, 2, 3]


Hence, we must use `.copy()` to create another address that holds the same value of list `[1,2,3]`. For nested list, we can use `deepcopy()`

In [44]:
x=[1, 2, 3]
y=x.copy()
z=x.copy()
x[0]='changed'
print(f'x: {x}, y: {y}, z: {z}')
#here, y and z are not changed

x: ['changed', 2, 3], y: [1, 2, 3], z: [1, 2, 3]


# Exercises & Self-Assessment

## Footnotes