<div style="text-align:left;font-size:2em"><span style="font-weight:bolder;font-size:1.25em">SP2273 | Learning Portfolio</span><br><br><span style="font-weight:bold;color:darkred">Storing Data (Good)</span></div>

# What to expect in this chapter

Now, let's delve deeper into accessing and modifying these structures. This is crucial as much of programming involves working with data access and manipulation. This exploration will also enhance your comprehension of the distinctions and similarities among lists, NumPy arrays, and dictionaries.

# 1 Subsetting: Indexing and Slicing

To select a subset (subsetting) of the data in a list (or array), one form of this is picking a single element called indexing (You already know how to do this from the previous chapter). Another option is to select a range of elements. This is called slicing.

- Subsetting means to ‘select’.
- Indexing refers to selecting one element.
- Slicing refers to selecting a range of elements.

## 1.1 Lists & Arrays in 1D | Subsetting & Indexing

Since slicing gives us a range of elements, we must specify two indices to indicate where to start and end. The various syntaxes for these are shown in the table below.

The following applies to both lists and arrays.

In [1]:
import numpy as np

In [2]:
py_list=["a1", "b2", "c3", "d4", "e5",
         "f6", "g7", "h8", "i9", "j10"]
np_array=np.array(py_list)

# Pick one
x = py_list  # OR
x = np_array

|$$Syntax$$|$$Result$$|$$Note$$|
|:---|---|---:|
|x[0]|	First element 'a1'||	
|x[-1]|	Last element 'j10'||	
|x[0:3]|Index 0 to 2	['a1','b2','c3']|Gives $3-0=3$ elements|
|x[1:6]	|Index 1 to 5	['b2','c3','d4','e5','f6']|	Gives $6-1=5$ elements|
|x[1:6:2]|	Index 1 to 5 in steps of 2	['b2','d4','f6']|Gives every other of $6-1=5$ elements|
|x[5:]|	Index 5 to the end	['f6','g7','h8','i9','j10']	|Gives len(x)-5=5 elements|
|x[:5]|	Index 0 to 5	['a1','b2','c3','d4','e5']|Gives $5-0=5$ elements
|x[5:2:-1]|	Index 5 to 3 (i.e., in reverse)	['f6','e5','d4']|Gives $5-2=3$elements
|x[::-1]|	Reverses the list	['j10','i9','h8',...,'b2','a1']||

Remember if you slice with [i:j], the slice will start at i and end at j-1, giving you a total of j-i elements.

## 1.2 Arrays only | Subsetting by masking

One of the most powerful things you can do with NumPy arrays is subsetting by masking. To make sense of this, consider the following.

In [3]:
np_array = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
my_mask = np_array > 3
my_mask

array([False, False, False,  True,  True,  True,  True,  True,  True,
        True])

The answer to my question is in the form of a ‘Yes’/‘No’ or True/False format. I can use this True/False format to ask NumPy to show me only those that are True by:

In [4]:
np_array[my_mask]

array([ 4,  5,  6,  7,  8,  9, 10])

The True/False answer acts like a mask allowing only the True subset to be seen.

Instead of creating another variable, I can also do all of this succinctly as:

In [5]:
np_array[np_array > 3]

array([ 4,  5,  6,  7,  8,  9, 10])

More examples: 

In [6]:
np_array[~(np_array > 3)]                 # '~' means 'NOT'
                                          # this inverts the mask and gives what is NOT > 3

array([1, 2, 3])

In [8]:
np_array[(np_array > 3) & (np_array < 8)] # '&' means 'AND'
                                          # this combines 2 masks

array([4, 5, 6, 7])

In [9]:
np_array[(np_array < 3) | (np_array > 8)] # '|' means 'OR'

array([ 1,  2,  9, 10])

- Always use the Bitwise NOT(~), Bitwise OR(|) and Bitwise AND(&) when combining masks with NumPy.
- Always use brackets to clarify what you are asking the mask to do.

## 1.3 Lists & Arrays in 2D | Indexing & Slicing

The differences between lists and arrays become even more apparent with higher dimensional lists and arrays. Especially when you try indexing and slicing in higher dimensions.

In [10]:
py_list_2d = [[1, "A"], [2, "B"], [3, "C"], [4, "D"],
              [5, "E"], [6, "F"], [7, "G"], [8, "H"],
              [9, "I"], [10, "J"]]

np_array_2d = np.array(py_list_2d)

What is at position 4 (index 3)?

In [11]:
py_list_2d[3]

[4, 'D']

In [12]:
np_array_2d[3]

array(['4', 'D'], dtype='<U11')

What is the FIRST element at position 4 (index 3)

In [13]:
py_list_2d[3][0]

4

In [14]:
np_array_2d[3, 0]  #arrays uses only one bracket 

'4'

What are the first three elements?

In [15]:
py_list_2d[:3]

[[1, 'A'], [2, 'B'], [3, 'C']]

In [16]:
np_array_2d[:3]

array([['1', 'A'],
       ['2', 'B'],
       ['3', 'C']], dtype='<U11')

In [17]:
py_list_2d[:3][0]

[1, 'A']

You might think that this will yield the first elements (i.e., [1, 2, 3]) of all the sub-lists up to index 2.
No! Instead, it gives the first of the list you get from py_list_2d[:3].

In [18]:
np_array_2d[:3, 0]

array(['1', '2', '3'], dtype='<U11')

In [19]:
py_list_2d[3:6][0]

[4, 'D']

In [20]:
np_array_2d[3:6, 0]

array(['4', '5', '6'], dtype='<U11')

In [21]:
np_array_2d[:, 0]   #if you want everything 

array(['1', '2', '3', '4', '5', '6', '7', '8', '9', '10'], dtype='<U11')

## 1.4 Growing lists

One advantage of lists is their ease and efficiency in growing. NumPy arrays are fantastic for fast math operations, provided you do not change their size (adding or removing values from the existing list will result in numpy destroying the list)

Creating a larger list from a smaller one.

In [23]:
x=[1, 2]*5
x

[1, 2, 1, 2, 1, 2, 1, 2, 1, 2]

Three ways to grow a list by appending one element at a time.

In [24]:
x=[1]
x= x + [2]
x= x + [3]
x= x + [4]
x

[1, 2, 3, 4]

In [25]:
x=[1]
x+= [2]
x+= [3]
x+= [4]
x

[1, 2, 3, 4]

In [26]:
x=[1]
x.append(2)
x.append(3)
x.append(4)
x

[1, 2, 3, 4]

Their execution speeds are different; the version with append() runs about 1.5 times faster than the rest!

Three ways of incorporating multiple elements.

In [27]:
x = [1, 2, 3]
x += [4, 5, 6]
x

[1, 2, 3, 4, 5, 6]

In [28]:
x=[1, 2, 3]
x.extend([4, 5, 6])
x

[1, 2, 3, 4, 5, 6]

In [29]:
x=[1, 2, 3]
x.append([4, 5, 6])
x

[1, 2, 3, [4, 5, 6]]

# Some loose ends

## 1.5 Tuples

Tuples are another way to store data. They are similar to lists, except they use ( ) and cannot be changed after creation (i.e., they are immutable).

In [30]:
a=(1, 2, 3)     # Step 1: Define tuple

In [31]:
print(a[0])    # Step 2: Access data

1


In [32]:
# The following will NOT work
# As we cannot change the data
a[0]=-1
a[0]+= [10]

TypeError: 'tuple' object does not support item assignment

## 1.6 Be VERY careful when copying

In [33]:
x=[1, 2, 3]
y=x           # DON'T do this!
z=x           # DON'T do this!

In [34]:
# DO this instead
x=[1, 2, 3]
y=x.copy()
z=x.copy()