<div style="text-align:left;font-size:2em"><span style="font-weight:bolder;font-size:1.25em">SP2273 | Learning Portfolio</span><br><br><span style="font-weight:bold;color:darkred">Storing Data (Good)</span></div>

# What to expect in this chapter

# 1 Subsetting: Indexing and Slicing

You will often need to select a subset (subsetting) of the data in a list (or array). One form of this is picking a single element called indexing (You already know how to do this from the previous chapter). Another option is to select a range of elements. This is called slicing.

Subsetting: Selecting 

Indexing: Selecting one element 

Slicing: selecting a range of elements

## 1.1 Lists & Arrays in 1D | Subsetting & Indexing

In [2]:
import numpy as np
py_list=["a1", "b2", "c3", "d4", "e5",
         "f6", "g7", "h8", "i9", "j10"]

x = py_list 

np_array=np.array(py_list) #If I want to convert to array


In [3]:
x[0] #first element 

'a1'

In [4]:
x[-1] #last element 

'j10'

In [5]:
x[0:3] #Select from 0th element to 2nd element 
#2nd number is not inclusive 

['a1', 'b2', 'c3']

In [7]:
x[1:6:2] #last position represent step
# In this case, for any given ith element,
#the next element we select is (i+step)th element

['b2', 'd4', 'f6']

In [9]:
x[:6] #Is theres no first position, assumption
#is that it is 0.

['a1', 'b2', 'c3', 'd4', 'e5', 'f6']

In [10]:
x[5:2:-1] #Go backwards from 5th to 3rd. 

['f6', 'e5', 'd4']

In general , [i:j] gives you j-i elements. 

## 1.2 Arrays only | Subsetting by masking

Only works for NumPy arrays. 

In [12]:
np_array = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
my_mask = np_array > 3 
my_mask #Assesses whether certain condition is met
#and outputs a boolean.

array([False, False, False,  True,  True,  True,  True,  True,  True,
        True])

In [13]:
np_array[my_mask]

#Shows all the elements where a certain condition
#is met. In this case element > 3
#allows only the True subset to be seen

array([ 4,  5,  6,  7,  8,  9, 10])

In [15]:
np_array[np_array > 3]
#More succint way of creating a mask

array([ 4,  5,  6,  7,  8,  9, 10])

We can also apply logic statements such as NOT, AND, OR in our mask as well.  

In [17]:
np_array[~(np_array > 3)]      # '~' means 'NOT'
# Output those that do not meet (np_array > 3) condition
# ~ is called the Bitwise Not operator. 

array([1, 2, 3])

In [18]:
np_array[(np_array > 3) & (np_array < 8)] 
# '&' means 'AND'

array([4, 5, 6, 7])

In [None]:
np_array[(np_array < 3) | (np_array > 8)] 
# '|' means 'OR'

## 1.3 Lists & Arrays in 2D | Indexing & Slicing

In [None]:
py_superhero_info = [
    ["Natasha", "Black Widow"], #0
    ["Tony", "Iron Man"], #1
    ["Stephen Strange", "Doctor Strange"] #2
]

In [20]:
py_superhero_info[1][0]
#Select 1st element in outer list, which gives another list.
#In the inner list, select the 0th element. 

'Tony'

In [22]:
import numpy as np
py_list_2d = [[1, "A"], [2, "B"], [3, "C"], [4, "D"],
              [5, "E"], [6, "F"], [7, "G"], [8, "H"],
              [9, "I"], [10, "J"]]

np_array_2d = np.array(py_list_2d)

In [26]:
py_list_2d 

[[1, 'A'],
 [2, 'B'],
 [3, 'C'],
 [4, 'D'],
 [5, 'E'],
 [6, 'F'],
 [7, 'G'],
 [8, 'H'],
 [9, 'I'],
 [10, 'J']]

There are differences between doing indexing and slicing for lists and arrays. Recall that data type for array must be the same

In [24]:
py_list_2d[3]

[4, 'D']

In [25]:
np_array_2d[3]

array(['4', 'D'], dtype='<U11')

Arrays uses just a single pair of square brackets separated by commas but lists need multiple square brackets. 

In [27]:
py_list_2d[3][0]

4

In [28]:
np_array_2d[3, 0]

'4'

Both works similarly when it comes to indexing in the following case. 

In [29]:
py_list_2d[:3]

[[1, 'A'], [2, 'B'], [3, 'C']]

In [30]:
np_array_2d[:3]

array([['1', 'A'],
       ['2', 'B'],
       ['3', 'C']], dtype='<U11')

But if we do this:

In [31]:
py_list_2d[:3][0]

[1, 'A']

You might think that this will yield the first elements (i.e., [1, 2, 3]) of all the sub-lists up to index 2.

But instead it gives you the first of the list you get from py_list_2d[:3]. Just to illustrate with another example:

In [35]:
py_list_2d[1:3][0]
#It is like taking the 0th element of the new list
#the new list being the original list with the 0th element
#removed. 

[2, 'B']

Notice what happens when we do the same thing with arrays. This will yield the first elements (i.e., [1, 2, 3]) of all the sub-lists up to index 2.

In [32]:
np_array_2d[:3, 0]

array(['1', '2', '3'], dtype='<U11')

Here are some more examples:

In [37]:
np_array_2d[3:6, 0]

array(['4', '5', '6'], dtype='<U11')

In [38]:
np_array_2d[3:6, 1]

array(['D', 'E', 'F'], dtype='<U11')

In [39]:
np_array_2d[:, 0] 
#If you want the starting element of every nested array

array(['1', '2', '3', '4', '5', '6', '7', '8', '9', '10'], dtype='<U11')

## 1.4 Growing lists

Why do we even bother with lists if the slicing syntax for NumPy arrays are more intuitive than lists? 

One advantage of lists is their ease and efficiency in growing. NumPy arrays are fantastic for fast math operations, provided you do not change their size.

Lets learn how to grow a list. Useful for solving differential equations numerically.

In [41]:
x=[1, 2]*5
x

#Recall that mutiplication don't work element-wise

[1, 2, 1, 2, 1, 2, 1, 2, 1, 2]

What are the 3 ways to frow a list by appending one element at a time?

In [42]:
#method 1
x=[1]
x= x + [2]
x= x + [3]
x= x + [4]
x

[1, 2, 3, 4]

In [43]:
#Method 2
x=[1]
x+= [2]
x+= [3]
x+= [4]
x

[1, 2, 3, 4]

In [44]:
#Method 3
x=[1]
x.append(2)
x.append(3)
x.append(4)
x

[1, 2, 3, 4]

**Whats the difference between these 3 versions?**

They differ in execution speed. The version with append() runs about 1.5 times faster than the rest!

**What are the 3 ways of incorporating multiple elements?**

In [45]:
x = [1, 2, 3]
x += [4, 5, 6]
x

[1, 2, 3, 4, 5, 6]

In [46]:
x=[1, 2, 3]
x.extend([4, 5, 6])
x

[1, 2, 3, 4, 5, 6]

In [47]:
x=[1, 2, 3]
x.append([4, 5, 6])
x

[1, 2, 3, [4, 5, 6]]

Extend adds the element from the new list to x, creating a longer list with elements within the new list.

Append adds the new list to x. In this case, the list itself is the new element appended to x. 

# Some loose ends

## 1.5 Tuples

Tuples are similar to lists, except they use ( ) and cannot be changed after creation (i.e., they are immutable).

In [48]:
a=(1, 2, 3)     # Define tuple

We can assess data using:

In [49]:
print(a[0])    # Access data

1


But we cannot change data. So the following code will produce error:

In [50]:
a[0]=-1

TypeError: 'tuple' object does not support item assignment

## 1.6 Be VERY careful when copying

For example, if you want to copy a list, you might be tempted to do the following; PLEASE DON’T!

In [51]:
x=[1, 2, 3]
y=x           # DON'T do this!
z=x           # DON'T do this!

What happens if I modify x? As we can see, modifying one list affects others. 

In [1]:
x = [1, 2, 3]
y = x  
z = x  

y[0] = 99  

print(x) 
print(y)  
print(z)

[99, 2, 3]
[99, 2, 3]
[99, 2, 3]


The correct way to do this is as follows:

In [52]:
x=[1, 2, 3]
y=x.copy()
z=x.copy()

In [2]:
x = [1, 2, 3]
y = x.copy()
z = x.copy()

y[0] = 99  

print(x) 
print(y)  
print(z)

[1, 2, 3]
[99, 2, 3]
[1, 2, 3]


According to ChatGPT: This is done to ensure that y and z are independent copies of the original list x, rather than references to the same list object. 

If you were to do y = x instead of y = x.copy(), both y and x would refer to the same list object. This means that if you modify the contents of the list through one variable, it will also affect the other. In contrast, using copy() creates a new list with the same elements as the original, but it is a separate object in memory.

# Exercises & Self-Assessment

In [None]:



# Your solution here




## Footnotes