<div style="text-align:left;font-size:2em"><span style="font-weight:bolder;font-size:1.25em">SP2273 | Learning Portfolio</span><br><br><span style="font-weight:bold;color:darkred">Storing Data (Good)</span></div>

# What to expect in this chapter

# 1 Subsetting: Indexing and Slicing

- Subsetting means to select
- Indexing refers to selecting one element
- Slicing refers to selecting a range of elements

## 1.1 Lists & Arrays in 1D | Subsetting & Indexing

In [3]:
import numpy as np
py_list = ["a1", "b2", "c3", "d4", "e5", "f6", "g7", "h8", "i9", "j10"]
np_array = np.array(py_list)

x = np_array

In [7]:
x[0] # First element

np.str_('a1')

In [9]:
x[-1] # Last element

np.str_('j10')

In [11]:
x[0:3] # Index 0 to 2 (Gives 3 - 0 = 3 elements)

array(['a1', 'b2', 'c3'], dtype='<U3')

In [13]:
x[1:6] # Index 1 to 5 (Gives 6 - 1 = 5 elements)

array(['b2', 'c3', 'd4', 'e5', 'f6'], dtype='<U3')

In [15]:
x[1:6:2] # Index 1 to 5 in steps of 2 (Step means it skips one takes every 2nd item)

array(['b2', 'd4', 'f6'], dtype='<U3')

In [17]:
x[5:] # Index 5 to the end 

array(['f6', 'g7', 'h8', 'i9', 'j10'], dtype='<U3')

In [19]:
x[:5] # Index 0 to 5

array(['a1', 'b2', 'c3', 'd4', 'e5'], dtype='<U3')

In [None]:
x[5:2:-1]
# Begins at index 5, slice goes down but does not include index 2, "-1" means going backwards
# Gives 5 - 2 = 3 elements

array(['f6', 'e5', 'd4'], dtype='<U3')

In [21]:
x[::-1]

array(['j10', 'i9', 'h8', 'g7', 'f6', 'e5', 'd4', 'c3', 'b2', 'a1'],
      dtype='<U3')

In [22]:
x[-3:] # Returns the last 3 elements of the list

array(['h8', 'i9', 'j10'], dtype='<U3')

## 1.2 Arrays only | Subsetting by masking

In [None]:
np_array = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
my_mask = np_array > 3 # Creating the mask / the condition
my_mask

array([False, False, False,  True,  True,  True,  True,  True,  True,
        True])

In [25]:
np_array[my_mask] # Applying the mask and only keep elements where the mask is True

array([ 4,  5,  6,  7,  8,  9, 10])

In [27]:
np_array[np_array>3] # More concise form

array([ 4,  5,  6,  7,  8,  9, 10])

In [29]:
np_array[~(np_array > 3)] # "~" means NOT

array([1, 2, 3])

In [32]:
np_array[(np_array > 3) & (np_array < 8)] # the "&" means AND

array([4, 5, 6, 7])

In [35]:
np_array[(np_array > 3) and (np_array < 8)] # the "and" does not work as "&" is used for element by element comparison

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [36]:
np_array[(np_array < 3) | (np_array > 8)] # "|" means OR

array([ 1,  2,  9, 10])

## 1.3 Lists & Arrays in 2D | Indexing & Slicing

In [37]:
py_list_2d = [[1, "A"], [2, "B"], [3, "C"], [4, "D"], [5, "E"], [6, "F"], [7, "G"], [8, "H"], [9, "I"], [10, "J"]]
np_array_2d = np.array(py_list_2d)

In [38]:
py_list_2d[3]

[4, 'D']

In [39]:
np_array_2d[3]

array(['4', 'D'], dtype='<U21')

In [None]:
py_list_2d[3][0] # Go to row 3 then from it, get item 0

4

In [44]:
py_list_2d[3][1] # Go to row 3 then from it, get item 1

'D'

In [42]:
np_array_2d[3,0]

np.str_('4')

In [47]:
np_array_2d[2,1] # Go to row 2 then get item 1 (In reality is 3rd row and 2nd element)

np.str_('C')

In [None]:
py_list_2d[:3] # Returns first 3 row of the data

[[1, 'A'], [2, 'B'], [3, 'C']]

In [49]:
np_array_2d[:3]

array([['1', 'A'],
       ['2', 'B'],
       ['3', 'C']], dtype='<U21')

In [50]:
np_array_2d[:3, :2] # Slice first 3 rows and first 2 columns

array([['1', 'A'],
       ['2', 'B'],
       ['3', 'C']], dtype='<U21')

In [54]:
np_array_2d[:4, :2] # Slice first 4 rows and first 2 columns

array([['1', 'A'],
       ['2', 'B'],
       ['3', 'C'],
       ['4', 'D']], dtype='<U21')

In [None]:
py_list_2d[:3][0]
# py_list_2d[:3] creates a new list containing only the first 3 rows (Rows 0, 1, and 2)
# [0] then takes that new list and grabs the very first item inside it (the entire first row)
# result: just the first row
# misconception: many expect the result to be "1, 2, 3"

[1, 'A']

In [None]:
np_array_2d[:3, 0]
# the comma separates the rows from the columns
# [:3] tells NumPy to look at the first 3 rows
# ,0 tells NumPy to look only at the first column

array(['1', '2', '3'], dtype='<U21')

In [62]:
py_list_2d[3:6]
# [3:6] Python creates new sublist containing row 3, 4, and 5

[[4, 'D'], [5, 'E'], [6, 'F']]

In [None]:
py_list_2d[3:6][0]
# [3:6] Python creates new sublist containing row 3, 4, and 5 (4th, 5th, 6th row)
# [0] selects the first item of that new sublist

[4, 'D']

In [64]:
np_array_2d[3:6, 0]
# NumPy takes Row 3 (4th) till 5 and specifically the 0th column of those rows

array(['4', '5', '6'], dtype='<U21')

In [66]:
np_array_2d[:,0] # NumPy takes every row and extract the 0th column

array(['1', '2', '3', '4', '5', '6', '7', '8', '9', '10'], dtype='<U21')

## 1.4 Growing lists

In [68]:
x = [1, 2]*5
x

[1, 2, 1, 2, 1, 2, 1, 2, 1, 2]

In [69]:
x=[1]
x = x + [2]
x = x + [3]
x = x + [4]
x


[1, 2, 3, 4]

In [70]:
x=[1]
x += [2]
x += [3]
x += [4]
x

[1, 2, 3, 4]

In [73]:
x=[1]
x.append(2)
x.append(3)
x.append(4)
x

[1, 2, 3, 4]

In [74]:
x = [1, 2, 3]
x += [4, 5, 6]
x

[1, 2, 3, 4, 5, 6]

In [None]:
x = [1, 2, 3]
x.extend([4, 5, 6]) # extend adds each element individually
x

[1, 2, 3, 4, 5, 6]

In [None]:
x = [1, 2, 3]
x.append([4, 5, 6]) # append adds the whole list as one item
x

[1, 2, 3, [4, 5, 6]]

# Some loose ends

## 1.5 Tuples

In [78]:
a = (1, 2, 3) # Define tuple
print(a[0])

1


In [None]:
# The following will not work
a[0] =- 1
# Tuples in Python are immutable, meaning once created, their elements cannot be changed

TypeError: 'tuple' object does not support item assignment

In [None]:
# The following will not work
a[0] += [10]
# Type mismatch + immutability error

TypeError: unsupported operand type(s) for +=: 'int' and 'list'

## 1.6 Be VERY careful when copying

In [86]:
x = [1, 2, 3]
y = x
z = x
z
# Don't do this!

[1, 2, 3]

In [92]:
# Correct way to copy a list is:
x = [1, 2, 3]
y = x.copy()
z = x.copy()
# Create independent copies of x

In [90]:
y

[1, 2, 3]

In [91]:
z

[1, 2, 3]

## Footnotes

## Additional Practice

__Tuples__ are a way to store data similar to lists, but one major difference is immutability.
- Tuples are defined using parentheses () instead of square brackets [ ]
- You can access elements in a tuple using indexes
- Immutability: once a tuple is created, its elements cannot be changed

Difference between Lists & Tuples:
- Lists are mutable, which means you can add, change, or remove elements
- Tuples are immutable, attempting to change an element will result in a TypeError

In [1]:
a = (1, 2, 3)
print(a[0])

1


In [2]:
# Example of what you CANNOT do
a = (1, 2, 3)

# The following code will fail:
a[0] = -1

TypeError: 'tuple' object does not support item assignment

In [3]:
# Example of what you CANNOT do
a = (1, 2, 3)

# The following code will fail:
a[0] += [10]

TypeError: unsupported operand type(s) for +=: 'int' and 'list'

In [6]:
# Example of what LIST CAN DO
a = [1, 2, 3]

# The following code will fail:
a[0] += 10

print(a)

[11, 2, 3]


In [7]:
# Example of List append.()
# Adds a single item to the end of the list

x = [1]
x.append(2)
x.append(3)
print(x)


[1, 2, 3]


In [None]:
# Using append.() to add a list to create a nested list
x = [1, 2, 3]
x.append([4, 5, 6])
print(x)

[1, 2, 3, [4, 5, 6]]


In [11]:
# Using .extend() to unpack the list and add each element individually

x = [1, 2, 3]
x.extend([4, 5, 6])
print(x)

[1, 2, 3, 4, 5, 6]


In [13]:
# += operator is a shorthand for .extend()
x = [1, 2, 3]
x += [4, 5, 6]
print(x)

[1, 2, 3, 4, 5, 6]
