<div style="text-align:left;font-size:2em"><span style="font-weight:bolder;font-size:1.25em">SP2273 | Learning Portfolio</span><br><br><span style="font-weight:bold;color:darkred">Storing Data (Good)</span></div>

In [2]:
import numpy as np

# What to expect in this chapter

# Subsetting: Indexing and Slicing

- Subsetting means to select
- Indexing refers to selecting one element
- Slicing refers to selecting a range of elements

##  Lists & Arrays in 1D | Subsetting & Indexing

In [4]:
py_list=["a1", "b2", "c3", "d4", "e5",
         "f6", "g7", "h8", "i9", "j10"]
np_array=np.array(py_list)

# Pick one
x = py_list  # OR
x = np_array

In [8]:
py_list[0]

'a1'

In [9]:
py_list[-1]

'j10'

In [11]:
py_list[1:6]  # get 6-1=5 elements

['b2', 'c3', 'd4', 'e5', 'f6']

In [12]:
py_list[1:6:2] #same as previous cell but in steps of 2

['b2', 'd4', 'f6']

- This applies to both lists and arrays
- Ref sp2273 website for syntax

##  Arrays only | Subsetting by masking

In [5]:
np_array = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
my_mask = np_array > 3
my_mask

array([False, False, False,  True,  True,  True,  True,  True,  True,
        True])

The answer to the question is in the form of a ‘Yes’/‘No’ or True/False format. We can use this True/False format to ask NumPy to show us only those that are True by

In [13]:
np_array[my_mask]  #the mask sort of "covers" the options that are false and just shows us those that are true

array([ 4,  5,  6,  7,  8,  9, 10])

**Subsetting my masking only works with numpy arrays**

Instead of creating another variable, we can also do all of this succinctly as:

In [14]:
np_array[np_array > 3]  #instead of creating a mask we can just do it like this

array([ 4,  5,  6,  7,  8,  9, 10])

**Example 1**

We can invert our mask by using the ~.
~ is called the Bitwise Not operator.

In [16]:
np_array[~(np_array > 3)]                 # '~' means 'NOT'

array([1, 2, 3])

**Example 2**

We can combine one mask AND another mask.
(AND will show something only if both masks are true.)

In [18]:
np_array[(np_array > 3) & (np_array < 8)] # '&' means 'AND'

array([4, 5, 6, 7])

**Example 3**

We can combine one mask OR another mask.
(OR will show something if either mask is true.)

In [20]:
np_array[(np_array < 3) | (np_array > 8)] # '|' means 'OR'

array([ 1,  2,  9, 10])

- Always use the Bitwise NOT(~), Bitwise OR(|) and Bitwise AND(&) when combining masks with NumPy.
- 
Always use brackets to clarify what you are asking the mask to do.



##  Lists & Arrays in 2D | Indexing & Slicing

In [23]:
py_list_2d = [[1, "A"], [2, "B"], [3, "C"], [4, "D"],
              [5, "E"], [6, "F"], [7, "G"], [8, "H"],
              [9, "I"], [10, "J"]]

np_array_2d = np.array(py_list_2d)

In [24]:
py_list_2d

[[1, 'A'],
 [2, 'B'],
 [3, 'C'],
 [4, 'D'],
 [5, 'E'],
 [6, 'F'],
 [7, 'G'],
 [8, 'H'],
 [9, 'I'],
 [10, 'J']]

In [25]:
np_array_2d

array([['1', 'A'],
       ['2', 'B'],
       ['3', 'C'],
       ['4', 'D'],
       ['5', 'E'],
       ['6', 'F'],
       ['7', 'G'],
       ['8', 'H'],
       ['9', 'I'],
       ['10', 'J']], dtype='<U11')

**Example 1**

What is at position 4 (index 3)?

In [27]:
py_list_2d[3]

[4, 'D']

In [28]:
np_array_2d[3]

array(['4', 'D'], dtype='<U11')

**Example 2**

What is the first element at position 4 (index 3)?

In [29]:
py_list_2d[3][0]

4

In [30]:
np_array_2d[3, 0]

'4'

**Example 3**

What are the first 3 elements?

In [31]:
py_list_2d[:3]

[[1, 'A'], [2, 'B'], [3, 'C']]

In [32]:
np_array_2d[:3]

array([['1', 'A'],
       ['2', 'B'],
       ['3', 'C']], dtype='<U11')

**Example 4**

In [40]:
py_list_2d[:3][0]  #Takes the 3 sets of elements first then takes 0

[1, 'A']

In [41]:
np_array_2d[:3, 0]  #Takes the first 0 off all 3 sets

array(['1', '2', '3'], dtype='<U11')

You might think that this will yield the first elements (i.e., [1, 2, 3]) of all the sub-lists up to index 2.
No! Instead, it gives the first of the list you get from py_list_2d[:3]. Also numpy array works very differently.

**Example 5** 

In [37]:
py_list_2d[3:6][0]

[4, 'D']

In [38]:
np_array_2d[3:6, 0]

array(['4', '5', '6'], dtype='<U11')

In [39]:
np_array_2d[:, 0]

array(['1', '2', '3', '4', '5', '6', '7', '8', '9', '10'], dtype='<U11')

If you want ‘everything’ you just use :

##  Growing lists

**Example 1**

Creating a larger list from a smaller one

In [48]:
x=[1, 2]*5   #creates copies of the list
x

[1, 2, 1, 2, 1, 2, 1, 2, 1, 2]

**Example 2**

Three ways to grow a list by appending one element at a time.

In [45]:
x=[1]
x= x + [2]
x= x + [3]
x= x + [4]
x

[1, 2, 3, 4]

In [46]:
x=[1]
x+= [2]
x+= [3]
x+= [4]
x

[1, 2, 3, 4]

In [49]:
x=[1]              #append computes faster than the rest
x.append(2)
x.append(3)
x.append(4)
x

[1, 2, 3, 4]

**Example 3**

Here are three ways of incorporating multiple elements.
Notice the difference between the effects of extend() and append()

In [50]:
x = [1, 2, 3]
x += [4, 5, 6]
x

[1, 2, 3, 4, 5, 6]

In [51]:
x=[1, 2, 3]
x.extend([4, 5, 6])
x

[1, 2, 3, 4, 5, 6]

In [52]:
x=[1, 2, 3]
x.append([4, 5, 6])
x

[1, 2, 3, [4, 5, 6]]

# Some loose ends

##  Tuples

Tuples are similar to lists, except they use ( ) and cannot be changed after creation (i.e., they are immutable)

In [53]:
a=(1, 2, 3)     # Define tuple

In [54]:
print(a[0])    # Access data

1


In [55]:
# The following will NOT work
a[0]=-1
a[0]+= [10]

TypeError: 'tuple' object does not support item assignment

##  Be VERY careful when copying

Variables in Python have subtle features that might make your life miserable if you are not careful. You should be particularly mindful when making copies of lists and arrays.

For example, if you want to copy a list, you might be tempted to do the following; PLEASE DON’T!

In [58]:
x=[1, 2, 3]
y=x           # DON'T do this!
z=x           # DON'T do this!

In [59]:
x=[1, 2, 3]
y=x.copy()
z=x.copy()