<div style="text-align:left;font-size:2em"><span style="font-weight:bolder;font-size:1.25em">SP2273 | Learning Portfolio</span><br><br><span style="font-weight:bold;color:darkred">Storing Data (Good)</span></div>

In [2]:
import numpy as np

# What to expect in this chapter

# 1 Subsetting: Indexing and Slicing

**subsetting**: selecting data in a list or array  
**indexing**: a type of subsetting that selects a single data element  
**slicing**: a type of subsetting that selects a range of data elements

**ESC+Z**: shortcut to undo cell operation  
`&` within `\begin{align} and \end{align}` can be used to align in Markdown (all `&` will align)

## 1.1 Lists & Arrays in 1D | Subsetting & Indexing

In [7]:
operator_list=["Eyjafjalla", "Saria", "Kal\'tsit", "Shining", "Exusiai",
         "Qiubai", "Horn", "Skadi", "Surtr", "Texas"]
operator_array=np.array(operator_list)

operator_list[1]

'Saria'

In [3]:
operator_list[-1]

'Texas'

In [14]:
operator_array[0:4] # gives positions 1-4 (indexes 0-3)

array(['Eyjafjalla', 'Saria', "Kal'tsit", 'Shining'], dtype='<U10')

In [15]:
operator_array[1:5] # gives positions 2-5 (indexes 1-4)

array(['Saria', "Kal'tsit", 'Shining', 'Exusiai'], dtype='<U10')

In [29]:
operator_array[0:7:2] # gives positions 1-7 (indexes 0-6) in steps of 2 i.e. first and every other element

array(['Eyjafjalla', "Kal'tsit", 'Exusiai', 'Horn'], dtype='<U10')

In [30]:
operator_list[0:7:3] # gives positions 1-7 (indexes 0-6) in steps of 3 i.e. first and every third element

['Eyjafjalla', 'Shining', 'Horn']

In [17]:
operator_array[5:] # gives position 6 (index 5) to the end

array(['Qiubai', 'Horn', 'Skadi', 'Surtr', 'Texas'], dtype='<U10')

In [18]:
operator_list[:5] # gives position 1-5 (index 0-4)

['Eyjafjalla', 'Saria', "Kal'tsit", 'Shining', 'Exusiai']

In [21]:
operator_list[5:3:-1] # gives position 5-6 (index 4-5) in reverse

['Qiubai', 'Exusiai']

In [9]:
operator_list[5:0:-1]

['Qiubai', 'Exusiai', 'Shining', "Kal'tsit", 'Saria']

In [22]:
operator_list[3:5:1] # gives position 4-5 (index 3-4)

['Shining', 'Exusiai']

In [27]:
operator_list[::-1] # reverses the list

['Texas',
 'Surtr',
 'Skadi',
 'Horn',
 'Qiubai',
 'Exusiai',
 'Shining',
 "Kal'tsit",
 'Saria',
 'Eyjafjalla']

In [34]:
operator_list[::1] # prints the list

['Eyjafjalla',
 'Saria',
 "Kal'tsit",
 'Shining',
 'Exusiai',
 'Qiubai',
 'Horn',
 'Skadi',
 'Surtr',
 'Texas']

In [35]:
operator_list[::2]

['Eyjafjalla', "Kal'tsit", 'Exusiai', 'Horn', 'Surtr']

slicing operations on a list or array:  
`[x:y:z]` for positive `z`  
`x` is the lower index of the range to be sliced (default 'all' if blank)  
`y-1` is the upper index of the range to be sliced (default 'all' if blank)  
`z` represents special operands (index jumps, default 1 if blank)  
- `z=1` : standard indexing
- `z=-1` : reverse indexing
- `z=2,3 etc.` : every z indexing (can be positive or negative)

when `z` is negative, `-x` becomes the upper index of the range to be sliced (negative indexing),  
and `y+1` becomes the lower index of the range to be sliced (positive indexing)

i tried `[5:0:-1]` and could only access indices -5 to -9, but when i tried `[5::-1]` i could access indices -5 to -10. why is this so?

subsetting: selecting a portion  
slicing: selecting a range of indices   
when slicing, only $(y-x)$ elements are given (which explains why it ends 1 before `y`)  
blank endpoint = undefined endpoint, go to the end  
if `0` is put then its a defined endpoint (and Python stops 1 before the endpoint)

## 1.2 Arrays only | Subsetting by masking

In [3]:
array1=np.array([1,2,3,4,5,6,7,8,9,10])
mask1 = array1 == 4
array1[mask1]

array([4])

In [4]:
array1[array1 > 2] # mask applied directly to the array without creating an additional mask variable

array([ 3,  4,  5,  6,  7,  8,  9, 10])

In [59]:
array1[~(array1>2)]

array([1, 2])

In [61]:
array1[(array1>2) & (array1<5)]

array([3, 4])

In [62]:
array1[(array1>2) | (array1<8)]

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

numpy arrays can use masking to perform subsetting by filtering an array for a given condition  
the mask will only show elements where the condition is **true**  
using `~(mask)` (Bitwise NOT) will instead show elements where the condition is **false**  
using `(mask)&(mask2)` (Bitwise AND, not `and`) allows for the use of 2 or more filters, and will return elements that are **true** for all masks  
using `(mask)|(mask2)` (Bitwise OR, not `or`) returns elements that are true for **1 or both** masks  
brackets are important to differentiate the masks clearly when using Bitwise operands

masks are useful for storing `True` and `False`, and can be applied to multidimensional structures  
multidimensional masks can be done as well

## 1.3 Lists & Arrays in 2D | Indexing & Slicing

In [5]:
list2 = [[1, "A"], [2, "B"], [3, "C"], [4, "D"],
              [5, "E"], [6, "F"], [7, "G"], [8, "H"],
              [9, "I"], [10, "J"]]

array2 = np.array(list2)

In [67]:
list2[3] # index 3 from list2

[4, 'D']

In [66]:
list2[3][1] # index 1 from index 3 of list2

'D'

In [11]:
array2[6,0] # index 1 from index 6 of array2, note the different syntax and storing ints as strings (1 data type only)

'7'

In [74]:
list2[:3] # indexes 0-2 from list2

[[1, 'A'], [2, 'B'], [3, 'C']]

In [75]:
array2[0:5] # indexes 0-4 from array2

array([['1', 'A'],
       ['2', 'B'],
       ['3', 'C'],
       ['4', 'D'],
       ['5', 'E']], dtype='<U11')

In [76]:
list2[:3][0] # index 0 from indexes 0-3 from list2

[1, 'A']

In [80]:
array2[5:3:-1, 1] # index 1 from each of indexes 0-5 from array2

array(['F', 'E'], dtype='<U11')

In [79]:
array2[:, 1] # index 1 from all indexes in array2

array(['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'], dtype='<U11')

syntax for lists: `list[x:y:z][w]` pulls index w from indexes x to y-1 with index jump z from list  
syntax for arrays: `array[x:y:z, w]` pulls index w from each of indexes x to y-1 with index jump z from array  
(`x:y:z` syntax is similar obtaining a sliced list/array as above, `w` performs a second slice to the sliced list/array)

## 1.4 Growing lists

In [82]:
x=[1,2]*5
x

[1, 2, 1, 2, 1, 2, 1, 2, 1, 2]

In [91]:
x=[1,2,3]+[4,5]
x*= 2
x.append([2,2,2,2,2])
type(x[10])

list

In [94]:
x=[1,2,3]+[4,5]
x.extend([6,7,8])
#x.extend(9,10)
x

[1, 2, 3, 4, 5, 6, 7, 8]

lists can be multiplied by an integer to expand them (lists cannot be divided)  
two lists can be added together, appending the second list to the first one (`.append()` is the **fastest**)  
`.append()` can only take one argument (1 element) at a time. `.append()` can also add lists and strings as a single element  
to append multiple elements use `.extend()`, it also takes a single argument but if a list is given it adds all list elements as individual elements

`.append()` is used to add lists to lists (extends the appended entry to higher dimension)  
if elements are to be added to the front of the list, addition can be done `[1,2,3]+x` instead (`.prepend()` is slow and discouraged)

In [17]:
x=[1,2,3]
w=np.array(x)
x.insert(1,2)
print(x)

[1, 2, 2, 3]


`.insert(x,y)` can be used to insert elements at any position on the list or array:  
`x` is the index in the new list where element `y` will be inserted

# Some loose ends

## 1.5 Tuples

In [102]:
a=(1,2,3)
print(a[1])

2


In [104]:
b=(1,2,3)
a+=b
a

(1, 2, 3, 1, 2, 3, 1, 2, 3)

In [105]:
a=(1,2,3)
a[0]=3

TypeError: 'tuple' object does not support item assignment

In [100]:
list3=[1,2,3]
list3[0]+=1
list3

[2, 2, 3]

In [101]:
array3=np.array(list3)
array3[0]=3
array3

array([3, 2, 3])

adding 2 tuples together gives weird results  
data in tuples cannot be modified (does not support item assignment)    
data in lists and arrays can be modified by specifying the element and operand

tuples are intended as failsafes against human error

## 1.6 Be VERY careful when copying

In [113]:
x=[1,2,3]
y=x
print(y)
x[0]+=5
print(y)

[1, 2, 3]
[6, 2, 3]


In [114]:
x=[1,2,3]
y=x.copy()
print(y)
x[0]+=5
print(y)

[1, 2, 3]
[1, 2, 3]


when copying lists and arrays, use `.copy()` instead of simply assigning them to new variables  
this is because lists and arrays are mutable, and changing one of the elements in a list/array will affect all the copies made with `=`

all things in Python have an ID number, if directly assigning `y=x` both `y` and `x` will have the same ID number  
if using `y=x.copy()` `y` and `x` will have different IDs (can be checked with `id(x)` and `id(y)`)

# Exercises & Self-Assessment

In [None]:



# Your solution here




## Footnotes