<div style="text-align:left;font-size:2em"><span style="font-weight:bolder;font-size:1.25em">SP2273 | Learning Portfolio</span><br><br><span style="font-weight:bold;color:darkred">Storing Data (Need)</span></div>

## 1 Lists, Arrays & Dictionaries

### 1.1 Let’s compare

In [3]:
import numpy as np

# there are 3 basic ways to store data:

# 1. storing information in a Python list

py_super_names = ["Black Widow", "Iron Man", "Doctor Strange"]
py_real_names = ["Natasha Romanoff", "Tony Stark", "Stephen Strange"]

# 2. storing information in a NumPy array

np_super_names = np.array(["Black Widow", "Iron Man", "Doctor Strange"])
np_real_names = np.array(["Natasha Romanoff", "Tony Stark", "Stephen Strange"])

# 3. storing information in a dictionary

superhero_info = {"Natasha Romanoff": "Black Widow",
                  "Tony Stark": "Iron Man",
                  "Stephen Strange": "Doctor Strange"
                  }

# while lists and arrays retrive the values using indexes, dictionaries retrieve values using the key it is bound to

### 1.2 Accessing data from a list (or array)

In [4]:
# accessing items in a (Python) list and a (NumPy) array, you need to use the item's index. Indexes start from 0.

print(py_real_names[0])
print(py_super_names[0])

# we can use reverse indexing to access items from the end of the list. The last index of the list is -1

print(py_super_names[2])    # Forward indexing 
                            # We need to know the size 
                            # beforehand for this to work.
print(py_super_names[-1])   # Reverse indexing

Natasha Romanoff
Black Widow
Doctor Strange
Doctor Strange


### 1.3 Accessing data from a dictionary

In [5]:
# access items in a dictionary by calling its key. the output will be the value the key is tied to

print(superhero_info["Natasha Romanoff"])

# you can also use _info.keys() and _info.values() to retrive all the keys and values accessible within
# each dictionary

print(superhero_info.keys())
print(superhero_info.values())

Black Widow
dict_keys(['Natasha Romanoff', 'Tony Stark', 'Stephen Strange'])
dict_values(['Black Widow', 'Iron Man', 'Doctor Strange'])


### 1.4 Higher dimensional lists

In [6]:
py_superhero_info = [['Natasha Romanoff', 'Black Widow'],
                     ['Tony Stark', 'Iron Man'],
                     ['Stephen Strange', 'Doctor Strange']]

# these are 2-dimensional lists/arrays which embed lists within a list, allowing you to store information of different
# nature within 1 list instead of using multiple lists

## 2 Lists vs. Arrays

### 2.1 Size

In [7]:
py_list_2d = [[1, "A"], [2, "B"], [3, "C"], [4, "D"],
              [5, "E"], [6, "F"], [7, "G"], [8, "H"],
              [9, "I"], [10, "J"]]

np_array_2d = np.array(py_list_2d)      # Reusing the Python list 
                                        # to create a NEW
                                        # NumPy array

# you can figure out the number of items in a list by using len() function

print(len(py_list_2d))
print(len(np_array_2d))

# note that for 2d lists, the len() function will output the length of the highest order/scope of the list and disregard
# the other lists embedded within the largest list

# you can check the 'level' of an array by appending .shape at the end of the array name. Note that this will only work
# for NumPy arrays and not Python lists.

'print(py_list_2d.shape)' # will raise an AttritubeError: 'list' object has no attribute 'shape'
print(np_array_2d)
print(np_array_2d.shape) # the output will be (10, 2). 2 numbers shown means the array has 2 dimensions, the first
                         # dimension has 10 elements while the second dimension has 2 elements

# now we try to create a 3d array

np_array_3d = np.array([[[1, "A"], [2, "B"], [3, "C"], [4, "D"], [5, "E"]], 
                        [[6, "F"], [7, "G"], [8, "H"], [9, "I"], [10, "J"]],
                        [[11, "K"], [12, "L"], [13, "M"], [14, "N"], [15, "O"]]])

# printing the 3d array will produce separate blocks of rows and columns denoting the "height" of the array

print(np_array_3d)
print(np_array_3d.shape) # the output will be (3, 5, 2). 3 numbers shown means the array has 3 dimensions, the first
                         # has 3 elements, second has 5 elements, and the third has 2 elements

# Thus, the numbers in the output of .shape are in order of decreasing hierarchy of the dimensions

# using .shape also does not require brackets () since .shape is an attribute of the NumPy array, not a function or method

10
10
[['1' 'A']
 ['2' 'B']
 ['3' 'C']
 ['4' 'D']
 ['5' 'E']
 ['6' 'F']
 ['7' 'G']
 ['8' 'H']
 ['9' 'I']
 ['10' 'J']]
(10, 2)
[[['1' 'A']
  ['2' 'B']
  ['3' 'C']
  ['4' 'D']
  ['5' 'E']]

 [['6' 'F']
  ['7' 'G']
  ['8' 'H']
  ['9' 'I']
  ['10' 'J']]

 [['11' 'K']
  ['12' 'L']
  ['13' 'M']
  ['14' 'N']
  ['15' 'O']]]
(3, 5, 2)


### 2.2 Arrays are fussy about type

In [8]:
# NumPy arrays like to have only one object type in each array whereas Python lists are more accommodating and can host
# various object types within a list

py_list = [1, 1.5, 'A']
np.array(py_list) # will convert all objects to a str to accommodate for the str which cannot be converted to a number

#> array(['1', '1.5', 'A'], dtype='<U32') <-- dtype='<U32' refers to the byte order and type of object in the array

# Thus, when dealing with arrays, we must be mindful of the object types and use functions such as astypes() to typecast
# according to our needs

array(['1', '1.5', 'A'], dtype='<U32')

Refer to this website for information about the Data Type Objects of NumPy

[dtype in NumPy](https://numpy.org/doc/stable/reference/arrays.dtypes.html)

### 2.3 Adding a number

In [9]:
py_list = [1, 2, 3, 4, 5]
np_array = np.array(py_list)         # Reusing the Python list
                                     # to create a NEW
                                     # NumPy array

# If we want to add a constant value to all the items in the list
        
'py_list + 10'        # Won't work!
np_array + 10         # will add 10 to each item in the list

# you can also replace the values in np_array using:

np_array += 2

print(py_list)
print(np_array)

[1, 2, 3, 4, 5]
[3 4 5 6 7]


### 2.4 Adding another list

In [10]:
py_list_1 = [1, 2, 3, 4, 5]
py_list_2 = [10, 20, 30, 40, 50]

np_array_1 = np.array(py_list_1)
np_array_2 = np.array(py_list_2)

print(py_list_1 + py_list_2)
print(np_array_1 + np_array_2)
print(np_array_1 * np_array_2)
print(np_array_1 / np_array_2)

# take note that Python lists will concatenate the lists together, forming a longer list while NumPy will apply the 
# arithmetic operation to the array (that is if the array consist of numbers) and produce another array with the same
# number of items but altered according to the calculation

[1, 2, 3, 4, 5, 10, 20, 30, 40, 50]
[11 22 33 44 55]
[ 10  40  90 160 250]
[0.1 0.1 0.1 0.1 0.1]


### 2.5 Multiplying by a Number

In [11]:
py_list = [1, 2, 3, 4, 5]
np_array = np.array(py_list)

print(py_list*2)
print(np_array*2)

# when using multiplication on lists, Python will repeat the list the number of times it is multiplied by whereas NumPy
# arrays will alter each item in the list according to the multiplication factor

[1, 2, 3, 4, 5, 1, 2, 3, 4, 5]
[ 2  4  6  8 10]


### 2.6 Squaring

In [12]:
'print(py_list**2)'                      # Won't work!  
print(np_array**2)

# again, for lists, since it alters the entire list according to the operator, doing list**2 will not work. for arrays,
# however, since each element is altered according to the operator, each element in the array will change and produce
# a new array with the altered outputs

[ 1  4  9 16 25]


### 2.7 Asking questions

In [15]:
py_list == 3     # Works, and produces the output 'False'. This is because Python is asking whether the list itself it
                 # equivalent to the integer 3, which is false 

'py_list > 3'    # Won't work because you cannot compare objects of different types using comparison operators
                 # other than == which checks for whether each element is equal

print(np_array == 3 )
print(np_array > 3  )

# both of these work because arrays perform each comparison with each of the elements in the list, producing a True or
# False output for each item in the array

[False False  True False False]
[False False False  True  True]


In [14]:
def test(a):
    a += 2
    return a
def test2(a):
    a *= 2
    return a

print(test(2)==test2(2))
print(test==test2)

# when comparing functions themselves, functions will only equal to another function if its ID is the same, not if their 
# bodies are the same. However, when comparing the output of the functions, if you get the same result from both
# functions, they are considered the same.

True
False


### 2.8 Mathematics

In [8]:
py_list = [1, 2, 3, 4, 5]
np_array = np.array(py_list)

# Base Python functions which will work on Python lists:

print(sum(py_list))     # outputs the sum of all the elements in the list
print(max(py_list))     # outputs the element with the maximum value in the list
print(min(py_list))     # outputs the element with the minimum value in the list

# For arrays, the same commands are not written as a base function on its own, instead they are appended at the end
# of the array name.

print(np_array.sum())
print(np_array.max())
print(np_array.min())
print(np_array.mean())

15
5
1
15
5
1
3.0


## Exercise 1 :  Total recall?

1. Two Similarities between Python lists and NumPy arrays
    - They both store multiple sets of information which can stored in a multi-dimensional manner.
    - They both retrieve information from the list/array using the element's index.
    
<br>

2. Two Differences between Python lists and NumPy arrays
    - Python lists stores information as an entirely entity, meaning any alteration made will affect the entire list. This is in contrast to NumPy arrays where each individual element will be evaluated according to the operation placed on it.
    - Storing information as a Python list only requires you to enclose the elements in square brackets while NumPy arrays require you to use the function np.array() and place the list enclosed in square brackets within the parentheses of the function

<br>

3. Definition of a dictionary
    - A dictionary is a method of storing information by attaching a value to its key in a key:value pair. These values can then only be retrieved by calling the key attached to it.

## Exercise 2 :  Index me

In [29]:
py_list = ["a1", "b2", "c3", "d4", "e5", "f6", "g7", "h8", "i9", "j10"]
print(py_list[0])   # Prints 'a1'
print(py_list[2])   # Prints 'c3'
print(py_list[4])   # Prints 'e5'
print(py_list[6])   # Prints 'g7'
print(py_list[8])   # Prints 'i9'

# attempting to create a program to print the elements with an odd number for me

def get_odd_num(list):
    index = 0
    for i in range(len(list)):
        if index == len(list):
            break
        else:
            print(list[index])
            index += 2

get_odd_num(py_list)

# program should work assuming all elements are sorted first

a1
c3
e5
g7
i9
a1
c3
e5
g7
i9


In [55]:
# Changes from Yuan Zhe's comments

def get_odd_num(given_list):                    # Variable names should appear in black (at least in Jupyter). if the
                                                # variable name turns green (or same colour as other functions), it means
                                                # you are using a Python function as a variable name which has a
                                                # risk of downgrading from a function to a variable and losing its, well,
                                                # function.
    for i in range(0, len(given_list), 2):      # Remember that the range() function has a start, stop and step.
        print(given_list[i])                    # Since the iteration of the index of the list follows the step value 2
                                                # and only outputs the following code every 2 steps, we can just use i as
                                                # the index to call.

get_odd_num(py_list)

a1
c3
e5
g7
i9


## Exercise 3 :  Capitalise Heros

In [32]:
superhero_info = {"Natasha Romanoff": "Black Widow",
                  "Tony Stark": "Iron Man",
                  "Stephen Strange": "Doctor Strange"
                  }

superhero_info['Tony Stark'] = superhero_info['Tony Stark'].upper()
superhero_info['Natasha Romanoff'] = superhero_info['Natasha Romanoff'].upper()
superhero_info['Stephen Strange'] = superhero_info['Stephen Strange'].upper()

print(superhero_info)

# i don't know how to create a program for this so i just hardcoded it

{'Natasha Romanoff': 'BLACK WIDOW', 'Tony Stark': 'IRON MAN', 'Stephen Strange': 'DOCTOR STRANGE'}


## Exercise 4 :  How many ones

In [51]:
numbers=[45, 60, 1, 30, 96, 1, 96, 57, 16, 1,
        99, 62, 86, 43, 42, 60, 59, 1, 1, 35,
        83, 47, 34, 28, 68, 23, 22, 92, 1, 79,
        1, 29, 94, 72, 46, 47, 1, 74, 32, 20,
        8, 37, 35, 1, 89, 29, 86, 19, 43, 61]

# function which makes use of iteration through the elements in the list, checks if the value is equal to the value
# in the stipulated in the function parameters, and counts the number of times the aforementioned case is true.

def total_count_iter(array, number): 
    count = 0 
    for i in range(len(array)):
        if np_numbers[i] == number:
            count += 1
    return count

print(total_count_iter(numbers, 1))

# function which makes use of filtering in a NumPy array to create a new array consisting of Boolean True or False values 
# depending on whether the number in the array matches that which is stipulated in the function parameters. The function
# then returns the sum of the numbers in the filtered array. Since True = 1 and False = 0, sum() will automatically
# calculate the number of True values in the array.

np_numbers = np.array(numbers)

def total_count(array, number):
    array = array[array == number]
    return sum(array)

print(total_count(np_numbers, 1))

9
9
