<div style="text-align:left;font-size:2em"><span style="font-weight:bolder;font-size:1.25em">SP2273 | Learning Portfolio</span><br><br><span style="font-weight:bold;color:darkred">Storing Data (Need)</span></div>

# What to expect in this chapter

# 1 Lists, Arrays & Dictionaries

Python has several ways to store and manipulate data. Some are shown below.
1. Lists
2. Numpy arrays
3. Dictionaries
4. Tuples
5. Dataframes
6. Classes

## 1.1 Let’s compare

Let's try to store this data:
- I took three courses, which are SP1111, SP2233 and SP4233
- The grades I got for each course resoectively is C, A and A+

We want the stored data such that it displays the correct courses for the correct grades.

**Python list**

In [5]:
courses = ["SP1111", "SP2233", "SP4233"]
grades = ["C", "A", "A+"]

#Indexes of courses to their respective grades need to be the same (e.g. both the index for "SP1111" and "C" is 0)
#You need two lists for this

<p></p>

**Numpy array**

In [5]:
import numpy as np
courses = np.array(["SP1111", "SP2233", "SP4233"])
grades = np.array(["C", "A", "A+"])

#Indexes of courses to their respective grades need to be the same (e.g. both the index for "SP1111" and "C" is 0)
#You need two arrays for this

<p></p>

**Dictionary**

In [26]:
course_grades = {
    "SP1111" : "C",
    "SP2233" : "A",
    "SP4233" : "A+"
}

#Uses a key (courses) associated with a value (the respective grade), unlike Python list and Numpy array
#You only need one dictionary for this

## 1.2 Accessing data from a list (or array)

We can return an element from a list or an array by using the square brackets `[]`, and specifying the index of the element to be displayed.

For example, for the `courses` and `grades` array:

In [13]:
print(courses[0], grades[0])        #Prints the first element in both the courses and grades array
print(courses[1], grades[1])        #Prints the second element in both the courses and grades array
print(courses[2], grades[2])        #Prints the third element in both the courses and grades array

SP1111 C
SP2233 A
SP4233 A+


Notice that the **first** element has an index of **0**, and **not 1**, while the **second** element has the index **1**, and so on.

We can also do reverse indexing, where the index of the **last** element in the list or array is **-1**, while the **second last** element has the index **-2**, and so on.

In [15]:
print(courses[-1], grades[-1])      #Prints the third element (which is the last element) in both the courses and grades array
print(courses[-3], grades[-3])      #Prints the third last element in both the courses and grades array

SP4233 A+
SP1111 C


## 1.3 Accessing data from a dictionary

We can return a value from a dictionary using `[]` too, but with specifying the **key**.

For example, for the `course_grades` dictionary:

In [27]:
print(course_grades["SP1111"])      #Prints the value associated with the SP1111 key
print(course_grades["SP4233"])      #Prints the value associated with the SP4233 key

C
A+


You can return all the keys and values as a list using the `.keys` and `.values` functions. For example:

In [20]:
print(course_grades.keys())         #Prints an array containing all the dictionary keys
print(course_grades.values())       #Prints an array containing all the dictionary values

dict_keys(['SP1111', 'SP2233', 'SP4233'])
dict_values(['C', 'A', 'A+'])


## 1.4 Higher dimensional lists

We needed two lists for the previous example of `courses` and `grades`. A way arround it is to make a 2D array.

In [2]:
new_course_grades = [
    ["SP1111", "C"],
    ["SP2233", "A"],
    ["SP4233", "A+"]
]                                         #new_course_grades instead of course_grades because course_grades was used for the earlier examples

print(new_course_grades)

[['SP1111', 'C'], ['SP2233', 'A'], ['SP4233', 'A+']]


<p></p>

**Accessing data in higher dimensional lists**

We can call out a 1D list from the 2D list using the index numbers of the 1D list, just like an element in a 1D list. For example:

In [30]:
print(new_course_grades[0])        #Prints the 1D list at the first position
print(new_course_grades[1])        #Prints the 1D list at the second position
print(new_course_grades[2])        #Prints the 1D list at the third position

['SP1111', 'C']
['SP2233', 'A']
['SP4233', 'A+']


<p>
</p>

We can also call out individual elements from the 1D list by using **two index numbers**. The general syntax is `list[index of 1D list, index of element in the 1D list]`.

In [3]:
print(new_course_grades[0][1])     #Prints the element at row at the position 0 (first row) and column at the position 1 (second column)

C


# 2 Lists vs. Arrays

## 2.1 Size

We can use the `len` function to count how many elements are in a list or in an array. See the example below.

In [10]:
import numpy as np

list_size = [
    [1, "A"], [2, "B"], [3, "C"], [4, "D"],
    [5, "E"], [6, "F"], [7, "G"], [8, "H"],
    [9, "I"], [10, "J"]
]

array_size = np.array(list_size)

print(len(list_size))
print(len(array_size))

10
10


<p></p>

We can use `.shape` to count the number of rows and columns in a 2D array (note that it does **NOT** work with lists, becauses lists do not have the `shape` attribute).

In [12]:
array_size.shape

(10, 2)

In the above example, `10` is the number of rows and `2` is the number of columns.

## 2.2 Arrays are fussy about type

One difference between arrays and lists is that arrays can only contain elements of the same data type (e.g. all elements are of the `str` data type) while lists can contain more than one data type. See the example below, where the `int` and `bool` values in list `list_type` are all converted to `str` in `array_type`.

In [16]:
import numpy as np

list_type = [1, False, "hello"]
array_type = np.array(list_type)

print(list_type)
print(array_type)

[1, False, 'hello']
['1' 'False' 'hello']


<p></p>

We can change the type of the elements in an array using the `astype` function. However, like changing the data type of something, the data type changing must make sense (i.e. we cannot change an `str` with an `int`). See below.

In [19]:
array_change_int = np.array([1, 2, 3])
array_change_str = array_change_int.astype(str)

print(array_change_int)
print(array_change_str)

[1 2 3]
['1' '2' '3']


In [22]:
array_change_str1 = np.array(["hello", "hi", "yellow"])
array_change_str1.astype(int)                                #This yields an error

ValueError: invalid literal for int() with base 10: 'hello'

## 2.3 Adding a number

We can add a number to an array, but **not** a list. See below.

In [25]:
import numpy as np

list_add = [1, 2, 3, 4]
array_add = np.array(list_add)

In [26]:
list_add + 10      #This yields an error

TypeError: can only concatenate list (not "int") to list

In [27]:
array_add + 10     #This adds 10 to all the elements in the list

array([11, 12, 13, 14])

## 2.4 Adding another list

We can add (with `+`) a list or an array to another list or array. However, if it is `list + list`, it is a concatenation of the lists, while `array + array` is an element-wise operation of addition.

In [28]:
list1 = [1, 2, 3]
list2 = [4, 5, 6]

list1 + list2       #This concatenates list1 and list2

[1, 2, 3, 4, 5, 6]

In [29]:
import numpy as np

array1 = np.array(list1)
array2 = np.array(list2)

array1 + array2     #Element-wise addition

array([5, 7, 9])

<p></p>

You can also add lists with arrays, but the output will be an array and not a list.

In [33]:
list1 + array1

array([2, 4, 6])

## 2.5 Multiplying by a Number

We can multiply (with `*`) a list or an array to a number. However, if it is `list * number`, it concatenates duplicates of the list, while `array * number` is an element-wise operation of multiplication.

In [30]:
list1 * 2      #This concatenates two lists of [1, 2, 3]

[1, 2, 3, 1, 2, 3]

In [31]:
array1 * 2     #Element-wise multiplication

array([2, 4, 6])

## 2.6 Squaring

We can square or cube (or any other higher powers) elements in an array, but **not** in a list.

In [34]:
list1 ** 2       #This yields an error

TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'

In [35]:
array1 ** 2      #Element-wise squaring

array([1, 4, 9])

## 2.7 Asking questions

We can ask questions using the mathematical operators or other operators (e.g. `is`) that have been introduced previously. However, there are some restrictions if we ask questions for lists.

<p></p>

**Example 1**

In [38]:
print(3 in list1)       #This asks whether the element 3 is in list1
print(3 in array1)      #This asks whether the element 3 is in array1

True
True


<p></p>

**Example 2**

For mathematical operators, only using arrays works. This is because arrays will do element-wise operations, while lists are not capable of doing that.

In [39]:
list1 > 2    #This yields an error

TypeError: '>' not supported between instances of 'list' and 'int'

In [42]:
array1 > 2   #Element-wise checking of whether each element is greater than 2

array([False, False,  True])

<p></p>

**Example 3**

For `==`, using lists will not yield an error, but it is ambigious: what exactly is the question, because it will only yield a single `True` or `False`.

In [44]:
list1 == 2     #Redundant

False

In [45]:
array1 == 2    #Element-wise checking of whether each element is equal to 2

array([False,  True, False])

<p></p>

## 2.8 Mathematics

We can do simple mathematical operations, such as `sum` and `max`, on both lists and arrays; but with different syntaxes.

In [46]:
import numpy as np

example_list = [1, 2, 3, 4, 5]
example_array = np.array(example_list)

<p></p>

Syntax for list is just using the base Python functions, such as `sum`, `max` and `min`.

In [48]:
print(max(example_list))
print(min(example_list))
print(sum(example_list))

5
1
15


<p></p>

On the other hand, those mathematical operators are attributes rather than separate functions for arrays. So, we use `.sum`, `.max` and `.min` for arrays.

In [51]:
print(example_array.max())
print(example_array.min())
print(example_array.sum())

5
1
15
