<div style="text-align:left;font-size:2em"><span style="font-weight:bolder;font-size:1.25em">SP2273 | Learning Portfolio</span><br><br><span style="font-weight:bold;color:darkred">Storing Data (Need)</span></div>

# What to expect in this chapter

Ways to store data:
1. Lists.
2. Numpy arrays.
3. Dictionaries.
4. Tuples.
5. Dataframes.
6. Classes.

Focus on lists, arrays, dictionaries and tuples.\
Influences how one thinks about data.

# 1 Lists, Arrays & Dictionaries

## 1.1 Let’s compare

Comparing info using lists, arrays and dictionaries.
- 'Lists' refer to Python lists.
- 'Arrays' refer to NumPy arrays.

### Python Lists:

In [1]:
py_super_names = ["Black Widow", "Iron Man", "Doctor Strange"]
py_real_names = ["Natasha Romanoff", "Tony Stark", "Stephen Strange"]

### Numpy Arrays:

In [3]:
import numpy as np
np_super_names = np.array(["Black Widow", "Iron Man", "Doctor Strange"])
np_real_names = np.array(["Natasha Romanoff", "Tony Stark", "Stephen Strange"])

### Dictionary:

In [13]:
superhero_info = {
    "Natasha Romanoff": "Black Widow",
    "Tony Stark": "Iron Man",
    "Stephen Strange": "Doctor Strange"
}

Points to note:
- Dictionaries use key and associated value separated by `:`.
- Dictionary holds key and value in 1 structure, list and arrays need 2 structures (lists/ arrays).
- Order for lists and arrays must be same to be associated to each other.
- `py` and `np` added in front of variable for clarity; can use any variable names except Python keywords.

## 1.2 Accessing data from a list (or array)

Needing to use index that corresponds to data's position.
Python is zero-indexed language, starts counting/ indexing at 0.
To access particular element in list/ array, need specify relevant index.

E.g. 1:

In [14]:
py_real_names[0]

'Natasha Romanoff'

E.g. 2:

In [15]:
py_super_names[0]

'Black Widow'

E.g. 3:

In [17]:
py_super_names[2]

'Doctor Strange'

In [18]:
py_super_names[-1]

'Doctor Strange'

Negative index will start from back of list.\
Useful to access last element without knowing list size.\
Index '-1' gives last element.

## 1.3 Accessing data from a dictionary

Dictionaries have key-value structure.

In [19]:
superhero_info["Natasha Romanoff"]

'Black Widow'

Accessing all keys:

In [20]:
superhero_info.keys()

dict_keys(['Natasha Romanoff', 'Tony Stark', 'Stephen Strange'])

Accessing all values:

In [21]:
superhero_info.values()

dict_values(['Black Widow', 'Iron Man', 'Doctor Strange'])

## 1.4 Higher dimensional lists

Instead of having 2 lists, can use 2D list/ array.

In [22]:
py_superhero_info = [["Natasha Romanoff", "Black Widow"],
                     ["Tony Stark", "Iron Man"],
                     ["Stephen Strange", "Doctor Strange"]]

# 2 Lists vs. Arrays

Important to know similarities and differences betwwen lists and arrays.

## 2.1 Size

Knowing number of elements in lists or arrays using length function.\
Format: `len()`.

In [24]:
py_list_2d = [[1,"A"],[2,"B"],[3,"C"],[4,"D"],[5,"E"],
              [6,"F"],[7,"G"],[8,"H"],[9,"I"],[10,"J"]]
np_array_2d = np.array(py_list_2d)  # Reusing Python list to create new NumPy array.

### Lists:

In [25]:
len(py_list_2d)

10

### Arrays:

In [27]:
len(np_array_2d)

10

In [28]:
np_array_2d.shape

(10, 2)

`Shape` does not have brackets `()`.\
`Shape` is property/ attribute of NumPy array, not function.\
For arrays, can use either `len()` or `.shape`.

## 2.2 Arrays are fussy about type

Arrays only accept 1 data type.\
Lists can accept multiple data types.

In [29]:
py_list = [1, 1.5, 'A']
np_array = np.array(py_list)

### Lists:

In [30]:
py_list

[1, 1.5, 'A']

### Arrays:

In [31]:
np_array

array(['1', '1.5', 'A'], dtype='<U32')

Note: Numbers are converted to English (as shown by `' '`) with array.\
Can change variable type with 'hidden' function `astypes()`.

## 2.3 Adding a number

In [32]:
py_list = [1,2,3,4,5]
np_array = np.array(py_list)

### Lists:

In [33]:
py_list + 10

TypeError: can only concatenate list (not "int") to list

### Arrays:

In [34]:
np_array + 10

array([11, 12, 13, 14, 15])

Note: Lists cannot add numbers.

## 2.4 Adding another list

In [42]:
py_list_1 = [1,2,3,4,5]
py_list_2 = [10,20,30,40,50]

np_array_1 = np.array(py_list_1)
np_array_2 = np.array(py_list_2)

### Lists:

In [41]:
py_list_1 + py_list_2

[1, 2, 3, 4, 5, 10, 20, 30, 40, 50]

Grows list by combining lists together into 1 list.

### Arrays:

In [43]:
np_array_1 + np_array_2

array([11, 22, 33, 44, 55])

Combines elements of the same position in the lists together; element-wise operation.

## 2.5 Multiplying by a Number

In [44]:
py_list = [1,2,3,4,5]
np_array = np.array(py_list)

### Lists:

In [45]:
py_list*2

[1, 2, 3, 4, 5, 1, 2, 3, 4, 5]

Grows list by repeating elements in the list by the number of times specified.

### Arrays:

In [46]:
np_array*2

array([ 2,  4,  6,  8, 10])

Multiplies elements by the number specified.

## 2.6 Squaring

In [47]:
py_list = [1,2,3,4,5]
np_array = np.array(py_list)

### Lists:

In [48]:
py_list**2

TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'

### Arrays:

In [49]:
np_array**2

array([ 1,  4,  9, 16, 25])

Note: Lists cannot be squared.

## 2.7 Asking questions

In [51]:
py_list = [1,2,3,4,5]
np_array = np.array(py_list)

### Lists:

E.g. 1:

In [52]:
py_list == 3

False

Works, but what is the question asked?

E.g. 2:

In [55]:
py_list > 3

TypeError: '>' not supported between instances of 'list' and 'int'

Does not work.

### Arrays:

E.g. 1:

In [56]:
np_array == 3

array([False, False,  True, False, False])

E.g. 2:

In [58]:
np_array > 3

array([False, False, False,  True,  True])

## 2.8 Mathematics

In [51]:
py_list = [1,2,3,4,5]
np_array = np.array(py_list)

### Lists:

E.g. 1:

In [59]:
sum(py_list)

15

E.g. 2:

In [60]:
max(py_list)

5

E.g. 3:

In [61]:
min(py_list)

1

`sum()`, `max()` and `min()` are base Python functions.

E.g. 4:

In [62]:
py_list.sum()

AttributeError: 'list' object has no attribute 'sum'

Does not work; limited in carrying out mathematical functions.

### Arrays:

E.g. 1:

In [64]:
np_array.sum()

15

E.g. 2:

In [66]:
np_array.max()

5

E.g. 3:

In [67]:
np_array.min()

1

E.g. 4:

In [68]:
np_array.mean()

3.0

E.g. 5:

In [69]:
np_array.std()

1.4142135623730951

Arrays able to carry out a range of mathematical functions.

TLDR:
- Operation on a list works on the **whole** list.
- Operation on an array works on the **individual elements** in the array.

## Footnotes