<div style="text-align:left;font-size:2em"><span style="font-weight:bolder;font-size:1.25em">SP2273 | Learning Portfolio</span><br><br><span style="font-weight:bold;color:darkred">Storing Data (Need)</span></div>

**Ways to store data**:
1. Lists
2. Numpy arrays
3. Dictionaries
4. Tuples
5. Dataframes
6. Classes

In [4]:
import numpy as np

# 1 Lists, Arrays & Dictionaries

## 1.1 Let’s compare

**Python Lists**

In [1]:
py_super_names = ["Black Widow", "Iron Man", "Doctor Strange"]
py_real_names = ["Natasha Romanoff", "Tony Stark", "Stephen Strange"]

**Numpy Arrays**

In [5]:
np_super_names = np.array(["Black Widow", "Iron Man", "Doctor Strange"])
np_real_names = np.array(["Natasha Romanoff", "Tony Stark", "Stephen Strange"])

**Dictionary**

In [3]:
superhero_info = {
    "Natasha Romanoff": "Black Widow",
    "Tony Stark": "Iron Man",
    "Stephen Strange": "Doctor Strange"
}

Notice:

- Dictionaries use a key and an associated value separated by a `:`
- The dictionary very elegantly holds the real and superhero names in one structure while we need two lists (or arrays) for the same data.
- For lists and arrays, the order matters. I.e. ‘Iron Man’ must be in the same position as ‘Tony Stark’ for things to work.

## 1.2 Accessing data from a list (or array)

Python is a zero index language:

![](https://phyweb.physics.nus.edu.sg/~chammika/sp2273/docs/python_basics/03_storing-data/python-zero-indexed-counting.png)

to access a particular element in the list (or array), you need to specify the relevant index starting from zero. 

In [2]:
py_super_names = ["Black Widow", "Iron Man", "Doctor Strange"]
py_real_names = ["Natasha Romanoff", "Tony Stark", "Stephen Strange"]

**Example 1**:

In [7]:
py_real_names[0]

'Natasha Romanoff'

**Example 2**:

In [8]:
py_super_names[0]

'Black Widow'

In [9]:
i = 2
print(py_real_names[i], 'is', py_super_names[i])

Stephen Strange is Doctor Strange


**Example 3**:

In [3]:
py_super_names[2]    # Forward indexing 
                     # We need to know the size 
                     # beforehand for this to work.

'Doctor Strange'

In [4]:
py_super_names[-1]   # Reverse indexing

'Doctor Strange'

## 1.3 Accessing data from a dictionary

Dictionaries hold data (values) paired with a key. i.e. you can access the value (in this case, the superhero name) using the real name as a key. 
<br>Dictionaries has a **key-value** structure.

In [5]:
superhero_info = {
    "Natasha Romanoff": "Black Widow",
    "Tony Stark": "Iron Man",
    "Stephen Strange": "Doctor Strange"
}                  

In [6]:
superhero_info["Natasha Romanoff"]

'Black Widow'

## 1.4 Higher dimensional lists

We can use two lists to store the corresponding real and superhero names. To achieve this, we can use a 2D list (array) as follows:

In [8]:
py_superhero_info  = [['Natasha Romanoff', 'Black Widow'],
                     ['Tony Stark', 'Iron Man'],
                     ['Stephen Strange', 'Doctor Strange']]

# 2 Lists vs. Arrays

## 2.1 Size

In [12]:
import numpy as np

In [13]:
py_list_2d = [[1, "A"], [2, "B"], [3, "C"], [4, "D"],
              [5, "E"], [6, "F"], [7, "G"], [8, "H"],
              [9, "I"], [10, "J"]]

np_array_2d = np.array(py_list_2d)      # Reusing the Python list 
                                        # to create a NEW
                                        # NumPy array

In [14]:
# lists  
len(py_list_2d)

10

In [16]:
# Arrays
len(np_array_2d)

10

In [17]:
# Shape is not a fucntion. It is a property or attribute of the Numoy array.
np_array_2d.shape

(10, 2)

# 2.2 Arrays are fussy about type

While we can have multiple data types in lists, array will convert all data types into a single type.

In [18]:
py_list = [1, 1.5, 'A']
np_array = np.array(py_list)

In [20]:
# Lists
py_list

[1, 1.5, 'A']

In [21]:
# Arrays
np_array   # Notice how the numbers are converted into strings.

array(['1', '1.5', 'A'], dtype='<U32')

## 2.3 Adding a number

In [22]:
py_list = [1, 2, 3, 4, 5]
np_array = np.array(py_list)         # Reusing the Python list
                                     # to create a NEW
                                     # NumPy array

In [23]:
# Lists
py_list + 10        # Won't work!

TypeError: can only concatenate list (not "int") to list

In [25]:
# Arrays
np_array + 10

array([11, 12, 13, 14, 15])

## 2.4 Adding another list

Adding lists cause the list to grow into a larger set, while adding arrays will lead to element-wise operation.

In [26]:
py_list_1 = [1, 2, 3, 4, 5]
py_list_2 = [10, 20, 30, 40, 50]

np_array_1 = np.array(py_list_1)
np_array_2 = np.array(py_list_2)

In [27]:
# Lists
py_list_1 + py_list_2

[1, 2, 3, 4, 5, 10, 20, 30, 40, 50]

In [28]:
# Array
np_array_1 + np_array_2

array([11, 22, 33, 44, 55])

## 2.5 Multiplying by a Number

Again, multiplying by a number makes a list grow, whereas an array multiplies its elements by the number.

In [29]:
py_list = [1, 2, 3, 4, 5]
np_array = np.array(py_list)         

In [30]:
# Lists
py_list*2

[1, 2, 3, 4, 5, 1, 2, 3, 4, 5]

In [31]:
# Arrays
np_array*2

array([ 2,  4,  6,  8, 10])

## 2.6 Squaring

In [32]:
py_list = [1, 2, 3, 4, 5]
np_array = np.array(py_list)

In [33]:
# Lists
py_list**2                      # Won't work!  

TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'

In [34]:
# Arrays
np_array**2

array([ 1,  4,  9, 16, 25])

## 2.7 Asking questions

In [35]:
py_list = [1, 2, 3, 4, 5]
np_array = np.array(py_list)         


In [36]:
# Lists_Example_1
py_list == 3     # Works, but what IS the question?

False

In [37]:
# Lists_Example_2
py_list > 3      # Won't work!

TypeError: '>' not supported between instances of 'list' and 'int'

In [38]:
# Arrays_Example_1
np_array == 3  

array([False, False,  True, False, False])

In [39]:
# Arrays_Example_2
np_array > 3  

array([False, False, False,  True,  True])

## 2.8 Mathematics

In [40]:
py_list = [1, 2, 3, 4, 5]
np_array = np.array(py_list)         

In [41]:
# Lists_Example_1
sum(py_list)     # sum() is a base Python function

15

In [43]:
# Lists_Example_2
max(py_list)     # max() is a base Python function

5

In [44]:
# Lists_Example_3
min(py_list)     # min() is a base Python function

1

In [45]:
# Lists_Example_4
py_list.sum()   # Won't work!

AttributeError: 'list' object has no attribute 'sum'

In [46]:
# Arrays_Example_1
np_array.sum()

15

In [47]:
# Arrays_Example_2
np_array.max()

5

In [48]:
# Arrays_Example_3
np_array.min()

1

In [50]:
# Arrays_Example_4
np_array.mean()

3.0

In [52]:
# Arrays_Example_5
np_array.std()

1.4142135623730951

In general, an operation on a list works on the entire list while an operation on an array works on the individual elements of the array,