<div style="text-align:left;font-size:2em"><span style="font-weight:bolder;font-size:1.25em">SP2273 | Learning Portfolio</span><br><br><span style="font-weight:bold;color:darkred">Storing Data (Need)</span></div>

# What to expect in this chapter

I cannot emphasize how important it is for you to understand how to store, retrieve and modify data in programming. This is because these abstract structures will influence how you think about data1. This will ultimately aid (or hinder) your ability to conjure up algorithms to solve problems.

# 1 Lists, Arrays & Dictionaries

There are three basic ways of storing data:

    lists,
    NumPy arrays and
    dictionaries.


Dictionaries use a key and an associated value separated by a :

The dictionary very elegantly holds the real and superhero names in one structure while we need two lists (or arrays) for the same data.

For lists and arrays, the order matters. I.e. ‘Iron Man’ must be in the same position as ‘Tony Stark’ for things to work.


## 1.1 Let’s compare

In [40]:
import numpy as np

py_super_names = ["Black Widow", "Iron Man", "Doctor Strange"]
py_real_names = ["Natasha Romanoff", "Tony Stark", "Stephen Strange"]

np_super_names = np.array(["Black Widow", "Iron Man", "Doctor Strange"])
np_real_names = np.array(["Natasha Romanoff", "Tony Stark", "Stephen Strange"])

superhero_info = {
    "Natasha Romanoff": "Black Widow",
    "Tony Stark": "Iron Man",
    "Stephen Strange": "Doctor Strange"
}


## 1.2 Accessing data from a list (or array)

In [19]:
py_super_names = ["Black Widow", "Iron Man", "Doctor Strange"]
py_real_names = ["Natasha Romanoff", "Tony Stark", "Stephen Strange"]

py_real_names[0]

'Natasha Romanoff'

In [18]:
py_super_names[0]

'Black Widow'

In [20]:
py_super_names[2]    # Forward indexing 
                     # We need to know the size 
                     # beforehand for this to work.

'Doctor Strange'

In [21]:
py_super_names[-1]   # Reverse indexing

'Doctor Strange'

Data in lists (and arrays) must be accessed using a zero-based index. (Start from 0, not 1)

## 1.3 Accessing data from a dictionary

In [22]:
superhero_info = {
    "Natasha Romanoff": "Black Widow",
    "Tony Stark": "Iron Man",
    "Stephen Strange": "Doctor Strange"
}                  
superhero_info["Natasha Romanoff"]

'Black Widow'

Remember that dictionaries have a key-value structure.

In [23]:
superhero_info.keys()

dict_keys(['Natasha Romanoff', 'Tony Stark', 'Stephen Strange'])

In [24]:
superhero_info.values()

dict_values(['Black Widow', 'Iron Man', 'Doctor Strange'])

## 1.4 Higher dimensional lists

In [35]:
py_superhero_info = [['Natasha Romanoff', 'Black Widow'],
                     ['Tony Stark', 'Iron Man'],
                     ['Stephen Strange', 'Doctor Strange']]

py_superhero_info[2]


['Stephen Strange', 'Doctor Strange']

# 2 Lists vs. Arrays

## 2.1 Size

In [47]:
import numpy as np

py_list_2d = [[1, "A"], [2, "B"], [3, "C"], [4, "D"],
              [5, "E"], [6, "F"], [7, "G"], [8, "H"],
              [9, "I"], [10, "J"]]

np_array_2d = np.array(py_list_2d)      # Reusing the Python list 
                                        # to create a NEW
                                        # NumPy array

py_list_2d

[[1, 'A'],
 [2, 'B'],
 [3, 'C'],
 [4, 'D'],
 [5, 'E'],
 [6, 'F'],
 [7, 'G'],
 [8, 'H'],
 [9, 'I'],
 [10, 'J']]

In [51]:
#lists

len(py_list_2d)

10

In [52]:
#arrays

len(np_array_2d)

10

In [54]:
np_array_2d.shape

#Notice the absence of brackets ( ) in shape above. This is because shape is not a function. Instead, it is a property or attribute of the NumPy array.

(10, 2)

## 2.2 Arrays are fussy about type

In [58]:
py_list = [1, 1.5, 'A']
np_array = np.array(py_list)

py_list

[1, 1.5, 'A']

In [60]:
np_array

#Remember that NumPy arrays tolerate only a single type.

array(['1', '1.5', 'A'], dtype='<U32')

In [3]:
#What happens when we have only int and float type data?
import numpy as np
exam_list = [1, 2.5, 3, 4.0, 5, 8, 9.67]

exam_array = np.array(exam_list)

exam_list

[1, 2.5, 3, 4.0, 5, 8, 9.67]

In [5]:
exam_array

#The Numpy array has turned all into floats when there are int and float type data. 

#They will choose the more general type, which in this case, turning the data into floats is sufficient.

array([1.  , 2.5 , 3.  , 4.  , 5.  , 8.  , 9.67])

## 2.3 Adding a number

In [63]:
py_list = [1, 2, 3, 4, 5]
np_array = np.array(py_list)         # Reusing the Python list
                                     # to create a NEW
                                     # NumPy array

#py_list + 10        # Won't work!
np_array + 10

array([11, 12, 13, 14, 15])

## 2.4 Adding another list

In [70]:
py_list_1 = [1, 2, 3, 4, 5]
py_list_2 = [10, 20, 30, 40, 50]

np_array_1 = np.array(py_list_1)
np_array_2 = np.array(py_list_2)

#Lists
py_list_1 + py_list_2

[1, 2, 3, 4, 5, 10, 20, 30, 40, 50]

In [69]:
#Arrays
np_array_1 + np_array_2

#So, adding lists causes them to grow while adding arrays is an element-wise operation.

array([11, 22, 33, 44, 55])

## 2.5 Multiplying by a Number

In [67]:
py_list = [1, 2, 3, 4, 5]
np_array = np.array(py_list)         

#Lists
py_list*2

[1, 2, 3, 4, 5, 1, 2, 3, 4, 5]

In [74]:
#Arrays

np_array*2

#So multiplying by a number makes a list grow, whereas an array multiplies its elements by the number!

array([ 2,  4,  6,  8, 10])

## 2.6 Squaring

In [75]:
py_list = [1, 2, 3, 4, 5]
np_array = np.array(py_list)

#py_list**2                      # Won't work!  
np_array**2

array([ 1,  4,  9, 16, 25])

## 2.7 Asking questions

In [8]:
py_list = [1, 2, 3, 4, 5]
np_array = np.array(py_list)         

py_list == 3     # Works, but what IS the question?

#This checks if the list is equal to the values 3. Since the py_list is a list with 5 elements, it cannot be equal to a integer 3. Thus false.
#When using this question on array, python will check each and every element. 
#Thus it will give false, false, true, false false since only 1 element is equal to 3

False

In [81]:
#py_list > 3      # Won't work!
np_array == 3  

array([False, False,  True, False, False])

In [82]:
np_array > 3  

array([False, False, False,  True,  True])

## 2.8 Mathematics

In [83]:
py_list = [1, 2, 3, 4, 5]
np_array = np.array(py_list)     

sum(py_list)     # sum() is a base Python function

15

In [84]:
max(py_list)     # max() is a base Python function

5

In [87]:
min(py_list)     # min() is a base Python function

#py_list.sum()   # Won't work!

1

In [88]:
#Arrays

np_array.sum()

15

In [89]:
np_array.max()

5

In [90]:
np_array.min()

1

In [91]:
np_array.mean()

3.0

In [93]:
np_array.std()

#(roughly speaking) an operation on a list works on the whole list. In contrast, an operation on an array works on the individual elements of the array.

1.4142135623730951

# Exercises & Self-Assessment

In [94]:



#Done in Exercise




## Footnotes