Session 02: Data structures and functions

Objectives
- Understand the usefulness of data structures to organize information within our programs.
- Use lists and dictionaries and understand the main differences between them.
- Declare functions and use them successfully.

*LISTS*

Lists are one of the types of data structures that we can use in Python. A data structure helps us to organize data and to optimize the access, processing and use of data.

Lists are ordered sequences of data and look like this:

list_of_numbers = [1, 4, 7, 5, 3, 4, 6]*texto en cursiva*

Each item in the list has an index, which is used to access that item. Since lists are ordered, the indices are assigned sequentially from the first element to the last. The first index is always 0, and therefore the last index is always n - 1, where n is the total number of elements in the list.

In [None]:
list_1 = [1, 4, 7, 5, 3, 4, 6]

So, to access the first and last elements of the list above, we do the following:

In [None]:
print(f'First element of the list: {list_1[0]}')
print(f'Last element of list: {list_1[4]}')

If we try to access an index that doesn't exist, because there aren't enough elements in the list, Python gives us an error:

In [None]:
list_1[10]

Python allows us to access elements starting from the last element to the first (the other way around, then). This is done using negative numbers. So to access the last number I can use a -1, to access the penultimate number I can use -2 and so on:


In [None]:
print(list_1[-1])
print(list_1[-2])

As you can imagine, we can't just store raw data in a list (1, 2, 3, 4, etc). We can store data in variables and then store those variables in the list. The following two lists are equivalent:


In [None]:
list_with_raw_values ​​= [1, 2, 3, 4, 5]

one = 1
two = 2
three = 3
four = 4
five = 5

list_with_variables = [one, two, three, four, five]

print(list_with_raw_values)
print(list_with_variables)


A list can contain any data type that we already know about (even other lists).

In [None]:
list_of_floats = [2.4, 5.67, 8.7, 9.34]
list_of_strings = ["John", "Pepe", "Pedro", "Jose"]
list_of_booleans = [True, False, False, True, False]
list_of_lists_of_ints = [[3, 4, 6], [7, 8, 9], [4, 6, 2]]

*Modifying lists*


**append** adds an element to the end of the list:

In [None]:
lista_1 = [1, 2, 3, 4, 5, 6]

lista_1.append(7)

lista_1


**pop** removes the last element of the list if called with no arguments. If we pass it an index, it removes whatever element is stored at that index:

In [None]:
lista_2 = [1, 2, 3, 4, 5, 6]

lista_2.pop()

lista_2

In [None]:
lista_2.pop(1)

lista_2

**Dictionaries**

Dictionaries are our second basic data structure in Python. Unlike lists, dictionaries do not have a defined order. This doesn't matter too much because dictionaries contain key-value pairs instead of elements (as in lists). To access the values, simply pass the key we are looking for to the dictionary. Think of it like passing a url to your browser to get a web page. Somewhere there is a structure similar to a dictionary that relates each url with the ip where the page we want to access is stored.


I find dictionaries especially valuable because of their similarity to JSON format and because they are ideal for representing rows (samples) in a table (the column name as the key and the cell value as the value of the key). It's also a good idea to talk about the idea of ​​mapping, how we can create pairs of information that represent "the same thing" but seen from different perspectives or at different levels of depth.


In [1]:
dictionary_1 = {
    "key_1": "value_1",
    "key_2": "value_2",
    "key_3": "value_3",
    "key_4": "value_4"
}

In [2]:
dictionary_1["key_1"]

'value_1'

In [None]:
dictionary_4 = {
    "int": 123,
    "float": 23.56,
    "string": "Hello",
    "boolean": True,
    "list": [1, 2, 3, 4],
    "dictionary": {
        1 one",
        2 two"
    }
}

In [None]:
print(dictionary_4["float"])
print(dictionary_4["boolean"])
print(dictionary_4["list"][3])
print(dictionary_4["dictionary"][1])

In [None]:
from pprint import pprint


In [None]:
contact_info = {
    "name": "Elizabeth",
    "tel": 5546352431,
    "dir": {
        "colony": "Del Valle Centro",
        "street": "Pillars",
        "num": 69,
        "zip": "03100"
    }
}

pprint(contact_info)

In [None]:
contact_info["email"] = "isabel.arriaga@gmail.com"

pprint(contact_info)

In [None]:
contact_info.pop("tel")

pprint(contact_info)

**FUNCTIONS**

Repeating code is one of the worst practices that we can have as programmers and analysts: it makes our code confusing and difficult to modify. For that there are functions, which allow us to "encapsulate" processes so that they can be repeated throughout our program.

Functions can help us take the complexity out of our program and make it easier to understand.

This is what a function looks like:


def am_a_function(am_a_parameter):
    
    new_variable = I_am_a_process
    
    return new_variable

In [None]:

def multiply_number_by_pi(number):
    result = number * 3.14
    
    return result

In [None]:
def this_function_does_always_the_same():
    
    result = 2 * 10
    
    return result

print(this_function_does_always_the_same())
print(this_function_does_always_the_same())
print(this_function_does_always_the_same())

In [None]:
def add_number_to_list_if_number_is_even(list, number):
    
    if number % 2 == 0:
        list.append(number)
        
    return list

list_of_ints = [2, 34, 26, 88, 4]

list_of_ints = add_number_to_list_if_number_is_even(list_of_ints, 5)
list_of_ints = add_number_to_list_if_number_is_even(list_of_ints, 66)

list_of_ints

In [None]:
def returns_true_if_the_value_is_between_50_and_60(value):
    
    if value > 50:
        if value < 60:
            return True
    
    return False

result_1 = returns_true_if_the_value_is_between_50_and_60(58)
result_2 = returns_true_if_the_value_is_between_50_and_60(89)

if result_1 == True:
    print("The first value is greater than 50 and less than 60")

if result_2 == True:
    print("The second value is greater than 50 and less than 60")