**Introductory and intermediate computing for Data Science [Barcelona School of Economics]**

`Instructor:` Maxim Fedotov  
`Program:` M.Sc. in Data Science Methodology

# Class 3

## Sequence types: lists, tuples, ranges
Let's dive into several very important built-in sequence types: `list`, `tuple` (and touch `range` a bit). Theit nature follows from the name: these data structures contain ordered collection of objects. Their *displays* work as follows:

In [None]:
typical_cup_volumes = [0.33, 0.5]
clients_addresses = ("Ramon Trias Fargas, 25-27", "Roc Boronat, 138", "Carrer de la Mercè, 12",
                     "Doctor Aiguader, 80", "Passeig Pujades, 1", "Balmes, 132-134")
clients_ids = range(len(clients_addresses))

The above cell shows expressions that create such objects. How to make Python understand that an object is a tuple of one element? Try it out here:

In [None]:
# create and print (or just call) a variable which is supposed to be a tuple of one element


As you have seen already, arithmetic operators can be used with lists and tuples.

In [None]:
typical_cup_volumes = [0] + typical_cup_volumes
print(f"A healthy option would be to have {typical_cup_volumes[0]} liters of a sweet sparkling drink.")

As you can see, we can access a value by specifying its index in the square brackets `typical_cup_volumes[0]` (note that it is necessary that there is no space between a name of a variable and a left square bracket). A negative integer would also work as an index – its absolute value specifies a position of an element with respect to the end of the list. We can access specfic elements from these data structures doing *slicing*. The interface for slicing is the same square brackets as for selecting one elements, but with specific contents inside it: `[start:end:step]`. Note that it is not necessary to specify all of them.

In [None]:
print("But I still sometimes drink {:.3f}".format(typical_cup_volumes[-1]))

clients_addresses_reverted = clients_addresses[::-1]
print("These are addresses in reverse order:", "; ".join(clients_addresses_reverted))

print("IDs of the first two clients:", clients_ids[:2])  # note that you get the specific result here because 
                                                         # the variable is a range. 

We can change values of lists by using an assignment expression of the following form:  
`list_identifier[index | slice] = new_value`

Note that if you provide ind(-ex / -ices) then you get an `IndexError`

The main difference between a list and a tuple is that the former is *mutable* and the latter is *immutable*. It means that we can freely change values of elements in a list, but not in a tuple.

In [None]:
# try to change the first element of typical_cup_volumes to 0.180

# now try to change any of the entries of clients_addresses


Note that Python considers several identifiers separated with commas as a tuple. This allows us to use an elegant expression when we work with several variables at the same time. For example, a basic computer science problem of swapping values of two variables can be done simply like that: 

In [None]:
value_1 = 1
value_2 = 2

value_1, value_2 = value_2, value_1

print(value_1, value_2)

One would typically do tuple *destructuring*.

## For loops

To really make use of the sequence data structures, we have *loops* at our disposal. There are two types of loops in Python: `for` and `while`. We start with the former ones as they are used in list comprehensions (critically useful tool).

Below you can find an example of a simple for loop:

In [None]:
kilocalories_drink = 37

kilocalories_portions = []

for volume in typical_cup_volumes:
    kilocalories_portions.append(calories := volume * 10 * kilocalories_drink)
    print(f"There are {calories:.1f} calories in {volume * 1000:n} ml. of the drink")

The basic contents of a for loop are:
* A keyword `for`
* An arbitrary identifier for a single element at each iteration (here it's `volume`)
* A keyword `in` which indicates that at each iteration we take on element of an iterable object that we specify right after.
* An identifier of an iterable which we want to loop through (here it's `typical_cup_volumes`) which is followed by `:`.
* Then there goes a body of the loop. Do not forget about correct indentation.

## List comprehensions

We could do the same thing using a list comprehension.

In [None]:
kilocalories_drink = 37  # typical value per 100 ml of a sweet sparkling drink
kilocalories_portions = [volume * 10 * kilocalories_drink for volume in typical_cup_volumes]
print(*kilocalories_portions, sep=' | ')

This list comprehension implements *mapping*, i.e. we apply a specific action to each element of the list.

We could define a function and do it the same way.

In [None]:
def kcal_portion(volume: int | float, kcal_drink: int | float) -> float:
    """
        Computes kilocalories per portion of drink.
        
        arguments: 
            - volume:     a volume of the drink in liters.
            - kcal_drink: kcal in 100 ml. of the drink
        
        returns:
            A value of kcal. per specified portion.
    """
    return volume * 10 * kcal_drink


kilocalories_portions = [kcal_portion(volume, kilocalories_drink) for volume in typical_cup_volumes]
print(*kilocalories_portions, sep=' | ')

In [None]:
help(kcal_portion)

There is also a concept of *filtering* which can be implemented with a list comprehension.

Suppose that we want to select only non-zero volumes from the list of volumes. Then we can use the following comprehension:

In [None]:
volumes_positive = [volume for volume in typical_cup_volumes if volume > 0]
print("Positive volumes are:", *volumes_positive)

We can also combine filtering with mapping. It is also allowed to use `else` section.

In [None]:
kilocalories_portions = [kcal_portion(volume, kilocalories_drink) 
                         if volume > 0 else None for volume in typical_cup_volumes]
print(*kilocalories_portions, sep=' | ')

Another helpful concept is *reducing*. That is, we can retrieve some useful information (e.g. some statistic) from a list, i.e. we reduce it to one particular number.

Let's see how we can take a maximum element from the list above.

In [None]:
def get_max_safe(_list: list):
    _list_dropna = [elem for elem in _list 
                    if elem is not None and isinstance(elem, (int, float, complex)) and ~isinstance(elem, bool)]
    return max(_list_dropna)

get_max_safe(kilocalories_portions)