# Better Python programming

This notebook was written by Tim Hillel (tim.hillel@ucl.ac.uk) for the UCL Department of Civil, Environmental, and Geomatric Engineering (CEGE) Introduction to Python sessions. 

Please contact before distributing or reusing the material below.

## Overview

This notebook builds from the python fundamentals notebook and introduces some more advanced techniques, which will help improve the quality of your python code.

These techniques include:
* Data structures & indexing
* Loops & iterables
* String formatting
* List comprehensions
* Function definition

## Data structures

In the last notebook we used literal values, but what happens when we need to add shape or structure to our data? 

Python allows us to combine values together in *data structures*. 

We will discuss three main data structures here, *list*, *tuple*, and *dictionary*. These are the structures you are likely to use most, though there are more!

Each one uses a different type of brackets:

1. `list` uses `[]` 
2. `tuple` uses `()`
3. `dictionary` uses `{}`

### Lists

Lists are 1D arrays of values, or *elements*, which are indexed by position. They are created using square brackets `[]`

We could, for instance, create a list of possible travel modes for your journey to EPFL.

In [None]:
travel_modes = ['Walk', 'Cycle', 'Bus', 'Train']
travel_modes

Lists are indexed by integer position. Indexing in python also uses square brackets (for all data structures!).

Note:
* Python uses zero-indexing (i.e. the first element in a list has the index 0)
* We can index from the right using negative numbers

In [None]:
travel_modes[-2]

We can also slice lists, using the colon `:`

In [None]:
# Slice from element 1 to the end of the list
travel_modes[1:]

In [None]:
# Slice from element 1 to element 3 (non inclusive)
travel_modes[1:3]

Remember that strings were described as lists of characters? We can therefore index strings in the same way.

Try taking a slice of the string `cyc`

In [None]:
cyc = 'Cycle'
# Take a slice of the first three characters of cyc
'Cycle'[:3]


You can also try a two-layer indexing

In [None]:
# Take a slice of the first three characters of element 1 of travel_modes
travel_modes[1][:3]


Lists have **variable length** and are **mutable**, in other words we can change values in a list after making it. 

Let's change the last travel mode to `'Metro'`

In [None]:
# Change the last travel mode to 'Metro'
travel_modes[-1] = 'Metro'
travel_modes


We can also add items to a list using the `append` method. Maybe you own a car?

In [None]:
# Add 'Drive' to the travelmodes
travel_modes.append('Drive')
travel_modes

Like strings, we can concatenate, lists. We can also mix types, and even store lists in lists (endlessly if needed)!

In [None]:
other_modes = ['Plane', ['Speed-Boat', 'Ferry'], None, 10]
# create a new_list combining travel_modes and other_modes
new_list=travel_modes + other_modes
new_list


There are many other methods that work with lists. Try using tab autocompletion to see which methods you can call in the list. Choose one and see if you can use the documentation to use it on new_list.

In [None]:
#try using a few string methods on new list
new_list.remove(10)
new_list


### Tuples

Tuples are very similar to lists, except they have **fixed length** and are **not mutable**. They are created using round brackets `()`.

In [None]:
nums = (10, 20, 30)
nums[1]

Note that as tuples are not mutable, we can not append to them or change a value in a tuple

In [None]:
nums[1]=40

Tuples are commonly used for simple data in a regular format (e.g. lat-long pairs) or for returning multiple function values. 

Tuples support *unpacking*, that is storing all of the values in a tuple as a separate variable in one line. Be careful though, you need to unpack the right number of values!

In [None]:
a, b, c = nums
print('a is ' + str(a))
print('b is ' + str(b))
print('c is ' + str(c))

In [None]:
d, e = nums

### Dictionaries

Dictionaries are lists which are indexed by a *keyword* (instead of integer position). As with lists, they have **variable length** and are **mutable**.

Dictionaries are created using curly brackets `{}`, and take the general form

    example_dict = {'keyword_1': value_1, 'keyword_2': value_2}

In [None]:
fuel_cost = 1.8
mode_costs = {'Walking': 0, 'Bus': 2.4, 'Car': fuel_cost}
mode_costs

We can then access the values by keyword. Try getting the cost for car:

In [None]:
#try extracting the cost for travelling by car
mode_costs['Car']


Adding new elements to the dictionary is then trivial. Just create a new keyword-value pair!

In [None]:
mode_costs['Metro'] = 2.4
mode_costs

### Copies

When storing data structures as variables, the variable points to the memory locations storing the data, and these are passed by *reference*. This is important, as multiple variables can point to the same memory locations.

In [None]:
# create a list
old = [1,2,3]
# store the list as a new variable a
new = old
# append a new value to the original list
old.append(4)
# new value is also in the new list (as they point to the same memory locations)
print(new)

If we want to create a new version of the same list (i.e. store it again separately in memory) we need to use the `copy` method.

In [None]:
separate = new.copy()
new.append(5)
separate

## Loops

In python, loops operate over objects called *iterables*, and use the keyword `for`.

Their general structure is similar to `if` statements:

    for i in iterable:
        <code_block>
        
Lists and tuples can be used directly as iterables. 

Let's try iterating over the `travel_modes` list, and print the first three letters.

In [None]:
# print the first three letters of each mode in travel_modes
for mode in travel_modes:
    print(mode[:3])


We can also combine loops with if statements. Try printing only the modes longer than 4 characters. 

Hint, try using the function `len`!

In [None]:
# Only print modes in travel_modes longer than 4 characters
for mode in travel_modes:
    if len(mode)>4:
        print(mode)


There are other functions that are useful for executing loops.

`range` is a function which creates an iterator containing integer values. It has multiple optional arguments, including a start, stop, and step. 

Try using range to print the even numbers between 5 and 15 in a `for` loop

In [None]:
# use range to print the even numbers between 5 and 15
for i in range(6,15,2):
    print(i)


What if we want both the value and the index simultaneously? `enumerate` allows us to access both the value and the index at the same time.

In [None]:
for i, v in enumerate(['Zero', 'One', 'Two']):
    print('Element ' + str(i) + ' is ' + v)

Finally, the `items` method of a dictionary allows us to iterate over the key-value pairs in a `dict`.

In [None]:
mode_costs = {'Walking': 0, 'Bus': 2.4, 'Car': 5}
for k, v in mode_costs.items():
    print('The cost of ' + k + ' is ' + str(v))

## String formatting

As you have seen in the previous notebook, formatting complex strings using concatenation can be quite clumsy.

Fortunately, there is a much more streamlined method, using the `format` method for strings.

We can insert any number of replacement fields `'{}'` in a string, and then pass the arguments to the placeholders using the `format` method.

In [None]:
mode = 'driving'
cost = 10
print('The cost of {} is {}CHF'.format(mode, cost))

As you can see, the format method automatically handles the casting numerical values into strings. 

String formatting is very powerful, and it is possible to control exactly how each value is displayed in the formatted string.

See the following website for detailed documentation.

https://pyformat.info/

### f-Strings (Python 3.6 or later)

Python 3.6 introduces f-Strings, which make string formatting even more streamlined!

Simply precede the string with `f`, and then you can fill the placeholders directly, so that the above string can be replicated with:

In [None]:
f'The cost of {mode} is {cost}CHF'

## List comprehensions

Let's look back at the loop we created above, to print the modes longer than four characters long:

In [None]:
for mode in travel_modes:
    if len(mode)>4:
        print(mode)

What if instead of printing the values, we wanted the first three characters of each as a new list?

We could do something like this:

In [None]:
new_list = []

for mode in travel_modes:
    if len(mode)>4:
        new_list.append(mode[:3])
        
new_list

Creating new lists from old lists is quite a common task to do in Python, but the above code is quite long and not that readable. 

Fortunately, there is a better method:

**List comprehensions**!

Take a look at the code below, and compare it to the code above. It contains all the same elements, but in a much more readable way (and all on one line!)


In [None]:
new_list = [mode[:3] for mode in travel_modes if len(mode)>4]
new_list

We can use a similar construct called a **dictionary comprehension** to create dictionaries as well. For instance, see the following code which makes a dictionary with the travel mode as the key, and its character length as the value

In [None]:
{mode: len(mode) for mode in travel_modes}

## Function definitions

In the previous notebook we saw how to call functions and methods, but what if we want to write our own?

In this last section we will see how to write and use our own functions.

Let's start by defining the simplest function possible, one that takes no inputs and doesn't return anything, e.g. a 'Hello world!' function:

In [None]:
def hello_world():
    '''
    Prints the phrase \'Hello World!\'
    '''
    print('Hello world!')

The function definition is started with the keyword `def`. 

We then give the function a name (`hello_world`) and specify the inputs it takes inside the brackets (none!). 

Next comes the *docstring*. This is simply a string with three quote marks, which tells the user how to use the function. Lets try getting help for our new function:

In [None]:
hello_world?

Documenting your code is **crucial** if you want to work with others or for others to use and understand your code, and **extremely helpful** for your own code, even if others aren't using it. It can be helpful to write the docstring for a function before the function itself, and then you can implement it following the instructions.

The indented block of code after the function is what will be executed when we call the function. Let's try calling our function!

In [None]:
hello_world()

### Function arguments

We can make our function more useful by taking input arguments. Try creating a function that prints an epfl email address for a first and last name.

In [None]:
def epfl_email(first_name, last_name):
    '''Email address generator, prints <first_name>.<last_name>@ucl.ac.uk'''
    print(f'{first_name.lower()}.{last_name.lower()}@ucl.ac.uk')

epfl_email('Tim', 'Hillel')

What happens if we try to call the function with no arguments?

In [None]:
epfl_email()

We can set default values to our function arguments by setting them to be equal to a value

In [None]:
def epfl_email(first_name='first', last_name='last'):
    '''Email address generarator, prints <first_name>.<last_name>@ucl.ac.uk'''
    print(f'{first_name.lower()}.{last_name.lower()}@ucl.ac.uk')

epfl_email()

Try creating a function that requires the users first and last name, but includes the top layer domain (e.g. `ch`) and second layer domain (e.g. `epfl`) to be optional arguments

In [None]:
# modify the function to add sld and tld as optional arguments
def epfl_email(first_name, last_name, sld='epfl', tld='ch'):
    '''Email address generator, prints <first_name>.<last_name>@<sld>.<tld>'''
    print(f'{first_name.lower()}.{last_name.lower()}@{sld}.{tld}')

epfl_email('Tim', 'Hillel', tld='co.uk')


### Return

As well as taking arguments, our functions can *return* values, using the `return` keyword

In [None]:
def epfl_email(first_name, last_name, sld='epfl', tld='ch'):
    '''Email address generator, returns <first_name>.<last_name>@<sld>.<tld>'''
    email_address = f'{first_name.lower()}.{last_name.lower()}@{sld}.{tld}'
    return email_address

In [None]:
epfl_email('Tim', 'Hillel', sld = 'lhc')