# Python Prerequisites

This book isn't intended to teach basic python or use of python Integrated Development Environments such as pycharm, or visual studeo code.  There are many excellent books you can purchase (or borrow from a good library), and free courses on YouTube that you can make use of to get you started.  If you are an absolute beginner you can also investigate some [online preparatory material](https://health-data-science-or.github.io/basic-python/content/front_page.html) I provide for the MSC in Health Data Science at Exeter, but this it no intended to be exhaustive, and you would be daft to not investigate the plethora of material available for free online.

[2, 4, 6, 8]


### Example 3 - Using if statements within a list comprehension

You can extend list comprehensions to include if statement.  This limits what is used in the loop.  I often use this as a simple approach to filter a small list.  A common bit of code I use is filtering file types.

This example filters a list of file names to the python files only.  First here is the standard `for` loop example. Note the inclusion of the `if` conditional to identify the python files.

In [10]:
unfiltered_files = ['test.py', 'names.csv', 'fun_module.py', 'prog.config']

python_files = []

# filter the files using a standard for loop 
for file in unfiltered_files:
    if file[-2:] == 'py':
        python_files.append(file)
        
print('using standard for loop: {}'.format(python_files))

using standard for loop: ['test.py', 'fun_module.py']


The list comprehension code if far more compact.  We simpy add the `if` statement onto the end of the syntax. I.e.

In [11]:
python_files = [file for file in unfiltered_files if file[-2:] == 'py']
print(f'using list comprehension {python_files}')

using list comprehension ['test.py', 'fun_module.py']


### Example 5 - List comprehension to create a list of lists

Its likely at some point that you will need a list of lists.  List comprehension syntax here is very flexible and will allow you to do create these lists, but its a bit harder to follow in my opinion.  This is equivalent to a nested for loop.  Here is some standard code to create one:

In [17]:
list_of_lists = []

# outer loop: controls how many items are in the outer list
for i in range(5):
    # on each iter we create a new sublist to nest
    sub_list = []
    # inner loop: controls how many items are in the sub list.
    for j in range(3):
        # arbitrary operation for the example
        sub_list.append(i * j)
    # we append the sunlist to the list of lists.
    list_of_lists.append(sub_list)

print(list_of_lists)

[[0, 0, 0], [0, 1, 2], [0, 2, 4], [0, 3, 6], [0, 4, 8]]


a list comprehension reduces 6 lines of code to 1!

In [19]:
list_of_lists = [[i * j for j in range(3)] for i in range(5)]
print(list_of_lists)

[[0, 0, 0], [0, 1, 2], [0, 2, 4], [0, 3, 6], [0, 4, 8]]


Although the reduction in code is nice.  It is less readable at first glance.  Let's break it down.  

The first thing to note is that we have an inner and outer list comprehension just list the nested `for` loops. Let's simplify to see it:

```python
[[<inner comprehension>] for i in range(5)]
```

When we look at the code list this it doesn't seem so bad does it?  Its just the same as examples 1 and 2.  But we replace the mathematical operation or function call with another list comprehension.  Just like function calls we can use `i` (our current variable in the loop) in the the inner list comprehension.  i.e.

```python
[i * j for j in range(3)]
```

Here `j` varies, but i remains constant. When we put these two together we get the list of lists.

### Example 6: Iterate over all items in a list of lists

This time we will start with a list of lists.  How do we iterate over all items in the list?

As an example use this list

```python
[[8, 2, 1], [9, 1, 2], [4, 5, 100]]
```


In [28]:
list_of_lists = [[8, 2, 1], [9, 1, 2], [4, 5, 100]]

flat_list = []
for sublist in list_of_lists:
    for item in sublist:
        flat_list.append(item)

print(flat_list)


[8, 2, 1, 9, 1, 2, 4, 5, 100]


The code above iterates through each item in turn e.g. 8, 2, 1, 9 ... 100.  At the end of the snippet we have produces a flat list (1 dimension).  But we need not have created a list we could have called a function or performed some processing inline.  

To create a flat list with list comprehension syntax we can use the following code:

In [29]:
list_of_lists = [[8, 2, 1], [9, 1, 2], [4, 5, 100]]


flat_list = [item for sublist in list_of_lists for item in sublist]
print(flat_list)

[8, 2, 1, 9, 1, 2, 4, 5, 100]


I've used the same variable names as the `for` loop code. I.e. `sublist` and `item`. I think that makes the comparison of the code easier.

You can think of the nested comprehension in two parts. This is the same as the standard for `loop` i.e. outerloop then inner loop.

## Dictionarys

Most programming languages have a form of dictionary.  They consist of key and item pairs.  The key - typically, a `str` allows quick lookup of the item: for example, a `list` or `int`.  In python a dictionary is refered to as a `dict`.  The basic syntax to declare one statically is as follows:

In [2]:
foo = {'bar':10, 'spam':'eggs', 10:'knights'}

A few things to notice: 

* A `dict` uses `{}` 
* Key and item pairs are separated by a semi colon
* The data type of the keys and items do not need to be consistent.  It can be any type you require.

> My recommendation is that you keep the data type for the **key** consistent. This simplification avoids silly mistakes.  Flexibility isn't always your friend! 

We access the items in the dict by using their keys


In [5]:
print(foo['spam'])

eggs


In [6]:
print(foo['bar'])

10


In [8]:
print(foo[10])

knights


### Creating an empty dict and adding key:items

This is an easy operation. First we simply create a `dict` with no key item pairs and assign to a variable.

In [18]:
print(foo)
print(type(foo))

{'bar': 'spam', 'eggs': 10.0}
<class 'dict'>


To add an item we use a simple syntax where we specify the key between `[]` and assign the item value

In [24]:
foo = {}
foo['bar'] = 'spam'
foo['eggs'] = 10.0
print(foo)

{'bar': 'spam', 'eggs': 10.0}


### Checking for the existance of a key

Keys must be exist in a `dict` to be referenced  You will raise an exception if you try reference a key that is not present.  If you want to check if a key exists you can use the following syntax:


In [28]:
foo = {}
foo['spam'] = 'eggs'

if 'spam' in foo:
    print(foo['spam'])
else:
    print('oh dear there is no spam')

eggs


### Dictionary comprehensions

If needed you can also create a `dict` using a dictionary comprehension e.g.

In [2]:
keys = ['bar', 'spam', 'nee']
items = [10, 'eggs', 'knights']

foo = {key:item for key, item in zip(keys, items)}
print(foo)
print(foo['bar'])

{'bar': 10, 'spam': 'eggs', 'nee': 'knights'}
10


### Nested dictionaries

Sometimes it can be useful to have a nested dictionary.  A good example of when this is useful is when creating a `pandas` `Dataframe`.  You can think of a nested dictionary as having a table like structure.  Here's a simple example:

In [4]:
bands_dict = {'band': {0: 'pantera', 1: 'metallica', 2: 'megadeth', 3: 'anthrax'},
              'n_albums': {0: 9, 1: 10, 2: 15, 3: 11},
              'yr_formed': {0: 1981, 1: 1981, 2: 1983, 3: 1981},
              'active': {0: False, 1: True, 2: True, 3: True}}

The variable `bands_dict` is just a simple `dict`.  However, each key can be though of as a column name in a table.  Each item contains the row data.  For example to find the data in row 0 of the bands column we use:

In [9]:
bands_dict['band'][0]

'pantera'

Or if you want all data in a column you simple use the key:

In [7]:
bands_dict['yr_formed']

{0: 1981, 1: 1981, 2: 1983, 3: 1981}

Again this data structure might be useful to create if you are collecting data during the run of a model of algorithm.