# For Loops

Currently we've looked at for loops. With these we can perform various procedures such as adding 5 to every number from 0 -> 20 (exclusive)

In [1]:
# E.g
results = []
for i in range(20):
    results.append( i + 5 )
    
# We should expect 5 -> 25 (exclusive)
print(results)

[5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]


1. We instantiate an empty list `results = []`
2. We iterate over range(20) `[0,1,2,3,....,19]`
3. Calculate `i + 5`, append it to results.

# List Comprehensions
List comprehensions are an alternative way of making lists.
This is considered a more elegant and **pythonic** way of writing them.

We can rewrite the above example as the following:

In [2]:
results2 = [ i + 5 for i in range(20) ]

# Sanity print, just to make sure it's equivalent.
print(results2)

[5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]


In [3]:
# Let's go a step further and get Python to verify.
# assert is a function that will Error if the condition doesn't hold.
# assert(results == results2)

# Could also use an if, if we didn't want a big red klaxon to go off.
# if results == results2:
#     print("They are equivalent!")

# assert("bob" == "cat") # Uncomment the beginning of this line if you want to check that.

## List Comprehensions are comprised of 3 main elements:

`new_list = [ expression for element in some_iterable ]`

1. **expression** - This element itself, a function call, any expression which can be evaluated. E.g i + 5
2. **element** - The element variable from the iterable. E.g i for our example.
3. **some_iterable** - An Iterable Data Structure. Hint: If it works with `for elem in the_structure` then it's a candidate for this. E.g range(20) in our case.

The main gotcha here, is that expression has to be 'singular'. We cannot have `<expression>` `<expression>` `<expression>` like in the case of more complicated for-loops.
But, we've seen expressions are quite powerful and flexible. An expression can be a function call, just as easily as it can be `i`, or `i + 5`. E.g `something_new_list = [ get_grade( student ) for student in roster ]`

### Why Bother?
Compare the for loop variants to the List Comprehension variants. It can be argued that List Comprehensions vastly improve readability. They are robust, and can be utilised in many different situations. Reduces cognitive load - We don't need to keep track of loop iterations or have any *side effects*.

List Comprehensions are inherently more **declarative** in nature, it reads more like plain english than a set of computer instructions. Higher-level than for loops alone. It doesn't require the creation of extra variables, then appending, etc. Instead you declare what you want this list to be, and Python takes care of the creation, and appending.

## List Comprehensions: Level 2
Commonly, list comprehensions are utilised as filters on existing data structures. This is a very important and handy operation in Data Science.

However, with our above definition, there's no real way to filter the list unless you put all of that logic into get_grade function.

Luckily, List Comprehensions have some additional syntax.

`new_list [ expression for element in some_iterable (if conditional) ]`

### Example: Filtering a list of numbers for even numbers.

In [4]:
numbers = [20, 23, 10, 12, 7, 9, 11]

# Traditional approach
even_numbers = []
for n in numbers:
    if n % 2 == 0:
        even_numbers.append( n )
        
print(even_numbers)

# List Comprehension Way
even_numbers = [n for n in numbers if n % 2 == 0]
print(even_numbers)

[20, 10, 12]
[20, 10, 12]


As you can see, the list comprehension way is a single line, at the same indentation level! And it perfectly filtered our input list.

"But!" I hear you interject, that's just a simple check. Using our knowledge of Boolean Expressions we can build increasingly complex logic, that still works in this format. So long as it can return a True/False, it can be used.

In [5]:
a_mess = [20, 23, [1,2], 10, 12, 7, 9, "bob", 11, "cat", 89.001983276298, {}]

# How are we going to filter this mess!

# Make a function with a good name! As it returns bool, we do is_ for good readability and understanding!
def is_number(n):
    # a or b returns True if either is the case. 
    # As 'number' could be an integer, or a float, we have to consider both.
    return isinstance(n, int) or isinstance(n, float)

numbers_filtered = [ x for x in a_mess if is_number(x) ]
print(numbers_filtered)

# We can also do the following, if we didn't want to make it a function ( if we only ever use it once for example )
numbers_filtered = [ x for x in a_mess if (isinstance(x, int) or isinstance(x, float)) ]
print(numbers_filtered)

[20, 23, 10, 12, 7, 9, 11, 89.001983276298]
[20, 23, 10, 12, 7, 9, 11, 89.001983276298]


### List Comprehensions: Level 3

So we can filter whether or not to include, or exclude a data element. What if we wanted to change what that element becomes in the output?

`new_list = [ expression (if cond) for element in some_iterable ]`

We can use a conditional at the beginning (rather than the end), to change the value.

In [6]:
sales = [-10, -32.9, 12, 11.34, 0]

#profit_only = [ i for i in sales ] # This would get each element.
#profit_only = [ i for i in sales if i > 0] # This would exclude all data members < 0.
profit_only = [ i if i > 0 else 0 for i in sales ]

print(profit_only)

[0, 0, 12, 11.34, 0]


If we used our condition at the end, as a filter, it would have returned only a list length of 2. What we actually wanted was to find out much profit we made per item. Putting the condition at the front allows us to modify the value that goes into our output list.

Try uncommenting the other lines to see the differences (remember to only have one line on at anytime!)

That example is a little bit complex and unreadable. We can bring back our friend the function to help us here.

In [7]:
sales = [-10, -32.9, 12, 11.34, 0]

def get_profit(n):
    if n > 0:
        return n
    else:
        return 0
    
# More compact
# def get_profit(n):
#     return n if n > 0 else 0

profit_only = [ get_profit(i) for i in sales ]

print(profit_only)

[0, 0, 12, 11.34, 0]


Much more readable!

Always strive for more readable code, understanding is key. If it does the task in a more complicated way its more likely to be misunderstood, or any errors missed. Clarity of communication should be core focus after functionality.

# Dict Comprehensions
Just like how List Comprehensions can improve situations for Lists, we can also do these on Dictionaries. (And even data structures like Sets too)

They behave the same way, but have slightly different syntax, we need to account for the key and value.

`new_dict = {key: val for val in some_iterable}`

These may seem harder to use as you need a variable for the key, and the value. These are used less often than List comprehensions.

In [8]:
new_prices = {v: v + 5 for v in range(20)}
print(new_prices)

# 0 -> 0 + 5
# 1 -> 1 + 5
# ...
# 19 -> 19 + 5

{0: 5, 1: 6, 2: 7, 3: 8, 4: 9, 5: 10, 6: 11, 7: 12, 8: 13, 9: 14, 10: 15, 11: 16, 12: 17, 13: 18, 14: 19, 15: 20, 16: 21, 17: 22, 18: 23, 19: 24}


### More advanced
This will use zip, and enumerate to show examples of things which have multiple values per iteration, so we can use one as a key, and one as a value

In [9]:
# See https://docs.python.org/3.8/library/functions.html#zip

people = ["Ada", "Grace", "Jim"]
grades = [98.0, 76.9, 75.0]

people_grades_dict = {name: grade for name, grade in zip(people, grades)}
print(people_grades_dict)

{'Ada': 98.0, 'Grace': 76.9, 'Jim': 75.0}


In [10]:
# See https://docs.python.org/3.8/library/functions.html#enumerate

prices = [ 458, 122, 32.5, 89 ]

def vat_inclusive( price ):
    """ Calculate a 20% tax on price
    Parameters:
        price: float
    Outputs:
        Float value which has 20% tax added to price
    """
    
    return price * 1.2 # or price + price*0.2

# Store the index, against the 
# Let's see what enumerate does first
for i, v in enumerate( prices ):
    print(i, v) # Gives us a counter (from 0) and the value. Literally enumerates them.

print("\n\n") # Bunch of new lines.
id_prices_dict = {k: vat_inclusive(price) for k, price in enumerate(prices)}
print(id_prices_dict)


0 458
1 122
2 32.5
3 89



{0: 549.6, 1: 146.4, 2: 39.0, 3: 106.8}


# Why wouldn't I use List Comprehensions

At the moment, we've been dealing with a single for loop.

### Nesting
A **nested** Comprehension is slightly more tricky. If we wanted a List with a List at every index. Or a List at every value for a key in a dictionary?

Let's say we wanted to track the Daily Power Output for the first week of the month, of Wind Turbines in various sites. List in a Dictionary would be the ideal data structure for this.

In [11]:
turbine_sites = ["Project One", "Candidate Site B", "EOL Zeta"]
MW_output = {site_location: [0 for _ in range(7)] for site_location in turbine_sites }

# This is our starting data structure.
# print(MW_output)

# Prettier printing
for site, output in MW_output.items():
    print(site, output)

Project One [0, 0, 0, 0, 0, 0, 0]
Candidate Site B [0, 0, 0, 0, 0, 0, 0]
EOL Zeta [0, 0, 0, 0, 0, 0, 0]


In this example we have created a dictionary, which has a key for every major turbine site, and at this key stores the List of MW Output for the first 7 days of the month. Currently these are initialised as 0, but we can now populate this structure with real-world values by using indexing.

`MW_output["Project One"][2] = 0.78` This would be the 3rd day of 'Project One' site, which recorded 0.78 MW for the day.

Nested Lists - Lists in Lists, are common, we might do the following:

`mat = [ [i for i in range(5)] for _ in range(5) ]`

These are often refered to as <u>matrices</u>, which are used extensively in mathematical computing.

In [12]:
mat = [ [i for i in range(5)] for _ in range(5) ]
print(mat)
print("\n")
for line in mat:
    print(line)

[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]


[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]


### Flattening

As we progress, we can get into situations where readability quickly suffers. The example above is more complex, but still readable.

If we took a 2D matrix such as the one above, and attempted to **flatten** it. Such that it was in a 1D array instead of 2D.

In [13]:
simple_mat = [
    [ 0, 0, 0 ],
    [ 1, 1, 1 ],
    [ 2, 2, 2 ]
]

flattened_mat = [ n for row in simple_mat for n in row ]
# WHAT?!?!?!

print(flattened_mat)

[0, 0, 0, 1, 1, 1, 2, 2, 2]


In [14]:
# Let's break this down

results = []
for row in simple_mat:
    for number in row:
        results.append( number )
print(results)

[0, 0, 0, 1, 1, 1, 2, 2, 2]


Which is more readable to you now? I know I certainly prefer the second one. It is far too simple to get things in the wrong order.
This nightmare gets compounded should we ever wish to put some conditionals in there. If we want to filter which items are considered, or modify the values that get put in the output (or both at the same time!)