<div style="color:red;background-color:black">
Diamond Light Source

<h1 style="color:red;background-color:antiquewhite"> Python Fundamentals: Comprehensions</h1>  

©2000-20 Chris Seddon 
</div>

Execute the following cell to activate styling for this tutorial

In [1]:
from IPython.display import HTML
HTML(f"<style>{open('my.css').read()}</style>")

## 1
As discussed elsewhere, everything is immutable in Functional programming.  As a consequence, functional programmers do not want to use "for" loops because the loop counter is mutable.  As an alternative, Python offers a "list comprehension" that only uses immutable elements.

In the following example, the comprehension:
<pre>[x**3 for x in (1, 2, 3, 4, 5)]</pre>
is read as <pre>for every value x in the tuple, compute x\**3</pre>
Thus you read comprehensions back to front; the for clause first and the cpmutation (x\**3) last.  Many people find this confusing at first, but you soon get used to it.  

A subtle point (lost on some people) is that despite appearances, "x" is immutable.  That's because there are 5 different "x"s that are all immutable, rather than one "x" that is mutable.  

And finally, note that the comprehension produces a list (which paradoxically is mutable!):

In [2]:
n = [x**3 for x in (1, 2, 3, 4, 5)]
print(n)

[1, 8, 27, 64, 125]


## 2
We often use the "range" generator in a list comprehension:

In [3]:
n = [x**3 for x in range(10)]
print(n)

[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]


## 3 
The sequence used in the for clause can be filtered.  The following example only computes x\**3 for even values of x because of the filter <pre>if x%2 == 0</pre>

In [4]:
n = [x**3 for x in range(10) if x%2 == 0]
print(n)

[0, 8, 64, 216, 512]


## 4
The comprehension is really performing a transformation on a sequence, subject to optional filtering.  The following example confuses many, but is actually the identity transformation.  

The identity transformation is this part of the comprehension:<pre>[x for x in ...</pre>

So only the filter is effective in this example:

In [5]:
n = [x for x in range(20) if x%2 == 0]
print(n)

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]


## 5
Comprehensions don't just work with numbers - we can use any data type as the sequence.  Here is an example where we compute the length of each string in a sequence of strings:

In [6]:
n = [len(s) for s in ("sometimes", "this", "leads", "to", "difficult", "understand", "code")]
print(n)

[9, 4, 5, 2, 9, 10, 4]


## 6
In all of the examples so far, the compute clause has generated a number.  Whilst this is common, other possibilities exist.  In this example, the computation stage compute strings with all their characters reversed: 

In [7]:
n = [s[::-1] for s in ("sometimes", "this", "leads", "to", "difficult", "understand", "code")]
print(n)

['semitemos', 'siht', 'sdael', 'ot', 'tluciffid', 'dnatsrednu', 'edoc']


## 7
Another possibility is to compute a list for each item in the given sequence.  

In the example below we split each of the input strings into lists, based on the space as a delimiter:

In [8]:
n = [s.split() for s in ("one", "one two", "one two three", "one two three four")]
print(n)

[['one'], ['one', 'two'], ['one', 'two', 'three'], ['one', 'two', 'three', 'four']]


## 8
Normally a comprehension works on a sequence, but sometimes it would be helpful if we could work with a combination of 2 or more sequences.  This can be achieved using "zip".  Here we form a f-string based on a pair of strings, for each element returned by zip.  So you can see what is happening, I've shown the output from zip as well as the final list generated by the comprehensions.  

zip will return a matching pair of strings from the 2 lists below:

In [9]:
for s1,s2 in zip(['red', 'blue', 'green'], ['kite', 'tit', 'finch']):
    print(f"{s1} {s2}")
print()

n = [f"{s1} {s2}" for s1,s2 in zip(['red', 'blue', 'green'], ['kite', 'tit', 'finch'])]
print(n)

red kite
blue tit
green finch

['red kite', 'blue tit', 'green finch']


## 9
You are not restricted to working with a single sequence.  Consider this example where we have multiple sequences.  In the example below, the fragment <pre>[... for x in range(5) for y in range(7)]</pre>
means for every combination of x and y (the outer product of the sequences); this will be 35 items.


In [10]:
n = [(x,y) for x in range(5) for y in range(7)]
print(n)

[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6)]


## 10
Comprehension can be nested, although sometimes this leads to difficult understand code.  Hopefully, the following example is more understandable.  

We have a double list comprehension, resulting in a 2 dimensional list:

In [11]:
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

cubes = [[col**3 for col in row] for row in matrix]
print(cubes)

[[1, 8, 27], [64, 125, 216], [343, 512, 729]]


## 11
List comprehensions are not the only type of comprehension in Python.  The vast majority of comprehensions are list comprehensions, but other types exist:
* dict comprehension
* set comprehension
* generator comprehension

Let's start with a dict comprehension.  These comprehensions are indicated by { } brackets; they produce a set of key-values pairs for each item in the given sequence.  

In this example the keys will be computed as x\*\*2 and the values will be computed as x\*\*3:

In [12]:
d = {x**2: x**3 for x in range(20) if x%2 == 0}
print(d)

{0: 0, 4: 8, 16: 64, 36: 216, 64: 512, 100: 1000, 144: 1728, 196: 2744, 256: 4096, 324: 5832}


## 12
Next is the set comprehension.  These comprehensions also uses { } brackets but have no colon in the compute clause.  For example:
* dict comprehension: { key:value for ...
* set comprehension: { value for ...

In this set comprehension our sequence is the outer product of two sequences.  We form a set by computing x + y for x's in the range 0 through 9 and y's in the range 0 through 9.  Remember that duplicates are removed in a set, so we will end up with all numbers between 0 (0+0) and 18 (9+9) 

In [13]:
n = { x + y for x in range(10) for y in range(10) }
print(n)

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18}


## 13
Finally we will look a generator comprehensions.  

Generator comprehension use ( ) brackets.  Furthermore, because we are working with generators, the results are lazily evaluated.

In this example the generator will eventually "compute" sleep for 0, 3, 6 and 9 seconds.  However, each sleep will be evaluated as requested.

In the following example we only ask the generator to compute the first sleep; this won't take any time at all.

In [14]:
import time
g = (time.sleep(n) for n in range(0, 12, 3))
print( type(g) )
next(g)
print("sleep(0) done")

<class 'generator'>
sleep(0) done


## 14
If we call next again on our generator we will continue to evaluate laziliy and sleep for just 3 seconds.  The 6 and 9 second sleep events will not yet be evaluated.

Please wait until the output appears (3 seconds).

In [15]:
next(g)
print("sleep(3) done")

sleep(3) done


## 15
It will take a further 6 + 9 seconds to exhaust the generator (please be prepared to wait 15 seconds before the output appears):

In [16]:
next(g)
next(g)
print("sleep(6) and sleep(9) done, generator exhausted")

sleep(6) and sleep(9) done, generator exhausted


## 16
Finally, compare the generator comprehension with the corresponding tuple.  The same sleep events occur, but the tuple evaluates eagerly.  Thus the whole tuple is ecaluated before the statement returns.  

So prepare to wait for 0 + 3 + 6 + 9 = 18 seconds!

In [17]:
import time
t = (time.sleep(0), time.sleep(3), time.sleep(6), time.sleep(9))
print("all sleeps done")

all sleeps done
