# List comprehensions
List comprehensions are a more compact way of iterating over lists, resembling mathematical notation $\{x|x\in \N \wedge x<5<44\}$. The basic syntax is 
```python
[expression for item in iterable]
```

In [13]:
iters = [1,3,5,7,9]
[i+4 for i in iters]

[5, 7, 9, 11, 13]

This can be extended by an optional condition:
```python
[expression for item in iterable if condition]
```


In [14]:
[i for i in iters if i%3==0]

[3, 9]

Both iteration and condition can be nested:
```python
[expression for item1 in iterable1 for item2 in iterable2 ... if condition1 if condition2 ...]
```

In [23]:
[str(a)+str(b) for a in range(5) for b in range(5) if a%2==0 if b%2==1] # second list is iterated over the first element of iters, then over its second element, etc.

['01', '03', '21', '23', '41', '43']

Here, most programmers would rather write `if a%2==0 and b%2==1`.

We can even have a list comprehension inside another one. This creates a matrix.

In [18]:
[[a*b for a in range(5)] for b in range(3)]

[[0, 0, 0, 0, 0], [0, 1, 2, 3, 4], [0, 2, 4, 6, 8]]

<div class="alert alert-block alert-warning">
Python first generates all elements and then removes those which do not satisfy the condition. This is called <b>filtering</b> and can get really inefficient if the list is long and the number of elements removed is small.
</div>

---
### Examples

In [None]:
nums = [int(n) for n in input().split()] # fast way to input numbers as 3 4 5 6 and make a list of integers [3,4,5,6]
print(nums)

Remember the problematic example `matrix = [[1]*5]*3` creating 3 references to the same list of 5 ones. Changing one row of such a matrix resulted in changing all other rows. This can be easily avoided using list comprehensions:

In [None]:
matrix = [[1]*5 for _ in range(3)] # creates a matrix of 3 rows and 5 columns
matrix[0][1] = 55
print(matrix)

#### Matrix transposition

In [None]:
[[row[col] for row in matrix] for col in range(len(matrix[0]))] # iterate the inner number over the 

---
### Aggregation functions

In [26]:
print(all([True, True, True])) # returns True if all elements are True
print(any([False, False, True])) # returns True if any elements is True

letters = ["d", "a", "b", "c", "q", "e"]
print(sorted(letters))
print(all([i < "z" for i in letters]))
print(any([i == "e" for i in letters]))

True
True
['a', 'b', 'c', 'd', 'e', 'q']
True
True


In [27]:
from random import randint
# random gaussian
from statistics import mean, median, mode
from random import gauss

data = [gauss(120,30) for a in range(400)]
print(min(data))
print(max(data))
print(sum(data))

print(mean(data))
print(median(data))
print(mode(data))

32.46269143881017
213.12892042754385
47630.35533460199
119.07588833650493
120.63842729567482
126.05633449262474


In [29]:
# function round(n,k) rounds n to k decimal places, using k=-1 to nearest 10, k=-2 to the nearest 100, etc.
data_rounded = [None]*len(data)

for d in range(len(data)):
    data_rounded[d] = round(data[d],-1)

for i in range(0,230,10):
    print('#' * data_rounded.count(i))




#
#
#########
##
##############
##########################
################################
##########################################
##########################################
#######################################################
#################################################
##########################################################
################################
#################
########
######
###
#
##



This can be equivalently written as:

In [30]:
data_rounded = [round(n,-1) for n in data] 
hist = [print('#' * data_rounded.count(i)) for i in range(0,230,10)]  # hist now contains list of None, because print() returns None




#
#
#########
##
##############
##########################
################################
##########################################
##########################################
#######################################################
#################################################
##########################################################
################################
#################
########
######
###
#
##



---
## Controlling memory usage

List comprehension generate the whole list at once. This can be a problem if the list is very long.

In [79]:
from random import randint

numbers = [randint(1,100) for _ in range(int(1e6))]

List comprehension vs generator expression:

In [132]:
%timeit [x*x for x in numbers]
%timeit (x*x for x in numbers)

39.4 ms ± 556 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
257 ns ± 6.06 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [133]:
squares = [x*x for x in numbers]
squares_gen = (x*x for x in numbers)


In [134]:
%timeit sum(squares)
%timeit sum(squares_gen)

4.33 ms ± 98.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
33.8 ns ± 0.0931 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [139]:
import tracemalloc
from typing import Generator

def calculate_squares(generator: bool) -> list[int]|Generator[int, None, None]:
    """Generates either a list of squares of random numbers, or generator expression."""
    max = int(1e6)

    if generator:
        numbers = (randint(1, 100) for _ in range(max))
        return (x*x for x in numbers)
    else:
        numbers = [randint(1, 100) for _ in range(max)]
        return [x*x for x in numbers]

def measure(generator: bool) -> str:
    """Measure the used memory and timing to compare list comprehension and generator."""
    tracemalloc.start()
    calculate_squares(generator)

    current, peak = tracemalloc.get_traced_memory()
    tracemalloc.stop()

    print(f"Generator: {generator}")
    print(f"Current memory usage: {current / 10**6:.4f} MB")
    print(f"Peak memory usage: {peak / 10**6:.4f} MB")
    print("-" * 40)

In [140]:
measure(True)
measure(False)

Generator: True
Current memory usage: 0.0000 MB
Peak memory usage: 0.0011 MB
----------------------------------------
Generator: False
Current memory usage: 0.0022 MB
Peak memory usage: 43.7936 MB
----------------------------------------
