In [1]:
import time

# Introduction to Python  

## [Comprehensions](https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Comprehensions.html) and [Generators](https://www.programiz.com/python-programming/generator)

# [Comprehensions](https://towardsdatascience.com/list-comprehensions-in-python-28d54c9286ca):

Python offers some nice "synctatic sugar" constructions. They allow the creation of sequences in a clear in concise way  
They are:  

- List comprehension
- Set comprehension
- Dict comprehension

## 1) List Comprehensions

List comprehensions returns a list filled with elements derived from another sequence ot iterable, being modified or not. A common use is to build a new list where each element is the result of some expression applied to the original, or to build a sequence where the elements satisfy some conditions  

The patterns are:  

    [ <expression> for <name> in <iterable or sequence> ]
    [ <expression> for <name> in <iterable or sequence> if <condition> ]
    [ <expression> if <condition> else <expression> for <name> in <iterable or sequence> ]
    

See the examples below:  

In [2]:
cubes = []
for elements in range(10):
    cubes.append(elements**3)

In [3]:
print(cubes)

[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]


#### We can rewrite it in a more concise way using list comprehension:  

In [4]:
cubes2 = [elements**3 for elements in range(10)]

In [5]:
print(cubes2)

[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]


#### Another example of the use of list comprehensions  
Let's sum the square of the divisors of 3 and 5 below 1000:  

In [6]:
%%time

my_list = []
for x in range(1001):
    if x%5 == 0 or x%3 == 0:
        my_list.append(x**2)
print(sum(my_list))

156390386
CPU times: user 812 µs, sys: 138 µs, total: 950 µs
Wall time: 922 µs


In [7]:
%%time

my_new_list = [x**2 for x in range(1001) if x%5 == 0 or x%3 == 0]
print(sum(my_new_list))

156390386
CPU times: user 1.96 ms, sys: 0 ns, total: 1.96 ms
Wall time: 1.66 ms


In [8]:
print(type(my_new_list))

<class 'list'>


More examples:

+ Modifying a sequence  

In [9]:
sequence = range(11)
sequence

range(0, 11)

In [10]:
list(sequence)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

In [11]:
[element**2 for element in sequence if element%2==0]

[0, 4, 16, 36, 64, 100]

+ More than one variable

In [12]:
[x + y for x,y in [(9,4),(8,6),(2,9)]]

[13, 14, 11]

+ No variables

In [13]:
%%time

print([3+2 for numero in range(4)])

[5, 5, 5, 5]
CPU times: user 715 µs, sys: 142 µs, total: 857 µs
Wall time: 930 µs


In [14]:
%%time

print(4 * [5])

[5, 5, 5, 5]
CPU times: user 891 µs, sys: 0 ns, total: 891 µs
Wall time: 1.08 ms


+ Filtering by type

In [15]:
my_list = [1, "4", 9, "a", 0, 4]
test_type = [i for i in my_list if isinstance(i, int)]
print(test_type)

[1, 9, 0, 4]


+ Filtering by type, two conditions:  

In [16]:
my_list = [1, "4", 9, "a", 0, 4]
test_type = [int(i) for i in my_list if isinstance(i, int) or (isinstance(i, str) and (i.isnumeric()))]
print(test_type)

[1, 4, 9, 0, 4]


+ Multiple loops  

In [17]:
%%time

points = []
for x in [1,2,3]:
    for y in [3,4,5]:
        if x != y:
            points.append((x,y))
print(points)

[(1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5)]
CPU times: user 922 µs, sys: 0 ns, total: 922 µs
Wall time: 994 µs


In [18]:
%%time

points = [(x,y) for x in [1,2,3] for y in [3,4,5] if x != y]
print(points)

[(1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5)]
CPU times: user 639 µs, sys: 128 µs, total: 767 µs
Wall time: 789 µs


+ Using for filtering strings  

In [19]:
import string

In [20]:
print(string.punctuation)
print(string.digits)
print(string.ascii_letters)
print(string.ascii_lowercase)
print(string.ascii_uppercase)
print(string.hexdigits)

!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
0123456789
abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
abcdefghijklmnopqrstuvwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ
0123456789abcdefABCDEF


Now that we know the string package, let's use it for cleaning strings with list comprehensions:

In [21]:
my_string = 'This is a string. It is full of surprises! Would you believe it? I can clean it...'

In [22]:
tokens = [token for token in my_string.split()]
print(tokens)

['This', 'is', 'a', 'string.', 'It', 'is', 'full', 'of', 'surprises!', 'Would', 'you', 'believe', 'it?', 'I', 'can', 'clean', 'it...']


In [23]:
lowercase_tokens = [token.lower() for token in my_string.split()]
print(lowercase_tokens)

['this', 'is', 'a', 'string.', 'it', 'is', 'full', 'of', 'surprises!', 'would', 'you', 'believe', 'it?', 'i', 'can', 'clean', 'it...']


In [24]:
lower_strip_tokens = [token.strip(string.punctuation).lower() for token in my_string.split()]
print(lower_strip_tokens)

['this', 'is', 'a', 'string', 'it', 'is', 'full', 'of', 'surprises', 'would', 'you', 'believe', 'it', 'i', 'can', 'clean', 'it']


## 2) Set Comprehensions:


Set comprehensions returns a set filled with elements derived from a sequence or iterable, being modified or not. The only difference is that the results are a set instead of a list, that is, without repeated elements and not following a specific order  

The pattern is:  

### name_of_set = { <_expression_(_element_)> for <_element_> in <_sequence_> if <_condition_> }. 

See the examples below:  

In [25]:
vowels = {letter for letter in "python lovers" if letter in "aeiou"}
print(vowels)

{'o', 'e'}


In [26]:
consonants = {letter for letter in 'summer camp' if letter not in "aeiou" and letter != ' '}
print(consonants)

{'m', 'r', 's', 'c', 'p'}


## 3) Dict Comprehensions

Dict comprehensions returns a dictionary filled with elements derived from sequences or iterable, being modified or not. The pattern is a varies a bit from list and set comprehensions, because we have keys and values deriving from a sequence.

See the examples below: 

In [27]:
my_dict = {x:y for x,y in [('first',1),('second',2)]}
my_dict

{'first': 1, 'second': 2}

In [28]:
squares = {x:x**2 for x in range(4)}
print(squares)

{0: 0, 1: 1, 2: 4, 3: 9}


+ Using _zip_ to merge two sequences in a list of tuples:  

In [29]:
{x:y for x,y in zip(range(0,7,2), range(1,8,2))}

{0: 1, 2: 3, 4: 5, 6: 7}

+ Three ways to do the same thing

In [30]:
%%time

new_dict = dict([('one',1),('two',2),('three',3)])
print(new_dict)

{'one': 1, 'two': 2, 'three': 3}
CPU times: user 215 µs, sys: 46 µs, total: 261 µs
Wall time: 194 µs


In [31]:
%%time

new_dict = {x:y for x,y in [('one',1),('two',2),('three',3)]}
print(new_dict)

{'one': 1, 'two': 2, 'three': 3}
CPU times: user 179 µs, sys: 39 µs, total: 218 µs
Wall time: 147 µs


In [32]:
%%time

la = ['one', 'two', 'three']
lb = [1 ,2 ,3]
new_dict = {x:y for x,y in zip(la,lb)}
print(new_dict)

{'one': 1, 'two': 2, 'three': 3}
CPU times: user 432 µs, sys: 98 µs, total: 530 µs
Wall time: 411 µs


+ Inverting key and value

In [33]:
new_dict

{'one': 1, 'two': 2, 'three': 3}

In [34]:
dict_new = {value:key for key, value in new_dict.items()}
dict_new

{1: 'one', 2: 'two', 3: 'three'}

+ Increasing a value

In [35]:
plus_one = {key:value + 1 for key, value in new_dict.items()}
plus_one

{'one': 2, 'two': 3, 'three': 4}

+ Two ways for converting lists of tuples:  

In [36]:
list_of_tuples = [(2,3),(4,5),(1,7)]

In [37]:
%%time

dict(list_of_tuples)

CPU times: user 3 µs, sys: 0 ns, total: 3 µs
Wall time: 5.72 µs


{2: 3, 4: 5, 1: 7}

In [38]:
%%time

{key:value for key, value in list_of_tuples}

CPU times: user 4 µs, sys: 0 ns, total: 4 µs
Wall time: 6.2 µs


{2: 3, 4: 5, 1: 7}

### [Generator Expressions](https://www.programiz.com/python-programming/generator) 

Generator expressions are similar to generator functions, but they are created like list comprehensions using simple brackets instead of square brackets   

In [39]:
generator1 = (x**(0.5) for x in range(10) if x%5 == 0 or x%3 == 0)
type(generator1)

generator

In [40]:
print(generator1)

<generator object <genexpr> at 0x7f57c86ac890>


One way to materialise the generator is to convert it to a list:

In [41]:
list(generator1)

[0.0, 1.7320508075688772, 2.23606797749979, 2.449489742783178, 3.0]

In order to access each generator element, we can use the command _next_

In [42]:
generator2 = (x**x for x in range(2,99999999999999))  #A really big number. If it was a list, the memory would overflow.

In [43]:
next(generator2)

4

In [44]:
next(generator2)

27

When a list command materialises the generator, it comes to an end with an error message _StopIteration_     
It also happens when the last element is accessed with the command _next_ 

In [45]:
generator3 = (x**2 for x in range(10))

In [46]:
list(generator3)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [47]:
next(generator3) #error

StopIteration: 

In [48]:
generator4 = (letter for letter in "abcd")

In [49]:
next(generator4)

'a'

In [50]:
next(generator4)

'b'

In [51]:
next(generator4)

'c'

In [52]:
next(generator4)

'd'

In [53]:
next(generator4) #error

StopIteration: 

In [54]:
generator4 = (letter for letter in "abcd")

In [55]:
while True:
    print(next(generator4))  #error

a
b
c
d


StopIteration: 

### Special generators: _range*_, _zip_, _filter_, _map_ 

Obs:
range is a class of immutable iterable objects. Their iteration behavior can be compared to lists: you can't call next directly on them; you have to get an iterator by using iter. So no, range is not a proper generator.
+ Ranges are immutable, so they can be used as dictionary keys.
+ Ranges have the start, stop and step attributes (since Python 3.3), count and index methods and they support in, len and __getitem__ operations.
* You can iterate over the same range multiple times.


In [61]:
r = range(10)
type(r)

range

In [62]:
z = zip([1,2,3],[4,5,6],[6,7,8])
type(z)

zip

In [63]:
f = filter(str.isupper,['a','A','E','r'])
type(f)

filter

In [64]:
m = map(int,['1', '2', '3'])
type(m)

map