## Notebook covers following topics:
- Relevance of 'iter' and 'next' while defining iterators
- Creating an iterator using closure
- **Generators**
    - **yield**
    - How to enable **stopiteration** using **yield**
    - Example of generator using Fibonacci series
- Making an iterable from generator (that wont exhaust)
- Beware of using a generatot with another generator. Example using enumerate
- Generator expressions ( )
- **yield from** - Usecase using cars.csv
- **Aggregators**
    - list, sum, min, max, all, any
- Slicing iterables - **islice**
- Selecting & filtering iterators - **filter** and **filterfalse**
- **dropwhile and takewhile**
- **compress**
    

In [120]:
# Imports required for this notebook
from timeit import timeit
import math
from itertools import islice
from itertools import filterfalse
from itertools import takewhile, dropwhile
from itertools import compress
from itertools import (count,
                       cycle,
                       repeat)

We can write an iterator as below. We are creating our own iterators here.
- What iter and next do here ?
    - 'iter' return self. Thereby it helps 'next' to get hold of object parameters as it iterates over.

In [8]:
import math

class Factiter:
    def __init__(self, n):
        self.n = n
        self.i = 0
        
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.i >= self.n:
            raise StopIteration
        else:
            result = math.factorial(self.i)
            self.i += 1
            return result

In [13]:
fact = Factiter(5)

In [14]:
for num in fact:
    print(num)

1
1
2
6
24


Another method. Here python creates iterator for us. We are using concept of closure and sentinel value.

In [20]:
def factiter():
    i = 0
    def inner():
        nonlocal i
        result = math.factorial(i)
        i += 1
        return result
    return inner

In [21]:
fact = iter(factiter(), math.factorial(5)) # Here math.factorial(5) is the sentinel value

In [22]:
for num in fact:
    print(num)

1
1
2
6
24


In [24]:
for num in fact:  # Iterator exhausted. So no results. Proof that iterator works
    print(num)

## Yield

The yield statement is used almost like a return statement in a function - but there is a huge difference - when the yield statement is encountered, Python returns whatever value yield specifies, but it "pauses" execution of the function. We can then "call" the same function again and it will "resume" from where the last yield was encountered.

I say "call" because we do not "resume" the function by calling it - instead we use the function... next() !!!

In [25]:
def my_func():
    print('First Name:')
    yield('Anil')
    print('Second Name:')
    yield('Bhatt')

In [26]:
my_func()

<generator object my_func at 0x00000162440071C8>

If you noticed above just calling the function doesnt execute it. Instead it returned a **generator** object.

So to execute it we need to use **next()**. But just using next() wont work as it will error as below.

In [33]:
next(my_func)

TypeError: 'function' object is not an iterator

We need to call it as we call a proper function my_func() as below. But we will keep getting first yield statemnt only. Hw to solve this ?

In [35]:
next(my_func())

First Name:


'Anil'

In [36]:
next(my_func())

First Name:


'Anil'

We can solve this by creating an instance as below & then use it with **next()**. It will properly iterate now

In [38]:
gen_func = my_func()
gen_func

<generator object my_func at 0x0000016244B6CF48>

In [39]:
next(gen_func)

First Name:


'Anil'

In [40]:
next(gen_func)

Second Name:


'Bhatt'

In [42]:
next(gen_func)  # Will stop iteration also upon exhaustion

StopIteration: 

So generators are iterators which means there must be __iter__ and __next__ methods defined.

In [45]:
'__iter__' in dir(gen_func)

True

In [46]:
'__next__' in dir(gen_func)

True

**One thing to note is that it automatically returned stopiteration. It is beacuse of yield**

In the example above, it seemed clear - when the function finished running - there were no more statements after that last yield.

What actually happens if a function finishes running and we don't explicitly return something?

Remember that Python fills in the gap, and returns None.

**In general, the iteration will terminate when we return something from the function.** Let us this fact with with a combination of 'yield' and 'return'

In [57]:
def squares(sentinel):
    i = 0 
    while True:
        if i < sentinel:
            yield i**2
            i += 1
        else:
            return 'All Done'    # Invoking 'return' automatically stops iteration

In [58]:
sq = squares(3)

In [59]:
print(next(sq))
print(next(sq))
print(next(sq))
print(next(sq))  # Here stopiteration message became customized

0
1
4


StopIteration: All Done

Now let us use this knowledge to simplify the factorial function we written above

In [60]:
def factiter(n):
    for i in range(n):
        yield math.factorial(i)

In [61]:
fact = factiter(3)

In [62]:
print(next(fact))
print(next(fact))
print(next(fact))
print(next(fact)) # It automatically hits stopiteration

1
1
2


StopIteration: 

In [63]:
fact = factiter(3)

In [64]:
for num in fact:  # Works fine with 'for' loop as well 
    print(num)

1
1
2


### Generator example using fiboancci series

Let us first write a non-recursive function to calculate fibonacci number

In [68]:
def fib(n):
    fib_0 = 1
    fib_1 = 1
    for i in range(n-1):
        fib_1, fib_0 = fib_0 + fib_1, fib_1
    return fib_1

In [69]:
[fib(i) for i in range(7)]

[1, 1, 2, 3, 5, 8, 13]

In [71]:
# Just to ensure that this works for big numbers also 
from timeit import timeit

timeit('fib(5000)', globals=globals(), number=10)

0.013956000000689528

Now let us create an iterator approach that enables lazy evaluation. First approach is creating our own iterator

In [78]:
class Fib:
    def __init__(self, n):
        self.n = n
        
    def __iter__(self):
        return self.Fibiterator(self.n)
    
    class Fibiterator():
        def __init__(self, n):
            self.i = 0
            self.n = n
            
        def __iter__(self):
            return self
        
        def __next__(self):
            if self.i >= self.n:
                raise StopIteration
            else:
                result = fib(self.i)
                self.i += 1
                return result            

In [89]:
fib_iterable = Fib(7)

In [90]:
next(fib_iterable)

TypeError: 'Fib' object is not an iterator

In [80]:
for num in fib_iterable:  # Can iterate
    print(num)

1
1
2
3
5
8
13


In [82]:
for num in fib_iterable:  # Can iterate repeatedly also
    print(num)

1
1
2
3
5
8
13


Now let us build iterator using closure

In [85]:
def fib_closure():
    i = 0
    def inner():
        nonlocal i 
        result = fib(i)
        i+= 1
        return result
    return inner

In [87]:
fib_numbers = fib_closure()
fib_iter = iter(fib_numbers, fib(7))
for num in fib_iter:
    print(num)

1
1
2
3
5
8
13


In [88]:
fib_iter = iter(fib_numbers, fib(7))
next(fib_iter)

34

***Now let us use 'yield' here***

In [91]:
def fib(n):
    fib_0 = 1
    fib_1 = 1
    for i in range(n-1):
        fib_1, fib_0 = fib_0 + fib_1, fib_1
        yield fib_1                         # Notice that yield is inside 'for' loop

In [92]:
[num for num in fib(7)]

[2, 3, 5, 8, 13, 21]

In [99]:
# we are missing 1, 1...so addressing that also
def fib(n):
    fib_0 = 1
    yield fib_0
    fib_1 = 1
    yield fib_1
    for i in range(n-1):
        fib_1, fib_0 = fib_0 + fib_1, fib_1
        yield fib_1     

In [100]:
[num for num in fib(7)]

[1, 1, 2, 3, 5, 8, 13, 21]

In [101]:
fib_gen = fib(7)

In [102]:
next(fib_gen), next(fib_gen), next(fib_gen), next(fib_gen), next(fib_gen), next(fib_gen), next(fib_gen)

(1, 1, 2, 3, 5, 8, 13)

In [103]:
next(fib_gen) # Stopiteration also raised

21

In [104]:
# Performance wise also generator approach is doing good
timeit('[num for num in fib(5000)]', globals=globals(), number=1)

0.035056700000495766

### Making an iterable from generator

As we now know, generators are iterators.

This means that they become exhausted - so sometimes we want to create an iterable instead.

There's no magic here, we simply have to implement a class that implements the iterable protocol:

In [105]:
# Step 1 - Let us write a simple generator function

def square_gen(n):
    for i in range(n):
        yield i**2

In [106]:
sq = square_gen(5)

In [107]:
# As of now this is an iterator, so exhaustible as shown below
list(sq)

[0, 1, 4, 9, 16]

In [108]:
list(sq)

[]

In [109]:
# Step 2 - Wrap this in an iterable
class Squares:
    def __init__(self, n):
        self.n = n
        
    def __iter__(self):
        return square_gen(self.n)

In [110]:
sq = Squares(5)

In [112]:
type(sq)

__main__.Squares

In [113]:
list(sq)

[0, 1, 4, 9, 16]

In [115]:
list(sq) # Now this is not exhaustive

[0, 1, 4, 9, 16]

In [119]:
# Let us put the function also inside iterable
class Squares:
    def __init__(self, n):
        self.n = n
        
    @staticmethod        #staticmethod enables us to call without self.<fn_name>
    def squaress_gen(n):
        for i in range(n):
            yield i**2
        
    def __iter__(self):
        return Squares.squaress_gen(self.n)

In [120]:
sq = Squares(5)

In [121]:
list(sq)

[0, 1, 4, 9, 16]

In [122]:
list(sq)

[0, 1, 4, 9, 16]

### Beware of using Generator inside another Generator

In [125]:
def squares(n):
    for i in range(n):
        yield i ** 2

In [131]:
sq = squares(5)

In [132]:
# Let us create an enumerator for our 'sq'. Please note that enumerate is also a generator. Below should be the ideal order
enum_sq = enumerate(sq)
for idx, num in enum_sq:
    print(idx, num)

0 0
1 1
2 4
3 9
4 16


In [137]:
# Let us define sq again
sq = squares(5)

In [138]:
# let us call sq few times
next(sq)

0

In [139]:
next(sq)

1

In [141]:
# Now let us call enum_sq, we can say that order is screwed up
enum_sq = enumerate(sq)

In [143]:
next(enum_sq) # Instead of getting (2, 4) we are getting (0,4) now

(1, 9)

### Generator expressions

The expression inside the [] brackets is called a **comprehension expression**.

The [] brackets resulted in a list being created.

We can easily create a generator by using () parentheses instead of the [] brackets:

In [144]:
g = (i**2 for i in range(3))

In [145]:
type(g)

generator

In [146]:
for num in g:
    print(num)

0
1
4


In [148]:
list(g)  # exhaustive also

[]

### yield from

In [1]:
# Let us create a nested generator expression as below

def matrix(n):
    gen = ( (i*j for j in range(1, n+1))
             for i in range(1, n+1) )
    return gen

In [2]:
m = list(matrix(5))

In [3]:
m

[<generator object matrix.<locals>.<genexpr>.<genexpr> at 0x00000245EF291BC8>,
 <generator object matrix.<locals>.<genexpr>.<genexpr> at 0x00000245EF291C48>,
 <generator object matrix.<locals>.<genexpr>.<genexpr> at 0x00000245EF291CC8>,
 <generator object matrix.<locals>.<genexpr>.<genexpr> at 0x00000245EF291D48>,
 <generator object matrix.<locals>.<genexpr>.<genexpr> at 0x00000245EF291DC8>]

In [6]:
# so to get value we have to loop over 
def matrix_iterator(n):
    for row in matrix(n):
        for item in row:
            yield item

In [9]:
for i in matrix_iterator(2):
    print(i)

1
2
2
4


In [10]:
# But we can avoid nested for loop using 'yield from'

def matrix_iterator(n):
    for row in matrix(n):
        yield from row

In [11]:
for i in matrix_iterator(2):
    print(i)

1
2
2
4


So **yield from** is like

yield from iterator 

   =
   
for i in iterator

    yield i
    

In [12]:
#### Let us check another example where this will be useful

In [26]:
car1_path = r'C:\Users\annbhatt\Desktop\DS_AI\EPAI_P1\S15\car-brands-1.txt'
car2_path = r'C:\Users\annbhatt\Desktop\DS_AI\EPAI_P1\S15\car-brands-2.txt'
car3_path = r'C:\Users\annbhatt\Desktop\DS_AI\EPAI_P1\S15\car-brands-3.txt'

In [27]:
brands = []

with open(car1_path) as f:
    for brand in f:
        brands.append(brand.strip('\n'))
        
with open(car2_path) as f:
    for brand in f:
        brands.append(brand.strip('\n'))
        
with open(car3_path) as f:
    for brand in f:
        brands.append(brand.strip('\n'))        

In [28]:
for brand in brands:
    print(brand, end=', ')

Alfa Romeo, Aston Martin, Audi, Bentley, Benz, BMW, Bugatti, Cadillac, Chevrolet, Chrysler, Citroën, Corvette, DAF, Dacia, Daewoo, Daihatsu, Datsun, De Lorean, Dino, Dodge, Farboud, Ferrari, Fiat, Ford, Honda, Hummer, Hyundai, Jaguar, Jeep, KIA, Koenigsegg, Lada, Lamborghini, Lancia, Land Rover, Lexus, Ligier, Lincoln, Lotus, Martini, Maserati, Maybach, Mazda, McLaren, Mercedes-Benz, Mini, Mitsubishi, Nissan, Noble, Opel, Peugeot, Pontiac, Porsche, Renault, Rolls-Royce, Saab, Seat, Å koda, Smart, Spyker, Subaru, Suzuki, Toyota, Vauxhall, Volkswagen, Volvo, 

In above approach, we are loading up entire dataset in memory. What if we want an iterator approach..We can use generators

In [23]:
car_paths = [r'C:\Users\annbhatt\Desktop\DS_AI\EPAI_P1\S15\car-brands-1.txt', 
             r'C:\Users\annbhatt\Desktop\DS_AI\EPAI_P1\S15\car-brands-2.txt',
             r'C:\Users\annbhatt\Desktop\DS_AI\EPAI_P1\S15\car-brands-3.txt']

In [30]:
def brands(*files):
    for file_path in files:
        with open(file_path) as f:
            for line in f:
                yield line.strip('\n')

In [31]:
for brand in brands(*car_paths):
    print(brand, end = ', ')

Alfa Romeo, Aston Martin, Audi, Bentley, Benz, BMW, Bugatti, Cadillac, Chevrolet, Chrysler, Citroën, Corvette, DAF, Dacia, Daewoo, Daihatsu, Datsun, De Lorean, Dino, Dodge, Farboud, Ferrari, Fiat, Ford, Honda, Hummer, Hyundai, Jaguar, Jeep, KIA, Koenigsegg, Lada, Lamborghini, Lancia, Land Rover, Lexus, Ligier, Lincoln, Lotus, Martini, Maserati, Maybach, Mazda, McLaren, Mercedes-Benz, Mini, Mitsubishi, Nissan, Noble, Opel, Peugeot, Pontiac, Porsche, Renault, Rolls-Royce, Saab, Seat, Å koda, Smart, Spyker, Subaru, Suzuki, Toyota, Vauxhall, Volkswagen, Volvo, 

In [34]:
# We can simplify above function using yield from
def brands(*files):
    for file_path in files:
        with open(file_path) as f:
            yield from f

In [35]:
for brand in brands(*car_paths):
    print(brand, end = ', ')

Alfa Romeo
, Aston Martin
, Audi
, Bentley
, Benz
, BMW
, Bugatti
, Cadillac
, Chevrolet
, Chrysler
, Citroën
, Corvette
, DAF
, Dacia
, Daewoo
, Daihatsu
, Datsun
, De Lorean
, Dino
, Dodge, Farboud
, Ferrari
, Fiat
, Ford
, Honda
, Hummer
, Hyundai
, Jaguar
, Jeep
, KIA
, Koenigsegg
, Lada
, Lamborghini
, Lancia
, Land Rover
, Lexus
, Ligier
, Lincoln
, Lotus
, Martini, Maserati
, Maybach
, Mazda
, McLaren
, Mercedes-Benz
, Mini
, Mitsubishi
, Nissan
, Noble
, Opel
, Peugeot
, Pontiac
, Porsche
, Renault
, Rolls-Royce
, Saab
, Seat
, Å koda
, Smart
, Spyker
, Subaru
, Suzuki
, Toyota
, Vauxhall
, Volkswagen
, Volvo, 

Problem is we cant use 'strip' here, so we get unformatted strings as above. Let us address that too

As you can see, below generator function will clean each line of the file before yielding it. Let's try it with a single file and make sure it works:

In [38]:
def gen_clean_read(file):
    with open(file) as f:
        for line in f:
            yield line.strip('\n')

In [39]:
# Now let us integrate this to brands function

def brands(*files):
    for file_path in files:
        yield from gen_clean_read(file_path)

In [40]:
for brand in brands(*car_paths):
    print(brand, end = ', ')

Alfa Romeo, Aston Martin, Audi, Bentley, Benz, BMW, Bugatti, Cadillac, Chevrolet, Chrysler, Citroën, Corvette, DAF, Dacia, Daewoo, Daihatsu, Datsun, De Lorean, Dino, Dodge, Farboud, Ferrari, Fiat, Ford, Honda, Hummer, Hyundai, Jaguar, Jeep, KIA, Koenigsegg, Lada, Lamborghini, Lancia, Land Rover, Lexus, Ligier, Lincoln, Lotus, Martini, Maserati, Maybach, Mazda, McLaren, Mercedes-Benz, Mini, Mitsubishi, Nissan, Noble, Opel, Peugeot, Pontiac, Porsche, Renault, Rolls-Royce, Saab, Seat, Å koda, Smart, Spyker, Subaru, Suzuki, Toyota, Vauxhall, Volkswagen, Volvo, 

### Aggregators

In [41]:
def squares(n):
    for i in range(n):
        yield i**2

In [42]:
list(squares(5))

[0, 1, 4, 9, 16]

In [43]:
max(squares(5))

16

In [44]:
min(squares(5))

0

In [45]:
with open(car1_path) as f:
    for row in f:
        print(len(row), row, end='')

11 Alfa Romeo
13 Aston Martin
5 Audi
8 Bentley
5 Benz
4 BMW
8 Bugatti
9 Cadillac
10 Chevrolet
9 Chrysler
8 Citroën
9 Corvette
4 DAF
6 Dacia
7 Daewoo
9 Daihatsu
7 Datsun
10 De Lorean
5 Dino
5 Dodge

In [46]:
# Let us see if all our car brands are > 3 chars long. We can use 'ALL' aggregator

with open(car1_path) as f:
    result = all(map(lambda row:len(row) >=3, f))
print(result)

True


In [53]:
# Let us see if any of our car brands are > 15 chars long. We can use 'ANY' aggregator
with open(car1_path) as f:
    result = any(map(lambda row: len(row)>15, f))
print(result)

False


### Slicing iterables - islice

In [55]:
# We know sequence types can be sliced

lst = [1, 2, 3,4, 5]
lst [2:4]

[3, 4]

In [56]:
# But this wont work for an iterable that is NOT a sequence type

import math

def factorials(n):
    for i in range(n):
        yield math.factorial(i)

In [61]:
facts = factorials(10)

In [62]:
facts[0:2]

TypeError: 'generator' object is not subscriptable

In [63]:
# We have to employ a for loop to print out
for i in facts:
    print(i)

1
1
2
6
24
120
720
5040
40320
362880


In [64]:
# We can use islice here

from itertools import islice

islice(factorials(10), 0, 3)

<itertools.islice at 0x245f1102408>

In [65]:
#islice is lazy iterator, so we have to wrap it to get results

list(islice(factorials(10), 0, 3))

[1, 1, 2]

In [68]:
# We can use a stepvalue as well 

list(islice(factorials(10), 0, 10, 2))

[1, 2, 24, 720, 40320]

In [69]:
# islice can be extremely useful with infinite iterators as below

def factorials():
    index = 0
    while True:
        yield math.factorial(index)
        index += 1

In [75]:
# Earlier we can display only as below

facts = factorials()
for _ in range(3):
    print(next(facts))

1
1
2


In [76]:
# But now we can use islice & even choose slices in between

list(islice(factorials(), 5, 9))

[120, 720, 5040, 40320]

### Selecting & filtering iterators - **filter** and **filterfalse**

You should already be aware of the Python built-in function filter.

Remember that the filter function can work with any iterable, including of course iterators and generators.

Let's see a quick example:

In [82]:
def gen_cubes(n):
    for i in range(n):
        print(f'yielding {i}')
        yield i**3

Now let's say we only want to use cubes that are odd.

We need a function that will return a True if the number is odd, False otherwise. (This is technically called a **predicate** by the way - any function that given an input returns True or False is called a **predicate**)

In [83]:
def is_odd(x):
    return x%2 == 1

Now we can use that function (or we could have just used a lambda as well) with the filter function.

Note that the filter function is also **lazy**.

In [84]:
filtered = filter(is_odd, gen_cubes(10))

In [85]:
filtered  # Dont return any results since filter is lazy

<filter at 0x245f12e3988>

In [86]:
# We can however iterate through it:
list(filtered)

yielding 0
yielding 1
yielding 2
yielding 3
yielding 4
yielding 5
yielding 6
yielding 7
yielding 8
yielding 9


[1, 27, 125, 343, 729]

As we can see filtered will drop any values where the predicate is False.

We could easily reverse this to return not-odd (i.e. even) values. But we had to create a new function.

In [87]:
def is_even(x):
    return x%2 == 0

In [88]:
list(filter(is_even, gen_cubes(10)))

yielding 0
yielding 1
yielding 2
yielding 3
yielding 4
yielding 5
yielding 6
yielding 7
yielding 8
yielding 9


[0, 8, 64, 216, 512]

instead of creating a new function, we could use the **filterfalse** function in the itertools module that does the same work as filter but retains values where the predicate is False (instead of True as the filter function does).

The filterfalse function also uses lazy evaluation.

In [89]:
from itertools import filterfalse

In [91]:
evens = filterfalse(is_odd, gen_cubes(10))
evens

<itertools.filterfalse at 0x245f129b688>

In [92]:
list(evens)

yielding 0
yielding 1
yielding 2
yielding 3
yielding 4
yielding 5
yielding 6
yielding 7
yielding 8
yielding 9


[0, 8, 64, 216, 512]

#### dropwhile and takewhile

The **takewhile** function in the itertools module will yield elements from an iterable, **as long as a specific criteria (the predicate) is True.**

As soon as the predicate is False, iteration is stopped - even if subsequent elements would have had a True predicate - this is not a filter, this basically iterate over an iterable as long as the predicate remains True.

As we might expect, this function also uses **lazy evaluation.**

In [94]:
# Let us create a function to generate sine of values between o & 2*pi

from math import sin, pi

def sine_wave(n):
    start = 0
    max_  = 2*pi
    step  = (max_ - start)/(n-1)
    
    for _ in range(n):
        yield round(sin(start), 2)
        start += step

In [101]:
list(sine_wave(15))

[0.0,
 0.43,
 0.78,
 0.97,
 0.97,
 0.78,
 0.43,
 0.0,
 -0.43,
 -0.78,
 -0.97,
 -0.97,
 -0.78,
 -0.43,
 -0.0]

In [102]:
from itertools import takewhile

list(takewhile(lambda x: 0 <= x <=0.9, sine_wave(15)))

[0.0, 0.43, 0.78]

As you can see iteration stopped at 0.78, even though we had values later that would have had a True predicate. This is different from the filter function

In [103]:
list(filter(lambda x: 0 <= x <=0.9, sine_wave(15)))

[0.0, 0.43, 0.78, 0.78, 0.43, 0.0, -0.0]

The **dropwhile** function on the other hand **starts the iteration once the predicate becomes False:**

In [108]:
from itertools import dropwhile

list(dropwhile(lambda x:0 <= x <= 0.97, sine_wave(15)))

[-0.43, -0.78, -0.97, -0.97, -0.78, -0.43, -0.0]

In [110]:
l = [1, 3, 5, 2, 1]

list(dropwhile(lambda x : x <=3, l))

[5, 2, 1]

As you can see the iterable skipped 1 and 3 and started the iteration once the predicate was False. Once the iteration begins, it no longer checks the predicate, and so we ended up with 5 and 2 and 1 in the iteration.

### Compress

The compress function is essentially a filter that takes two iterables as parameters. The first argument is the iterable (data) that will be filtered, and the second iterable contains elements (selectors), possibly of different length than the iterable being filtered. As always in Python, any object has an associated truth value, and the selectors therefore each have a truth value as well.

The resulting iterator yields elements from the data iterable where the selector at the same "position" is truthy.

A simple analogous way to look at it would be as follows using the zip function:ress 

In [112]:
data = ['a', 'b', 'c', 'd', 'e']
selectors = [True, False, 1, 0]

In [113]:
list(zip(data,selectors))

[('a', True), ('b', False), ('c', 1), ('d', 0)]

And only retain the elements where the second value in the tuple is truthy:

In [114]:
[item for item, truth_value in zip(data, selectors) if truth_value]

['a', 'c']

The compress function works the same way, except that it is evaluated lazily and returns an iterator:

In [116]:
from itertools import compress

list(compress(data, selectors))

['a', 'c']

#### Infinite Iterators - count, cycle, repeat

#### count

The count function is similar to range, except it does not have a stop value. **It has both a start and a step:**

In [119]:
from itertools import (count,
                       cycle,
                       repeat)

In [121]:
g = count(10)

In [122]:
list(islice(g, 5))

[10, 11, 12, 13, 14]

In [123]:
g = count(10, step=2)

In [124]:
list(islice(g, 5))

[10, 12, 14, 16, 18]

Unlike the range function, whose arguments must always be integers, count works with floats as well:

In [125]:
g = count(10.5, 0.5)

In [126]:
list(islice(g, 5))

[10.5, 11.0, 11.5, 12.0, 12.5]

In [127]:
g = count(1+1j, 1+2j)

In [128]:
list(islice(g, 5))

[(1+1j), (2+3j), (3+5j), (4+7j), (5+9j)]

#### Cycle

cycle is used to repeatedly loop over an iterable:

In [129]:
g = cycle(('red', 'green', 'blue'))

In [130]:
list(islice(g, 8))

['red', 'green', 'blue', 'red', 'green', 'blue', 'red', 'green']

One thing to note is that this works even if the argument is an iterator (i.e. gets exhausted after the first complete iteration over it)!

In [131]:
def colors():
    yield 'red'
    yield 'green'
    yield 'blue'

In [132]:
cols = colors()

In [133]:
list(cols)

['red', 'green', 'blue']

In [134]:
list(cols)

[]

As expected, cols was exhausted after the first iteration.

Now let's see how cycle behaves:

In [135]:
cols = colors()
g = cycle(cols)

In [136]:
list(islice(g, 10))

['red', 'green', 'blue', 'red', 'green', 'blue', 'red', 'green', 'blue', 'red']

In [137]:
list(islice(g, 10))

['green',
 'blue',
 'red',
 'green',
 'blue',
 'red',
 'green',
 'blue',
 'red',
 'green']

#### Repeat
The repeat function is used to create an iterator that just returns the same value again and again. By default it is infinite, but a count can be specified optionally:

In [138]:
g = repeat('Python')
for _ in range(5):
    print(next(g))

Python
Python
Python
Python
Python


In [140]:
g = repeat('Python', 4)
g

repeat('Python', 4)

In [141]:
list(g)

['Python', 'Python', 'Python', 'Python']

In [142]:
l = [1, 2, 3]
result = list(repeat(l, 3))
result

[[1, 2, 3], [1, 2, 3], [1, 2, 3]]