### Zipping

We've already used the `zip` function quite a bit.

It zips up two iterables and yields tuples containing elements from all iterables in "parallel". It is also lazy, and it will stop once the first iterable is exhausted.

Let's look at a simple example:

In [1]:
l1 = [1, 2, 3, 4, 5]
l2 = [1, 2, 3, 4]
l3 = [1, 2, 3]

In [2]:
list(zip(l1, l2, l3))

[(1, 1, 1), (2, 2, 2), (3, 3, 3)]

As you can see, the shortest iterable we provided to the `zip` function had a length of 3 (so it reached the end of iteration first), and our output therefore only had 3 tuples in it.

Of course, this works with iterators and generators too:

In [3]:
def integers(n):
    for i in range(n):
        yield i
        
def squares(n):
    for i in range(n):
        yield i**2
        
def cubes(n):
    for i in range(n):
        yield i**3

In [4]:
iter1 = integers(6)
iter2 = squares(5)
iter3 = cubes(4)

In [5]:
list(zip(iter1, iter2, iter3))

[(0, 0, 0), (1, 1, 1), (2, 4, 8), (3, 9, 27)]

Sometimes we want to zip up iterables but completely iterate all the iterables, and not stop at the shortest. Of course, the problem is what to do with iterables that have been fully iterated before the longest one has?

Simple, we just need to provide a default "filler" value.

And that's how the `zip_longest` function from `itertools` works:

In [6]:
from itertools import zip_longest

In [7]:
help(zip_longest)

Help on class zip_longest in module itertools:

class zip_longest(builtins.object)
 |  zip_longest(iter1 [,iter2 [...]], [fillvalue=None]) --> zip_longest object
 |  
 |  Return a zip_longest object whose .__next__() method returns a tuple where
 |  the i-th element comes from the i-th iterable argument.  The .__next__()
 |  method continues until the longest iterable in the argument sequence
 |  is exhausted and then it raises StopIteration.  When the shorter iterables
 |  are exhausted, the fillvalue is substituted in their place.  The fillvalue
 |  defaults to None or can be specified by a keyword argument.
 |  
 |  Methods defined here:
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.
 |  
 |  __next__(self, /)
 |      Implement next(self).
 |  
 |  __reduce__(...)
 |

As you can see, we can only specify a single default value, this means that default will be used for any provided iterable once it has been fully iterated.

As expected, `zip_longest` yields its values - it is lazy.

Let's see an example:

In [8]:
l1 = [1, 2, 3, 4, 5]
l2 = [1, 2, 3, 4]
l3 = [1, 2, 3]

In [9]:
list(zip_longest(l1, l2, l3, fillvalue='N/A'))

[(1, 1, 1), (2, 2, 2), (3, 3, 3), (4, 4, 'N/A'), (5, 'N/A', 'N/A')]

Of course, since this zips over the longest iterable, beware of using an infinite iterable!

You don't have to worry about this with the normal `zip` function as long as at least one of the iterables is finite:

In [10]:
def squares():
    i = 0
    while True:
        yield i ** 2
        i += 1

def cubes():
    i = 0
    while True:
        yield i ** 3
        i += 1

Obviously `squares` produces an inifinite iterator. But we can still zip it with a finite iterable:

In [11]:
iter1 = squares()
iter2 = cubes()
list(zip(range(10), iter1, iter2))

[(0, 0, 0),
 (1, 1, 1),
 (2, 4, 8),
 (3, 9, 27),
 (4, 16, 64),
 (5, 25, 125),
 (6, 36, 216),
 (7, 49, 343),
 (8, 64, 512),
 (9, 81, 729)]

Don't try the same thing with `zip_longest`!