# Zips

## for combining iterables

by Koenraad De Smedt at UiB

---
The `zip` function 'zips' together consecutive elements of different sequences or other iterable data. Zipping continues until one of the sequences is exhausted. Zips are useful for many purposes, such as computing *n-grams*, as will be shown in a later notebook.

<center>
<img src="https://git.app.uib.no/desmedt/teaching/-/raw/main/zip.png" alt = "zipper" width = 250px>
</center>

N.B. This construction is not to be confused with *.zip* as a compressed file format.

---


Let's zip together some letters and numbers.

In [None]:
letters = 'abcd'
numbers = [1,2,3,4,5]
z = zip(letters, numbers)
z

Zips are iterators that produce tuples. We can take the next element with `next` until the zip is exhausted.

In [None]:
print(next(z))
print(next(z))
print(next(z))
print(next(z))

Instead of taking elements one by one, can put all remaining tuples from a zip into a list. After that, the iterator is exhausted.

In [None]:
z = zip(letters, numbers)
list(z)

Here is an alternative way of *unpacking* a zip using a starred expression.

In [None]:
z = zip(letters, numbers)
[*z]

We can zip more than two sequences together. (Don't try that with the zipper of your jacket).

In [None]:
adjs = ['fat', 'sharp', 'sweet']
origins = ['Greek', 'English', 'French']
nouns = ['wedding', 'cheddar', 'wine']

phrases = zip(adjs, origins, nouns)
[*phrases]

We can iterate over a zip by means of a comprehension and use the values as we like.

In [None]:
[' '.join(triplet) for triplet in zip(adjs, origins, nouns)]

Any iterable data can be zipped, also ranges and sets, for instance. The following illustrates how we can include a range of numbers in a zip.

In [None]:
tokens = ['Once', 'upon', 'a', 'time', 'there', 'was', 'a', 'princess']
[*zip(range(len(tokens)), tokens)]

But let's be honest, the same result can be achieved in an easier way with the built-in function `enumerate` which results in a kind of iterator from which we can unpack the values into a list if desired.

In [None]:
[*enumerate(tokens)]

From a zip or an enumerate object or any other iterable that gives pairs, we can easily construct a dict, so that the first element of each pair is a key and the second is its value. Remember that keys must be unique.

In [None]:
dict(zip(range(len(tokens)), tokens))

Alternatively, we can make a dict by means of a dict comprehension. This offers flexibility to add modifications or conditions.

In [None]:
{'word ' + str(n+1): w.lower() for n, w in enumerate(tokens) if w != 'a'}

### Exercises

1.   Make a list of first names and a list of last names. Zip them together to produce tuples, each with first and last name, in that order.
2.   Then, from that zip, use a comprehension to produce a list of strings, each with last name, `', '` and first name.
3.   One can zip sets, but why doesn't it make much sense to do that?