# Description

In this exercise, you should write a function called `merge()` that takes N iterables (each potentially infinite), each of whose iterators yields elements in sorted order (you can rely on that assumption for this exercise).  Your function should produce a new iterator that returns elements in sorted order overall.  If duplicates occur between the iterators (or within one), they should be returned multiple times.

As a motivation for this function, imagine that you have many log files on your system, each of which begins with a timestamp.  You could use this function to order all events from all processes.  Of course, different processes might write different messages at an identical timestamp, and preserving all is relevant.  Moreover, log files can continue to accumulate new records, and this iterator might run forever, monitoring all of them.  In the hypothetical, of course, "sorted order" would simply mean "sorted by first timestamp field."

For this exercise, I provide moderate sized alphabetical wordlists of French, Spanish, and Danish.  Some words occur across multiple languages.  Assuming you open each wordlist as an iterable, your function might produce this:

```python
>>> for word in islice(merge(fr, es, da), 15):
...     print(word)
a
aaronico
ab
ababillar
abaceria
abacero
abaco
abad
abadejo
abadengo
abadernar
abadesa
abadia
abadiato
abaisse
```

**Note**: As you practice, you may use up elements from one or more of the gzip iterators.  Opening them anew will reset the iterables.

**Note 2**: The sort order of words is dependent on your locale, so I have stripped all the words using letters outside the ASCII range in these word lists.  This will assure the same sorting in every locale. Apologies to speakers of those languages who are fond of words removed in the samples.

# Setup

In [1]:
from itertools import *
import gzip

da = gzip.open('wordlist-da.txt.gz', 'rt')
es = gzip.open('wordlist-es.txt.gz', 'rt')
fr = gzip.open('wordlist-fr.txt.gz', 'rt')
langs = [da, es, fr]

def merge(*langs):
    return ['a', 'aaronico', 'ab', 'ababillar', 'tanga']

# Solution

In [2]:
from functools import total_ordering

# Define a value greater than all others
@total_ordering
class Top:
    def __eq__(self, other):
        return False
    def __gt__(self, other):
        return True

# Define a value smaller than all others
@total_ordering
class Floor:
    def __eq__(self, other):
        return False
    def __lt__(self, other):
        return True

top = Top()
floor = Floor()
    
def merge2(a_iter, b_iter):
    topless = lambda x: x is not top and x is not floor
    todo = []
    a, b = floor, floor
    while a is not top or b is not top:
        a = next(a_iter, top)
        todo.append(a)
        for b0 in b_iter:
            todo.append(b0)
            if b0 > a: break

        # Sort and yield the (mostly sorted) temporary accumulation
        # Possible to overshoot by one, so keep last thing in todo
        todo.sort()
        yield from filter(topless, iter(todo[:-1]))
        todo = todo[-1:]

        # Equivalent for a_iter as above with b_iter
        b = next(b_iter, top)
        todo.append(b)
        for a0 in a_iter:
            todo.append(a0)
            if a0 > b: break
        todo.sort()
        yield from filter(topless, iter(todo[:-1]))
        todo = todo[-1:]
                
def merge(*iters):
    all_ = merge2(*iters[:2])
    for it in iters[2:]:
        all_ = merge2(all_, it)
    return all_

# Test Cases

In [3]:
def test_iter():
    from typing import Iterable, Iterator
    assert isinstance(merge(da, es, fr), Iterable)
    assert isinstance(iter(merge(da, es, fr)), Iterator)
    
test_iter()

In [4]:
def test_merge2():
    da = gzip.open('wordlist-da.txt.gz', 'rt')
    fr = gzip.open('wordlist-fr.txt.gz', 'rt')
    merged = merge(da, fr)
    premerged = gzip.open('wordlist-dafr.txt.gz', 'rt')
    # Extra LFs ignored for test
    for a, b in zip_longest(merged, premerged):
        a, b = a.rstrip(), b.rstrip()
        assert a == b, f"Merged {a} does not match {b}"
    
test_merge2()

In [5]:
def test_merge3():
    da = map(str.rstrip, gzip.open('wordlist-da.txt.gz', 'rt'))
    es = map(str.rstrip, gzip.open('wordlist-es.txt.gz', 'rt'))
    fr = map(str.rstrip, gzip.open('wordlist-fr.txt.gz', 'rt'))
    merged = merge(da, es, fr)
    premerged = map(str.rstrip, gzip.open('wordlist-fresda.txt.gz', 'rt'))
    for a, b in zip_longest(merged, premerged):
        assert a == b, f"Merged {a} does not match {b}"
    
test_merge3()