### Flattening Nested Iterables

Often we want to flatten nested sequences (or iterables in general) so we can iterate over all the individual elements.

For example, we may want to iterate over all the elements in a sequence that looks like this:

```
[1, 2, [3, 4, [5, 6, 7, [8, 9, 10]]]
```

and get the elements

```
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
```

One thing we want to be careful with, is with strings (or bytes).

For example:

```
["I'm", "a", ["lumberjack", ["and", "I'm", "OK"]]]
```

If we treat these strings as sequences, then our iteration would yield something like this:

```
I, `, m, a, l, u, m, b, e, r, j, a, c, k, etc
```

This is generally not how we want to do this, so instead we'll want to treat strings as non-sequence types, and the iteration should yield:

```
I'm, a, lumberjack, and, I'm, OK
```

We'll use recursion for our solution.

We're also going to make use of the `Iterable` abstract base class in the `collections` module to determine if an object is iterable or not.

In our first approach we are going to build up a list of all the elements we want to return flattened, and return that list.

In [3]:
from collections.abc import Iterable

In [37]:
def flatten_std(iterable):
    result = []
    for element in iterable:
        if isinstance(element, Iterable) and not isinstance(element, (str, bytes)):
            result.extend(flatten(element))
        else:
            result.append(element)
    return result

Let's try it out and see if it works:

In [38]:
l = [1, 2, [3, 4, [5, 6, 7, [8, 9, 10]]]]
flatten_std(l)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

And with strings:

In [39]:
l = ["I'm", "a", ["lumberjack", "and", ["I'm", "OK"]]]
flatten_std(l)

["I'm", 'a', 'lumberjack', 'and', "I'm", 'OK']

This works fine, but we can actually simplify this code quite a bit by using a generator, and in particular using `yield` and `yield from`.

We are going to turn our function into a generator function - that way we do not have to build up that `result` list - not only is the code easier to read, but using a generator will make our function far more memory efficient, since we do not have to store the entire resulting iterable in a list.

In [25]:
def flatten(iterable):
    for element in iterable:
        if isinstance(element, Iterable) and not isinstance(element, (str, bytes)):
            yield from flatten(element)
        else:
            yield element

Our `flatten` function is now a generator function, so we'll have to iterate over it to get all the results:

In [27]:
l = [1, 2, [3, 4, [5, 6, 7, [8, 9, 10]]]]
list(flatten(l))

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

In [28]:
l = ["I'm", "a", ["lumberjack", "and", ["I'm", "OK"]]]
list(flatten(l))

["I'm", 'a', 'lumberjack', 'and', "I'm", 'OK']

The way we have coded our function, it will work with nested iterables in general, not just sequence types:

In [35]:
x = zip(range(1, 5), range(100, 105)) 
print(x)
print(list(x))

<zip object at 0x1092fb940>
[(1, 100), (2, 101), (3, 102), (4, 103)]


In [36]:
x = zip(range(1, 5), range(100, 105))
list(flatten(x))

[1, 100, 2, 101, 3, 102, 4, 103]

Let's do some timings:

In [40]:
from timeit import timeit

In [42]:
l = [1, 2, [3, 4, [5, 6, 7, [8, 9, 10]]]]
timeit("flatten_std(l)", globals=globals(), number=1_000_000)

2.3032115411479026

In [43]:
l = [1, 2, [3, 4, [5, 6, 7, [8, 9, 10]]]]
timeit("flatten(l)", globals=globals(), number=1_000_000)

0.10977479186840355

This looks way faster - but I kind of cheated! Can you spot it?

Since `flatten` is a generator function, it will actually not do any work until we iterate over the results - which I did not do in this case - so let's change it, and see what happens:

In [44]:
l = [1, 2, [3, 4, [5, 6, 7, [8, 9, 10]]]]
timeit("list(flatten(l))", globals=globals(), number=1_000_000)

2.3530387501232326

Ok, so now that we are actually iterating over all the elements of the generator, we can see that timeings are about the same.

But the generator approach has the advantage that it will be more memory efficient in general, as well as more computationally efficient if we do not actually iterate over the entire result set (maybe we only need the first few elements, in which case we save having to computer everything the way `flatten_std` works).

And that's it - a nice easy way to flatten nested iterables using recursion and generators!