# Breaking (is) Bad...

   **...or 5 good reasons why you should not break out of a loop**

 Presentation to San Diego Python User Group by Erik Colban
 
 Date: November 19, 2020
 
 (Available at https://github.com/ecolban/SDPUG/tree/master/breaking_bad)

### Chapter 1: It's a blatant lie!

Consider a loop of the following form:
```Python
while True:
    ...
    if something:
        break
    ...
 ```
 
This is misleading the reader. First you say that you are going to loop forever, but then later, you're not. The break statement is like the "Not" in a not-joke: "I'm gonna loop forever...Not!" Very funny! 

You may think: This is not a lie. It's an *idiom*! It's become an idiom because people write code for the computer (or interpreter), not the human reader. The computer doesn't care if you lie to it, and that's why it's become acceptable. Stop writing for the computer and start writing for the human reader!

It is even more deceptive when you break out of a for-loop. 

```Python
for n in range(10):
    ...
    if something:
        break
    ...
```
In a `while True` case, the reader can sort of guess that a break statement is coming, but in a for loop it's unexpected.

Be upfront and honest; tell the reader what's ahead. _Don't lie!_

### Chapter 2: Help! How do I get out of here?

As a code reader, when a loop is not exited in the normal way, I have the extra burden of figuring out where the loop is exited.

+ What am I looking for? 
    - A break statement? 
    - Maybe two or more? 
    - A return statement? 
    - An exception? 
    - Or a call to a function that raises an exception?
    - Will an exit point always be reached?
    - Which of the exit points will be reached and in which cases?
 
Don't give your code reviewer extra work!

### Chapter 3: An unfamiliar place

Regular while- and for-loops are familiar to most programmers. When breaking out of a loop, a new structure is introduced.

- How does this new structure work?

Some languages have do-while loops

```Java
do {
    do_some_stuff
} while(condition)
```
Python doesn't have such loop. One can easily write something in Python that has the same effect:

```Python
while True:
    do_some_stuff
    if condition:
        break
```
However, it turns out that in languages that do have do-while loops, they are rarely used. Programmers are simply not that used to them. If a do-while is unfamiliar, imagine a loop where you can break out almost anywhere and in almost any manner: break, return, exception, etc.  

Keep to familiar structures! Use plain old regular loops!

### Chapter 4: Fragile! Loop invariant inside

If you break out of a loop, you break the *loop invariant*. 

Loop invariants are not only used by academics and people who write 1000 page correctness proofs. Invariants can be useful in every day programming. Consider the following short function that calculates the n'th Fibonacci number. 

Just to be clear:

1. fib(0) = 1
2. fib(1) = 1
3. fib(n) = fib(n - 2) + fib(n - 1), for n >= 2

In [24]:
def fib(n):
    a, b = 1, 1
    for i in range(n - 1):
        a, b = b, a + b
        # a == fib(i + 1) and b == fib(i + 2)
    return b

This is a typical program that is prone to one-off error. 

- Should `a` and `b` have been initialized differently?
- Should the range have been `range(n)` instead?
- Should `a` have been returned instead?
- Do we need to handle the special cases where `n` is 0 or 1?

A loop invariant is captured in a comment; it's relative easy to convince oneself that it's true. At the exit of the loop, `i == n - 1`. Put that together with the loop invariant and you have `b == fib(n)`.

Get used to such reasoning and your programs are less prone to one-off errors and others. With a little bit of practice, you can do it in your head.  But throw in a break statement, and it messes up everything!

In [25]:
[fib(i) for i in range(10)]

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

### Chapter 5: There must be a better way!

(Search "raymond hettinger there must be a better way")

Python provides some built-in functions that make it easy in many cases to avoid breaking out of a loop. Here are some examples:

#### Example 1: Searching for an element in an iterable

In [26]:
def find_element_in_iterable_that_matches(iterable, condition):
    for x in iterable:
        if condition(x):
            return x
    return None

find_element_in_iterable_that_matches([1, 2, 3, 15, 8, 9], lambda x: x % 5 == 0)

15

In [27]:
next((x for x in [1, 2, 3, 15, 8, 9] if x % 5 == 0), None)

15

#### Example 2: Checking that an iterable contains an element that satisfies a condition

In [28]:
def any_element_satisfies(iterable, condition):
    for x in iterable:
        if condition(x):
            return True
    return False

any_element_satisfies([1, 2, 3, 15, 8, 9], lambda x: x % 7 == 0)

False

In [29]:
any(x % 7 == 0 for x in [1, 2, 3, 15, 8, 9])

False

#### Example 3: Checking that all elements in an iterable satisfy a condition 

In [30]:
def all_elements_satify(iterable, condition):
    for x in iterable:
        if not condition(x):
            return False
    return True

all_elements_satify([1, 2, 3, 5, 8, 9], lambda x: x < 10)

True

In [31]:
all(x < 10 for x in [1, 2, 3, 5, 8, 9])

True

#### Example 4: Searching in a tree

In [32]:
from dataclasses import dataclass
from src.breaking_bad import make_random_tree


@dataclass(frozen=True)
class TreeNode:
    root: str
    children: tuple

In [33]:
tree = make_random_tree(4)
print(tree)

length
├── above
│   ├── document
│   │   ├── should
│   │   ├── where
│   │   └── graphics
│   └── forum
│       ├── number
│       └── against
├── football
│   ├── shoes
│   │   ├── indian
│   │   ├── science
│   │   └── products
│   └── sales
│       └── progress
└── hospital
    ├── across
    │   └── texas
    ├── desktop
    │   └── purchase
    └── entire
        ├── radio
        └── become


In [34]:
from collections import deque

def starts_with_vowel(word):
    return word[0] in 'aeiou'

def breadth_first_search(tree, condition):
    to_do_queue = deque()
    to_do_queue.append(tree)
    while to_do_queue:
        subtree = to_do_queue.popleft()
        if condition(subtree.root):
            return subtree.root
        to_do_queue.extend(subtree.children)
    return None

breadth_first_search(tree, starts_with_vowel)

'above'

Separate the walking from the searching:

In [35]:
def breadth_first_iterator(tree):
    to_do_queue = deque()
    to_do_queue.append(tree)
    while to_do_queue:
        subtree = to_do_queue.popleft()
        yield subtree.root
        to_do_queue.extend(subtree.children)
    return None

def breadth_first_search_alt(tree, condition):
    return next(filter(starts_with_vowel, breadth_first_iterator(tree)), None)

breadth_first_search_alt(tree, starts_with_vowel)

'above'

In [36]:
def contains_node(tree, condition):
    return any(condition(x) for x in breadth_first_iterator(tree))
    
contains_node(tree, lambda x: 'w' in x)

True

In [37]:
def num_nodes(tree):
    return sum(1 for _ in breadth_first_iterator(tree))

num_nodes(tree)

24

#### Example 5: Processing a binary file one chunk at the time

In [38]:
def process_data(chunk):
    print(len(chunk))

In [39]:
CHUNKSIZE = 4096

with open('breaking_bad.ipynb', mode='rb') as f:
    while True:
        chunk = f.read(CHUNKSIZE)
        if chunk == b'':
            break
        process_data(chunk)

4096
4096
4096
3336


A "hidden gem" in Python is (an overloaded version of) the built-in `iter` function. It can be used to return an iterator that iterates over all the chunks:

In [40]:
CHUNKSIZE = 4096

with open('breaking_bad.ipynb', mode='rb') as f:
    for chunk in iter(lambda: f.read(CHUNKSIZE), b''):
        process_data(chunk)

4096
4096
4096
3336


### Conclusion

Breaking out of loop should evoke a smell and get you to think: Is there a better way? Getting rid of that smell may result in code that is more:
1. readable
2. succinct
4. functional
5. declarative
6. in short,...more pythonic!