## Looping for Python: for and while

Both the for-loop and the while-loop have different use cases. The for-loop is, as in the previous example, used to iterate over something. This can be a simple counter (a list of values), a list of strings, actually any object that is put into an iterable. So for-loops are to loop through a fixed list. A while-loop has a different purpose as it does not iterate through a list, but has a conditional that is tested before each iteration. If the condition is True, it will execute the code-block, and when the conditional is False, the while-loop is finished. A while-loop is used to continue iterating until for example a user presses a key or a certain precision has been achieved for a minimization process.

In many other languages it is common to create for-loops to have a counter, which is then used to iterate through a data structure. In Python you can directly iterate through an iterable such as a list. Let us have a look at some some examples:

In [None]:
fruit_basket = ['apples', 'bananas', 'pears']

# the standard for-loop
for fruit in fruit_basket:
    print('The basket contains:', fruit)

# There are a couple of useful list manipulations
for fruit in reversed(fruit_basket):
    print('The reversed basket contains:', fruit)

# create a iterable from 0 to 9 (10 is not included)
for number in range(10):
    print(f'number {number}')

# combining two (or more) lists using zip:
persons = ['Alfred', 'Rob', 'Jeroen', 'Hendri', 'Coen', 'Dennis']
cake_type = ['apple pie', 'tompouce', 'brownies', 'butterkuchen', 'carrot cake', 'bossche bollen']
for person, cake in zip(persons, cake_type):
    print(f'{person} likes {cake} a lot!')
    
# if a number is needed, add enumerate
for ix, fruit in enumerate(fruit_basket):
    print(f'The numbered basket contains: {fruit} ({ix + 1})')
    
# enumerate is the same as zipping a range with the length of the iterable
for ix, fruit in zip(range(len(fruit_basket)), fruit_basket):
    print(f'The numbered basket contains: {fruit} ({ix + 1})')

A for-loop starts with the for keyword followed by a variable name. This variable name will be used to reference the current item of the list. As Python is dynamically typed, you do not have to bother with assigning a type. This will be set by the Python interpreter dynamically. After the variable name comes the keyword in followed by the iterable and a semicolon. Similar to if-statements, the code in the loop is identified by indentation. The indentation, together with clear semantics, makes these loops extremely readable.

Python comes already packed with quite some handy tools to manipulate lists. For example to reverse a list, you can use the built-in reversed() method. These types of methods take an iterable (such as a list) and return a new iterable with the applied manipulation. Some of these methods such as reversed() are also built into the list class itself (my_list.reverse()). The difference is however that those methods change the list itself (because lists are mutable), and do not return a new list.

Sometimes, you need to have a counter or index when iterating. For that special case, Python comes with enumerate, which ‘zips’ your list with sequentially increasing numbers. Zip can combine a multiple lists to a single iterable that can be unpacked in a loop. Of course, all lists have to be of the same length, or you will get an error.

The range() method is used to create a list of numbers. It accepts one or two boundary values and a step. Range() returns a special type of iterable, which is a range object ([it is not a generator](https://treyhunner.com/2018/02/python-range-is-not-an-iterator/)). In our previous examples, we supplied a list to the for-loop. This list is completely defined somewhere in memory and waiting to be referenced by the for-loop. If we would do the same with a large range of numbers, we first need to create this list. This can take a large chunk of memory, especially when we are iterating over millions of values (range(1e12)). For such cases, Python has generators, which is a form of ‘lazy execution’. Only when the value is required, i.e. when the for-loop asked for it, it is generated. While range has some subtle differences with generators, the idea is somewhat similar: only get the value you need, when you need it, without creating the full list of numbers first.

Similar to if-statements, for-loops can also be nested. Of course, you need to make sure that the variable for both for-loops are unique. The second for-loop starts in the next indentation level and this increases with each additional for-loop. While Python is fine with you adding large amounts of nested for-loops, in general, three is generally a good maximum. If you end up using more, it is time to look into strategies reducing the amount of looping. Looping is slow and if you need lots of looping, vectorization is probably what you want.

In [None]:
# continue goes on with the next iteration
for number in range(1, 4):
    for another_number in range(1, 4):
        if number == another_number:
            continue
        print(f'{number} x {another_number} = {number * another_number}')

# break stops the iteration, not doing an else
for word in 'here are some words in a string'.split():
    if word == 'in':
        break
    print(f'the word "{word}" has {len(word)} characters')
else:
    print('Iteration done!')
    
# else can be added as a final code-block if the loop was successful
for word in 'more words here'.split():
    print(f'the word "{word}" has {len(word)} characters')
else:
    print('Iteration done!')

Both, the for-loop and the while-loop have break and continue keywords to have additional control on the looping flow. When issuing a break, the current loop is stopped, i.e. if you are in a for-loop, the looping is stopped. If you are in a nested for-loop, the current level of the for-loop is stopped. Using continue you can go to the next iteration, neglecting any statements after the continue keyword. These keywords give additional controls of the flow, but are for specific use-cases.

I have never used the for-else or while-else construct but it might have some use-cases. It is however quite easy to understand. If the for-loop or while-loop was successful, i.e. without breaks or errors, the code-block defined after the else is executed.

While-loops are the other construct of creating loops in Python. The while-loop tests a condition instead of looping through an iterable. A while-loop starts with the keyword while followed by the conditional and a semicolon.

In [None]:
value = 1
while value >= 0.5:
    value -= 0.01  # short-hand notation of value = value - 1
else:
    print('final value is:', value)

# a typical "never-ending" while-loop
counter = 0
while True:
    counter += 1  # short-hand notation of value = value + 1
    if counter > 10:
        break
print('counter is:', counter)

-a while loop is indefinite iteration

-a while loop repeats code while the expression is True

-it exits the loop if the expression is False

-the break and continue statements change the loop behavior

A construct that you see commonly in Python-based restFul API servers is a never-ending while-loop. These loops, keep running forever (or until a user aborts the program (or an error occurs)), because the condition that is provided is the constant True. True, never changes to False, as it is the truest of True, and the while can only be stopped by the break keyword or divine intervention. The conditional in the while-loop is similar to the one in the if-statement. Therefore, you can combine multiple using the logical operators (and, not, or). Also, if you want to code the loop later, you can use the pass placeholder.

## Defining functions and stop repeating yourself

A function is a block of code that is reusable and specifically designed to do a certain task. You have already seen quite some functions, namely the built-in print() and len(). To call a function, you have to type its name with parenthesis after it. The name without the parenthesis is the reference which points to the function (everything in Python is an object remember?). A function can be called without parameters, meaning that there is nothing between the parenthesis. Many functions do however take one or more parameters and these can be supplied as variables or values, separated with a comma. Defining a function is done using the def keyword followed by a function name, a set of parenthesis, and a semicolon. All code is indented, similar to a for-loop or if-else statements. As a convention, the naming of a function is similar to variable names: all lower case with words separated using a underscore. Try to be short but concise. It is good practice to create a DocString for a function you create. Of course, it is not required when it is very clear what the function does, so use common sense when to create a DocString or comment to prevent over-commenting.

In [1]:
# a function without any parameters
def my_function():
    """
    My first toy function
    """
    print('This is my first function')

my_function()

# a function with parameters
def my_other_function(value1, value2):
    """
    Prints two values to the screen
    """
    print(f'value 1: {value1:<7}  --  value 2: {value2}')

my_other_function(10, 25)
my_other_function(3.14159, 'Wow, so pi!')  # Dynamic types: anything is okay for Python


# a function that returns values
def add_one(value):
    """
    Adding one to the provided value
    """
    new_value = value + 1
    return new_value

result = add_one(13)
print(f'Adding 1 to 25 gives us {add_one(25)}')    

# Actually a Python function always returns an object (None type if nothing is provided)
def procrastinate():
    """
    Too lazy to comment
    """
    pass

result = procrastinate()
type(result)  # the built-in type function returns the class of the object

This is my first function
value 1: 10       --  value 2: 25
value 1: 3.14159  --  value 2: Wow, so pi!
Adding 1 to 25 gives us 26


NoneType

The parameters that a function takes are dynamically typed and are not restricted by Python itself. Of course, if you provide a string and the function tries to square the string, it will raise an error. Type checking is the responsibility of the user and is a good practice for code that is shared between multiple users, especially if the code base is large. For smaller projects and code only for you, you can choose not to do type checking, with a chance of getting bugs. Developers like to go even one step further and change the default dynamically typed behavior of Python into static types. Since Python 3.6 you can provide the types for each parameter and the return value in the function definition. While I understand the benefit for debugging, I have never bothered. I guess with larger programs it might be more important, but for data science purposes I think data integrity is more important than static typing. On the other side, it is not to much work to apply. Here is a [great guide on static typing](https://medium.com/@ageitgey/learn-how-to-use-static-type-checking-in-python-3-6-in-10-minutes-12c86d72677b) in Python. Sometimes the dynamical typing is used as a benefit. A function can test for its type and do different things for different objects. Many packages like Numpy or Pandas use this to create an array or DataFrame from different datatypes. If we would statically type this, we would have to create different functions for each datatype. Dynamical typing can be a blessing, but it can also be a burden according to others. I have never found it problematic for any of my use cases.

In [2]:
# Using dynamical typing to do different things for different inputs
# This is also usable to do a more custom type checking
def my_function(string_or_list):
    """
    Print a string or a list of strings
    """
    if type(string_or_list) == str:
        print('String provided:', string_or_list)
    elif type(string_or_list) == list:
        print('List of strings:', ' '.join(string_or_list))
    else:
        print('Type not usable')

input_value1 = 'Hi there!'
input_value2 = ['Hi', 'there', 'I', 'am', 'a', 'list!']

my_function(input_value1)
my_function(input_value2)
my_function(3.14159)


# Named parameters
def another_function(left_value, right_value):
    """
    Print values in an amazing format
    """
    print(f'{left_value} <---> {right_value}')

# parameters can always be given in a named fashion (order does not matter)
another_function(right_value='beer', left_value='wine')
# however unnamed parameters are always from left to right
another_function('milk', 'chocolate')


# default values
def yet_another_function(number, text='%'):
    """
    Print value as a precentage
    """
    print(f'{number}{text}')

# values with defaults are not required but optional
yet_another_function(25)
yet_another_function(12.5, text='!')


# arbitrary amount of parameters
def my_function(*args):
    for ix, arg in enumerate(args):
        print(f'Arguments {ix+1}: {arg}')

my_function('hi', 'there')

String provided: Hi there!
List of strings: Hi there I am a list!
Type not usable
wine <---> beer
milk <---> chocolate
25%
12.5!
Arguments 1: hi
Arguments 2: there


Another way to provide parameters to a function is to use named parameters. These are also called key — value parameters and start with the defined parameter name followed by an equal sign and the assigned value or variable. For short functions, i.e. function with only one or two parameters, people do not really bother but when it is a complex function with many parameters, these help tremendously with readability. As the parameters are assigned explicitly, the order in which they are provided does not matter. If you for some reason match named parameters with standard sequential parameters, the order does matter, so you should be careful when doing this. In a similar manor, default values can also be provided in the definition of the function. This makes the parameter optional, as it already has a value assigned. All parameters that do not have a default value, are automatically flagged as required parameters and an error will be raised if they are not provided. Another common practice is to assign the None type as a default value. This makes the parameter optional and in the function itself you can test if the parameter is of type None or something else and act accordingly. Of course, such idioms are quite specific in their use case.