Introductory Python code examples  
Lee Spector, lspector@amherst.edu  
August, 2020

# Python Looping and List Comprehensions

This mini-lesson provides a couple of simple examples of how to process lists of data with loops and with list comprehensions. 

One of the morals is that list comprehensions can make your code easier to write, easier to read, and less likely to contain bugs.

First, let's suppose you have some numerical data in a list:

In [1]:
data = [1, 2, 3, 4, 5]

Now let's suppose that what you **want** is a list in which all of these numbers have had 10 added to them. 

One common way to do this is to create a new, initially empty list, and then to loop through the original list, and for each item, add it plus 10 to the new list:

In [2]:
i = 0
processed = []
while i < len(data):
    processed.append(data[i] + 10)
    i = i + 1

After doing this, the contents of our original variable are unchanged:

In [3]:
data

[1, 2, 3, 4, 5]

And our new list contains the processed values:

In [4]:
processed

[11, 12, 13, 14, 15]

We can make this a bit simpler by using a `for` loop instead of a `while` loop. This will automatically take care of the business of incrementing and comparing against the loop value, so that our code is less cluttered up by that stuff. Since we don't have to write that stuff, it's also less likeley that we'll introduce a bug by messing it up.

Here's how we could do the same thing as above (except this time we add 100) with a `for` loop:

In [None]:
processed = []
for i in range(0, len(data)):
    processed.append(data[i] + 100)

And here is the result:

In [None]:
processed

A further step along the same path -- of writing less, and therefore having fewer opportunities to make mistakes -- is to do this task with a list comprehension.

When using a list comprehension, we express the list that we want in terms of the elements of the list that we start with, and the entire looping process is taken care of for us.

Here's how we could do the same thing as above (except this time we add 1000) with a list comprehension:

In [None]:
processed = [item + 1000 for item in data]

And here is the result:

In [None]:
processed

There are more options for list comprehensions, which you can find out about in the Python documentation and other tutorials. Here we'll just look at adding an `if` clause which allows you to skip over some of the items in the initial list, which can be handy for skipping over "bad" data.

For example, suppose your data is a list of strings, each of which **should** contain a floating-point number, but some of which are empty strings:

In [None]:
data = ["1.2", "3.4", "", "5.6"]

We can get a list of the floating-point numbers of all of the non-empty items with a list comprehension like the following, which says that we want a list of the results of calling `float` on each item, but only if the item is not equal to the empty string:

In [None]:
processed = [float(item) for item in data if item != ""]

And here is the result:

In [None]:
processed

But what if you don't actually want to **skip** the empty items, but you instead want to replace them with zeros?

There are a couple of good ways to do this, but one of the nicest is to take advantage of the fact that there is a list comprehension syntax in which you can provide an `else` for the `if`. To use this you put the `if` part before the `for` part, and follow it with an `else` part (still before the `for`).

In our example here, that would look like this:

In [None]:
processed = [float(item) if item != "" else 0.0 for item in data]

And here is the result:

In [None]:
processed

Sometimes, if you're doing something complicated, you may have to use an explicit `while` or `for` loop. 

But in many cases you don't have to, and in those cases you can often save yourself a lot of effort, and avoid bugs, by using a list comprehension. 

In many cases you might also find that while it's hard to make one list comprehension do the whole job, a couple of list comprehensions in a row, each of which processes a list and puts the result in a variable, can still be easier to read, write, and debug than an explicit loop. 

And sometimes the best solution will combine these things, using an explicit loop where it really can't be avoided, but keeping that as small and simple as possible by using list comprehensions to process the data before and/or after the explicit loop.

A good rule of thumb is that if you find yourself writing a big loop, with lots of variables and other code within it, then you could probably make your job simpler, and your code more likely to do what you want, by stepping back and thinking about how to break the process up into simpler steps.

And often some of the most significant simplifications can be achieved by using list comprehensions to take care of things that would otherwise require explicit loops.

Please feel free to send me code that you're working on, to see if I can help with simplification/debugging that may involve list comprehensions among other things.