# Iterations

Today we are going to walk through controlling the execution flow of a program with conditional statements and using loops to execute a task for the items in a container. We will start off with some details about strings, which we didn't cover earlier. We'll also discuss string formatting, which is a task we will want to do frequently when printing.

## 0. Strings

Strings are an immutable sequence of characters, similar to a tuple, which we learned about last week. We can use the built-in `len()` method on them, as well as slicing, which we used last week to get sub-lists.

In [1]:
my_string = "Hello World!"
print(my_string, len(my_string))

Hello World! 12


In [3]:
print(my_string[0:6])

Hello 


In [4]:
print(my_string[-1:])

!


In [5]:
print(my_string[::-1])

!dlroW olleH


As with tuples, adding strings will concentenate them.

In [6]:
my_string_1 = "Hello "
my_string_2 = "World!"
print(my_string_1 + my_string_2)

Hello World!


Let's try out some interesting string methods: `.lower()`, `.upper()`, `.rstrip()`, `.lstrip()`, `.strip()`, `.startswith()`, `.endswith()`, `.find()`, and `.replace()`.

In [8]:
my_string.upper()

'HELLO WORLD!'

In [9]:
my_string = "    Hello World!    "
print(my_string)

    Hello World!    


In [13]:
my_string.rstrip()

'    Hello World!'

In [18]:
my_string = "Hello World!"
my_string.endswith("!")

True

In [22]:
print(my_string.find("o "))
print(my_string.find("vjke35i345"))

4
-1


In [21]:
print(my_string.replace("World", "Back At You"))
my_string

Hello Back At You!


'Hello World!'

## 1. String formatting

In python, there are several ways of building strings incorporating different types of variables. If you are curious about the disadvantages/advantages of different string formatting methods, you can read this article: https://realpython.com/python-string-formatting/#3-string-interpolation-f-strings-python-36.

### String interpolation (legacy)
The oldest style is *[string interpolation](https://docs.python.org/3/library/stdtypes.html#old-string-formatting)*:

In [27]:
a = 1.2345
b = 42
#print("a = %d, b = %d" % (a, b)) # d -> integer
#print("a = %02d" % a)
#print("a = %f" % a) # f -> float
print("a = %.2f" % a)

a = 1.23


The `%d` and similar strings (`%s`, `%x`) are called *format strings*. This style has many pitfalls, is basically deprecated and we recommend against using it in your code, it is just included here in case you come across it in existing code. 

### f-strings and format() method
*f-strings* are *formatted string literals* allowing to easily incorporate python variables and expressions in strings. An alternative and less compact notation uses the `format()` method.

In [30]:
a, b = 1.2345, 42
#print(f"a = {a}, b = {b}") # this is simple
#print(f"{a=}, {b=}") # this is even more compact, although less flexible
print("a = {}, b = {}".format(a, b)) # this is an alternative standard, can be more or less readable depending on the circumstances

a = 1.2345, b = 42


You can control the spacing, number of zeros, number of decimals etc. with specific format strings. 

In [31]:
a, b = 42, 1042
print(f"b = {b}")
print(f"b = {b:4d}")
print(f"b = {b:5d}")
print(f"a = {a:4d}") # this fill up to 4 spaces regardless of the number of digits
print(f"a = {a:04d}") # this will fill with zeros instead

b = 1042
b = 1042
b =  1042
a =   42
a = 0042


In [35]:
a = 123.456
#print(f"a = {a}") # default
#print(f"a = {a:.2f}") # only print two places past the decimal place
#print(f"a = {a:.6f}") # print 6 places past the decimal place
print(f"a = {a:.2e}") # exponential notation!

a = 1.23e+02


You can even do inline arithmetic with f-strings.

In [36]:
a, b = 42, 1042
print(f"42 plus 1042 is {a + b}.")

42 plus 1042 is 1084.


### Multiline strings
You can build a multiline string using the newline (`\n`) escape sequence. What's an escape sequence? It's a sequence of characters that starts with a special character (`\`) and is subject to a special treatment. Do you remember where we can across escape already this term?

In [37]:
print("Line 1\nLine 2\nLine 3")

Line 1
Line 2
Line 3


You could get the same with three `print()` statements, however in some cases you may want to use a single one. For better readability, you could compose the string as follows:

In [38]:
s = "Line 1\n"
s += "Line 2\n"
s += "Line 3"
print(s)

Line 1
Line 2
Line 3


Code using this style can easily get very cluttered, so use this parsimoniously!

## 3. Conditional statements
We have talked about booleans the last two lectures, but what are they useful for? Conditional statements are one of the building blocks of computer programming. A conditional allows for controlling the execution of a sequence based on a boolean value, that can be the result of a comparison operation. Let's introduce the `if-else` construct:

In [40]:
a = 2
ref_value = 1
if (a > ref_value):
    print(f"{a=} is greater than {ref_value=}")
else:
    print(f"{a=} is less than or equal to {ref_value=}")
# Change the value of a and run this cell again!

a=2 is greater than ref_value=1


We could have been tempted to write the condition directly as `a > 1` instead of using an auxiliary variable `ref_value`. However, this form allows us to avoid repetitions of `1` in our string and makes our code more easily reusable. When possible, make your code depend on *parameters* rather than literals. 

We can have cascaded selections using `elif`:

In [43]:
a = 0.5
ref_value = 1
if (a == ref_value):
    print(f"{a=} is equal to {ref_value=}")
elif (a > ref_value):
    print(f"{a=} is greater than {ref_value=}")
else:
    print(f"{a=} is less than {ref_value=}")
# Change the value of a and run this cell again!

a=0.5 is less than ref_value=1


### match-case (only since python 3.10!)
This is also known as *switch-case* statement and has been part of other programming languages since ages. Surprisingly in python this has only been available since the recent 3.10 version. 

The `match` statement allows to select among different code blocks depending on the value of a variable:

In [46]:
a = 4
match a:
    case 1:
        print("one")
    case 2:
        print("two")
    case 3:
        print("three")
    case _:
        print("I don't know how to write this number!")
# change the value of a and see how the construct behave...

I don't know how to write this number!


You can rewrite this using `if` and `elif`, but it will be much less nice to read!

This feature is actually more powerful than we have shown here, as the argument of `match` can be a more sophisticated expression. For the time being, let's just take note of its existence.

## 4. Loops/Iteration

Every algorithm can be built from a combination of three constructs:
- tasks executed in a sequence
- tasks executed according to conditionals
- tasks executed in cycles (or loops).

### Looping over a collection with `for`
Loops are repetition of a sequence of instructions controlled by a membership statement: as long as the membership statement is true, the instructions are repeated. In `python`, loops can be a bit more abstract such as "repeat a sequence of instructions for all the elements of a collection", for example:

In [48]:
# loop over a list
l = [1, 2, 3, 4, 5]
print(l)
for n in l:
    print(n)

[1, 2, 3, 4, 5]
1
2
3
4
5


In [51]:
# loop over a dictionary
d = {'Germany':'Berlin', 'France':'Paris', 'Ireland':'Dublin'}
for key in d:
    value = d[key]
    print(key, value)

Germany Berlin
France Paris
Ireland Dublin


In [52]:
# loop over a list with an index
i = 0
for n in l:
    print(f"l[{i}] is {n}")
    i += 1

l[0] is 1
l[1] is 2
l[2] is 3
l[3] is 4
l[4] is 5


### Iterating over a group of lists



In [68]:
# Lists we're going to need
galaxy_names = ["NGC 5128", "TXS 0506+056", "NGC 1068", "GB6 J1040+0617", "TXS 2226-184"]
distances_mpc = [3.7, 1.75e3, 14.4, 1.51e4, 107.1]  # Mpc
luminosities = [1e40, 3e46, 4.9e38, 6.2e45, 5.5e41] # erg/s

### Introducing `range`

In [54]:
for i in range(5):
    print(i)

0
1
2
3
4


In [55]:
for i in range(12,30,7): # start on 12, end on 29, steps of 7
    print(i)

12
19
26


### Print name and distance of each galaxy in our list

In [58]:
for i in range(len(galaxy_names)):
    print(f"Name: {galaxy_names[i]}; D = {distances_mpc[i]} Mpc")

Name: NGC 5128; D = 3.7 Mpc
Name: TXS 0506+056; D = 1750.0 Mpc
Name: NGC 1068; D = 14.4 Mpc
Name: GB6 J1040+0617; D = 15100.0 Mpc
Name: TXS 2226-184; D = 107.1 Mpc


### More pythonic method - iterate directly over the list elements!
Here we can use the zip() function, which takes a iterable (such as a list or dict) and returns an iterator. The iterator makes a tuple (or tuples) from elements of the iterable. If we have iterables of different lengths, we'll get tuples covering the shortest iterable. So if we have lists with 2, 3, and 4 elements, zip() will terminate after 2 tuples.

In [59]:
#print out tpules
for pair in zip(distances_mpc, luminosities):
    print(pair)

(3.7, 1e+40)
(1750.0, 3e+46)
(14.4, 4.9e+38)
(15100.0, 6.2e+45)
(107.1, 5.5e+41)


In [60]:
pair = zip(distances_mpc, luminosities)
print(pair) #this just returns the iterator
list(pair) #this returns the tuples

<zip object at 0x108245480>


[(3.7, 1e+40),
 (1750.0, 3e+46),
 (14.4, 4.9e+38),
 (15100.0, 6.2e+45),
 (107.1, 5.5e+41)]

In [64]:
#print out unpacked items in tuple
for dist, lum in zip(distances_mpc, luminosities):
    print(dist, lum)

3.7 1e+40
1750.0 3e+46
14.4 4.9e+38
15100.0 6.2e+45
107.1 5.5e+41


In [67]:
#galaxy_names = ["NGC 5128", "TXS 0506+056", "NGC 1068", "GB6 J1040+0617", "TXS 2226-184"]
#distances_mpc = [3.7, 1.75e3, 14.4, 1.51e4]  # Mpc
#luminosities = [1e40, 3e46, 4.9e38, 6.2e45] # erg/s

#for name, lum in zip(galaxy_names, luminosities):
#    print(name, lum)

NGC 5128 1e+40
TXS 0506+056 3e+46
NGC 1068 4.9e+38
GB6 J1040+0617 6.2e+45


### Sidenote: be careful to use zip() with ordered iterables
zip() on e.g. sets is not guaranteed to work well

In [70]:
galaxy_names_set = {"NGC 5128", "TXS 0506+056", "NGC 1068"}
distances_mpc_set = {3.7, 1.75e3, 14.4}
list(zip(galaxy_names_set, distances_mpc_set))

[('NGC 5128', 3.7), ('TXS 0506+056', 1750.0), ('NGC 1068', 14.4)]

### Modify the printing code above to avoid using indices

In [71]:
#for i in range(len(galaxy_names)):
#    print(f"Name: {galaxy_names[i]}; D = {distances_mpc[i]} Mpc")
    
for name, dist, lum in zip(galaxy_names, distances_mpc, luminosities):
    print(f"Name: {name}; D = {dist} Mpc; L={lum} erg/s")

Name: NGC 5128; D = 3.7 Mpc; L=1e+40 erg/s
Name: TXS 0506+056; D = 1750.0 Mpc; L=3e+46 erg/s
Name: NGC 1068; D = 14.4 Mpc; L=4.9e+38 erg/s
Name: GB6 J1040+0617; D = 15100.0 Mpc; L=6.2e+45 erg/s
Name: TXS 2226-184; D = 107.1 Mpc; L=5.5e+41 erg/s


### And now a little cosmetic improvement using f-strings

In [72]:
for name, dist in zip(galaxy_names, distances_mpc):
    print(f"Name: {name:15}; D = {dist:10.1f} Mpc;")
    # print(f"Name: {name:15}; D = {dist:8} Mpc;")

Name: NGC 5128       ; D =        3.7 Mpc;
Name: TXS 0506+056   ; D =     1750.0 Mpc;
Name: NGC 1068       ; D =       14.4 Mpc;
Name: GB6 J1040+0617 ; D =    15100.0 Mpc;
Name: TXS 2226-184   ; D =      107.1 Mpc;


In [73]:
for name, dist in zip(galaxy_names, distances_mpc):
    print(f"Name: {name:15}; D = {dist:.1e} Mpc;")  # extra points for scientific notation

Name: NGC 5128       ; D = 3.7e+00 Mpc;
Name: TXS 0506+056   ; D = 1.8e+03 Mpc;
Name: NGC 1068       ; D = 1.4e+01 Mpc;
Name: GB6 J1040+0617 ; D = 1.5e+04 Mpc;
Name: TXS 2226-184   ; D = 1.1e+02 Mpc;


### Simplifying counting with `enumerate`

In [74]:
list(enumerate(galaxy_names))

[(0, 'NGC 5128'),
 (1, 'TXS 0506+056'),
 (2, 'NGC 1068'),
 (3, 'GB6 J1040+0617'),
 (4, 'TXS 2226-184')]

In [75]:
for i, name in enumerate(galaxy_names):
    print(f"Position: {i}; Name: {name}")

Position: 0; Name: NGC 5128
Position: 1; Name: TXS 0506+056
Position: 2; Name: NGC 1068
Position: 3; Name: GB6 J1040+0617
Position: 4; Name: TXS 2226-184


### Creating lists

#### Convert distance list from Mpc to cm

In [76]:
distances_cm = [] # create new list
for d in distances_mpc:
    distances_cm.append(d * 3e24) # use list's append() method to add items to new list

print(distances_cm)

[1.11e+25, 5.25e+27, 4.32e+25, 4.53e+28, 3.213e+26]


#### Select distances < 100 Mpc and convert them to cm

In [77]:
# Exercise: convert distance list from Mpc to cm
short_distances_cm = []
for d in distances_mpc:
    if d < 100:
        print(f"Distance = {d} Mpc")
        short_distances_cm.append(d * 3e24)

print(short_distances_cm)

Distance = 3.7 Mpc
Distance = 14.4 Mpc
[1.11e+25, 4.32e+25]


### Introducing list comprehension!
List comprehension can be used to create lists. It allows us to rewrite a for loop in a single line, and can also be used for mapping and filtering. Below, we map a list of distances in Mpc to distances in cm (without having to use map()).

List comprehension involves an expression, a member and an iterable:

In [78]:
distances_cm = [d * 3e24 for d in distances_mpc]

print(distances_cm)

[1.11e+25, 5.25e+27, 4.32e+25, 4.53e+28, 3.213e+26]


Here the expression is 'd*3e24', member is 'd' and iterable is 'distances_mpc'.

We can also use list comprehension for filtering. 

In [79]:
# We can also select elements based on some criterium on the same one line:

short_distances_cm = [d * 3e24 for d in distances_mpc if d < 100.]
print(short_distances_cm)

[1.11e+25, 4.32e+25]


#### Get list of names based on distance critrion

In [80]:
closeby_galaxy_names = [name for name, dist in zip(galaxy_names, distances_mpc) if dist < 100 ]

print(closeby_galaxy_names)

['NGC 5128', 'NGC 1068']


You can do even more complicated list expressions

In [81]:
distances_cm_or_opinion = [d if d < 100. else "Too far to care about!" for d in distances_mpc]
print(distances_cm_or_opinion)

[3.7, 'Too far to care about!', 14.4, 'Too far to care about!', 'Too far to care about!']


### Sidenote: you can also use set comprehension in the same way
With the caveat that if you are interested in the order of the elements, you don't want to use sets.

### Counting number of members

In [82]:
# You can to this by building a list and checking its length:

print(len(closeby_galaxy_names))

2


In [83]:
# Or better - if you don't need the list you don't have to create it 

count = 0

for dist in distances_mpc:
    if dist < 100:
        count += 1
print(count)

2


## Simultaneously iterating through multiple lists
zip() is very handy for iterating through multiple lists simultaneously.

In [84]:
from math import pi #Do you remember what this means?

fluxes = []
for lum, d_mpc in zip(luminosities, distances_mpc):
    d_cm = d_mpc * 3e24
    fluxes.append(lum / (4 * pi * d_cm ** 2))
        
print(fluxes)

[6.4586861087531586e-12, 8.661493501599748e-11, 2.089386202070171e-14, 2.404282090867728e-13, 4.2396633647669894e-13]


In [85]:
# Do the same using list comprehension!
fluxes_new = [lum / (4 * pi * (d_mpc * 3e24) ** 2) for lum, d_mpc in zip(luminosities, distances_mpc)]
print(fluxes_new)

[6.4586861087531586e-12, 8.661493501599748e-11, 2.089386202070171e-14, 2.404282090867728e-13, 4.2396633647669894e-13]


### Iterating through tables with nested loops

Here's a rather advanced example -  calculate a 2D table of fluxes based on the luminosities and distances. 

In [86]:
from math import pi

flux_table = [] # flux table is an empty list
for lum in luminosities:
    flux_table.append([]) # flux table is an empty list of empty lists
    for d_mpc in distances_mpc:
        d_cm = d_mpc * 3e24
        flux_table[-1].append(lum / (4 * pi * d_cm ** 2)) # flux table is being filled up with luminosities
        
print(flux_table[3][3])

2.404282090867728e-13


Use list comprehension to rewrite the function in only one line!

In [87]:
table = [[lum / (4 * pi * (d_mpc * 3e24) ** 2) for lum in luminosities] for d_mpc in distances_mpc]    

print(table[3][3])

2.404282090867728e-13
