# Iterations

Today we are going to walk through controlling the execution flow of a program with conditional statements and using loops to execute a task for the items in a container. 

## 1. Conditional statements
We have talked about booleans the last two lectures, but what are they useful for? Conditional statements using comparisons are one of the building blocks of computer programming. A condition controls the execution of a sequence based on a boolean value, which can be the result of a comparison operation. First, let's remind ourselves of all the comparison operators:

`>`, `<`, `>=`, `<=`, `!=`, `==`

We should also spell out the logical operations, which are the keywords `and`, `or` and `not`.

Order of operations is important with the comparison and logical operators! 
1. Comparison operators
2. `not`
3. `and`
4. `or`

Getting the order of operations wrong can lead to problems.

In [65]:
False == not True

SyntaxError: invalid syntax (1386348353.py, line 1)

In [66]:
False == (not True)

True

Now let's introduce the `if-else` construct. Here there is some syntax to remember - be careful with colons!

In [1]:
a = 2
ref_value = 1
if (a > ref_value):
    print(f"{a=} is greater than {ref_value=}")
else:
    print(f"{a=} is less than or equal to {ref_value=}")
# Change the value of a and run this cell again!

a=2 is greater than ref_value=1


We could have been tempted to write the condition directly as `a > 1` instead of using an auxiliary variable `ref_value`. However, this form allows us to avoid repetitions of `1` in our string and makes our code more easily reusable. When possible, make your code depend on *parameters* rather than literals. 

We can have cascaded selections using `elif`:

In [2]:
a = 0.5
ref_value = 1
if (a == ref_value):
    print(f"{a=} is equal to {ref_value=}")
elif (a > ref_value):
    print(f"{a=} is greater than {ref_value=}")
else:
    print(f"{a=} is less than {ref_value=}")
# Change the value of a and run this cell again!

a=0.5 is less than ref_value=1


### match-case (only since python 3.10!)
This is also known as *switch-case* statement and has been part of other programming languages for ages. Surprisingly in python this has only been available since the recent 3.10 version. 

The `match` statement allows selection among different code blocks depending on the value of a variable:

In [3]:
a = 4
match a:
    case 1:
        print("one")
    case 2:
        print("two")
    case 3:
        print("three")
    case _:
        print("I don't know how to write this number!")
# Change the value of a and see how the construct behaves...

I don't know how to write this number!


You can rewrite this using `if` and `elif`, but it will be much less nice to read!

This feature is actually more powerful than we have shown here, as the argument of `match` can be a more sophisticated expression. For the time being, let's just take note of its existence.

## 2. Loops/Iteration

Every algorithm can be built from a combination of three constructs:
- tasks executed in a sequence
- tasks executed according to conditionals
- tasks executed in cycles (or loops).

### Looping over a collection with `for`
Loops are repetition of a sequence of instructions controlled by a membership statement: as long as the membership statement is true, the instructions are repeated. In `python`, loops can be a bit more abstract such as "repeat a sequence of instructions for all the elements of a collection", for example:

In [4]:
# Loop over a list
l = [1, 2, 3, 4, 5]
print(l)
for n in l:
    print(n)

[1, 2, 3, 4, 5]
1
2
3
4
5


In [5]:
# Loop over a dictionary
d = {'Germany':'Berlin', 'France':'Paris', 'Ireland':'Dublin'}
for key in d:
    value = d[key]
    print(key, value)

Germany Berlin
France Paris
Ireland Dublin


In [6]:
# Loop over a list with an index
i = 0
for n in l:
    print(f"l[{i}] is {n}")
    i += 1

l[0] is 1
l[1] is 2
l[2] is 3
l[3] is 4
l[4] is 5


### Introducing `range`

In [8]:
for i in range(5):
    print(i)

0
1
2
3
4


In [9]:
for i in range(12,30,7):  # Start on 12, end on 29, steps of 7
    print(i)

12
19
26


### Iterating over a group of lists

In [10]:
# Lists we're going to need
galaxy_names = ["NGC 5128", "TXS 0506+056", "NGC 1068", "GB6 J1040+0617", "TXS 2226-184"]
distances_mpc = [3.7, 1.75e3, 14.4, 1.51e4, 107.1]  # Mpc
luminosities = [1e40, 3e46, 4.9e38, 6.2e45, 5.5e41] # erg/s

### Print name and distance of each galaxy in our list

In [11]:
for i in range(len(galaxy_names)):
    print(f"Name: {galaxy_names[i]}; D = {distances_mpc[i]} Mpc")

Name: NGC 5128; D = 3.7 Mpc
Name: TXS 0506+056; D = 1750.0 Mpc
Name: NGC 1068; D = 14.4 Mpc
Name: GB6 J1040+0617; D = 15100.0 Mpc
Name: TXS 2226-184; D = 107.1 Mpc


### More pythonic method - iterate directly over the list elements!
Here we can use the zip() function, which takes a iterable (such as a list or dict) and returns an iterator. The iterator makes a tuple (or tuples) from elements of the iterable. If we have iterables of different lengths, we'll get tuples covering the shortest iterable. So if we have lists with 2, 3, and 4 elements, zip() will terminate after 2 tuples.

In [12]:
# Print out tpules
for pair in zip(distances_mpc, luminosities):
    print(pair)

(3.7, 1e+40)
(1750.0, 3e+46)
(14.4, 4.9e+38)
(15100.0, 6.2e+45)
(107.1, 5.5e+41)


In [13]:
pair = zip(distances_mpc, luminosities)
print(pair)  # This just returns the iterator
list(pair)   # This returns the tuples

<zip object at 0x10396ca40>


[(3.7, 1e+40),
 (1750.0, 3e+46),
 (14.4, 4.9e+38),
 (15100.0, 6.2e+45),
 (107.1, 5.5e+41)]

In [14]:
# Print out unpacked items in tuple
for dist, lum in zip(distances_mpc, luminosities):
    print(dist, lum)

3.7 1e+40
1750.0 3e+46
14.4 4.9e+38
15100.0 6.2e+45
107.1 5.5e+41


Let's try this with lists of different lengths.

In [17]:
galaxy_names = ["NGC 5128", "TXS 0506+056", "NGC 1068", "GB6 J1040+0617", "TXS 2226-184"]
distances_mpc = [3.7, 1.75e3]  # Mpc
luminosities = [1e40, 3e46, 4.9e38] # erg/s

for name, dist, lum in zip(galaxy_names, distances_mpc, luminosities):
    print(name, dist, lum)

NGC 5128 3.7 1e+40
TXS 0506+056 1750.0 3e+46


In [18]:
# Redefine the full lists
galaxy_names = ["NGC 5128", "TXS 0506+056", "NGC 1068", "GB6 J1040+0617", "TXS 2226-184"]
distances_mpc = [3.7, 1.75e3, 14.4, 1.51e4, 107.1]  # Mpc
luminosities = [1e40, 3e46, 4.9e38, 6.2e45, 5.5e41] # erg/s

### Sidenote: be careful to use zip() with ordered iterables
zip() on e.g. sets is not guaranteed to work well

In [21]:
galaxy_names_set = {"NGC 5128", "TXS 0506+056", "NGC 1068"}
distances_mpc_set = {3.7, 1.75e3, 14.4}
list(zip(galaxy_names_set, distances_mpc_set))

[('TXS 0506+056', 3.7), ('NGC 5128', 1750.0), ('NGC 1068', 14.4)]

### Let's make the print-out more descriptive

In [22]:
for name, dist, lum in zip(galaxy_names, distances_mpc, luminosities):
    print(f"Name: {name}; D = {dist} Mpc; L={lum} erg/s")

Name: NGC 5128; D = 3.7 Mpc; L=1e+40 erg/s
Name: TXS 0506+056; D = 1750.0 Mpc; L=3e+46 erg/s
Name: NGC 1068; D = 14.4 Mpc; L=4.9e+38 erg/s
Name: GB6 J1040+0617; D = 15100.0 Mpc; L=6.2e+45 erg/s
Name: TXS 2226-184; D = 107.1 Mpc; L=5.5e+41 erg/s


### And now a little cosmetic improvement using f-strings

In [23]:
for name, dist in zip(galaxy_names, distances_mpc):
    print(f"Name: {name:15}; D = {dist:10.1f} Mpc;")
    # print(f"Name: {name:15}; D = {dist:8} Mpc;")

Name: NGC 5128       ; D =        3.7 Mpc;
Name: TXS 0506+056   ; D =     1750.0 Mpc;
Name: NGC 1068       ; D =       14.4 Mpc;
Name: GB6 J1040+0617 ; D =    15100.0 Mpc;
Name: TXS 2226-184   ; D =      107.1 Mpc;


Extra points for scientific notation!

In [25]:
for name, dist in zip(galaxy_names, distances_mpc):
    print(f"Name: {name:15}; D = {dist:.1e} Mpc;")  

Name: NGC 5128       ; D = 3.7e+00 Mpc;
Name: TXS 0506+056   ; D = 1.8e+03 Mpc;
Name: NGC 1068       ; D = 1.4e+01 Mpc;
Name: GB6 J1040+0617 ; D = 1.5e+04 Mpc;
Name: TXS 2226-184   ; D = 1.1e+02 Mpc;


### Simplifying counting with `enumerate`

In [26]:
list(enumerate(galaxy_names))

[(0, 'NGC 5128'),
 (1, 'TXS 0506+056'),
 (2, 'NGC 1068'),
 (3, 'GB6 J1040+0617'),
 (4, 'TXS 2226-184')]

In [27]:
for i, name in enumerate(galaxy_names):
    print(f"Position: {i}; Name: {name}")

Position: 0; Name: NGC 5128
Position: 1; Name: TXS 0506+056
Position: 2; Name: NGC 1068
Position: 3; Name: GB6 J1040+0617
Position: 4; Name: TXS 2226-184


### Creating lists

#### Convert distance list from Mpc to cm

In [28]:
distances_cm = []  # Create a new list
for d in distances_mpc:
    distances_cm.append(d * 3e24)  # Use list's append() method to add items to new list

print(distances_cm)

[1.11e+25, 5.25e+27, 4.32e+25, 4.53e+28, 3.213e+26]


#### Select distances < 100 Mpc and convert them to cm

In [29]:
# Convert distance list from Mpc to cm
short_distances_cm = []
for d in distances_mpc:
    if d < 100:
        print(f"Distance = {d} Mpc")
        short_distances_cm.append(d * 3e24)

print(short_distances_cm)

Distance = 3.7 Mpc
Distance = 14.4 Mpc
[1.11e+25, 4.32e+25]


### Introducing list comprehension!
List comprehension can be used to create lists. It allows us to rewrite a for loop in a single line, and can also be used for mapping and filtering. Below, we map a list of distances in Mpc to distances in cm (without having to use map()).

List comprehension involves an expression, a member and an iterable:

In [30]:
distances_cm = [d * 3e24 for d in distances_mpc]

print(distances_cm)

[1.11e+25, 5.25e+27, 4.32e+25, 4.53e+28, 3.213e+26]


Here the expression is 'd*3e24', member is 'd' and iterable is 'distances_mpc'.

We can also use list comprehension for filtering. 

In [31]:
# We can also select elements based on some criterium on the same one line:

short_distances_cm = [d * 3e24 for d in distances_mpc if d < 100.]
print(short_distances_cm)

[1.11e+25, 4.32e+25]


#### Get list of names based on distance critrion

In [32]:
closeby_galaxy_names = [name for name, dist in zip(galaxy_names, distances_mpc) if dist < 100 ]

print(closeby_galaxy_names)

['NGC 5128', 'NGC 1068']


You can do even more complicated list expressions

In [33]:
distances_cm_or_opinion = [d if d < 100. else "Too far to care about!" for d in distances_mpc]
print(distances_cm_or_opinion)

[3.7, 'Too far to care about!', 14.4, 'Too far to care about!', 'Too far to care about!']


### Sidenote: you can also use set comprehension in the same way
With the caveat that if you are interested in the order of the elements, you don't want to use sets.

### Counting number of members

In [34]:
# You can to this by building a list and checking its length:

print(len(closeby_galaxy_names))

2


In [35]:
# Or better - if you don't need the list you don't have to create it 

count = 0

for dist in distances_mpc:
    if dist < 100:
        count += 1
print(count)

2


## Simultaneously iterating through multiple lists without and with list comprehension

In [36]:
from math import pi  # What does this mean?

fluxes = []
for lum, d_mpc in zip(luminosities, distances_mpc):
    d_cm = d_mpc * 3e24
    fluxes.append(lum / (4 * pi * d_cm ** 2))
        
print(fluxes)

[6.4586861087531586e-12, 8.661493501599748e-11, 2.089386202070171e-14, 2.404282090867728e-13, 4.2396633647669894e-13]


In [37]:
# Do the same using list comprehension!
fluxes_new = [lum / (4 * pi * (d_mpc * 3e24) ** 2) for lum, d_mpc in zip(luminosities, distances_mpc)]
print(fluxes_new)

[6.4586861087531586e-12, 8.661493501599748e-11, 2.089386202070171e-14, 2.404282090867728e-13, 4.2396633647669894e-13]


### Iterating through tables with nested loops

Here's a rather advanced example -  calculate a 2D table of fluxes based on the luminosities and distances. 

In [38]:
from math import pi

flux_table = []  # Flux table is an empty list
for lum in luminosities:
    flux_table.append([])  # Flux table is an empty list of empty lists
    for d_mpc in distances_mpc:
        d_cm = d_mpc * 3e24
        flux_table[-1].append(lum / (4 * pi * d_cm ** 2))  # Flux table is being filled up with luminosities
        
print(flux_table[3][3])

2.404282090867728e-13


Use list comprehension to rewrite the function in only one line!

In [39]:
table = [[lum / (4 * pi * (d_mpc * 3e24) ** 2) for lum in luminosities] for d_mpc in distances_mpc]    

print(table[3][3])

2.404282090867728e-13


### The `break` statement

In [40]:
my_list = ["Siya", "Tiya", "Guru", "Buru"]

i = 0

for i, name in enumerate(my_list):
    print(my_list[i])
    if (my_list[i] == 'Guru'):
        print('Found the name Guru')
        break

Siya
Tiya
Guru
Found the name Guru


#### Breaks in nested loops

In [44]:
for i in range(4):
    for j in range(4):          
        print(f"i={i} and j={j}");  
        
print("Now let's add a break statement")

# What's the output of the following code?
for i in range(4):
    for j in range(4):          
        if j == 2: 
            break
        print(f"i={i} and j={j}");  

i=0 and j=0
i=0 and j=1
i=0 and j=2
i=0 and j=3
i=1 and j=0
i=1 and j=1
i=1 and j=2
i=1 and j=3
i=2 and j=0
i=2 and j=1
i=2 and j=2
i=2 and j=3
i=3 and j=0
i=3 and j=1
i=3 and j=2
i=3 and j=3
Now let's add a break statement
i=0 and j=0
i=0 and j=1
i=1 and j=0
i=1 and j=1
i=2 and j=0
i=2 and j=1
i=3 and j=0
i=3 and j=1


### The `continue` statement

In [45]:
for i in range(10): 
    if i % 2:
        continue
    print(f"{i} is even")

0 is even
2 is even
4 is even
6 is even
8 is even


In [46]:
for i in range(10):    
    if not i % 2: 
        continue
    print(f"{i} is odd")

1 is odd
3 is odd
5 is odd
7 is odd
9 is odd


#### Reprise: breaks in nested loops

In [47]:
# What's the output of the following code?
for i in range(4):
    if i < 2:
        continue
    for j in range(4):          
        print(f"{i} and {j}");  

2 and 0
2 and 1
2 and 2
2 and 3
3 and 0
3 and 1
3 and 2
3 and 3


### The `while` loop
`while` loops are an alternative to `for` loops. An operation is performed until a condition is met. This can be useful in cases where there is a user input involved, or a desired condition that will be met after an unknown number of operations (for example, achieving desired precision on a numerical calculation.

In [48]:
i = 0
while ( i < len(galaxy_names) ):
    print(f"Name: {galaxy_names[i]}; D = {distances_mpc[i]} Mpc")
    i=i+1

Name: NGC 5128; D = 3.7 Mpc
Name: TXS 0506+056; D = 1750.0 Mpc
Name: NGC 1068; D = 14.4 Mpc
Name: GB6 J1040+0617; D = 15100.0 Mpc
Name: TXS 2226-184; D = 107.1 Mpc


Filtering within the `while` loop is also possible, with similar behavior to the break statement in a `for` loop.

In [49]:
i = 0
while ( distances_mpc[i] < 100 ):
    print(f"Name: {galaxy_names[i]}; D = {distances_mpc[i]} Mpc")
    i=i+1

Name: NGC 5128; D = 3.7 Mpc


### From lists to dictionaries
Create dictionary mapping `galaxy_name` to `luminosity`

In [50]:
galaxy_luminosities = {}

for name, lum in zip(galaxy_names, luminosities):
    galaxy_luminosities[name] = lum

print(galaxy_luminosities)
print(galaxy_luminosities["TXS 0506+056"])

{'NGC 5128': 1e+40, 'TXS 0506+056': 3e+46, 'NGC 1068': 4.9e+38, 'GB6 J1040+0617': 6.2e+45, 'TXS 2226-184': 5.5e+41}
3e+46


#### A more pythonic way

In [51]:
galaxy_luminosities = {name:lum for name, lum in zip(galaxy_names, luminosities)}

print(galaxy_luminosities["TXS 0506+056"])

3e+46


#### An even more pythonic way

In [52]:
galaxy_luminosities = dict(zip(galaxy_names, luminosities))

print(galaxy_luminosities["TXS 0506+056"])

3e+46


## Iterate through dictionaries

In [53]:
for k in galaxy_luminosities:
    print(f"{k:15s} has {galaxy_luminosities[k]:.2e} erg/s ")

NGC 5128        has 1.00e+40 erg/s 
TXS 0506+056    has 3.00e+46 erg/s 
NGC 1068        has 4.90e+38 erg/s 
GB6 J1040+0617  has 6.20e+45 erg/s 
TXS 2226-184    has 5.50e+41 erg/s 


#### A more pythonic way using the .items() method:

In [55]:
for k, v in galaxy_luminosities.items():
    print(f"{k:15s} has {v:.2e} erg/s ")

NGC 5128        has 1.00e+40 erg/s 
TXS 0506+056    has 3.00e+46 erg/s 
NGC 1068        has 4.90e+38 erg/s 
GB6 J1040+0617  has 6.20e+45 erg/s 
TXS 2226-184    has 5.50e+41 erg/s 


#### Create a dictionary mapping galaxy names to their observed flux
You can use dictionary comprehension, which is similar to list comprehension, but uses a key.

In [56]:
from math import pi

obs_flux = {name : lum / (4 * pi * (d * 3e24) ** 2) for name, lum, d in zip(galaxy_names, 
                                                                      luminosities,
                                                                      distances_mpc) }
for name, flux in obs_flux.items():
    print(f"{name :15s} has an observed flux of {flux:.2e} erg/cm2/s")

NGC 5128        has an observed flux of 6.46e-12 erg/cm2/s
TXS 0506+056    has an observed flux of 8.66e-11 erg/cm2/s
NGC 1068        has an observed flux of 2.09e-14 erg/cm2/s
GB6 J1040+0617  has an observed flux of 2.40e-13 erg/cm2/s
TXS 2226-184    has an observed flux of 4.24e-13 erg/cm2/s


## Closing note: keeping performance in mind
List comprehension results in nice code, but if you are making a very large list, you can run into memory problems. You might be better off considering using a generator, which stores an iterable rather than a large list. This might be preferable when working with large datasets. See https://realpython.com/introduction-to-python-generators/ for more details.

In [63]:
# This is memory-intensive
sum([n * n for n in range(50000000)])

41666665416666675000000

In [64]:
# This is less memory-intensive
sum(n * n for n in range(5000000))

41666654166667500000