## Wrap-up iterations

Recall our lists from last time:

In [1]:
galaxy_names = ["NGC 5128", "TXS 0506+056", "NGC 1068", "GB6 J1040+0617", "TXS 2226-184"]
distances_mpc = [3.7, 1.75e3, 14.4, 1.51e4, 107.1]  # Mpc
luminosities = [1e40, 3e46, 4.9e38, 6.2e45, 5.5e41] # erg/s

### The `break` statement

In [2]:
my_list = ["Siya", "Tiya", "Guru", "Buru"]

i = 0

for i, name in enumerate(my_list):
    print(my_list[i])
    if (my_list[i] == 'Guru'):
        print('Found the name Guru')
        break

Siya
Tiya
Guru
Found the name Guru


#### Breaks in nested loops

In [5]:
# What's the output of the following code?

for i in range(4):
    for j in range(4):          
        if j == 2: 
            break
        print(f"i={i} and j={j}");  

i=0 and j=0
i=0 and j=1
i=1 and j=0
i=1 and j=1
i=2 and j=0
i=2 and j=1
i=3 and j=0
i=3 and j=1


### The `continue` statement

In [13]:
for i in range(10): 
    if i % 2:
        continue
    print(f"{i} is even")

0 is even
2 is even
4 is even
6 is even
8 is even


In [14]:
for i in range(10):    
    if not i % 2: 
        continue
    print(f"{i} is odd")

1 is odd
3 is odd
5 is odd
7 is odd
9 is odd


#### Reprise: breaks in nested loops

In [15]:
# What's the output of the following code?

for i in range(4):
    if i < 2:
        continue
    for j in range(4):          
        print(f"{i} and {j}");  

2 and 0
2 and 1
2 and 2
2 and 3
3 and 0
3 and 1
3 and 2
3 and 3


### The `while` loop
`while` loops are an alternative to `for` loops. An operation is performed until a condition is met. This can be useful in cases where there is a user input involved, or a desired condition that will be met after an unknown number of operations (for example, achieving desired precision on a numerical calculation.

In [16]:
i = 0
while ( i < len(galaxy_names) ):
    print(f"Name: {galaxy_names[i]}; D = {distances_mpc[i]} Mpc")
    i=i+1

Name: NGC 5128; D = 3.7 Mpc
Name: TXS 0506+056; D = 1750.0 Mpc
Name: NGC 1068; D = 14.4 Mpc
Name: GB6 J1040+0617; D = 15100.0 Mpc
Name: TXS 2226-184; D = 107.1 Mpc


Filtering within the `while` loop is also possible, with similar behavior to the break statement in a `for` loop.

In [17]:
i = 0
while ( distances_mpc[i] < 100 ):
    print(f"Name: {galaxy_names[i]}; D = {distances_mpc[i]} Mpc")
    i=i+1

Name: NGC 5128; D = 3.7 Mpc


### From lists to dictionaries
Create dictionary mapping `galaxy_name` to `luminosity`

In [18]:
galaxy_luminosities = {}

for name, lum in zip(galaxy_names, luminosities):
    galaxy_luminosities[name] = lum

print(galaxy_luminosities)
print(galaxy_luminosities["TXS 0506+056"])

{'NGC 5128': 1e+40, 'TXS 0506+056': 3e+46, 'NGC 1068': 4.9e+38, 'GB6 J1040+0617': 6.2e+45, 'TXS 2226-184': 5.5e+41}
3e+46


#### A more pythonic way

In [19]:
galaxy_luminosities = {name:lum for name, lum in zip(galaxy_names, luminosities)}

print(galaxy_luminosities["TXS 0506+056"])

3e+46


#### An even more pythonic way

In [20]:
galaxy_luminosities = dict(zip(galaxy_names, luminosities))

print(galaxy_luminosities["TXS 0506+056"])

3e+46


## Iterate through dictionaries

In [21]:
for k in galaxy_luminosities:
    print(f"{k:15s} has {galaxy_luminosities[k]:.2e} erg/s ")

NGC 5128        has 1.00e+40 erg/s 
TXS 0506+056    has 3.00e+46 erg/s 
NGC 1068        has 4.90e+38 erg/s 
GB6 J1040+0617  has 6.20e+45 erg/s 
TXS 2226-184    has 5.50e+41 erg/s 


#### A more pythonic way:

In [22]:
for k, v in galaxy_luminosities.items():
    print(f"{k:15s} has {v:.2e} erg/s ")

NGC 5128        has 1.00e+40 erg/s 
TXS 0506+056    has 3.00e+46 erg/s 
NGC 1068        has 4.90e+38 erg/s 
GB6 J1040+0617  has 6.20e+45 erg/s 
TXS 2226-184    has 5.50e+41 erg/s 


#### Create a dictionary mapping galaxy names to their observed flux
You can use dictionary comprehension, which is similar to list comprehension, but uses a key.

In [23]:
from math import pi

obs_flux = {name : lum / (4 * pi * (d * 3e24) ** 2) for name, lum, d in zip(galaxy_names, 
                                                                      luminosities,
                                                                      distances_mpc) }
for name, flux in obs_flux.items():
    print(f"{name :15s} has an observed flux of {flux:.2e} erg/cm2/s")

NGC 5128        has an observed flux of 6.46e-12 erg/cm2/s
TXS 0506+056    has an observed flux of 8.66e-11 erg/cm2/s
NGC 1068        has an observed flux of 2.09e-14 erg/cm2/s
GB6 J1040+0617  has an observed flux of 2.40e-13 erg/cm2/s
TXS 2226-184    has an observed flux of 4.24e-13 erg/cm2/s


## Closing note: keeping performance in mind
List comprehension results in nice code, but if you are making a very large list, you can run into memory problems. You might be better off considering using a generator, which stores an iterable rather than a large list. This might be preferable when working with large datasets. See https://realpython.com/introduction-to-python-generators/ for more details.

In [None]:
# This is memory-intensive
sum([n * n for n in range(50000)])

In [None]:
# This is less memory-intensive
sum(n * n for n in range(50000))

# Functions

Functions make code more maintainable and more organized/modular. If there is a task that you repeat several times in your program, packaging it as a function is much better practice than copying the code snippet for the task repeatedly. If you update the task, you only need to update the function once. Functions allow you to clearly separate different sub-tasks in your program, rather than writing one long list of commands with comments to indicate the sub-tasks.

Functions start with the `def` keyword, then a ***function signature*** (name of the function). In general, a function takes ***arguments*** as input, processes them (via ***statements*** in the function body), and ***returns*** a result as output. A generic function looks like this:

In [None]:
def <function_name>([<parameters>]):
    <statement(s)>

You have already seen built-in functions like len(), print(), etc.

In [24]:
type(len)

builtin_function_or_method

The function must be called with parentheses and the right arguments.

In [27]:
len("Hi class")

8

The function is executed only if it is called in the main body of the code. Here's an example for a user-defined function:

In [33]:
from math import pi

def calc_flux(luminosity, distance_mpc):
    distance_cm = distance_mpc * 3e24
    flux = luminosity / (4 * pi * distance_cm ** 2)
    return flux

lum, dist = 4e45, 100

flux = calc_flux(lum, dist)

print(f"{flux:.2e} erg/s")

3.54e-09 erg/s


Here, two arguments are passed to the function. Since there are no default values for the arguments defined, the number of arguments passed must match the number of parameters the function expects, otherwise an error will be thrown. Only the ordering shows which argument is mapped to which parameter; these are called positional arguments.

## Indentation
You'll notice above that indentation is important - it delimits the body of the function, and separates between the main code and the function definition. 
- *indentation* means shifting a line of code by either a given number of spaces or a tab (`Tab` key);
- a tab is a *single* special character that is visualised as an empty space;
- tab-style indentation may have been popular in the past, but today the standard is space-style indentation using 4 whitespaces;
- most editors will produce 4 whitespaces by default (or can be set up to do so!)

### In python
- indentation in python is ***part of the syntax!***
- indentation delimits the code of a function, an `if/elif/else` clause, a loop etc.
- any number of spaces is recognised, but it has to be consistent

If you forget a `return` statement, your function will return `None`

In [35]:
def add_one(num):
    num += 1
    return num
    
print(add_one(10))

11


## Scope

Scope is an important concept that dictates how repeated variable names are interpreted. The behavior of the code snippet below is probably intuitive:

In [36]:
x = 2
print(x)

x = 3
print(x)

2
3


But here it may not be:

In [37]:
x = 2

def func():
    x = 3
    print(x)

func()
print(x)

3
2


The scope inside the function is different than that outside the function. `python` uses the ***LEGB*** rule that gives the order in which variables are evaluated: **L**ocal, **E**nclosing, **G**lobal, **B**uilt-in.

In [41]:
count = 0

def bad_function():
    count += 1
    return count

bad_function() # Calling this function will return an error

UnboundLocalError: cannot access local variable 'count' where it is not associated with a value

Variables defined inside of a function are local to that function. The namespace refers to the defined names and objects that the names refer to.

Variables created outside of any function (note that functions can be nested) are called global variables.

In [44]:
def add_one(count):
    count += 1 # count is a local variable because it's an argument
    return count


count = 1      # global 
print(count)

add_one(count) # this is returning 2 and we're doing nothing with the result
print(count)   # global variable is unaffected

count = add_one(count) # only now are we updating the variable count 
print(count)

1
1
2


Best practice: give different names to your arguments, local variables and global variables

In [45]:
# Exercise: rewrite the code snippet above so that the local and global variables are clearly defined
def add_one(n):
    res = n + 1  # Both n and res are local variables
    return res

count = 1      # Calling our global and local variables differently
               # avoids confusion
print(count)
count = add_one(count) # this returns 2 and we're replacing our global variable with it
print(count)   # global variable is changed

1
2


Note that scope is important not just with functions, but with classes, which we'll see next week, and comprehension, which we saw last week. For example this code snippet returns an error:

In [46]:
[item for item in range(5)]
item

NameError: name 'item' is not defined

However this is not an issue with a standard for loop, though it might not give you the behavior you expect.

In [47]:
item = 0 

for item in range(5):
    print(item)

item

0
1
2
3
4


4

### Function returns

A function can return any type, including lists, dictionaries, booleans, or even functions.

In [48]:
def is_detectable(flux):
    return flux > 1e-11

print(is_detectable(1e-12))

False


Recall our galaxy catalog from last week.

In [49]:
names = ["NGC 5128", "TXS 0506+056", "NGC 1068", "GB6 J1040+0617", "TXS 2226-184"]
distances = [3.7, 1.75e3, 14.4, 1.51e4, 107.1]  # Mpc
luminosities = [1e40, 3e46, 4.9e38, 6.2e45, 5.5e41] # erg/s

gal_cat = list(zip(names, distances, luminosities))

for name, dist, lum in gal_cat:
    print(f"{name:15s} D={dist:.2e} Mpc, L={lum:.2e} erg/s")

NGC 5128        D=3.70e+00 Mpc, L=1.00e+40 erg/s
TXS 0506+056    D=1.75e+03 Mpc, L=3.00e+46 erg/s
NGC 1068        D=1.44e+01 Mpc, L=4.90e+38 erg/s
GB6 J1040+0617  D=1.51e+04 Mpc, L=6.20e+45 erg/s
TXS 2226-184    D=1.07e+02 Mpc, L=5.50e+41 erg/s


In [53]:
# Exercise: use function is_detectable and galaxy catalog to print whether each galaxy in catalog is detectable or not
def is_detectable(luminosity, distance):
    flux = calc_flux(luminosity, distance)
    #print(flux)
    return flux > 1e-10

for name,dis,lum in gal_cat:
    if is_detectable(lum, dis):
        print(f"{name:15s} is detectable")
    else:
        print(f"{name:15s} is not detectable")

NGC 5128        is not detectable
TXS 0506+056    is not detectable
NGC 1068        is not detectable
GB6 J1040+0617  is not detectable
TXS 2226-184    is not detectable


A function terminates the first time that return is called - beware of pitfalls!

In [54]:
def find_first_detectable(catalog):
    for name, dis, lum in catalog:
        if is_detectable(lum, dis):
            return name

firstname = find_first_detectable(gal_cat)
print(f"First resolved galaxy: {firstname:s}") 

TypeError: unsupported format string passed to NoneType.__format__

The problem with the above function is that if there are no elements that satisfy our requirement, the return statement will never be called and the function will return a NoneType. Let's fix that:

In [None]:
# Exercise: rewrite the above function to always return a string

def find_first_detectable(catalog):
    firstname = "None!"
    for name, dis, lum in catalog:
        if is_detectable(lum, dis):
            firstname = name
    return firstname

firstname = find_first_detectable(gal_cat)
print(f"First resolved galaxy: {firstname:s}") # Now I know that a string will always be returned

Python functions are extremely flexible and can even return multiple variables of different types

In [None]:
def assess_flux(luminosity, distance):
    flux = calc_flux(luminosity, distance)
    isdetect = is_detectable(luminosity, distance)
    return flux, isdetect

results  = assess_flux(1e45, 100) # above detectability threshold
# results  = assess_flux(1e43, 100) # below detectability threshold
print(results)

if results[1]:
    print(f"A flux of {results[0]:.2e} erg/cm2/s is detectable!\n")

# A better syntax is to "unpack" the result into different variables:

flx, isdet = assess_flux(1e45, 100) # above detectability threshold
# flx, isdet = assess_flux(1e43, 100) # below detectability threshold

print(flx, isdet)

if isdet:
    print(f"A flux of {flx:.2e} erg/cm2/s is detectable!\n")

### Keyword arguments

In [57]:
from math import sqrt

def quadratic(a, b, c):
    x1 = -b / (2*a)
    x2 = sqrt(b**2 - 4*a*c) / (2*a)
    return (x1 + x2), (x1 - x2)

#a=31
#b=93
#c=62
#print(quadratic(a,b,c))
print(quadratic(a=31, b=93, c=62))
print(quadratic(c=62, a=31, b=93))

(-1.0, -2.0)
(-1.0, -2.0)


But positional arguments must come first, if we use a mix of both.

In [58]:
# This will work
a, b = 31, 93
print(quadratic(a, b, c=62))

(-1.0, -2.0)


In [59]:
# This will not
a, c = 31, 62
print(quadratic(a, b=93, c))

SyntaxError: positional argument follows keyword argument (3758098366.py, line 3)

### Default parameters
We can give some parameters default values.

In [61]:
def is_detectable(luminosity, distance, threshold=1e-11): # luminosity and distance are positional: 
                                                          # they must always be passed then calling 
                                                          # the function. threshold is keyword, and will
                                                          # be defaulted to 1e-11 if I don't pass it
                                                          # to the function
                                                        
    flux = calc_flux(luminosity, distance)
    return flux > threshold

print(is_detectable(1e45,100)) # I don't give any value of threshold,
                        # so Python assumes the default value 
                        # I defined in the function (in this case 1e-11) 
        
print(is_detectable(1e45,100, 1e-12)) # Now Python takes the value I passed to the function

print(is_detectable(1e45,100, 1e-9))

True
True
False


These defaulted parameters must come ***after*** all the undefined arguments.

In [62]:
# Trying to define a function like this will throw an error:

def is_detectable(luminosity, threshold=1e-11, distance):
    flux = calc_flux(luminosity, distance)
    return flux > threshold

SyntaxError: parameter without a default follows parameter with a default (3971272819.py, line 3)

When you add a parameter to a function, always remember to update all the functions that depend on it!

In [65]:
def find_first_detectable(catalog, threshold=1e-11):
    firstname = "None!"
    for name, dis, lum in catalog:
        if is_detectable(lum, dis, threshold): # I pass on the threshold
                                               # parameter to all functions
                                               # that depend on it
            firstname = name
    return firstname

firstname = find_first_detectable(gal_cat)
print(firstname)

TXS 0506+056


### Variable length argument lists
In the examples above, we call a function that takes one luminosity and one distance. What if we want to pass in e.g. a group of distances?

In [66]:
def calc_dist_cm(*args):
    for i in args:
        distance_cm = i * 3e24
        print(f"Distance: {distance_cm} cm")

calc_dist_cm(3.7, 1750.0, 14.4, 15100.0, 107.1)

Distance: 1.11e+25 cm
Distance: 5.25e+27 cm
Distance: 4.32e+25 cm
Distance: 4.53e+28 cm
Distance: 3.213e+26 cm


More useful is passing a tuple packed up from e.g. a list:

In [68]:
def calc_dist_cm(*args):
    for i in args:
        distance_cm = i * 3e24
        print(f"Distance: {distance_cm} cm")

calc_dist_cm(*distances)

Distance: 1.11e+25 cm
Distance: 5.25e+27 cm
Distance: 4.32e+25 cm
Distance: 4.53e+28 cm
Distance: 3.213e+26 cm


We can also use a similar syntax for dictionaries.

In [None]:
galaxy_luminosities = dict(zip(names, luminosities))

def print_galaxies(**kwargs):
    for k, v in kwargs.items():
        print(f"Name: {k}, Luminosity {v} erg/s ")

print_galaxies(**galaxy_luminosities)

## Recursion
Functions can not only depend on other functions, but also on themselves. 

In [69]:
def fibonacci(n):
    if n <= 1:
        return n
    else:
        return fibonacci(n - 1) + fibonacci(n - 2)
    
for i in range(12):
    print(fibonacci(i))

0
1
1
2
3
5
8
13
21
34
55
89


In [70]:
# Exercise: write a function for calculating a factorial, and print 0! through 9!
def factorial(n):
    if n <= 1:
        return 1
    else:
        return n * factorial(n-1)

for i in range(10):
    print(i, factorial(i))

0 1
1 1
2 2
3 6
4 24
5 120
6 720
7 5040
8 40320
9 362880


## Type hints
- python is dynamically typed: you can do whatever you want and there will be little control about the types you use!
- from python 3.5 *type hints* are supported: we can indicate what types a function is supposed to take as arguments and what type it returns!

In [None]:
def f(a : int, b : int) -> str:
    if a > b:
        return f"{a} is greater than {b}"
    else:
        return f"{a} is less than or equal to {b}"

print(f(1, 2))

print(f(1.5, 3.5))

- the python interpreter does not complain if you don't respect type hints, after all it is a *dynamically typed* language!
- type hints are useful **to you** to remember how a function is supposed to behave: they may seem (and probably are) unnecessary at this level but it is important to **pick good habits** from the start! 
- there are tools known as **static type checkers** (one is `mypy`) that can check if your code respect all the type declarations.

## Defining functions in one line with `lambda`
`lambda` is an example of python's support for functional programming. They provide a compact alternative for simple functions. They consist of a keyword (`lambda`) a variable, and a body.

In [71]:
# The above factorial function can be rewritten as

factorial = lambda n: n * factorial(n-1) if n > 1 else 1

print(factorial(4.), factorial(10))

24.0 3628800


`lambda` functions can also take several arguments, but should be used only for simple tasks so as not to become unreadable. Notice that the arguments are separated by commas, but not enclosed in parentheses. 

In [72]:
hypothenuse = lambda x,y: (x ** 2 + y ** 2) ** 0.5

sa, sb = 3, 4
sc = hypothenuse(sa,sb)
print(f"A={sa}, B={sb} -> C = {sc}")

A=3, B=4 -> C = 5.0


You can also pass in values for the arguments in the same line.

In [73]:
hypothenuse = (lambda x,y: (x ** 2 + y ** 2) ** 0.5)(3,4)
print(hypothenuse)

5.0


`lambda` functions have some key differences from standard functions. They don't support statements within the body of the function, or type hints. For example, neither of the examples below will work.

In [74]:
(lambda x: return x**2)(2)

SyntaxError: invalid syntax (2525065835.py, line 1)

This is the correct syntax.

In [75]:
(lambda x: x**2)(2)

4

In [76]:
hypothenuse = (lambda x: int,y: int: (x ** 2 + y ** 2) ** 0.5)(3,4)

SyntaxError: invalid syntax (2102252273.py, line 1)

In contrast to functions, `lambda` functions are invoked immediately, which can be particularly convenient within a Jupyter notebook.

## Docstrings
It is important to add docstrings to your functions, it makes it easier for you and other users to remember/understand what your code is doing. Even in the most obvious cases, your docstring should be at least one line: 

In [1]:
def add_one(n):
    """Calculate n+1 and return the result."""
    res = n + 1  
    return res

def check_script():
    """Check if the script is running."""

If the funtion does something more complex, you should write a more complete docstring, in this general form:

In [3]:
def my_function(par1, par2):
    """
    One-line description of the purpose of the function.

    If necessary, you can add here a second paragraph explaining in detail
    the rationale and usage of the function, including an example if 
    necessary. By using three quotation marks, every line in between is 
    interpreted as part of the same string. So use line breaks like this 
    to keep your lines short.
    
    Args:
        par1: a number
        par2: a second number
    
    Returns:
        The result of some operation on our input
    """
    res = some_operation(par1, par2)
    return res

Strings written in this fashion will become the docstring of the function, which will help your future self or your collaborators understand your code:

In [4]:
help(my_function)

Help on function my_function in module __main__:

my_function(par1, par2)
    One-line description of the purpose of the function.

    If necessary, you can add here a second paragraph explaining in detail
    the rationale and usage of the function, including an example if
    necessary. By using three quotation marks, every line in between is
    interpreted as part of the same string. So use line breaks like this
    to keep your lines short.

    Args:
        par1: a number
        par2: a second number

    Returns:
        The result of some operation on our input



In [5]:
help(len)

Help on built-in function len in module builtins:

len(obj, /)
    Return the number of items in a container.



In the Jupyter environment, you can also get the docstring by pressing Shift+Tab on a function.

For more information about common docstring formats, check:

- https://stackoverflow.com/questions/3898572/what-are-the-most-common-python-docstring-formats 
- https://betterprogramming.pub/3-different-docstring-formats-for-python-d27be81e0d68

For more on how to write good docstrings, check out the PEP conventions:

https://peps.python.org/pep-0257/

## A last note on built-in functions

Be careful not to name your variable with the same name as a built-in function! It is allowed, but it will break the behavior of the built-in function.

In [6]:
len

<function len(obj, /)>

In [7]:
type(len)

builtin_function_or_method

In [8]:
len = 2

In [9]:
type(len)

int

In [10]:
len("Hello World")

TypeError: 'int' object is not callable