# Scientific Programming: A Crash Course

## Class 2 – Going Abstract

In the previous class we learned all the basic ingredients of programming (numbers, strings, lists, etc.), as well as the fundamental "control structures" (`if`, `for`, and `while`). This small handful of components will allow you to do practically anything – the only limits are your time and imagination! However, there are still some more programming structures that are worth learning. Although these new structures are less fundamental, they will help you to organize your code into discrete, modular chunks that are easier to understand and conceptualize.

A key concept in programming is abstraction. This is the idea that we can wrap up a bunch of technical details in a single machine, and then we can forget about all the internal technical details – we just need a high-level understanding of what the machine does. In Python, there are four main levels of abstraction:

1. Functions: Lines of code can be organized into functions that carry out particular procedures.

2. Classes: A bunch of related functions can be organized together in a class.

3. Modules: A bunch of functions and classes can be organized together in a module.

4. Packages: A bunch of modules can be organized together in a package.

As we move up the hierarchy, towards the more abstract levels, we can forget about the nitty-gritty details at the lower levels and focus on more general concepts. Likewise, as we move down the hierarchy we can zoom in to more technical details. For example, the [PsychoPy](https://psychopy.org) package, which is used for running behavioral experiments, is comprised of several modules, including one module to deal with timing, another to deal with hardware, and another to deal with sound. Within each module, there are several classes; for example, within the hardware module, you might have classes for dealing with the mouse, keyboard, and monitor. Finally, within classes, there will be multiple functions (technically, they're called "methods"), which define particular procedures; for example, within the mouse class, there will be methods to access where the mouse is currently positioned or which button is currently being pressed.

Today I mostly want to look at functions, but we'll also briefly look at some of the higher-level abstractions.

## Functions

You've already used several functions, perhaps without even realizing. In the last notebook, you used the `print()` function in order to put stuff on the screen, you used the `len()` function to find out how many items are in a list, and you used the `range()` function to generate a list of counting numbers.

These are all **"built-in" functions** – that is, functions that are a core part of Python. Unlike other languages, Python has a very minimal set of built-in functions; however, there are many useful functions which can be **imported** if you need to use them. More importantly, you can define your own functions. This turns out to be extremely useful for organizing your code and minimizing unnecessary code duplication. It also makes it easier to test and understand your code because you can isolate each part of the machine and understand it independently of everything else. When you become a proficient programmer, 99% of your code will be organized into functions: You define all the functions you need and then you "conduct" them as if you were conducting an orchestra.

### Built-in functions

You can find a full list of the built-in functions here: https://docs.python.org/3/library/functions.html Of these, I think the most commonly used ones are `abs`, `round`, `min`, `max`, `sum`, `sorted`, `reversed`, `enumerate`, and `zip`. Most of these are fairly obvious, but let's take a look at each in turn.

`abs()` gives you the absolute value of a number (it removes the sign):

In [None]:
print(abs(2))

In [None]:
print(abs(-2))

In [None]:
print(abs(-2.5))

`round()` is used to round a float to a specified number of significant digits. For example, here we round *π* to 3 significant digits.

In [None]:
print(round(3.14159, 3))

If you don't specify the number of digits, the `round()` function will default to 0 decimal places, that is, to the nearest whole number (this also causes the number to be cast as an int):

In [None]:
print(round(3.14159))

`min()`, `max()`, and `sum()` do exactly what you would expect: Given a *list* of numbers, these functions return the minimum, the maximum, and the summation.

In [None]:
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

print(min(numbers))
print(max(numbers))
print(sum(numbers))

The `sorted()`function also does exactly what you would expect – it sorts a list:

In [None]:
unsorted_numbers = [8, 4, 9, 5, 1, 2, 3, 6, 7, 10]
sorted_numbers = sorted(unsorted_numbers)
print(sorted_numbers)

The `sorted()` function can also be used with strings, in which case, the strings are sorted alphabetically:

In [None]:
colors = ['blue', 'red', 'green', 'yellow', 'black']
sorted_colors = sorted(colors)
print(sorted_colors)

However, look what happens when we try to reverse a list using the `reversed()` function:

In [None]:
colors = ['blue', 'red', 'green', 'yellow', 'black']
reversed_colors = reversed(colors)
print(reversed_colors)

Here you should see something weird like `<list_reverseiterator object at 0x11089ff70>`. The reason for this is because the `reversed()` function doesn't actually create a new list with the elements in reverse order. Instead, it creates an "iterator object." You can combine this object with a for-loop in order to traverse the elements in reverse order:

In [None]:
colors = ['blue', 'red', 'green', 'yellow', 'black']
reversed_colors = reversed(colors)
for color in reversed_colors:
    print(color)

This behavior is a little inconvenient sometimes, but the function is designed this way to be more computationally efficient. If you had a list with a billion numbers and tried to reverse it, the computer would have to waste a lot of memory representing two versions of the list: the original list and the reversed version. By creating an iterator object, Python doesn't actually need to create a new list in memory; the iterator object just represents the *concept* of looking at the original list in reverse order. When you iterate over the iterator object, Python just reads the original list in reverse.

This type of behavior is quite common in Python. For example, the `range()` function does the same thing. If you write something like `range(1, 1000000000001)` (the range of numbers from one to one trillion), the numbers are not actually created in memory. In fact, I'm not sure it's even possible to represent this many numbers in memory simultaneously. If each number requires 64 bits to represent, then you would need 64 trillion bits, which is equivalent to seven thousand gigabytes. To put that in context, my computer has 32 gigabytes of RAM. Nevertheless, Python will quite happily represent the *concept* of a trillion consecutive numbers with no problem.

#### `enumerate` and `zip`

These last two built-in functions, `enumerate()` and `zip()` also create iterator objects. They offer some handy functionality that is mostly useful when used in combination with a for-loop. First, a very common thing we need to do when iterating over items is to keep track of the index of each item; this is what `enumerate()` allows you to do – it enumerates (i.e. numbers) the items in the list.

In [None]:
colors = ['blue', 'red', 'green', 'yellow', 'black']

for index, item in enumerate(colors):
    message = f'The element with index {index} is {item}.'
    print(message)

In particular, note the slightly different syntax. Instead of a single iterator variable, there are now two iterator variables `index` and `item`, separated by a comma. As always, you are free to pick the variable names. At the moment, it might not seem so obvious why the `enumerate()` function is useful, but personally I use it all the time. For example, maybe I have a bunch of items that I want to plot in different plots. By using `enumerate()` I can place the first item in plot 1, the second item in plot 2, ... and so on.

The `zip()` function does something similar. It allows you to "zip together" two (or more) lists so that an item in one list is paired up with the corresponding item in another list, like this:

In [None]:
favorite_colors = ['blue', 'red', 'green', 'yellow', 'black']
friends = ['Francesco', 'Laura', 'Matteo', 'Gabriele', 'Andrea']

for color, friend in zip(favorite_colors, friends):
    message = f"{friend}'s favorite color is {color}"
    print(message)

Try experimenting with a third list. For example, maybe you also have another list containing your friends' ages and you want to zip all this information together.

### Importing functions

Aside from these core Python functions, which are always available to you, you can also import functions that are a little more specialist. A standard Python installation includes a bunch of built-in **modules**, collectively known as the **Python Standard Library**. For example, some commonly used modules are `math`, `random`, `re`, `itertools`, `time`, `pathlib`, `os`, and `collections`. We won't look at all of these in detail; I just want you to know that there are lots more tools available in the standard library, but they need to be explicitly imported if you need to use them. Going beyond the standard library, you can also install packages that provide even more functionality.

To import a module, you use `import` followed by the name of the module. By convention, you usually put your `import` statements at the top of a script. Let's import the `math` module, which provides various mathematical functions.

In [None]:
import math

Once the module has been imported, you can use the functions it contains by using **"dot notation."** You write the name of the module, then a dot (`.`) and then the name of the function. The dot means something like, "look inside." For example, to use the `sqrt()` function (square root), we can do this:

In [None]:
print(math.sqrt(16))

So, conceptually, the `sqrt()` function is located inside the `math` module. If we were planning to use the `sqrt()` function a lot, we might choose to import that function specifically, in which case we would use the following syntax:

In [None]:
from math import sqrt

Now we can use the `sqrt()` function directly, without having to specify the name of the module (`math`) every time we use it:

In [None]:
print(sqrt(64))

Another useful module, especially in science, is `random`. The `random` module contains various functions for generating random numbers. For example, the `randint()` function generates a random integer between *x* and *y*:

In [None]:
import random

print(random.randint(0, 10))

If you run the above cell a few times, you will see that you get different random numbers. A slightly weird thing about the `randint()` function is that the range is expressed in inclusive:inclusive format, which is inconsistent with pretty much all other functions in Python, where ranges are inclusive:exclusive. This annoys me every time I use this function! But, we're kind of stuck with this design decision now. Realistically, the Python developers can't change the behavior of this function because that would break backwards compatibility: Many people's scripts would no longer behave correctly under newer versions of the language.

I mention this because bugs relating to random numbers can be particularly hard to find and squash. If you weren't aware of this particular behavior, you would probably expect `random.randint(0, 10)` to yield numbers from 0 to 9 and never 10 (based on how things work everywhere else in the language). Because of the nature of random numbers, it might be difficult to even notice that there's a problem; you might have to run your code several times to even notice it. Furthermore, the bug would only manifest itself some of the time – seemingly at random – making it hard to locate the source of the problem. For example, imagine you wanted to run an experiment where you show participants random color words; you might have some code like this:

In [None]:
colors = ['blue', 'red', 'green', 'yellow', 'black']

for trial_number in range(10):
    random_color_index = random.randint(0, 5)
    random_color = colors[random_color_index]
    print(random_color)

When you run the code above, it might work perfectly first time (if so, try it again); however, most of the time it will fail with an `IndexError` and the message "list index out of range." Do you see why this is happening? How would you fix this bug? Can you think of some more elegant ways to write this code? (hint: try investigating the `choice()` function from the `random` module)

Another very common thing that we need to do when running experiments is to shuffle some stimuli, participants, or treatments into a random order. The `shuffle()` function is useful for this purpose:

In [None]:
colors = ['blue', 'red', 'green', 'yellow', 'black']

random.shuffle(colors)

for color in colors:
    print(color)

Here we see another slightly odd thing about the `random` module: The `shuffle()` function operates on the list **in place**. This means that the original list is modified directly; the function does not create and return a new shuffled version of the list. For example, you might have expected to write the code like this:

In [None]:
colors = ['blue', 'red', 'green', 'yellow', 'black']

shuffled_colors = random.shuffle(colors)

for color in shuffled_colors:
    print(color)

But, as you can see, this produces a `TypeError` with the rather obscure message "NoneType object is not iterable." What's going on here? Can you find the source of this error? Why is Python trying to iterate over "NoneType"? What does this even mean!?

Being able to read and interpret errors is an extremely valuable skill, but, unfortunately, it takes time and experience to develop. Python will try its best to explain why it can't interpret your code; however, sometimes you need to put yourself in Python's shoes – you need to look at the code from the point of view of the computer.

Often, to interpret an error, you need to work backwards to "trace" its source. This is why the error messages are descrbed as a "traceback" – the error message shows the sequence of things that happened that ultimately led to the error. For example, in the code above, it seems like the error happens on line 5; after all, there's a big green arrow pointing to line 5. But, in fact, the real source of the error is in line 3. Putting myself in Python's shoes, I would work through it like this:

1. On line 5, Python got stuck and says "NoneType" is not iterable.

2. That means the for-loop is trying to iterate over `None` (incidentally, this is how Python represents null / nothing / none).

3. If the for-loop is trying to iterate over `None`, that implies that the variable `shuffled_colors` is equal to `None`.

4. If `shuffled_colors` is `None` that means the `random.shuffle()` function returned `None`.

5. Why does `random.shuffle()` return `None`?

At this point, I would then remember... oh yeah!... `random.shuffle()` operates on the list in place; it does not return a copy of the list. (If a function doesn't return anything, `None` is returned by default).

Here's my advice: Don't get mad if the computer doesn't understand you; keep calm and work backwards through the code! If you need to, print some of the variables to verify that they are indeed equal to what you expect them to be equal to. Pretty soon, you should be able to track down the source of the bug. You might be tempted just to copy and paste *TypeError: 'NoneType' object is not iterable* into Google. This *might* help you, but it might also just confuse you even more; there are many ways this particular error could come about, and the people on Stack Overflow are probably getting this error for totally unrelated reasons. (Nevertheless, please feel free to Google stuff when you get stuck – I do it every day – just be aware of the potential pitfalls).

Learning how to interpret error messages is like learning how to read sheet music; it takes time to develop the skill but eventually it becomes second nature.

### Writing functions

Okay! Sorry! I got a bit side-tracked there. Let's turn to something much more exciting. Writing functions!

Here's a function definition:

In [None]:
def product(numbers):
    total = numbers[0]
    for number in numbers[1:]:
        total *= number
    return total

Before I explain the new syntax here, let's actually *use* the function first, so you can see the relationship with the built-in functions that you're already familiar with.

The goal of the function is to calculate the product of a bunch of numbers (that is, all the numbers multiplied together). To use the function, I just have to give it a list of numbers and it calculates the product:

In [None]:
some_numbers = [4, 9, 3, 5]
print(product(some_numbers))

So here, 4 × 9 × 3 × 5 = 540. Your list can contain any number of numbers and the product function will produce the correct answer. The nice thing about this is that I can code the function once, and then I can just forget all the details about how it actually works internally; I can just trust that when I give the `product()` function a list of numbers it will calculate the product correctly. As you can see, the `product()` function that we've created here is *used* just like the built-in `sum()` function that you briefly saw earlier – the only difference is that we created the `product()` function ourselves.

In [None]:
print(product([4, 9, 3, 5]))
print(sum([4, 9, 3, 5]))

Now let's study the function definition in more detail:

```python
def product(numbers):
    total = numbers[0]
    for number in numbers[1:]:
        total *= number
    return total
```

To define a function you use the `def` keyword (short for define). This is followed by the name of the function. Like variables, you can choose whatever name you want, but it's good to choose something descriptive. The name of the function is followed by a set of parentheses which contain the function's **arguments**, or in other words, variable names that will be used to represent the function's inputs. In this case, the function has a single argument `numbers` (note that, although the user will pass in multiple numbers, they only actually pass in one object, a list; thus, the function only takes one input – it has one argument). As you would expect, you are free to choose the argument names. Here I chose the argument name `numbers` because it communicates to the user what I expect them to pass in – a list of numbers. Finally, like `if`, `for`, and `while`, the first line of the function definition ends with a colon (`:`).

Now, everything indented inside the function describes the actual procedure that needs to be carried out – how exactly to calculate a product. I won't explain the internal workings here – instead, I leave it to you to study the code and understand how it works. However, the important thing is that the function definition ends with a `return` statement. The `return` statement specifies the output of the function – the object that will be returned when the function completes. In this case, our `product()` function needs to return... the product, which I represented by the variable `total`.

I hope all this is making some sense – but if you're feeling lost, please get my attention so we can work through it together. Functions are super useful, and I really want to make sure everyone understands them. Once you understand how to make your own functions, you are well on your way to becoming a proficient programmer!

Before we move on, I want to show you a bad way to code and good way to code so that you can compare and see the difference. First, let's look at the bad way:

In [None]:
some_numbers = [3, 9, 2, 8, 10]
total = some_numbers[0]
for number in some_numbers[:]:
    total *= number
print(total)

some_more_numbers = [5, 2, 9, 11, 13, 3]
total = some_more_numbers[0]
for number in some_more_numbers[1:]:
    total += number
print(total)

even_more_numbers = [4, 9, 9, 3]
total = even_more_numbers[0]
for number in some_more_numbers[1:]:
    total *= number
print(total)

This is pretty terrible! The main problem is that there's a lot of code duplication – we are writing the same lines over and over again, which is very messy and fragile. It's also not really clear what the overall purpose is, and how can we be sure that the printed outputs are actually correct? In fact, I deliberatly placed three bugs in this code, which you probably didn't notice. Can you find and fix them?

Now let's look at the good way to write this code:

In [None]:
def product(numbers):
    '''
    A function to return the product of a list of numbers.
    '''
    total = numbers[0]
    for number in numbers[1:]:
        total *= number
    return total


print(product([3, 9, 2, 8, 10]))
print(product([5, 2, 9, 11, 13, 3]))
print(product([4, 9, 9, 3]))

Bellissimo! The code is short, tidy, and organized, and most importantly it's quick and easy to grasp exactly what the function does. I even included a **"docstring"** (documentation string) to give an English description of what the function does. Even if you don't delve into the technical details of how the function works, you can at least understand conceptually what the function is doing overall.

If you ever find yourself repeating the same lines of code again and again, that's a good sign that you should probably be using a function. There are several imporant advantages to using functions:

1. If you repeat the same chunk of code multiple times and then decide you need to change how the chunk of code works, you will need to make the same change in multiple places, which is annoying and error-prone (e.g. you might forget to update some of the chunks). By placing the code in a function, you only have to change it in one place.

2. The code is easier to read and comprehend because you can focus on what the function does conceptually and forget about the technical details of how it works.

3. It's easier to test that the code works correctly because you've isolated the chunk of code. You can test each function independently, without interference from other bits of code.

### More on arguments

In the example above, we just used a single argument (`numbers`), but functions can take multiple arguments. In principle, I don't think there's any hard limit on the number of arguments, but, as a rule of thumb, less than six is probably good. If your function requires a lot of arguments, it's probably getting a bit out of control and it might be time to reorganize things by breaking the function down into smaller more specialized functions. In any case, here's an example of a function with two arguments, so you get the general idea:

In [None]:
def raise_x_to_y(x, y):
    return x**y

print(raise_x_to_y(2, 10))

In addition, you can also make arguments optional; if the user doesn't provide a value for a given argument, it will default to a specified value. For example:

In [None]:
def raise_x_to_y(x, y=2):
    return x**y

print(raise_x_to_y(2, 10))
print(raise_x_to_y(2))

As you can see, if I don't specify a second argument (i.e. the `y` value), `y` will default to `2`. It's also worth noting that you can explicitly specify the arguments when calling the function. This allows you to pass in the arguments in any order, which is especially useful if you can't remember what order they're supposed to go in.

In [None]:
print(raise_x_to_y(x=2, y=10))
print(raise_x_to_y(y=10, x=2))

### Fun with functions

Let's do something a bit more fun than boring mathy stuff. Explore the code below and then try running it a few times to see what it does. Try to understand the code at a global level first, and then dig into the individual functions to see how they work.

In [None]:
from random import choice


positive_adjectives = ['funny', 'determined', 'happy', 'stable']
negative_adjectives = ['anxious', 'rambunctious', 'timid', 'chaotic']

prophecies = [
    ['Today is perfect for new endeavors. ', 'The tensions of this week will feel heavier today than yesterday. ', 'Today is the day to cherish and embrace others. ', 'Making yourself useful is a main component of a successful day. ', 'Today, exercise caution when crossing the street. ',],
    ['Remember that good things come to those who work hard. ', 'Don’t let the circumstances bring you down. ', 'Patience is key, but sometimes a little push can get the job done. ', 'A smile can get you a long way. '],
    ['Looking ahead may seem like a waste of time, but it pays off in the end. ', 'Luck favors those who mind the risks and take them. ', 'Today is the day for that thing you always wanted to do. ', 'Luck is on your side today, so seize it! ', 'Things are looking up for you! '],
]

zodiacs = {'Aries':'♈️', 'Taurus':'♉️', 'Gemini':'♊️', 'Cancer':'♋️', 'Leo':'♌️', 'Virgo':'♍️', 'Libra':'♎️', 'Scorpio':'♏️', 'Sagittarius':'♐️', 'Capricorn':'♑️', 'Aquarius':'♒️', 'Pisces':'♓️'}


def generate_opening_sentence(star_sign):
    '''
    This function writes the initial line of the
    horoscope by randomly choosing two adjectives,
    a good trait and a bad trait. It also inserts
    the relevant zodiac symbol.
    '''
    symbol = zodiacs[star_sign]
    positive_trait = choice(positive_adjectives)
    negative_trait = choice(negative_adjectives)
    horoscope = f'{symbol} As a {star_sign}, you are naturally {positive_trait}, but you also tend to be {negative_trait}. '
    return horoscope

def generate_prophecy(star_sign):
    '''
    This function takes the name of a star sign
    and generates a random prophecy by combining
    some random sentences together.
    '''
    horoscope = ''
    for sentences in prophecies:
        horoscope += choice(sentences)
    return horoscope

def generate_horoscope(star_sign):
    '''
    This function takes the name of a star sign
    and generates a horoscope. It first creates
    the opening sentence and then the prophecy.
    '''
    horoscope = generate_opening_sentence(star_sign)
    horoscope += generate_prophecy(star_sign)
    return horoscope


for sign in zodiacs.keys():
    print(generate_horoscope(sign))

Once you understand the code, here are some activities you can try:

1. Can you add in some more prophecies and adjectives? [EASY]

2. Aries and Aquarius begin with a vowel sound, so technically they should be preceded by "an" not "a". Can you add some code to the `generate_opening_sentence()` function to insert the correct article. [MEDIUM]

3. Make a new function that takes a person's date of birth and returns the appropriate horoscope? [HARD]

## Classes and Objects

As I mentioned earlier, a collection of related functions can be wrapped up into a class. Classes tend to be a little abstract and confusing for beginners, so I don't want to say too much about them right now. It suffices to say that classes define "objects" that can be used again and again, rather like how functions define procedures that can be used again and again. Personally, I like using classes to define my own ojects, but some people hate this style of programming.

Python is technically described as an **object-oriented programming (OOP) language**, but unlike some other OOP languages (e.g. Java), Python does not strictly enforce the object-oriented style of programming, and you can actually do quite a lot of stuff without ever defining your own classes/objects. For now, the main reason why all this matters is because Python has many built-in objects – indeed, we've been using them already... strings, ints, lists, etc. are all objects – and these built-in objects have useful **methods** associated with them. We've used one of these methods already: the `.append()` method from the `list` object. Recall from the last notebook that this allowed us to append an item to the end of a list:

In [None]:
colors = ['blue', 'red', 'green', 'yellow', 'black']
print(colors)
colors.append('orange')
print(colors)

Aside from `.append()`, the list object has several other handy methods, including:

1. `.count()` – counts how many times a value occurs in the list

2. `.extend()` – combine the list with a new list

3. `.insert()` – insert a value into the list at a particular position

4. `.pop()` – remove the last item from the list

5. `.index()` – get the index of the first occurrence of a value

Plus, `list` objects also have their own `.reverse()` and `.sort()` methods, which perform these operations in place (i.e. calling these methods actually modifies the list in memory). Try experimenting with these methods in the cell above to make sure you understand what they all do.

Likewise, strings also have several useful methods:

In [None]:
my_string = 'hello world'

print(my_string)
print(my_string.capitalize())
print(my_string.upper())
print(my_string.split())
print(my_string.replace('world', 'universe'))

If you type `my_string.` and then hit the TAB key, you should see an autocomplete popup that shows all the available string methods. Try exploring some of these methods to see what they do.

The key thing to know is that the basic data types in Python (strings, lists, ints, etc.) are all objects that have methods associated with them. These methods are basically like functions, but they are specific to a given object. For example, the string object has an `upper()` method which converts the string to uppercase, but, obviously, integers and floats do not have this method because that wouldn't make any sense!

As you get more experienced with Python, you can start to define your own custom objects and methods. For example, a physicist might find it useful to create an `Atom` object to represent atoms in a simulation; or a neuroscientisit might find it useful to create a `Neuron` object to study how they interact. If you're interested in learning more, stick around for the bonus class today.

## More Activities [optional]

If you get through all the material above, and you're still hungry for more, here are some additional activities you can try:

1. Write a `sum()` function that sums a list of numbers (you cannot use the built-in `sum()` function – you have to write your own). [EASY]

2. Now write a `sum()` function that doesn't use a for-loop. [MEDIUM]

3. Now write a `sum()` function that doesn't use any loops. [HARD]

## Even More Activities [even more optional]

Try to implement a `sort()` function from scratch. The function should take in a list of unordered numbers, and it should return the list in sorted order. See how far you can get on your own and then take a look at [this page](https://www.tutorialspoint.com/python_data_structure/python_sorting_algorithms.htm), which shows you various solutions using different sorting algorithms.