# Scientific Programming: A Crash Course

## Class 1 – Fundamentals

The goal of this first class is to get everyone familar with the core building blocks of programming: numbers, strings, operators, conditional statements, and looping structures. You may have learned some of these concepts before; even so, try to work through the notebook step-by-step – there's probably at least a few things that you didn't already know.

By the end of this first class, you should be able to code some simple algorithms! However, you will also need to learn some meta-skills along the way. For example, it may take you some time to get Python set up correctly on your computer, and you will need to get familiar with Jupyter notebooks. It's also worth learning some of the keyboard shortcuts to make life easier.

Before running some code, **you should always try to predict the outcome first**. In other words, don't just run the code and look at the result; instead, read the code, try to predict what will happen, and *then* run the code. This will give you a much better sense of what stuff you understand and what stuff you don't understand. You should also try changing the code to see what happens when you run it again. Does it do what you expected?

If you have any problems or questions, please feel free to get my attention and I'll come help. But also: Talk to the people around you because they are probably having the same problems!

If you don't manage to finish this notebook by the end of the class, don't worry, that's okay, there are a lot of new concepts to learn and not very much time. However, do try to continue working through the material at home.

## Math and Variables
At the most basic level, a programming language is like a fancy calculator:

In [None]:
1+1

In [None]:
2*2

In [None]:
1+2*2

Like in mathematics, **operators** (e.g., `+` `-` `*` `/`) are always performed in a particular order. For example, multiplication happens before addition. Use **parentheses** if you need to override the basic **order of operations**:

In [None]:
(1+2)*2

The result of a calulation can be stored in a **variable**:

In [None]:
x = (1 + 2) * 2

Note that, when you run the above cell, it doesn't produce any output. Instead, the result of the calculation **was assigned to** the variable `x`. It's important to understand that the equals sign (`=`) is not trying to imply that `x` has the same value as `(1 + 2) * 2`; rather, the calculation on the right is performed and then the result is assigned to (stored in) a variable called `x` on the left. We can check that it worked correctly by **printing** the value of `x`:

In [None]:
print(x)

You can name variables whatever you like, and it is a good idea to use **descriptive variable names** to make your code more **readable**:

In [None]:
seconds_in_minute = 60
seconds_in_hour = seconds_in_minute * 60
seconds_in_day = seconds_in_hour * 24
seconds_in_week = seconds_in_day * 7
print(seconds_in_week)

Although it may seem annoying to type long descriptive variable names, you will thank yourself in the long run because you will be able to understand your code more easily when you look back at it. Also, don't forget that you can use autocomplete by pressing the TAB key; not only is this faster, but it also helps you avoid typos.

In Python, like other languages, there are some restrictions on variable names. A variable name cannot start with a number (e.g. `1variable`) and a variable name cannot contain spaces or punctuation (except for the underscore `_`). There are also certain keywords that have a special meaning in Python, so they cannot be used as variable names (e.g. `if`, `for`, `while`, `in`). Variable names can include most alphabetical characters, including non-Latin characters. For example, sometimes I like to use Greek letters as variable names if I'm doing something mathy:

In [None]:
α = 10
β = 100
print(α * β)

## Data Types

So far, we've only used one particular data type, the **integer** (``int``). Integers are whole numbers, like ``1``, ``117``, or ``-99``. However, there are many other data types and it is important to understand how they work and when they should be used. In Python, the most important data types are **strings** (``str``), **floating points** (``float``), **Booleans** (``bool``), **lists** (``list``), and **dictionaries** (``dict``). In other languages, these data types sometimes have different names. For example, lists are sometimes called "arrays" or "vectors" and dictionaries are sometimes called "objects" or "named arrays".

### Floats

Floating point numbers (floats) are a special type of number that can have decimal expansions. For example, `100.5`, `3.14159`, or `-0.0001`. In some programming languages, it is important to choose the right type of number (i.e. integers or floats) depending on what you want to do. In Python, it often doesn't matter – the choice between integers (e.g. `1`) and floats (e.g. `1.0`) is usually just an implementation detail that you don't need to worry about. For example, if I calculate `1` (an integer) divided by `3` (an integer), the result is automatically **cast as** (converted to) a float:

In [None]:
1 / 3

As, you can see, we get the same result if we use floating point numbers instead:

In [None]:
1.0 / 3.0

However, if you do want to explicity cast ints as floats or floats as ints, you can use the `int()` and `float()` functions:

In [None]:
float(4)

In [None]:
int(3.14159)

Notice above that if you cast a float as an int, you simply lose the decimal part of the number. The conversion does not round to the nearest whole number; converting to an int always rounds down, never up. For example:

In [None]:
int(3.999)

### Strings

But programming is not just a fancy calculator. We can do *much* more exciting stuff than just playing around with numbers! We can also work with pieces of text, which are called strings (i.e. a string of characters). To create a string, you place the text inside single or double quotation marks (`'` or `"`). In Python there is no distinction between single and double quotation marks, but, of course, you must start and end the string with the same type. For example:

In [None]:
my_name = 'Jon'
print(my_name)
my_surname = "Carr"
print(my_surname)

Unlike variable names, there are no restrictions on what characters can appear in a string. Any unicode character, even emoji, is totally fine:

In [None]:
question = 'What did the 🦊 say?'
print(question)

As with numbers, there are various operations we can perform with strings. For example, we can add two strings together (i.e. **concatenate** the strings):

In [None]:
print(my_name + my_surname)

You can also use the multiplication operator (`*`) to repeat a string a certain number of times:

In [None]:
print(my_name * 10)

You need to be careful to use the right data type depending what you want to do. For example, what do you think will happen here?

In [None]:
x = '10'
y = 20
print(x * y)

If you actually wanted to calculate 10 × 20, you would first need to explicitly cast `x` as the appropriate type (an integer):

In [None]:
print(int(x) * y)

Sometimes mixing certain data types will result in an error. For example, look at what happens if we try to do a string + an integer:

In [None]:
x = 'Catch-'
y = 22
print(x + y)

We get a `TypeError`. This is a very common error and you will see it very frequently. If you see a `TypeError`, Python is telling you that it doesn't know how to perform the operation because the data type or combination of data types is unexpected. In this case, Python does not know how to add a string to an integer; this operation is not defined in the language. Instead, you probably want to explicitly cast the integer as a string:

In [None]:
x = 'Catch-'
y = 22
print(x + str(y))

A very common thing we want to do is combine strings together to construct new strings. One way to do this is to use a special kind of string called an f-string (or formatted string). They are called f-strings because to create them you place the letter `f` before the opening quotation mark (see below). When you use an f-string, you can use braces (`{` and `}`) to insert other variables inside the string:

In [None]:
my_name = 'Jon'
message = f'Ciao, {my_name}! Come va?'
print(message)

### Booleans

Despite the weird name, Booleans are the simplest type of data. A Boolean is a simple yes-no, on-off, 1-0, true-false variable. The main reason to use a Boolean is if you need to represent whether something is true or false. In Python, these two possible values are represented by `True` and `False` (note the capital letter at the start). For example, we might need a variable to represent whether a student answered a question correctly in a test:

In [None]:
expected_answer = 'A'
given_answer = 'B'
correct = expected_answer == given_answer
print(correct)

Here, `correct` is a Boolean variable; it is either `True` or `False`. Also, note the difference here between a single `=`, which means assignment, and double `==`, which checks equality. In this case, the `==` operator checks whether `expected_answer` and `given_answer` are the same; if they match, the expression `expected_answer == given_answer` will evaluate as `True`; if they mismatch, the expression will evaluate as `False`. Either way, the result will be assigned to the variable `correct`.

### Lists

When working with data, we often don't have a single number or a single string or a single Boolean; we often have many numbers, strings, or Booleans. Rather than give each one its own variable name, it is often very useful to store such data in a list. For example, you might want to store a list of people's names, or a list of dates, or a list of scores. In Python, lists are created using square brackets (`[` and `]`) and the comma (`,`) is used to separate the items of the list. Here, for example, we will create a list of colors. The variable `colors` is a list, and that list contains five strings:

In [None]:
colors = ['blue', 'red', 'green', 'yellow', 'black']
print(colors)

If you need to find out how many items are in a list, you can use the `len()` function (which is short for length):

In [None]:
print(len(colors))

You can also extract a particular item from the list by specifying its **index**, again using brackets:

In [None]:
print( colors[4] )

Here we are extracting the 4th item from the list of colors. Do you notice anything weird here? Try experimenting with the index. How would you access the first item in the list (i.e. `blue`)?

In Python, like many other programming languages, counting starts from 0; counting does not start from 1. Thus, if you want to index the *n*th item from a list, you need to use the index *n*-1. This can often seem very annoying and inconvenient, but eventually you will get used to it. (To make things even more confusing, other languages, like R, start counting from 1).

A more advanced type of indexing is called **slicing**. Slices allow you to extract a range of items from a list and create a new list with just these extracted items:

In [None]:
print( colors[1:3] )

Do you notice anything weird here? Is this what you expected to happen?

In Python, like many (but not all) programming languages, a range of numbers is specified in the inclusive:exclusive format. This means that the number before the colon (`:`) is included in the range and the number after the colon is *not* included in the range. Thus, `colors[1:3]` means: Extract item 1 (i.e. the second item – remember, counting is from zero) up-to-but-not-including item 3. Double check that you understand this and try playing around with the slice indices. The two key things to remember are: counting is from zero and ranges are inclusive:exclusive.

So far, we've just seen a list of strings. But lists can contain any kind of data, or even a mix of different data types. For example:

In [None]:
birth_years = [1963, 1964, 1987, 1989, 1991]
fractions_of_one = [0.0, 0.25, 0.5, 0.75, 1.0]
street_names = ['via Carducci', 'via Bonomea', 'viale XX Settembre']
mixed_up_list = [1, 'Ciao!', 3.14159, True]

In fact, a list can even contain other lists:

In [None]:
family_birth_years = [  ['father', 1963], ['mother', 1964]  ]
print( family_birth_years[0] )
print( family_birth_years[0][1] )

The concept of embedding lists inside lists is super important, and allows you to create complex hierarchical data structures. Make sure you understand how to index items from a list and how to index items from a list within a list. Try experimenting further to create lists within lists within lists.

Note that list indices must be integers – they cannot be floats:

In [None]:
colors = ['blue', 'red', 'green', 'yellow', 'black']
print( colors[2] )

In [None]:
print( colors[2.0] )

Notice here we get another `TypeError`. This is because the index is the wrong type – an index must be of the type `int`, not `float`.

As you might expect, you can also index using an integer that is stored in a variable:

In [None]:
an_index = 2
print( colors[an_index] )

### Dictionaries

Dictionaries are similar to lists, except the indices are (usually) strings rather than numeric positions. More generally, a dictionary is a collection of **key-value pairs**. Each item in the dictionary has a key and a value. This makes it easier to find an item in a list because you can use its key to access the corresponding value. For example, you could create a dictionary of your family members, and then access the values by specifying the relevant key:

In [None]:
my_family = {'father':'Davide', 'mother':'Maria', 'sister':'Valentina'}
sister_name = my_family["sister"]
message = f'My sister is called {sister_name}!'
print(message)

Dictionaries (and lists too) are **mutable**. This means that the items in the list or dictionary can be deleted or modified. For example, if my father changed his name, I could update the dictionary:

In [None]:
my_family['father'] = 'Marco'
print(my_family)

Or I could add a new family member:

In [None]:
my_family['brother'] = 'Riccardo'
print(my_family)

What happens if you have multiple sisters or multiple brothers? How would you modify the dictionary structure to store multiple siblings? (hint: a dictionary can contain lists).

## Control Structure

So far we have simply created bits of data and assigned them to variables, but the true power of programming lies in writing code to manipulate these pieces of data in an automated and consistent way.

In programming, **control structure** refers to the order and logic of how the lines of code are executed – the way the program "flows". Think of programming like cooking from a recipe. Variables (e.g. strings and numbers) are like ingredients, while the control structure is like the recipe's procedure.

The most obvious type of control structure, which we've already been using, is the **"sequential flow"**: it might seem very obvious but the lines of code are executed one by one from top to bottom. The lines of code are not, for example, executed randomly or in parallel – that would be chaos!

However, in order to implement more complex kinds of logic, we need to dynamically modify the sequential flow of the program. Sometimes we need to repeat a line of code multiple times (the **"repetitive flow"** of the program) and sometimes we need to run a line of code only if a certain condition is met (the **"conditional flow"** of the program). These two types of control structure are absolutely fundamental to computer science. 95% of programming is figuring out the right control structure to accomplish a particular task. In other words, programming is all about finding the right recipe to combine the ingredients together to produce a particular outcome.

### Conditional flow: `if` ... `else`

In Python, the main type of conditional control structure is the if-statement. The language has three **keywords** that you will use quite regularly: `if`, `elif` (else if), and `else`. `if` and `elif` are always followed by a condition that Python needs to evaluate; if the condition evaluates as `True`, the **block of code** that is indented inside the if-statement will be executed; otherwise the code will be ignored. An `elif` statement is only evaluated if the previous `if`s and `elif`s were `False`. The code inside the `else` statement is only executed if all the `if`s and `elif`s were `False`. Overall, you should find conditional control structure to be very intuitive because it's very similar to natural language.

In [None]:
birth_year = 1987

if birth_year >= 1901 and birth_year <= 1927:
    print('You belong to the greatest generation')
elif birth_year >= 1928 and birth_year <= 1945:
    print('You belong to the silent generation')
elif birth_year >= 1946 and birth_year <= 1964:
    print('You are a baby boomer')
elif birth_year >= 1965 and birth_year <= 1980:
    print('You are gen-x')
elif birth_year >= 1981 and birth_year <= 1996:
    print('You are a millennial')
elif birth_year >= 1997 and birth_year <= 2012:
    print('You are gen-z')
else:
    print('You are gen alpha')

What generation are you? Can you see any problems here? What happens if the birth year is 1900? What condition do you need to add to fix this bug?

The main mathematical comparison operators are:

- `==` equal to
- `!=` not equal to
- `<` less than
- `>` greater than
- `>=` greater than or equal to
- `<=` less than or equal to

As you can see in the code above, these can be combined with logical operators, like `and` and `or`, to create more complex conditions.

Another useful operator is `in` which determines if a variable exists in a list, dictionary, or string:

In [None]:
colors = ['blue', 'red', 'green', 'yellow', 'black']

my_favorite_color = 'pink'

if my_favorite_color in colors:
    print('Good, this is a valid color!')
else:
    print('Your favorite color is invalid!')

If-statements can contain more than one line of code, and also if-statements can be embedded inside other if-statements. To demonstrate, let's play a game! Think of a color and type in the number of letters it contains and its first letter, then run the code.

In [None]:
# Think of a color...
# How many letters does it have?
n_letters = 5
# And what is the first letter?
first_letter = 'g'
# Now run the code...



guess = "I don't know!"

if n_letters == 3:
    if first_letter == 'r':
        guess = 'red'

elif n_letters == 4:
    if first_letter == 'b':
        guess = 'blue'
    elif first_letter == 'p':
        guess = 'pink'
    elif first_letter == 'g':
        guess = 'gray'

elif n_letters == 5:
    if first_letter == 'b':
        guess = 'black'
    elif first_letter == 'g':
        guess = 'green'
    elif first_letter == 'w':
        guess = 'white'

elif n_letters == 6:
    if first_letter == 'y':
        guess = 'yellow'
    elif first_letter == 'o':
        guess = 'orange'
    elif first_letter == 'p':
        guess = 'purple'
    
print(f'Your secret color is... {guess}')

Of couse, if you choose an obscure color like emerald it won't work. What happens instead? Now try reimplementing the above code with Italian color names (note: it's not as simple as just changing the color words to Italian).

### Repetitive flow: `for` and `while`

The second type of control structure is the repetitive flow, also known as **iteration** or **looping**. Iteration allows you to repeat a block of code many times, which is extremely useful because it means you don't need to write the same bit of code over and over again. In Python, like many other languages, there are two types of loop: the **for-loop** and the **while-loop**.

#### For-loops

For-loops are extremely common – perhaps even more common than if-statements – and they are very useful. In other languages, for-loops can be a bit difficult to read; but, in Python, for-loops are pretty readable and intuitive. The basic syntax of a for-loop looks like this:

```python
for x in y:
    # do something with x
```

where `x` is called the **iterator variable** and `y` is called the **iterable**. The iterator variable can be named anything you like, but like other variables, it's a good idea to pick a name that is descriptive. The iterable will be some kind of "container" object – a data type that has a concept of many values, such as a list, dictionary, or string.

Iterable objects

- lists (each item in the list)
- dictionaries (each key in the dictionary)
- strings (each character in the string)

Non-iterable objects

- integers
- floats
- booleans

Here's a simple for-loop:

In [None]:
list_of_colors = ['blue', 'red', 'green', 'yellow', 'black']

for each_color in list_of_colors:
    print(f'{each_color} is a color')

The key thing to understand is that the block of code indented inside the for-loop (in this case just a single print statement) is run once for every element of the iterable, and the iterator variable is used to temporarily represent each item in the iterable. Compare the following two pieces of code:

In [None]:
list_of_colors = ['blue', 'red', 'green', 'yellow', 'black']

for each_color in list_of_colors:
    print(f'{each_color} is a color')
    print('--------')

In [None]:
list_of_colors = ['blue', 'red', 'green', 'yellow', 'black']

for each_color in list_of_colors:
    print(f'{each_color} is a color')
print('--------')

Do you see the difference? The code that is indented inside the for-loop will be run for every item in the list. The code that is outside the for-loop only runs once. In Python, it is very important to make sure the indentation is correct because, as you see above, indentation can totally change the meaning of the code. (In some other languages, indentation doesn't matter and is only used to make the code easier to read for humans).

Let's try doing something a bit more complex:

In [None]:
list_of_colors = ['blue', 'red', 'green', 'yellow', 'black']

for each_color in list_of_colors:
    n_letters = len(each_color)
    message = f'The word {each_color} has {n_letters} letters'
    print(message)

Remember that you are free to choose the variable names. We could also rewrite the code as follows, but note that this is much harder to read and understand:

In [None]:
C = ['blue', 'red', 'green', 'yellow', 'black']

for c in C:
    n = len(c)
    m = f'The word {c} has {n} letters'
    print(m)

Another very common thing we want to do is iterate over the counting numbers. To do this we use the `range` function:

In [None]:
for each_number in range(1, 11):
    print(each_number * 2)

Essentially, the range function generates a list of numbers from x up-to-but-not-including y (remember that, in Python, ranges are inclusive:exclusive). Thus, `range(1, 11)` generates the list `[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]`, so the above code is equivalent to this:

In [None]:
for each_number in [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]:
    print(each_number * 2)

Here is a new version of the color guessing game, but this time using a for-loop and a single if-statement:

In [None]:
# Think of a color...
# How many letters does it have?
n_letters = 5
# And what is the first letter?
first_letter = 'w'
# Now run the code...


colors = ['red', 'blue', 'pink', 'gray', 'black', 'green', 'white', 'yellow', 'orange', 'purple']

guess = "I don't know!"

for color in colors:
    if len(color) == n_letters and color[0] == first_letter:
        guess = color

print(f'Your secret color is... {guess}')

First, notice how this code is MUCH shorter than our previous implementation of the color game above. Wow! So much simpler, right!? Notice how we can write the program in a much neater and more compact way by using the for-loop instead of loads of nested `if`s and `elif`s. Make sure you understand exactly how this code works, and then try rewriting the code with Italian color names. What happens if you include both *giallo* and *grigio* in the list? Can you think of some ways to fix this problem?

#### While-loops

The second type of loop is the while-loop. While-loops are relatively rare, so, if you are unsure about which loop you need, you probably want a for-loop. While-loops are typically useful if you don't have an iterable to iterate over; instead, you want to keep repeating a block of code until a particular condition is eventually met (in some ways, then, a while-loop is similar to an if-statement). One problem with this is that, if you don't think carefully about the logic of the code, it's easy to get stuck in an infinite loop because maybe the condition will never be met, so the loop will keep repeating infinitely.

Let's say we want to find the first 100 numbers that are divisible by both three and seven. For example, maybe you're trying to figure out how many stimuli you need and, for some reason, the number of stimuli needs to be divisible by three and seven. Before writing the code, we don't have anything to iterate over – we don't know how many numbers we will need to search to find 100 of these special numbers, so a while-loop seems to be appropriate.

We start by checking if 1 is divisible by three and seven; then we check 2; then 3; and so on... Every time we find a number that is divisible by both three and seven, we **append** that number to a list. On each iteration of the while-loop, we check to see if the list contains less than 100 items; if it does, we keep searching. Once the list contains 100 numbers, the condition of the while-loop is no longer met, so the while-loop exits.

In [None]:
numbers_divisible_by_3_and_7 = []
number_to_check = 1

while len(numbers_divisible_by_3_and_7) < 100:
    if number_to_check % 3 == 0 and number_to_check % 7 == 0:
        numbers_divisible_by_3_and_7.append(number_to_check)
    number_to_check += 1

print(numbers_divisible_by_3_and_7)

Aside from the while-loop, there are three new concepts introduced in this code.

1. `%` This is the modulo operator – the remainder after division. For example 10 % 3 is 1 because 3 fits into 10 three times with 1 left over. Here we are checking to see if the remainder is equal to 0 which implies that the numbers can be divided perfectly.

2. `.append()` We will see more of this kind of code in the next class, but for now this allows you to add (append) a number on to the end of a list. This allows us to gradually build up a list of numbers that meet our criteria.

3. `+=` This is the increment operator. It takes the current value of a variable and increases it by a certain amount. In this case, on every iteration of the while-loop, we increment `current_number` by one so that we can evaluate the next number.

Make sure you understand exactly how this code works. If you find it hard to understand, try working through the process step-by-step on a piece of paper. Make sure you can answer the following questions:

1. In the first line, why is there a set of empty brackets?

2. How many times does the while-loop run?

3. Why is `current_number += 1` not inside the if-statement? (warning: putting this line inside the if-statement will result in an infinite loop! Why?)

4. What happens if you change `and` to `or` in the if-statement? How does this change the resulting numbers?

5. Can you modify the code to find numbers that are also divisible by five?