# Language fundamentals

This chapter will start with a short tutorial to familiarize you with the Python language. You will quickly see the similarities with the programming language you learned in your undergraduate program. Remember, our goal here is to formalize and name the programming constructs (**semantics**). Using clear semantics is crucial to understand software documentation and to "ask questions the right way" in search engines.

## An entry level tutorial

Let's start by following a simple tutorial together. 

:::{tip}
You can simply *read* through the examples and try to remember them. This might work out for those of you with programming experience. For the majority of you, we highly recommend to open an **ipython** interpreter (or a **jupyter notebook**) to test the commands yourself as the tutorial goes on. You can open the interpreter on MyBinder, your laptop, or through JupyterHub in OLAT.
:::

**Copyright notice**: many of these examples and explanations are copy-pasted from the [official python tutorial](https://docs.python.org/3/tutorial/).

###  Python as a Calculator

The interpreter acts as a simple calculator: you can type an expression and it will write the value. Expression syntax is straightforward: the operators ``+``, ``-``, ``*`` and ``/`` work just like in most other languages:

In [1]:
2 + 2

4

In [2]:
50 - 5*6

20

In [3]:
8 / 5  # division always returns a floating point number

1.6

Comments in Python start with the hash character, `#`, and extend to the end of the physical line. A comment may appear at the start of a line or following whitespace or code:

In [4]:
# this is the first comment
spam = 1  # and this is the second comment
          # ... and now a third!

Parentheses `()` can be used for grouping:

In [5]:
(50 - 5 * 6) / 4

5.0

With Python, the `**` operator is used to calculate powers:

In [6]:
5 ** 2

25

The equal sign (`=`) is used to assign a value to a variable (**variable assignment**). Afterwards, no result is displayed until the next interactive prompt:

In [7]:
width = 20
height = 5 * 9
width * height

900

:::{tip}
I remember my first programming class very well: the professor wrote ``i = i + 1`` on the blackboard, and I was horrified: how can one write something so obviously wrong?

Many programming instructors recommend against reading out variable assignments as "name equals value" (i.e. from the example above: "i equals i + 1"), because it wrongly associates the `=` operator to "equals" in spoken language or mathematics. 

**A much better translation in spoken language would be "i becomes i + 1" or "i is assigned i + 1".
:::

If a variable is not “defined” (assigned a value), trying to use it will give you an error:

In [8]:
n  # trying to access an undefined variable raises an error

NameError: name 'n' is not defined

In interactive mode, the last printed expression is assigned to the variable `_`. This means that when you are using Python as a desk calculator, it is somewhat easier to continue calculations, for example:

In [9]:
tax = 12.5 / 100
price = 100.50
price * tax

12.5625

In [10]:
price + _

113.0625

`_` should be treated as a read-only variable, to be used in the interpreter only.

### Strings 

Besides numbers, Python can also manipulate strings, which can be expressed in several ways. They can be enclosed in single quotes (`'...'`) or double quotes (`"..."`) with the same result:

In [11]:
'spam eggs'

'spam eggs'

In [12]:
"spam eggs"

'spam eggs'

The double quotes are useful if you need to use a single quote in a string:

In [13]:
"doesn't"

"doesn't"

Alternatively, `\` can be used to escape quotes:

In [14]:
'doesn\'t'

"doesn't"

If you do not want characters prefaced by `\` to be interpreted as special characters, you can use raw strings by adding an `r` before the first quote. This is useful for Windows paths:

In [15]:
print('C:\some\name')  # here \n means newline!

C:\some
ame


In [16]:
print(r'C:\some\name')  # note the r before the quote

C:\some\name


```{admonition} For Windows users
:class: warning 

**Windows users**: remember this trick! Paths to files or folders are used constantly in programming.
```

Strings can be concatenated (glued together) with the `+` operator, and repeated with `*`:

In [17]:
("She's a " + 'witch! ') * 3

"She's a witch! She's a witch! She's a witch! "

Strings can be indexed (subscripted), with the first character having index 0:

In [18]:
word = 'Python'
word[0]  # character in position 0

'P'

In [19]:
word[5]  # character in position 5

'n'

Indices may also be negative numbers, to start counting from the right:

In [20]:
word[-1]  # last character

'n'

In [21]:
word[-2]  # second-last character

'o'

In addition to indexing, slicing is also supported. While indexing is used to obtain individual characters, slicing allows you to obtain a substring:

In [22]:
word[0:2]  # characters from position 0 (included) to 2 (excluded)

'Py'

In [23]:
word[2:5]  # characters from position 2 (included) to 5 (excluded)

'tho'

Note how the start is always included, and the end always excluded. This makes sure that `s[:i] + s[i:]` is always equal to `s`:

In [24]:
word[:2] + word[2:]

'Python'

Attempting to use an index that is too large will result in an error:

In [25]:
word[42]  # the word only has 6 characters: this will raise an error

IndexError: string index out of range

However, out of range slice indexes are handled gracefully when used for slicing:

In [26]:
word[4:42]

'on'

In [27]:
word[42:]

''

The **built-in** **function** `len()` returns the length of a string:

In [28]:
s = 'supercalifragilisticexpialidocious'
len(s)

34

## Basic data types

Now that you are more familiar with the basics, let's start to name things "the right way". For example: an informal way to describe a programming language is to say that it "does things with stuff". 

This "stuff" is formally called "objects" in Python. We will define objects more precisely towards the end of the course, but for now remember one important thing: **In Python, everything is an object**. Yes, everything.

Python objects have a **type** (synonym: [data type](https://en.wikipedia.org/wiki/Data_type)). In the previous tutorial, you used exclusively [built-in](https://docs.python.org/3/library/stdtypes.html) types. **Built-in data types** are directly available in the interpreter, as opposed to other data types which may be obtained either by importing them (e.g. ``from collections import OrderedDict``) or by creating new data types yourselves.

### Asking for the type of an object

In [29]:
type(1)

int

In [30]:
a = 'Hello'
type(a)

str

```{exercise}
Try `print(type(a))` instead to see the difference with IPython's simplified print. What is the type of ``type``, by the way?
```

### Numeric types 

There are three distinct numeric types: **integers** (``int``), **floating point numbers** (``float``), and **complex numbers** (``complex``). We will talk about these in more detail in the numerics chapter.

 ### Booleans

There is a built-in boolean data type (``bool``) useful to test for truth value. Examples:

In [31]:
type(True), type(False)

(bool, bool)

In [32]:
type(a == 'Hello')

bool

In [33]:
3 < 5

True

Note that there are other rules about testing for truth in Python. This is quite convenient if you want to avoid doing operations on invalid or empty containers:

In [34]:
if '':
    print('This should not happen')

In Python, like in C, any non-zero integer value is true; zero is false:

In [35]:
if 1 and 2:
    print('This will happen')

This will happen


Refer to the [docs](https://docs.python.org/3/library/stdtypes.html#truth-value-testing) for an exhaustive list of boolean operations and comparison operators.

### Text

In Python (and many other languages) text sequences are named **strings** (``str``), which can be of any length:

In [36]:
type('Français, 汉语')  # unicode characters are no problem in Python

str

Unlike some languages, there is no special type for characters:

In [37]:
for char in 'string':
    # "char" is also a string of length 1
    print(char, type(char))

s <class 'str'>
t <class 'str'>
r <class 'str'>
i <class 'str'>
n <class 'str'>
g <class 'str'>


Since strings behave like **lists** in many ways, they are often classified together with the **sequence** types, as we will see below.

Python strings cannot be changed - they are [immutable](https://en.wikipedia.org/wiki/Immutable_object). Therefore, assigning to an indexed position in the string results in an error:

In [38]:
word = 'Python'
word[0] = 'J'

TypeError: 'str' object does not support item assignment

Python objects have **methods** attached to them. We will learn more about methods later, but here is an example: 

In [39]:
word.upper()  # the method .upper() converts all letters in a string to upper case

'PYTHON'

In [40]:
"She's a witch!".split(' ')  # the .split() method divides strings using a separator

["She's", 'a', 'witch!']

### Sequence types - list, tuple, range

Python knows a number of sequence data types, used to group together other values. The most versatile is the list, which can be written as a list of comma-separated values (items) between square brackets. Lists might contain items of different types, but usually the items all have the same type.

In [41]:
squares = [1, 4, 9, 16, 25, 36, 49]
squares

[1, 4, 9, 16, 25, 36, 49]

Lists can be indexed and sliced:

In [42]:
squares[0]

1

In [43]:
squares[-3:]

[25, 36, 49]

In [44]:
squares[0:7:2]  # new slicing! From element 0 to 7 in steps of 2

[1, 9, 25, 49]

In [45]:
squares[::-1]  # new slicing! All elements in steps of -1, i.e. reverse

[49, 36, 25, 16, 9, 4, 1]

:::{warning}
Lists are not the equivalent of arrays in MATLAB. One major difference being that the addition operator *concatenates* lists together (like strings), instead of adding the numbers elementwise like in MATLAB. For example:
:::

In [46]:
squares + [64, 81, 100]

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

Unlike strings, which are immutable, lists are a mutable type, i.e. it is possible to change their content:

In [47]:
cubes = [1, 8, 27, 65, 125]  # something's wrong here
cubes[3] = 64
cubes

[1, 8, 27, 64, 125]

Assignment to slices is also possible, and this can even change the size of the list:

In [48]:
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
letters[2:5] = ['C', 'D', 'E']  # replace some values
letters

['a', 'b', 'C', 'D', 'E', 'f', 'g']

In [49]:
letters[2:5] = []  # now remove them
letters

['a', 'b', 'f', 'g']

The built-in function `len()` also applies to lists:

In [50]:
len(letters)

4

It is possible to nest lists (create lists containing other lists), as it is possible to store different objects in lists. For example:

In [51]:
a = ['a', 'b', 'c']
n = [1, 2, 3]
x = [a, n, 3.14]
x

[['a', 'b', 'c'], [1, 2, 3], 3.14]

In [52]:
x[0][1]

'b'

Lists also have methods attached to them (see [5.1 More on lists](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) for the most commonly used). For example:

In [53]:
alphabet = ['c', 'b', 'd']
alphabet.append('a')  # add an element to the list
alphabet

['c', 'b', 'd', 'a']

In [54]:
alphabet.sort() # sort it
alphabet

['a', 'b', 'c', 'd']

Other sequence types include: **string, tuple, range**. Sequence types support a [common set of operations](https://docs.python.org/3/library/stdtypes.html#common-sequence-operations) and are therefore very similar:

In [55]:
l = [0, 1, 2]
t = (0, 1, 2)
r = range(3)
s = '123'

In [56]:
# Test if elements can be found in the sequence(s)
1 in l, 1 in t, 1 in r, '1' in s

(True, True, True, True)

In [57]:
# Ask for the length
len(l), len(t), len(r), len(s)

(3, 3, 3, 3)

In [58]:
# Addition
print(l + l)
print(t + t)
print(s + s)

[0, 1, 2, 0, 1, 2]
(0, 1, 2, 0, 1, 2)
123123


The addition operator does not work for the range type though. Ranges are a little different than lists or strings:

In [59]:
r = range(2, 13, 2)
r  # r is an object of type "range". It does not print all the values, just the interval and steps

range(2, 13, 2)

In [60]:
list(r)  # applying list() converts range objects to a list of values

[2, 4, 6, 8, 10, 12]

Ranges are usually used as loop counters or to generate other sequences. Ranges have a strong advantage over lists and tuples: their elements are generated *when they are needed*, not before. Ranges have therefore a very low memory consumption. See the following:

In [61]:
range(2**100)  # no problem

range(0, 1267650600228229401496703205376)

In [62]:
list(range(2**100))  # trying to make a list of values out of it results in an error

OverflowError: Python int too large to convert to C ssize_t

An ``OverflowError`` tells me that I am trying to create an array too big to fit into memory.

The "**tuple**" data type is probably a new concept for you, as tuples are quite specific to Python. A tuple behaves *almost* like a list, but the major difference is that a tuple is **immutable**:

In [63]:
l[1] = 'ha!'  # I can change an element of a list
l

[0, 'ha!', 2]

In [64]:
t[1] = 'ha?'  # But I cannot change an element of a tuple

TypeError: 'tuple' object does not support item assignment

It is their immutability which makes tuples useful, but for beginners this is not really obvious at the first sight. We will get back to tuples later in the lecture.

### Sets 

Sets are an unordered collection of **distinct** objects: 

In [65]:
s1 = {'why', 1, 9}
s2 = {9, 'not'}
s1

{1, 9, 'why'}

In [66]:
# Let's compute the union of these two sets. We use the method ".union()" for this purpose:
s1.union(s2)  # 9 was already in the set, however it is not doubled in the union

{1, 9, 'not', 'why'}

Sets are useful for operations such as intersection, union, difference, and symmetric difference between sequences. You will not use them much this semester, but remember that they exist.

### Mapping types - dictionaries

A **mapping object** maps values (**keys**) to arbitrary objects (**values**): the most frequently used mapping object is called a **dictionary**. It is a collection of (key, value) pairs:  

In [67]:
tel = {'jack': 4098, 'sape': 4139}
tel

{'jack': 4098, 'sape': 4139}

In [68]:
tel['guido'] = 4127
tel

{'jack': 4098, 'sape': 4139, 'guido': 4127}

In [69]:
del tel['sape']
tel

{'jack': 4098, 'guido': 4127}

*Keys* can be of any immutable type: e.g. strings and numbers are often used as keys. The keys in a dictionary are all unique (they have to be):

In [70]:
d = {'a':1, 2:'b', 'c':1}  # a, 2, and c are keys
d

{'a': 1, 2: 'b', 'c': 1}

You can ask whether a key exists in a dict with the statement:

In [71]:
2 in d

True

However, you cannot check the existence of a value, since the values are not necessarily unique:

In [72]:
1 in d

False

:::{warning}
A python `dict` is not guaranteed to remember the *order* in which the keys have been added to it. As of Python 3.6, for the CPython implementation of Python, dictionaries remember the order of items inserted, but it is not guaranteed in previous Python versions and you should not count on it.
:::

Dictionaries are (together with lists) the **container** type you will use the most often.

*Note: there are other container types in python, but they are used less often. See [Container datatypes](https://docs.python.org/3/library/collections.html) in the official documentation.*

### Semantics parenthesis: "literals"

**Literals** are the fixed values of a programming language ("notations"). Some of them are pretty universal, like numbers or strings (``9``, ``3.14``, ``"Hi!"``, all literals) some are more language specific and belong to the language's syntax. Curly brackets ``{}`` for example are the literal representation of a ``dict``. The literal syntax has been added for convenience only:

In [73]:
d1 = dict(bird='parrot', plant='crocus')  # one way to make a dict
d2 = {'bird':'parrot', 'plant':'crocus'}  # another way to make a dict
d1 == d2

True

Both `{}` and `dict()` are equivalent: using one or the other to construct your containers is a matter of taste, but in practice you will see the literal version more often.

##  Control flow

### First steps towards programming

Of course, we can use Python for more complicated tasks than adding two and two together. For instance, we can write an initial sub-sequence of the [Fibonacci series](https://en.wikipedia.org/wiki/Fibonacci_number) as follows:

In [74]:
# Fibonacci series:
# the sum of two previous elements defines the next
a, b = 0, 1
while a < 10:
    print(a)
    a, b = b, a+b

0
1
1
2
3
5
8


This example introduces several new features.
- The first line contains a multiple assignment: the variables a and b simultaneously get the new values 0 and 1. On the last line this is used again, demonstrating that the expressions on the right-hand side are all evaluated first before any of the assignments take place. The right-hand side expressions are evaluated from the left to the right.
- The while loop executes as long as the condition (here: ``a < 10``) remains true. The standard comparison operators are written the same as in C: `<` (less than), `>` (greater than), `==` (equal to), `<=` (less than or equal to), `>=` (greater than or equal to) and `!=` (not equal to).
- The body of the loop is **indented**: indentation is Python’s way of grouping statements, and not via brackets or ``begin .. end`` statements. Hate it or love it, this is how it is ;-). I learned to like this style a lot. **Note that each line within a basic block must be indented by the same amount.** Although the indentation could be anything (two spaces, three spaces, tabs...), the recommended way is to use **four spaces**.

The print() function accepts multiple arguments:

In [75]:
i = 256*256
print('The value of i is', i)

The value of i is 65536


The keyword argument (see definition below) ``end`` can be used to avoid the newline after the output, or end the output with a different string:

In [76]:
a, b = 0, 1
while a < 1000:
    print(a, end=',')
    a, b = b, a+b

0,1,1,2,3,5,8,13,21,34,55,89,144,233,377,610,987,

### The `if` statement 

Perhaps the most well-known statement type is the if statement:

In [77]:
x = 12
if x < 0:
    x = 0
    print('Negative changed to zero')
elif x == 0:
    print('Zero')
elif x == 1:
    print('Single')
else:
    print('More')

More


There can be zero or more `elif` parts, and the `else` part is optional. The keyword `elif` is short for "else if", and is useful to avoid excessive indentation.

###  The `for` statement

The `for` loops in python can be quite different than in other languages: **In Python, one iterates over sequences, not indexes**. This is a feature I very much like for its readability:

In [78]:
words = ['She', 'is', 'a', 'witch']
for w in words:
    print(w)

She
is
a
witch


The equivalent `for` loop with a counter is considered "unpythonic", i.e. not elegant.

**Unpythonic:**

In [79]:
seq = ['This', 'is', 'very', 'unpythonic']
# Do not do this at home!
n = len(seq)
for i in range(n):
    print(seq[i])

This
is
very
unpythonic


**Pythonic**:

In [80]:
seq[-1] = 'pythonic'
for s in seq:
    print(s)

This
is
very
pythonic


``for i in range(xx)`` is *almost never* what you want to do in Python. If you have several sequences you want to **iterate** over, then do:

In [81]:
squares = [1, 4, 9, 25]
for s, l in zip(seq, squares):
    print(l, s)

1 This
4 is
9 very
25 pythonic


### The `break` and `continue` statements

The `break` statement breaks out of the innermost enclosing for or while loop:

In [82]:
for letter in 'Python':
    if letter == 'h':
        break
    print('Current letter:', letter)

Current letter: P
Current letter: y
Current letter: t


The `continue statement` continues with the next iteration of the loop:

In [83]:
for num in range(2, 10):
    if num % 2 == 0:
        print("Found an even number", num)
        continue
    print("Found a number", num)

Found an even number 2
Found a number 3
Found an even number 4
Found a number 5
Found an even number 6
Found a number 7
Found an even number 8
Found a number 9


## Defining  functions

### A first example 

In [84]:
def fib(n):
    """Print a Fibonacci series up to n."""
    a, b = 0, 1
    while a < n:
        print(a, end=' ')
        a, b = b, a+b

# Now call the function we just defined:
fib(2000)

0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 

The `def` statement introduces a function definition. It must be followed by the function name and the parenthesized list of formal parameters. The statements that form the body of the function start at the next line, and **must be indented**.

The first statement of the function body can optionally be a string literal; this string literal is the function's documentation string, or docstring (more about docstrings later: in the meantime, make a habit out of it).

A function definition introduces the function name in the current **scope** (we will learn about scopes soon). The value of the function name has a type that is recognized by the interpreter as a user-defined function. This value can be assigned to another name which can then also be used as a function. This serves as a general renaming mechanism:

In [85]:
fib

<function __main__.fib(n)>

In [86]:
f = fib
f(100)

0 1 1 2 3 5 8 13 21 34 55 89 

Coming from other languages, you might object that `fib` is not a function but a procedure since it does not return a value. In fact, even functions without a return statement do return a value, albeit a rather boring one. This value is called `None` (it is a built-in name). Writing the value `None` is normally suppressed by the interpreter if it would be the only value written. You can see it if you really want to by using `print()`:

In [87]:
fib(0)  # shows nothing

In [88]:
print(fib(0))  # prints None

None


It is simple to write a function that returns a list of the numbers of the Fibonacci series, instead of printing it:

In [89]:
def fib2(n):  # return Fibonacci series up to n
    """Return a list containing the Fibonacci series up to n."""
    result = []
    a, b = 0, 1
    while a < n:
        result.append(a) 
        a, b = b, a+b
    return result

r = fib2(100)  # call it
r  # print the result

[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]

### Positional and keyword arguments

Functions have two types of arguments: **positional arguments** and **keyword arguments**.

**keyword arguments** are preceded by an identifier (e.g. ``name=``) and are attributed a default value. They are therefore *optional*:

In [90]:
def f(arg1, arg2, kwarg1=None, kwarg2='Something'):
    """Some function with arguments."""
    print(arg1, arg2, kwarg1, kwarg2)

In [91]:
f(1, 2)  # no need to specify them - they are optional and have default values

1 2 None Something


In [92]:
f(1, 2, kwarg1=3.14, kwarg2='Yes')  # but you can set them to a new value
f(1, 2, kwarg2='Yes', kwarg1=3.14)  # and the order is not important!

1 2 3.14 Yes
1 2 3.14 Yes


Unfortunately, it is also possible to set keyword arguments without naming them, in which case the order matters:

In [93]:
f(1, 2, 'Yes', 'No')

1 2 Yes No


This feature because reduces the clarity of the code and we recommend to always use the ``kwarg=`` syntax. Others agree with that, and therefore Python implemented a syntax to make calls like the above illegal:

In [94]:
# The * before the keyword arguments make them keyword arguments ONLY
def f(arg1, arg2, *, kwarg1=None, kwarg2='None'):
    print(arg1, arg2, kwarg1, kwarg2)

In [95]:
f(1, 2, 'Yes', 'No')  # This now raises an error

TypeError: f() takes 2 positional arguments but 4 were given

**Positional arguments** are named like this because their position matters, and unlike keyword arguments they do not have a default value and they are mandatory. Forgetting to set them results in an error:

In [96]:
f(1)

TypeError: f() missing 1 required positional argument: 'arg2'

## Importing modules and functions

Although Python ships with some built-in functions available in the interpreter (e.g. `len()`, `print()`), it is by far not enough to do real world programming. Thankfully, Python comes with a mechanism which allows us to access *much* more functionality: 

In [97]:
import math
print(math)
print(math.pi)

<module 'math' from '/home/c707201/mambaforge/envs/scipro/lib/python3.13/lib-dynload/math.cpython-313-x86_64-linux-gnu.so'>
3.141592653589793


`math` is a **module**, and it has attributes (e.g. `pi`) and functions attached to it:

In [98]:
math.sin(math.pi / 4)  # compute a sinus

0.7071067811865475

`math` is available in the [Python **standard library**](https://docs.python.org/3/library/): this means that it comes pre-installed together with Python itself. Other modules can be installed (like `numpy` or `matplotlib`), but we will not need them for now.

Modules often have a thematic grouping, i.e. `math`, `time`, `multiprocessing`. You will learn more about them in the next lecture.

## Take home points

- In Python, everything is an object - we will learn more about them later.
- For now you can remember that Python objects have methods ("services") attached to them, such as `.split()` for strings or `.append()` for lists.
- All objects have a data type: examples of data types include `float`, `string`, `dict`, and `list`.
- You can ask for the type of an object with the built-in function ``type()``.
- "Built-in" means that a function or data type is available at the command prompt without an import statement.
- The "standard library" is not the same as "built-in" (the standard library is the suite of modules which come pre-installed with Python).
- `list` and `dict` are the container data types you will use most often, `tuple` is often returned by Python itself or libraries.
- Certain objects are immutable (`string`, `tuple`), but others are mutable and can change their state (`dict`, `list`).
- In Python, indentation matters! This is how you define blocks of code. Keep your indentation consistent, with 4 spaces.
- In Python, one iterates over sequences, not indexes (`for i in ...` is very rare in Python and so is the variable `i`).
- Functions are defined with ``def``, and also rely on indentation to define blocks. They can have a `return` statement.
- There are two types or arguments in functions: positional (mandatory) and keyword (optional) arguments.
- The `import` statement opens a whole new world of possibilities: you can access other standard tools that are not available at the top-level prompt.

We learned the basic elements of the Python syntax: to become fluent with this new language you will have to get familiar with all of the elements presented above. With time, you might want to get back to this chapter (or to the Python reference documentation) to revisit what you have learned. I also highly recommend to follow the official Python tutorial, sections 3 to 5.