# Basic concepts

## Basic input and output

The traditional "Hello, world" program is very simple in Python. You can run the program by selecting the cell by mouse and pressing control-enter on keyboard. Try editing the string in the quotes and rerunning the program.

In [84]:
print("Hello world!")

Hello world!


Multiple strings can be printed. By default, they are concatenated with a space:

In [85]:
print("Hello,", "John!", "How are you?")

Hello, John! How are you?


In the print function, numerical expression are first evaluated and then automatically converted to strings. Subsequently the strings are concatenated with spaces:

In [86]:
print(1, "plus", 2, "equals", 1+2)

1 plus 2 equals 3


Reading textual input from the user can be achieved with the input function. The input function is given a string parameter, which is printed and prompts the user to give input. In the example below, the string entered by the user is stored the variable `name`.

In [87]:
name=input("Give me your name: ")
print("Hello,", name)

Give me your name: Jarkko
Hello, Jarkko


## Indentation

Repetition is possible with the for loop. Note that the body of for loop is indented whith a tabulator or four spaces.
Unlike in some other languages, braces are not needed to denote the body of the loop. When the indentation stops the body of the loop ends.

In [88]:
for i in range(3):
    print("Hello")
print("Bye!")

Hello
Hello
Hello
Bye!


Indentation applies to other compound statements as well, such as bodies of functions, different branches of an if statement, and while loops. We shall see examples of these later.

The range(3) expression above actually results with the sequence of integers 0, 1, and 2. So, the range is a half-open interval with the end point excluded from the range. In general, expression range(n) gives integers 0, 1, 2, ..., n-1. Modify the above program to make it also print the value of variable i at each iteration. Rerun the code with control-enter.

#### <div class="alert alert-info"> Exercise 1 (hello world)</div>
Fill in the missing piece in the solution stub to make it print the following:

`Hello, world!`

Make sure you use correct indenting.

#### <div class="alert alert-info"> Exercise 2 (compliment)</div>
Fill in the stub solution to make the program work as follows. The program should ask the user for an input, and the print an answer as the examples below show. The string the user entered is shown below in red.

What country are you from? <font color='red'>Sweden</font>  
I have heard that Sweden is a beautiful country.

What country are you from? <font color='red'>Chile</font>  
I have heard that Chile is a beautiful country.

#### <div class="alert alert-info">Exercise 3 (multiplication)</div> 
Make a program that gives the following output. You must use a for loop in your solution.

```
4 multiplied by 0 is 0
4 multiplied by 1 is 4
4 multiplied by 2 is 8
4 multiplied by 3 is 12
4 multiplied by 4 is 16
4 multiplied by 5 is 20
4 multiplied by 6 is 24
4 multiplied by 7 is 28
4 multiplied by 8 is 32
4 multiplied by 9 is 36
4 multiplied by 10 is 40
```

## Variables and data types

We saw already earlier that assigning a value to variable is very simple:

In [89]:
a=1
print(a)

1


Note that we did not need to introduce the variable a in any way. No type was given for the variable. Python automatically detected that the type of a must be int. We can query the type of a variable with the builtin function type:

In [90]:
type(a)

int

Note also that the type of a variable is not fixed:

In [91]:
a="some text"
type(a)

str

In Python the type of a variable is not attached to the name of the variable, like in C for instance, but instead with the actual value. This is called dynamic typing.

![typing.svg](attachment:typing.svg)

We say that a variable is a name that *refers* to a value or and object, and the assignment operator *binds* a variable name to a value.

The basic data types in Python are: int, float, complex, str (a string), bool (a boolean with values True and False), and bytes. Below are few examples of their use.

In [92]:
i=5
f=1.5
b = i==4
print("Result of the comparison:", b)
c=0+2j
print("Complex multiplication:", c*c)
s="conca" + "tenation"
print(s)

Result of the comparison: False
Complex multiplication: (-4+0j)
concatenation


The names of the types act as conversion operators between types:

In [93]:
print(int(-2.8))
print(float(2))
print(int("123"))
print(bool(-2), bool(0))  # Zero is interpreted as False
print(str(234))

-2
2.0
123
True False
234


### Creating strings
A string is a sequence of characters commonly used to store input or output data in a program. The characters of a string are specified either between single (') or double (") quotes. This optionaly is useful if a string needs to contain a quotation mark:
"I don't want to go!". You can also achieve this by *escaping* the quotation mark with the backslash: 'I don\'t want to go'.

The string can also contain other escape sequences like \n for newline and \t for a tabulator. See [literals](https://docs.python.org/3/reference/lexical_analysis.html#literals) for a list of all escape sequences.

In [94]:
print("One\tTwo\nThree\tFour")

One	Two
Three	Four


A string containing newlines can be easily given within triple double or triple single quotes:

In [95]:
s="""A string
spanning over
several lines"""

Although we can concatenate strings using the + operator, for effiency reasons, one should use the join method to concatenate largen number of strings:

In [96]:
a="first"
b="second"
print(a+b)
print(" ".join([a, b, b, a]))   # More about the join method later


firstsecond
first second second first


Sometimes printing by concatenation from pieces can be clumsy:

In [97]:
print(str(1) + " plus " + str(3) + " is equal to " + str(4))
# slightly better
print(1, "plus", 3, "is equal to", 4)

1 plus 3 is equal to 4
1 plus 3 is equal to 4


The multiple catenation and quotation characters break the flow of thought. *String interpolation* offers somewhat easier syntax:

In [98]:
print("%i plus %i is equal to %i" % (1, 3, 4))

1 plus 3 is equal to 4


Or alternatively using the newer format-method:

In [99]:
print("{} plus {} is equal to {}".format(1, 3, 4))

1 plus 3 is equal to 4


The %i format specifier corresponds to integers and the specifier %f corresponds to floats.
It is often useful to specify the number of decimals when printing the float:

In [100]:
print("%.1f %.2f %.3f" % (1.6, 1.7, 1.8))               # Old style
print("{:.1f} {:.2f} {:.3f}".format(1.6, 1.7, 1.8))     # new style

1.6 1.70 1.800
1.6 1.70 1.800


Look [here](https://pyformat.info/#number) for more details about format specifiers, and for comparison between the old and new style of string interpolation.

## Expressions
An *expression* is a piece of Python code that results in a value. It consists of values combined together with *operators*. Values can be literals, such as `1`, `1.2`, `"text"`, or variables. Operators include arithmetics operators, comparison operators, function call, indexing, attribute references, among others. Below there are a few examples of expressions:

```1+2
7/(2+0.1)
a
cos(0)
mylist[1]
c > 0 and c !=1
(1,2,3)
a<5
obj.attr
(-1)**2 == 1```

<div class="alert alert-warning">Note that in Python the operator `//` performs integer division and operator `/` performs float division. The `**` operator denotes exponentiation. These operators might therefore behave differently than in many other comman languages.</div>

As another example the following expression computes the kinetic energy of a non-rotating object:
`0.5 * mass * velocity**2`

## Statements
Statements are command that have some effect. For example, a function call (that is not part of another expression) is a statement. Also, the variable assignment is a statement:

In [101]:
i = 5
i = i+1    # This is a commong idion to increment the value of i by one
i += 1     # This is a short-hand for the above

It turns out that the operators `+ - * / // % & | ^ >> << **` have the corresponding *augmented assignment operators* `+= -= *= /= //= %= &= |= ^= >>= <<= **=`

Another large set of statements if the flow-control statements such as if-else, for and while loops. We will look into these in the next sections.

### Loops for repetitive tasks
In Python we have two kinds of loops: while and for. We briefly saw the for loop earlier. Let's now look at the while loop. A while loop repeats a set of statements while a given condition holds. An example:

In [102]:
i=1
while i*i < 1000:
    print("Square of", i, "is", i*i)
    i = i + 1
print("Finished printing all the squares below 1000.")

Square of 1 is 1
Square of 2 is 4
Square of 3 is 9
Square of 4 is 16
Square of 5 is 25
Square of 6 is 36
Square of 7 is 49
Square of 8 is 64
Square of 9 is 81
Square of 10 is 100
Square of 11 is 121
Square of 12 is 144
Square of 13 is 169
Square of 14 is 196
Square of 15 is 225
Square of 16 is 256
Square of 17 is 289
Square of 18 is 324
Square of 19 is 361
Square of 20 is 400
Square of 21 is 441
Square of 22 is 484
Square of 23 is 529
Square of 24 is 576
Square of 25 is 625
Square of 26 is 676
Square of 27 is 729
Square of 28 is 784
Square of 29 is 841
Square of 30 is 900
Square of 31 is 961
Finished printing all the squares below 1000.


Note again that the body of the while statement was marked with the indentation.

Another way of repeating statements is with the for statement. An example

In [103]:
s=0
for i in [0,1,2,3,4,5,6,7,8,9]:
    s = s + i
print("The sum is", s)

The sum is 45


The for loop executes the statements in the block as many times as there are elements in the given list. At each iteration the variable i refers to another value from the list in order. Instead of the giving the list explicitly as above, we could have used the *generator* range(10) which returns values from the sequence 0,1,...,9 as the for loop asks for a new value. In the most general form the for loop goes through all the elements in an *iterable*.
Besides lists and generators there are other iterables. We will talk about iterables and generators later this week.

When one wants to iterate through all the elements in an iterable, then the for loop is a natural choice. But sometimes while loops offer cleaner solution. For instance, if we want
to go through all Fibonacci number up till a given limit, then it is easier to do with a `while` loop.

### Decision making with the if statement
The if-else statement works as can be expected.
Try running the below cell by pressing control+enter.

In [104]:
x=input("Give an integer: ")
x=int(x)
if x >= 0:
    a=x
else:
    a=-x
print("The absolute value of %i is %i" % (x, a))

Give an integer: -2
The absolute value of -2 is 2


The general from of an if-else statement is

```if condition1:
    statement1_1
    statement1_2
    ...
elif condition2:
    statement2_1
    statement2_2
    ...
...
else:
    statementn_1
    statementn_2
    ...
```

Another example:

In [105]:
c=float(input("Give a number: "))
if c > 0:
    print("c is positive")
elif c<0:
    print("c is negative")
else:
    print("c is zero")

Give a number: -1
c is negative


### Breaking and continuing loop
Breaking the loop, when the wanted element is found, with the `break` statement:

In [106]:
l=[1,3,65,3,-1,56,-10]
for x in l:
    if x < 0:
        break
print("The first negative list element was", x)

The first negative list element was -1


Stopping current iteration and continuing to the next one with the `continue` statement:

In [107]:
from math import sqrt, log
l=[1,3,65,3,-1,56,-10]
for x in l:
    if x < 0:
        continue
    print("Square root of %i is %f" % (x, sqrt(x)))
    print("Natural logarithm of %i is %f" % (x, log(x)))

Square root of 1 is 1.000000
Natural logarithm of 1 is 0.000000
Square root of 3 is 1.732051
Natural logarithm of 3 is 1.098612
Square root of 65 is 8.062258
Natural logarithm of 65 is 4.174387
Square root of 3 is 1.732051
Natural logarithm of 3 is 1.098612
Square root of 56 is 7.483315
Natural logarithm of 56 is 4.025352


## Functions
A function is defined with the `def` statement. Let's do a doubling function.

In [108]:
def double(x):
    "This function multiplies its argument by two."
    return x*2
print(double(4), double(1.2), double("abc")) # It even happens to work for strings!

8 2.4 abcabc


The double function takes only one parameter. Notice the *docstring* on the second line. It documents the purpose and usage of the function. Let's try to access it.

In [109]:
print("The docstring is:", double.__doc__)
help(double)   # Another way to access the docstring

The docstring is: This function multiplies its argument by two.
Help on function double in module __main__:

double(x)
    This function multiplies its argument by two.



Most of Python's builtin functions, classes, and modules should contain a docstring.

In [110]:
help(print)

Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.



Here's another example function:

In [111]:
def sum_of_squares(a, b):
    "Computes the sum of arguments squared"
    return a**2 + b**2
print(sum_of_squares(3, 4))

25


<div class="alert alert-warning">Note the terminology: in the function definition the names a and b are called *parameters* of the function; in the function call, however, 3 and 4 are called *arguments* to the function.
</div>

It would be nice that the number of arguments could be arbitrary, not just two. We could pass a list to the function as a parameter.

In [112]:
def sum_of_squares(lst):
    "Computes the sum of squares of elements in the list given as parameter"
    s=0
    for x in lst:
        s += x**2
    return s
print(sum_of_squares([-2]))
print(sum_of_squares([-2,4,5]))

4
45


This works perfectly! There is however some extra typing with the brackets around the lists. Let's see if we can do better:

In [113]:
def sum_of_squares(*t):
    "Computes the sum of squares of arbitrary number of arguments"
    s=0
    for x in t:
        s += x**2
    return s
print(sum_of_squares(-2))
print(sum_of_squares(-2,4,5))

4
45


The strange looking argument notation is called *argument packing*. It packs all the given positional arguments into a tuple `t`. We will encounter tuples again later, but it suffices now to say that tuples are immutable lists. With the for loop we can iterate through all the elements in the tuple.

Conversely, there is also syntax for *argument unpacking*. It has confusingly exactly same notation as argument packing, but they are separated by the location where used. Packing happens in the parameter list of the functions definition, and unpacking happens where the function is called:

In [115]:
lst=[1,5,8]
print("With list unpacked as arguments to the functions:", sum_of_squares(*lst))
# print(sum_of_squares(lst))    # Does not work correctly

With list unpacked as arguments to the functions: 90


The second call failed because the function tried to raise the list of numbers to the second power. Inside the function body we have t=([1,5,8]), where the parentheses denote a tuple with one element, a list.

In addition to positional parameters we have seen so far, a function can also have *named parameters*. An example will explain this concept best:

One can also specify optional parameter by giving the parameters a default value. The parameters that have default values must come after those parameters that don't. We saw that the parameters of the print function were of form `print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)`. There were four parameters with default values. If some default values don't suit us, we give them in the function call using the name of the parameter:

In [116]:
print(1, 2, 3, end=' |', sep=' -*- ')
print("first", "second", "third", end=' |', sep=' -*- ')

1 -*- 2 -*- 3 |first -*- second -*- third |

Note that the named arguments didn't need to be in the same order as in the function definition. Nor did we need to specify all the parameters with default values, only those we wanted to change.

In [117]:
def length(*t, degree=2):
    """Computes the length of the vector given as parameter. By default, it computes
    the Euclidean distance (degree==2)"""
    s=0
    for x in t:
        s += abs(x)**degree
    return s**(1/degree)
print(length(-4,3))
print(length(-4,3, degree=3))

5.0
4.497941445275415


With the default parameter this is the Euclidean distance, and if $p\ne 2$ it is called [$p$-norm](https://en.wikipedia.org/wiki/P-norm).

We saw that it was possible to use packing and unpacking of arguments with the * notation, when one wants to specify arbitrary number of *positional arguments*. This is also possible for arbitrary number of named arguments with the `**` notation. We will talk about this more in the data structures section.

### Visibility of variables
Function definition creates a new namespace (also called local scope). Variables created inside this scope are not available from outside the function definition. Also, the function parameters are only visible inside the function definition. Variables that are not defined inside any function are called `global variables`.

Global variable are readable also in local scopes, but an assignment creates a new local variable without rebinding the global variable. If we are inside a function, a local variable hides a global variable by the same name:

In [118]:
i=2
def f():
    i=3       # this creates a new variable, it does not rebind the global i
    print(i)  # This will print 3    
f()
print(i)      # This will print 2

3
2


If you really need to rebind a global variable from a function, use the `global` statement. Example:

In [119]:
i=2
def f():
    global i
    i=5       # rebind the global i variable
    print(i)  # This will print 5
f()
print(i)      # This will print 5

5
5


Unlike languages like C or C++, Python allows defining a function inside another function. This *nested* function will have nested scope:

In [120]:
def f():            # outer function
    b=2
    def g():        # inner function
        #nonlocal b # Without this nonlocal statement,
        b=3         # this will create a new local variable
        print(b)
    g()
    print(b)
f()

3
2


Try first running the above cell and see the result. Then uncomment the nonlocal stamement and run the cell again. The `global` and `nonlocal` statements are similar. The first will force a variable refer to a global variable, and the second will force a variable to refer to the variable in the nearest outer scope (but not the global scope).

# Data structures
The main data structures in Python are stringss, lists, tuples, dictionaries, and sets. We saw some examples of lists, when we discussed for loops. And we saw briefly tuples when we introduced argument packing and unpacking. Let's get into more details now.

## Sequences
A *list* contains arbitrary number of elements (even zero) that are stored in sequential order. The elements are separated by commas and written between brackets. The elements don't need to be of the same type. An example of a list with four values:

In [121]:
[2, 100, "hello", 1.0]

[2, 100, 'hello', 1.0]

A *tuple* is fixed length, immutable, and ordered container. Elements of tuple are separated by commas and written between parentheses. Examples of tuples:

In [122]:
(3,)               # a singleton
(1,3)              # a pair
(1, "hello", 1.0); # a triple

<div class="alert alert-warning">Note the difference between `(3)` and `(3,)`. The first one defines an integer, and the second one defines a tuple with single element.</div>

As we can see, both lists and tuples can contain values different type.

List, tuples, and strings are called *sequences* in Python, and they have several commonalities:
* their length can be queried with the `len` function
* `min` and `max` function find the minimum and maximum element of a sequence, and `sum` adds all the elements of numbers together
* Sequences can be concatenated with the `+` operator, and repeated with the `*` operator: `"hi"*3=="hihihi"`
* Since sequences are ordered, we can refer to the elements of a sequences by integers using the *indexing* notation: `"abcd"[2] == "c"`
* Note that the indexing begins from 0
* Negative integers start indexing from the end: -1 refers to the last element, -1 refers to the second last, and so on

Above we saw that we can access a single element of a sequence using indexing. If we want a subset of a sequence, we can use the *slicing* syntax. A slice consists of elements of the original sequence, and it is itself a sequence as well. A simple slice is a range of elements:

In [123]:
s="abcdefg"
s[1:4]

'bcd'

Note that Python ranges exclude the last index. The generic form of a slice is
`sequence[first:last:step]`. If any of the three parameters are left out, they are set to default values as follows: first=0, last=len(L), step=1. So, for instance "abcde"[1:]=="bcde". The step parameter selects elements that are step distance apart from each other. For example:

In [124]:
print([0,1,2,3,4,5,6,7,8,9][::3])

[0, 3, 6, 9]


### Modifying lists
We can assign values to elements of a list by indexing or by slicing. An example:

In [125]:
L=[11,13,20,32]
L[1]=2          # Changes the third element
print(L)

[11, 2, 20, 32]


Or we can assign a list to a slice:

In [126]:
L[1:3]=[4]
print(L)

[11, 4, 32]


We can also modify a list by using *mutating methods* of the list class, namely the methods `append`, `extend`, `insert`, `remove`, `pop`, `reverse`, and `sort`. Try Python's help functionality to find more about these methods: e.g. `help(list.extend)` or `help(list)`.

<div class="alert alert-warning">Note that we cannot perform these modifications on tuples or strings since they are *immutable*</div>

### Generating sequences
Trivial lists can be tedious to write: `[0,1,2,3,4,5,6]`. The function range creates numeric ranges automatically. The above sequence can be generated with the function call range(7). Note again that then end value is not included  in the sequence. An example of using the range function:

In [127]:
L=range(3)
for i in L:
    print(i)
# Note that L is not a list!
print(L)

0
1
2
range(0, 3)


So `L` is not a list, but it is a sequence. We can for instace access its last element with `L[-1]`. If really needed, then it can be converted to a list with the `list` constructor:

In [128]:
L=range(10)
print(list(L))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


<div class="alert alert-warning">Note that using a range consumes less memory than a corresponding list. This is because in a list all the elements are stored in the memory, whereas the range generates the requested elements only when needed. For example, when the for loop asks for the next element from the range at each iteration, only a single element from the range exists in memory at the same time.</div>

The range function works in similar fashion as slices. So, for instance the step of the sequence can be given:

In [129]:
print(list(range(0, 7, 2)))

[0, 2, 4, 6]


## Dictionaries
A *dictionary* is a dynamic, unordered container. Instead of using integers to access the elements of the container, the dictionary uses *keys* to acces the stored *values*. The dictionary can be created by listing the comma separated key-value pairs in braces. Keys and values are separated by a colon. A tuple (key,value) is called an *item* of the dictionary.

Let's demonstrate the dictionary creation and usage:

In [130]:
d={"key1":"value1", "key2":"value2"}
print(d["key1"])
print(d["key2"])

value1
value2


Keys can have different types even in the same container. So the following code is legal:
`d={1:"a", "z":1}`. The only restriction is that the keys must be *hashable*. That is, there has to be a mapping from keys to integers. Lists are *not* hashable, but tuples are!

There are alternative syntaxes for dictionary creation:

In [131]:
dict([("key1", "value1"), ("key2", "value2"), ("key3", "value3")]) # list of items
dict(key1="value1", key2="value2", key3="value3");

If a key is not found in a dictionary, the indexing `d[key]` results in an error (*exception* `KeyError`). But an assignment with non-existing key causes the key to be added in the dictionary associated with the corresponding value:

In [132]:
d={}
d[2]="value"
print(d)

{2: 'value'}


In [135]:
# d[1]   # This would cause an error

Dictionary object contains several non-mutating methods:
```
d.copy()
d.has key(k)
d.items()
d.keys()
d.values()
d.iteritems()
d.iterkeys()
d.itervalues()
d.get(k[,x])
```

Some methods change the dictionary:
```
d.clear()
d.update(d1)
d.setdefault(k[,x])
d.pop(k[,x])
d.popitem()
```

Try out some of these in the below cell. You can find more info with `help(dict)` or `help(dict.keys)`.

In [136]:
d=dict(a=1, b=2, c=3, d=4, e=5)
d.values()

dict_values([1, 2, 3, 4, 5])

## Sets
Set is a dynamic, unordered container. It works a bit like dictionary, but only the keys are stored. And each key can be stored only once. The set requires that the keys to be stored are hashable. Below are a few ways of creating a set:

In [137]:
s=set([1,2,2,'a'])
print(s)
s=set()  # empty set
print(s)
s.add(7) # add one element
print(s)

{'a', 1, 2}
set()
{7}


A more useful example:

In [138]:
s="mississippi"
print("There are %i distinct characters in %s" % (len(set(s)), s))

There are 4 distinct characters in mississippi


The `set` provides the following non-mutating methods:

In [139]:
s=set()
s1=set()
s.copy()
s.issubset(s1)
s.issuperset(s1)
s.union(s1)
s.intersection(s1)
s.difference(s1)
s.symmetric_difference(s1);

The last four operation can be tedious to write to create a more complicated expression. The alternative is to use the corresponding operator forms: `|`, `&`, `-`, and `^`. An example of these:

In [140]:
s=set([1,2,7])
t=set([2,8,9])
print("Union:", s|t)
print("Intersection:", s&t)
print("Difference:", s-t)
print("Symmetric difference", s^t)

Union: {1, 2, 7, 8, 9}
Intersection: {2}
Difference: {1, 7}
Symmetric difference {1, 7, 8, 9}


There are also the following mutating methods:
```
s.add(x)
s.clear()
s.discard()
s.pop()
s.remove(x)
```

And the set operators `|`, `&`, `-`, and `^` have the corresponding mutating, augmented assignment forms: `|=`, `&=`, `-=`, and `^=`.

## Miscellaneous stuff

To find out whether a container includes an element, the `in` operator can be used. The operator returns a truth values. Some examples of the usage:

In [141]:
print(1 in [1,2])
d=dict(a=1, b=3)
print("b" in d)
s=set()
print(1 in s)
print("x", "text")

True
True
False
x text


As a special case, for string the `in` operator can be used to check whether a string is part of another string:

In [142]:
print("issi" in "mississippi")
print("issp" in "mississippi")

True
False


Elements of a container can be unpacked into variables:

In [143]:
first, second = [4,5]
a,b,c = "bye"
print(c)
d=dict(a=1, b=3)
key1, key2 = d
print(key1, key2)

e
a b


In membership testing and unpacking only the keys of a dictionary are used, unless either values or items (like below) are explicitly asked.

In [144]:
for key, value in d.items():
    print("For key '%s' value %i was stored" % (key,value))

For key 'a' value 1 was stored
For key 'b' value 3 was stored


To remove the binding of a variable, use the `del` statement. For example:

In [147]:
s="hello"
del s
# print(s)    # This would cause an error

To delete an item from a container, the `del` statement can again be applied:

In [148]:
L=[13,23,40,100]
del L[1]
print(L)

[13, 40, 100]


In similar fashion `del` can be used to delete a slice. Later we will see that `del` can delete attributes from an object.

# Compact way of creating data structures
We can now easily create complicated data structures using for loops:

In [149]:
L=[]
for i in range(10):
    L.append(i**2)
print(L)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


Because this kind of pattern is often used, Python offers a short-hand for this. A *list comprehension* is an expression that allows creating complicated lists on one line. The notation is familiar from mathematics:

$\{a^3 : a \in \{1,2, \ldots, 10\}\}$

The same written in Python as a list comprehension:

In [150]:
L=[ a**3 for a in range(1,11)]
print(L)

[1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]


The generic from of a list comprehension is:
`[ expression for element in iterable lc-clauses ]`.
Let's break this syntax into pieces. The iterable can be any sequence (or something more general). The lc-clauses consists of zero or more of the following clauses:
* for elem in iterable
* if expression

A more complicated example. How would you describe these numbers?

In [151]:
L=[ 100*a + 10*b +c for a in range(0,10)
                    for b in range(0,10)
                    for c in range(0,10) 
                    if a <= b <= c]
print(L)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 22, 23, 24, 25, 26, 27, 28, 29, 33, 34, 35, 36, 37, 38, 39, 44, 45, 46, 47, 48, 49, 55, 56, 57, 58, 59, 66, 67, 68, 69, 77, 78, 79, 88, 89, 99, 111, 112, 113, 114, 115, 116, 117, 118, 119, 122, 123, 124, 125, 126, 127, 128, 129, 133, 134, 135, 136, 137, 138, 139, 144, 145, 146, 147, 148, 149, 155, 156, 157, 158, 159, 166, 167, 168, 169, 177, 178, 179, 188, 189, 199, 222, 223, 224, 225, 226, 227, 228, 229, 233, 234, 235, 236, 237, 238, 239, 244, 245, 246, 247, 248, 249, 255, 256, 257, 258, 259, 266, 267, 268, 269, 277, 278, 279, 288, 289, 299, 333, 334, 335, 336, 337, 338, 339, 344, 345, 346, 347, 348, 349, 355, 356, 357, 358, 359, 366, 367, 368, 369, 377, 378, 379, 388, 389, 399, 444, 445, 446, 447, 448, 449, 455, 456, 457, 458, 459, 466, 467, 468, 469, 477, 478, 479, 488, 489, 499, 555, 556, 557, 558, 559, 566, 567, 568, 569, 577, 578, 579, 588, 589, 599, 666, 667, 668, 669, 677, 678, 679, 688, 689, 699, 777, 778, 779,

If one needs only to iterate through the list once, it is more memory efficient to use a *generator expression* instead. The only thing that changes syntactically is that the surrounding brackets are replace by parentheses:

In [152]:
G = ( 100*a + 10*b + c for a in range(0,10)
                       for b in range(0,10)
                       for c in range(0,10) 
                       if a <= b <= c )
print(sum(G))   # This iterates through all the elements from the generator
print(sum(G))   # It doesn't start from the beginning, so all elements are already consumed

60885
0


<div class="alert alert-warning">Note above that one can only iterate through the generator once.</div>

Similary a *dictionary comprehension* creates a dictionary:

In [153]:
d={ k : k**2 for k in range(10)}
print(d)

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}


And a *set comprehension* creates a set:

In [154]:
s={ i*j for i in range(10) for j in range(10)}
print(s)

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 15, 16, 18, 20, 21, 24, 25, 27, 28, 30, 32, 35, 36, 40, 42, 45, 48, 49, 54, 56, 63, 64, 72, 81}


# Processing sequences
In this section we will go through some useful tool, are maybe familiar to you from some function programming language like *lisp* or *haskell*. These functions rely on functions being first-class objects in Python, that is, you can
* pass a function as a parameter to another function
* return a function as a return value from some function
* store a function in a data structure or a variable

We will talk about `map`, `filter`, and `reduce` functions. We will also cover how to create functions with no name using the *lambda* expression.

## Map and lambda functions
The `map` function gets a list and a function as parameters, and it returns a new list whose
elements are elements of the original list transformed by the parameter function. For this to work the parameter must take exactly one parameter and return a value. An example will clarify this concept:

In [155]:
def double(x):
    return 2*x
L=[12,4,-1]
print(map(double, L))

<map object at 0x7f06684055f8>


The map function returns a map object for efficiency reasons. However, since we only want print the contents, we first convert it to a list and then print it:

In [156]:
print(list(map(double,L)))

[24, 8, -2]


When one reads numerice data from a file or from the internet, the number of in string. Before they can be used in computations, they must first be converted to ints or floats.
A simple example will showcase this.

In [157]:
s="12 43 64 6"
L=s.split()        # The split method of the string class, breaks the string at whitespaces
                   # to a list of strings.
print(L)
print(sum(map(int, L)))  # The int function converts a string to an integer

['12', '43', '64', '6']
125


Sometimes it feels unnecessary to write a function is you are only going to use it in one  `map` function call. For example the function

In [158]:
def add_double_and_square(x):
    return 2*x+x**2 

It is not likely that you will need it elsewhere in your program. The solution is to use an *expression* called *lambda* define a function with no name. Because it is an expression we can put it, for instance, in a argument list of a function call. The lambda expression has the form `lambda param1,param2, ... : expression`, where after the lambda keyword you list the parameters of the function, and after the colon is the expression that uses the parameters to compute the return value of function. Let's replace the above `add_double_and_square` function with a lambda function and apply it to a list using the `map` function.

In [159]:
L=[2,3,5]
print(list(map(lambda x : 2*x+x**2, L)))

[8, 15, 35]


## Filter function


The `filter` function takes a function and a list as parameters. But unlike with the map construct, now the function must take exactly one parameter and return a truth value (True or False). The `filter` function then creates a new list with only those elements from the original list for which the parameter function returns True. The elements for which the parameter function returns False are filtered out. An will demonstrate the `filter` function:

In [160]:
def is_odd(x):
    """Returns True if x is odd and False if x is even"""
    return x % 2 == 1         # The % operator returns the remainder of integer division
L=[1, 4, 5, 9, 10]
print(list(filter(is_odd, L)))

[1, 5, 9]


The even elements of the list were filtered out.

## The reduce function
The `sum` function that returns the sum of a numeric list, can be though to reduce a list to a single element. It does this reduction by repeated applying the `+` operator until all the list elements are consumed. For instance, the list [1,2,3,4] is reduced by the expression `(((0+1)+2)+3)+4` of repeated applications of the `+` operator. We could implement this with the following function:

In [161]:
def sumreduce(L):
    s=0
    for x in L:
        s = s+x
    return s

Because this is a common pattern, the designers of Python included a function called `reduce` to simplify the reduction of a sequence. You give the operator you want to use as a parameter to reduce (addition in the above example). And you also give a starting value of the computation (starting value 0 was used above). We can now get rid of the separate function sumreduce by using the reduce function:

In [162]:
L=[1,2,3,4]
from functools import reduce   # import the reduce function from the functools module
reduce(lambda x,y:x+y, L, 0)

10

If we wanted to get a product of all numbers in a sequence, we would use

In [163]:
reduce(lambda x,y:x*y, L, 1)

24

This corresponds to the sequence `(((1*1)*2)*3)*4` of application of operator `*`.

<div class="alert alert-warning">Note that use of the starting value is necessary, because we want to be able to reduce lists of lengths 0 and 1 as well. The default starting value is zero.

# String handling
We have already seen how to index, slice, concatenate, and repeat strings. Let's now look into what methods the `str` class offers. In Python strings are immutable. This means that for instance the following assignment is not legal:

In [164]:
s="text"
# s[0] = "a"    # This is not legal in Python

Because of the immutability of the strings, the string methods work by returning a value; they don't have any side-effects. In the rest of this section we briefly describe several of these methods. The methods are here divided into five groups.

## Classification of strings
All the following methods will take no parameters and return a truth value. An empty string will always result in `False`.
* `s.isalpha()` True if all characters are letters or digits
* `s.isalpha()` True if all characters are letters
* `s.isdigit()` True if all characters are digits
* `s.islower()` True if contains letters, and all are lowercase
* `s.isupper()` True if contains letters, and all are uppercase
* `s.isspace()` True if all characters are whitespace
* `s.istitle()` True if uppercase in the beginning of word, elsewhere lowercase

## String transformations
The following methods do conversions between lower and uppercase characters in the string. All these methods return a new string.
* `s.lower()`      Change all letters to lowercase
* `s.upper()`      Change all letters to uppercase
* `s.capitalize()` Change all letters to capitalcase
* `s.title()` Change to titlecase
* `s.swapcase()` Change all uppercase letters to lowercase, and vice versa







## Searching for substrings
All the following methods get the wanted substring as the
parameter, except the replace method, which also gets the
replacing string as a parameter
* `s.count(substr)` Counts the number of occurences of a substring
* `s.find(substr)` Finds index of the first occurence of a substring, or -1
* `s.rfind(substr)` Finds index of the last occurence of a substring, or -1
* `s.index(substr)` Like find, except ValueError is raised if not found
* `s.rindex(substr)` Like rfind, except ValueError is raised if not found
* `s.startswith(substr)` Returns True if string starts with a given substring
* `s.endswith(substr)` Returns True if string ends with a given substring
* `s.replace(substr, replacement)` Returns a string where occurences of one string
are replaced by another

Keep also in mind that the expression `"issi" in "mississippi"` returns a truth value of whether the first string occurs in the second string.








## Trimming and adjusting
* `s.strip(x)` Removes leading and trailing whitespace by default, or characters found in string x
* `s.lstrip(x)` Same as strip but only leading characters are removed
* `s.rstrip(x)` Same as strip but only trailing characters are removed
* `s.ljust(n)` Left justifies string inside a field of length n
* `s.rjust(n)` Right justifies string inside a field of length n
* `s.center(n)` Centers string inside a field of length n

An example of using the `center` method and string repetition:

In [165]:
L=[1,3,5,7,9,1,1]
print("-"*11)
for i in L:
    s="*"*i 
    print("|%s|" % s.center(9))
print("-"*11)

-----------
|    *    |
|   ***   |
|  *****  |
| ******* |
|*********|
|    *    |
|    *    |
-----------


## Joining and splitting
The `join(seq)` method joins the strings of the sequence `seq`. The string itself is used as a delimitter. An example:

In [166]:
"--".join(["abc", "def", "ghi"])

'abc--def--ghi'

In [167]:
L=list(map(lambda x : " %s" % x, range(100)))
s=""
for x in L:
    s = s + x   # Don't ever do this, it creates a new string at every iteration
print(s)
print("".join(L))  # This is the correct way of building a string out of smaller strings

 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99


<div class="alert alert-warning">If you want to build a string out of smaller strings, then
first put the small strings into a list, and then use the `join` method to catenate the pieces together. It is much more efficient this way. Use the `+` catenation operator only if you have very few short strings that you want to catenate.</div>

The method `split(sep=None)` divides a string into pieces that are separated by the string `sep`. The pieces are returned in a list. For instance, the call 'abc--def--ghi'.split("--") will result in

In [168]:
'abc--def--ghi'.split("--")

['abc', 'def', 'ghi']

# Regular expressions

# Basic file processing

A file can be opened with the `open` function. The call `open(filename, mode="r")` will return a *file object*, whose type is `file`. This file object can be used to refer to a file on disk. For example, when we want to read from or write to a file, we can used the methods `read` and `write` of the file object. After the file object is no longer needed, a call to the `close` method should be made.

We can control what kind of operations we can perform on a file with the *mode* parameter of the `open` function. Different options include opening a file for reading or writing,
whether the file should exists already or be created with the
call to open, etc. Here's a list of all the opening modes:

| Mode | Description |
| ---- | ----------- |
| `r`  | read-only mode, file must exist |
| `w`  | write-only mode, creates, or overwrites an existing file |
| `a`  | write-only mode, write always appends to the end |
| `r+` | read/write mode, file must already exist |
| `w+` | read/write mode, creates, or overwrites an existing file |
| `a+` | read/write mode, write will append to end |

In the end of the mode string either the letter `t` or `b` can be appended. These stand for text mode and binary mode. If this letter is not given, the file type is text mode by default. 

For binary mode the contents of the file are not interpreted in any way, and the read and write methods handle bytes. (A byte consists of 8 bits and can be used to represent a number in the range 0 to 255.)

In the text mode two interpretations happen
* On Windows operating system the end of line in files is encoded by two characters. When the file is read these two charactes are converted to `'\n'` character. During writes to a file this conversion happens in the opposite direction.
* One character is encoded in the file as one or more bytes. This conversion happens automatically during read and write operations. One common encoding between bytes and characters is utf-8. In this encoding, the Finnish character `'ä'`, for example, is encoded as the following sequence of bytes:

In [169]:
"ä".encode("utf-8")

b'\xc3\xa4'

Above the two bytes were expressed as hexadecimals. In decimal notation they would be 195 and 164. (Both in the range from 0 to 255.) What is the utf-8 encoding of the letter `'a'`?

During this course we will only consider files containing text, so the default text mode is fine for us.

## Some common file object methods
* `read(size)` will read size characters/bytes as a string
* `write(string)` will write string/bytes to a file
* `readline()` will read a string until and including the next newline character is met
* `readlines()` will return a list of all lines of a file
* `writelines()` will write a list of lines to a file
* `flush()` will try to make sure that the changes made to a file are written to disk immediately

In [170]:
f = open("basics.ipynb", "r") # Let's open this notebook file, 
                              # which is essentially a text file.
                              # So you can open it in a texteditor as well.
        
for i in range(5):            # And read the first five lines
    line = f.readline()
    print("Line %i: %s" % (i, line), end="")
f.close()

Line 0: {
Line 1:  "cells": [
Line 2:   {
Line 3:    "cell_type": "markdown",
Line 4:    "metadata": {},


It is easy to forget to close the file. One can use a *context manager* to solve this problem. A context manager is created with the `with` statement. After the indented block of the with statements exits, the file will be automatically closed.

In [171]:
with open("basics.ipynb", "r") as f:          # the file will be automatically closed,
                                              # when the with block exits
    for i in range(5):
        line = f.readline()
        print("Line %i: %s" % (i, line), end="")

Line 0: {
Line 1:  "cells": [
Line 2:   {
Line 3:    "cell_type": "markdown",
Line 4:    "metadata": {},


The `file` object is iterable. This means that we can iterate through the lines in the file using a for loop, like in the below example:

In [172]:
max_len = 0
with open("basics.ipynb", "r") as f:
    for line in f:    # iterates through all the lines in the file
        if len(line) > max_len:
            max_len = len(line)
print("The longest line in this file has length %i" % max_len)

The longest line in this file has length 32497


## Standard file objects
Python has automatically three file objects open:
* `sys.stdin` for *standard input*
* `sys.stdout` for *standard output*
* `sys.stderr` for *standard error*
To read a line from a user (keyboard), you can call `sys.stdin.readline()`. To write a line to a user (screen), call `sys.stdout.write(line)`. The standard error is meant for error messages only, even though its output often goes to the same destination as standard output.

The print function uses the file `sys.stdout` and input function uses the file `sys.stdin`. An example of usage:

In [173]:
import sys
i=int(input("Give a positive integer: "))
if i >= 0:
    sys.stdout.write("You gave a positive integer.\n")
else:
    sys.stderr.write("You gave a negative integer.\n")

Give a positive integer: -1


You gave a negative integer.


These standard file objects are meant to be a basic input/output mechanism in textual form. The destinations of the file objects can be changed to point
somewhere else than the usual keyboard and screen. Very often these are redirected to some files. For example, it is usual to point the stderr to a file where all
error messages are logged.

# Objects and classes

# Exceptions

When an error occurs, what can we do?

* Print an error message
* Stop the execution of a program
* Indicate the error by returning a special value, like -1 or None
* Ignore the error
* ...

These solutions tend to combine the indication of a problem
and the reaction to the problem indication.
The behaviour of the program in error situations cannot the
changed, they are fixed in the implementation of the function.
When an erroneous situation is noticed, it may not be clear
how to handle the situation.
Usually the user or an instance that called a function knows
what to do.

Most modern computer languages have a system called
exception handling. This system separates the recognition of errors and the
handling of these situations. We can signal an error or anomalous situation by raising an
exception. Exceptions can be raised in Python with the `raise` statement:

* `raise` instance
* `raise` exception class [, expression]

In the second form, if the expression exists, it is a tuple of
parameters given to exception class.

The functions of the Python standard library raise exceptions
in error situations. Sometimes exceptions aren’t really errors. For example, when
an iterator runs out of elements, it will signal this by raising
the `StopIteration` exception.
Another less erroneus exception is the `Warning` exception.

The general form of exception catching statement is the following:

```
try:
    # here are the statements that can cause exceptions
except (Exceptionname1, Exceptionname2, ...):
    # here we handle the exceptions
else:
    # this gets executed if try-block caused no exceptions
finally:
    # this is always executed, clean-up code
```

Usually, just the try and except parts are needed.

In [174]:
L=[1,2,3]
try:
    print(L[3])
except IndexError:
    print("Index does not exist")

Index does not exist


In [175]:
def compute_average(L):
    n=len(L)
    s=sum(L)
    return float(s)/n # error is noticed here !!!
mylist=[]
while True:
    try:
        x=float(input("Give a number (non-number quits): "))
        mylist.append(x)
    except ValueError:
        break
try:
    average=compute_average(mylist)
    print("Average is", average)
except ZeroDivisionError:
    # and the error is handled here
    if len(mylist) == 0:
        print("Tried to compute the average of empty list of numbers")
    else:
        print("Something strange happened")

Give a number (non-number quits): 12
Give a number (non-number quits): 
Average is 12.0


### Exception hierarchy

In Python exceptions are objects, like all values in Python.
These objects are instantiated from exception classes.
Exception classes form naturally hierarchies:
* New exception classes can be made by inheriting from existing exception classes and extending them
* The root of this hierarchy is the class Exception
* Python defines several base classes to derive from, and several ready-to-use exception classes

![exception_hierarchy.svg](attachment:exception_hierarchy.svg)

### Too general exception specifications

The exception hierarchy allows to catch multiple similar
exceptions by catching their common base class.
This feature has to be used carefully. Over-general exception
specification, like `except Exception:`, can hide the real
reason for an error. Example of this:

In [176]:
import sys
s=input("Give a number: ")
s=s[:-1] # strip the \n character from the end
try:
    x=int(s)
    sys.stdout.wr1te("You entered %d\n" % x)
except Exception:
    print("You didn’t enter a number")

Give a number: 1
You didn’t enter a number


In the previous example, if the user doesn’t enter a string that
represents an integer, a `ValueError` is raised by the int
function. Instead of catching the `ValueError`, we catch the root of the
exception hierarchy, namely `Exception`. This results in catching all possible exceptions.
But this will cause one typing error in the program to go undetected.
Change the exception specification from `Exception` to `ValueError` to see what this error is.

### What is the error handling policy in Python

Python uses a different approach to error checking than many
other common languages.
Instead of trying to beforehand check that all the inputs are of
correct type and then contents of input variables are sensible
for some operations, Python first tries the operations and then
checks whether they caused any exceptions.
This is partly what duck typing is about: a function works for
a set of inputs if all the operations in the function body make
sense for those inputs.
So, that’s why the parameters of functions aren’t specified to
be of any certain type.

# Modules

<pre>




















































</pre>