# Brief introduction to Python

## General information

Python and jupyter-notebook can be installed using [anaconda](https://www.anaconda.com/). This is probably the easiest way to follow this lecture. The goal of this first chapter is to give a *very quick introduction basis*, but practice is mandatory to get confortable with python objects and synthax. Practicing is possible with a web browser only using the [www3school python tutorials](https://www.w3schools.com/python).

In python, there is one instruction per line. Variable assignment is done with with `=`,  indentation is used to group instructions together under a loop or a condition block: there is no backet (like in C++) or equivalent. Comments (*i.e.* uninterpreted text) start with `#`. Importation of external modules or fonction can be done with three different ways: `import module`, `import module as m` or `from module import this_function`.

In the following example, the result of the command will be printed so that people can check that the computer is doing what is expected. The instruction `print(x)` will print the content of `x`. If several variables are printed, is it convenient to use `print('x={} and y={}'.format(x, y))` synthax that will print `x` and `y` in bracket fields with one command - even if they have different types.

##  Object types in python

### Numbers

There are three type of numbers: int, float and complex. The usual operations (`+`, `-`, `*`, `/`) are available. In addition, there is also `a**b` (which means $a^b$), `a // b` and `a % b`  (which are the result of integer divisions - see example below).

In [1]:
# Basic numbers and operations
a = 2
b = 3.14
print(a+b)
print(a**b)

5.140000000000001
8.815240927012887


In [2]:
# Complex numbers and power
a = 1j
a**2

(-1+0j)

In [3]:
# Integer division example (// and % operators)
a , b = 10, 4
divisor, rest = a//b, a%b
print('{} = {}x{} + {}'.format(a, divisor, b, rest))

10 = 2x4 + 2


### Strings

String allows to manipulate words, sentence or even text with specific methods. String are also python lists and list methods work as well (see below). The common and useful string manipulations can be counting the number of letters with `len(word)` or even manipulate a collection of words using `sentence.split(' ')`. Many methods exist, which can be looked at by typing `help(str)` in a python terminal or a jupyter nootebook.

In [4]:
w1 = 'hello'
print(w1, len(w1), w1[3])

hello 5 l


In [5]:
# Summing two strings is possible (all other operators dont work)
blank, w2 = ' ', 'world'
sentence = w1 + blank + w2
print(sentence)

hello world


In [6]:
# Multiplying a string by an integer is also possible
repetition = sentence*3
print(repetition)

hello worldhello worldhello world


In [7]:
# Get a list of words from a sentence (cf. below for list objects)
s = 'It is rainy today'
list_words = s.split(' ')
print(list_words)

['It', 'is', 'rainy', 'today']


In [8]:
# Looping over the words and get the number of letters
for w in s.split(' '):
    print(w, len(w))

It 2
is 2
rainy 5
today 5


## Object collections in python

There are four types of collection, which share several methods but differ from various aspects:
  + list
  + dictionnary
  + tuple
  + set

The most commonly used are the lists and dictionnary. The specificy of the set is that it is unordered, while the specificyt of the tuple is that it cannot be modified. The common methods are 
+ number of items: `len(x)`
+ loop over items with `for element in x:`
+ check if a item is in the list: `element in x` 


### Lists

This one of the most used collection object in python because it is the next-to-simplest level, after individual variables. A python list is a *list of objects* with possibly different types. One can search, loop, count with list. One can also add two lists or multiply a list by an integer, which makes a *concatenation* or a *duplication* (these points will be important for numpy arrays). The indexing of elements is also a nice way to access the information of interest: one can access the $i^{\text{th}}$ element with `my_list[i]` or get a sub-list with `my_list[i:j]`. One can also take only one element every `n` with `my_list[i:j:n]` (more precisely this takes elements of index $i+p\times n$ until $j$, with $p=0, 1, 2, ...$). With this synthax, reverting the order of the list is easy: `reverted_list = my_list[::-1]`, where empty variable are default values (namely `0` and `len(my_list)`).

In [9]:
# Defining a list and access basic information
my_list1 = [1, 3, 4, 'banana']
print('Second element is {}'.format(my_list1[1]))
print('Number of elements: {}'.format(len(my_list1)))
print('Is \'banana\' in the list? {}'.format('banana' in my_list1))

Second element is 3
Number of elements: 4
Is 'banana' in the list? True


In [10]:
# Sum of two lists
my_list2 = ['string', 1+3j, [100, 1000]]
my_list = my_list1 + my_list2
print(my_list)

[1, 3, 4, 'banana', 'string', (1+3j), [100, 1000]]


In [11]:
# List multiplied by an integer
my_list = my_list*2
print(my_list)

[1, 3, 4, 'banana', 'string', (1+3j), [100, 1000], 1, 3, 4, 'banana', 'string', (1+3j), [100, 1000]]


In [12]:
# Looping over list element and print the type of seven first elements in the reversed order.
for element in my_list[6:0:-1]:
    print('{} is {}'.format(element, type(element)))

[100, 1000] is <class 'list'>
(1+3j) is <class 'complex'>
string is <class 'str'>
banana is <class 'str'>
4 is <class 'int'>
3 is <class 'int'>


### sets and tuples

Tuples and set are modified version of python list. Tuples are ordered but cannot be modified (no assignment, no addition, while sets are not ordered but can be mofided. In this context, order means indexing (so `x[i:j:n]` synthax, among others). Search or loop over elements work in the same way as for list.

In [13]:
# Tuple
t = (1, 3, 7)
print(t)

# Access the third element
print(t[2])

# Try to modify the second element - using the 'try - except' synthax
try: 
    t[1] = 'hello'
except TypeError:
    print('Impossible to change the value of a tuple')

(1, 3, 7)
7
Impossible to change the value of a tuple


Sets can modified with methods like `s.add(x)` or `s.update([x, y])`.

In [14]:
# Set
s = {'apple', 'banana', 'orange'}
print(s)

# Add one element
s.add('pineapple')
print(s)

# Add a list
s.update(['pear', 'prune'])
print(s)

{'apple', 'banana', 'orange'}
{'apple', 'pineapple', 'banana', 'orange'}
{'apple', 'pineapple', 'pear', 'prune', 'banana', 'orange'}


In [15]:
# Try to access the second element - using the 'try - except' synthax
try: 
    print(s[1])
except TypeError:
    print('Impossible to access element via indexing')

Impossible to access element via indexing


### Dictionnaries

Other types are important to manipulate and organize data. The most common one is the dictionnary which work with a pair of (`key`, `value`). The `key` must be a non-modifiable object, in practice string or integer, but cannot be a list. This is a very powerful concept to store different types of information into the same object. One can easily loop, search, modify a given key value, or even add a new key quite easily.

In [16]:
# dictionnary
person = {'name': 'Charles', 'age': 78, 'size': 173, 'gender': 'M'}
print(person)

{'name': 'Charles', 'age': 78, 'size': 173, 'gender': 'M'}


In [17]:
# Accessing value using the key
template = '{} ({}) is {} years old and is {} cm'
print(template.format(person['name'], person['gender'], person['age'], person['size']))

Charles (M) is 78 years old and is 173 cm


In [18]:
# Adding a key and its value
person['eyes'] = 'blue'
print(person)

{'name': 'Charles', 'age': 78, 'size': 173, 'gender': 'M', 'eyes': 'blue'}


In [19]:
# Test if a key is present
print('name' in person)
print('brand' in person)

True
False


## Loops in python

Loops are at the core of programming and especially for data analysis oriented tasks. There are two way of repeating a instruction several times: the `for` loop and the `while` loop. Several instructions are common to both loops, such as `continue` (skip instruction after and switch to the next element) or `break` (stop the loop), but the use case of these two ways are different.

### For loops

For data analysis, I think these are the most used ones. But as we will see in the introduction to numpy, for loops must not be used for heavy computations in python. For loops are relevant for small (~1000) data samples (and computations). We'll come back to this point in the lecture. Below, few example are given.

In [20]:
# Compute sum(i^2) for i from 0 to 9
x = 0
for i in range(0, 10):
    x += i**2
print(x)

285


In [21]:
# Loop over fruit via a set and print only ones with a 'p'
for fruit in s:
    if 'p' in fruit:
        print(fruit)

apple
pineapple
pear
prune


There are several ways to loop over dictionnary depending on how we want to access the information. Indeed, you can access information by keys, values, or both. An example of each are given below.

In [22]:
# Loop over keys for a dictionnary and access the value of each
for properties in person:
    value = person[properties]
    print('{}: {}'.format(properties, value))

name: Charles
age: 78
size: 173
gender: M
eyes: blue


In [23]:
# Loop over dictionnary values only
for v in person.values():
    print(v)

Charles
78
173
M
blue


In [24]:
# Loop over both keys and values directly
for key, value in person.items():
    print('{}: {}'.format(key, value))

name: Charles
age: 78
size: 173
gender: M
eyes: blue


### While loops

While loops are bit less used in practive but they are quickly discribed for completness. The idea is to repeat a given instruction until a condition is reached.

In [25]:
# Cast (ie change type) the set s into a list
my_list = list(s)

# Remove item one by one until there are no items left.
while len(my_list)>0:
    my_list.pop()
    print(my_list)

['apple', 'pineapple', 'pear', 'prune', 'banana']
['apple', 'pineapple', 'pear', 'prune']
['apple', 'pineapple', 'pear']
['apple', 'pineapple']
['apple']
[]


## Few python synthax tips

### Comprehension

In [26]:
# List
list_squares = [i**2 for i in range(1, 10)]
print(list_squares)

# Dictionnary
dict_squares = {i:i**2 for i in list_squares[0:5]}
print(dict_squares)

[1, 4, 9, 16, 25, 36, 49, 64, 81]
{1: 1, 4: 16, 9: 81, 16: 256, 25: 625}


In [27]:
# Comprehension list with a condition (e.g. keep only even numbers)
list_even = [i for i in range(0, 10) if i%2==0]
print(list_even)

[0, 2, 4, 6, 8]


In [28]:
# Comprehension with nested loops
sum_integers = [i*10+j for i in range(0,5) for j in range(0, 5)]
print(sum_integers)

[0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 21, 22, 23, 24, 30, 31, 32, 33, 34, 40, 41, 42, 43, 44]


### Looping with `enumerate` and `zip`

The keywork `enumerate` return directly a counter together with the element arising in the loop. This is useful if you need to count the number of iterations of the loop. This can be done without `enumerate` but you need to add two lines (initialisation of the counter, and incrementation).

In [29]:
# Position of each word in a sentence
sentence = 'I would like to analyse this sentence in term of word position'
words = sentence.split(' ')
for i, w in enumerate(words):
    print(w.ljust(10) + ':  ' + str(i))

I         :  0
would     :  1
like      :  2
to        :  3
analyse   :  4
this      :  5
sentence  :  6
in        :  7
term      :  8
of        :  9
word      :  10
position  :  11


The `zip(list1, list2)` synthax allows to form pairs using elements of each list at *the same position*. This is quite convenient to associate some objects which are stored in different collections in a very quick and readable way. If `list1` and `list2` don't have the same size, the mimimum of the two lenght is taken. Finally, the `zip()` command can take more than two collections and return then a group of element which has the size of the number of collection.

In [30]:
# Associate fruits and colors
fruits = ['banana', 'orange', 'pineapple', 'pear', 'prune']
colors = ['yellow', 'orange', 'brown', 'green', 'purple']
for f, c in zip(fruits, colors):
    print('{} is {}'.format(f, c))

banana is yellow
orange is orange
pineapple is brown
pear is green
prune is purple


In [31]:
# Using zip() with three lists
l1, l2, l3 = range(0, 10), range(0, 100, 10), range(0, 1000, 100)
for i, j, k in zip(l1, l2, l3):
    print(i, j, k, i+j+k)

0 0 0 0
1 10 100 111
2 20 200 222
3 30 300 333
4 40 400 444
5 50 500 555
6 60 600 666
7 70 700 777
8 80 800 888
9 90 900 999


## Functions

Functions are defined as a set of instruction encapsulated into one object. This is particulary convenient when one has to the same list of instructions several times. A good guideline to know when to write a function could be 

> If you copy-paste the same pieace of code more than two times, then make a function


### Definition

A function takes some arguments, perform some instruction an return a result. The syntax to define and call a function is showed below. In python, function must be defined *before* being used (as opposed to C++ where it can be different, as soon as the function is declared). This makes the concept of *package* quite relevant to wrapp-up several function into a python file which can be imported in the main code.

In [32]:
# Definition syntax
def function(argument):
    result = argument * 3
    return result

# Call syntax
function(2)

6

The type of the arguement is not fixed (since it is a general feature of python) so the same instruction will be interpreted differently depending on the type. The following example shows the different result of the function above for two argument types.

In [33]:
# Print the result for two types of arguements
print('function(10) = {}'.format(function(10)))
print('function(\'ouh\') = {}'.format(function('ouh')))

function(10) = 30
function('ouh') = ouhouhouh


In [34]:
def person_printer(p):
    template = '{} ({}) is {} years old and is {} cm'
    print(template.format(p['name'], p['gender'], p['age'], p['size']))
    return


def grow_old(p, n_years):
    
    # 1. copy the dictionnary person (otherwise p *will* be modified)
    res = p.copy()
    
    # 2. Compute the new age and size
    new_age = res['age'] + n_years
    new_size = res['size'] - n_years*0.13
    
    # 3. Assign the new age/size to the result
    res['age'] = new_age
    res['size'] = new_size
    
    # 4. Return the result
    return res

In [35]:
# Print before growing old
person_printer(person)

# Growing old
old_guy = grow_old(person, 10)

# Print after growing old
person_printer(old_guy)

Charles (M) is 78 years old and is 173 cm
Charles (M) is 88 years old and is 171.7 cm


### Docstring

This offers the possibility to document your code in a proper way, which is quite useful for others (and for you, when you will re-use a code after several years). This is then a good practice to do, even if it takes time. This can be accessed using the command `help(function)` or by using the keyboard shortcut `Shift+Tab` in jupyter notebook (when the cursor is after the opening parenthesis of the function). The syntax to add docstring is `'''My documentation'''` at the very begining of the function. 

In [36]:
def grow_old(p, n_years):
    '''
    Take a person dictionnary and update the age and the size to make the person older.
    
    Parameters
    ----------
    p: dictionnary 
        Person object as defined earlier in the code, with at least 'age' (year) 
        and 'size' (cm) keys, to get old.
    n_years: integer
        Number of years to be added to the age of the person.
    
    Return
    ------
    person: dictionnary
       Person object as defined earlier in the code with age and size updated as
         age  -> age+n_years
         size -> size-n_years*0.13
    '''
    
    # 1. copy the dictionnary person (otherwise p will be modified, which might problematic)
    res = p.copy()
    
    # 2. Compute the new age and size
    new_age = res['age'] + n_years
    new_size = res['size'] - n_years*0.13
    
    # 3. Assign the new age/size to the result
    res['age'] = new_age
    res['size'] = new_size
    
    # 4. Return the result
    return res

In [37]:
help(grow_old)

Help on function grow_old in module __main__:

grow_old(p, n_years)
    Take a person dictionnary and update the age and the size to make the person older.
    
    Parameters
    ----------
    p: dictionnary 
        Person object as defined earlier in the code, with at least 'age' (year) 
        and 'size' (cm) keys, to get old.
    n_years: integer
        Number of years to be added to the age of the person.
    
    Return
    ------
    person: dictionnary
       Person object as defined earlier in the code with age and size updated as
         age  -> age+n_years
         size -> size-n_years*0.13



There are several ways to organise the docstring and the example above is based on numpy docstring style. Note that docstring can be also added to a module (in practice, a python file) to document the content, goal and usage of this module.

### Arbitrary number of arguments: `*args` and `**kwargs`

The example above are relatively simple and generally function takes several arguments. Sometime it is even convenient to have an unfixed number of arguments, so that the function is rather evolutive when the code grows. Python offres a nice way to define such a function thanks to the *packing* and *unpacking* notion, which is describe right below.

#### *Apparte* packing and unpacking

In short, this is the possibity to convert a list into a serie of objects (unpacking) or vis-versa (packing). This way of writing collection makes code developments very consise and fast, especially to call function with a several arguments in a nice way. This also allows to define function with an arbitrary number of arguments as already mentioned. The following dummy function is used to illustrate the concept of packing/unpacking with both a list and a dictionnary.

In [38]:
# Test function
def mean(a, b, c):
    return (a+b+c)/3.

It is possible to use a list of three number to specify the argument values of the `mean(a, b, c)` function, using the unpacking syntax for list: `*list`. This is demonstrated below:

In [39]:
# Packing & unpacking with a list (or a tuple): *list
my_numbers = [10, 12, 15]
mean(*my_numbers)

12.333333333333334

This is also sometime convenient to call the argument by their name (mostly to make the code more readable). This type of arguments are called *keyword arguments* and can be packed/unpacked into a dictionnary. Each argument name is a key of this dictionnary and the value is the values passed to the function. The unpacking is done with `**dict`.

In [40]:
# Packing & unpacking with a dictionnary: **dict
my_numbers = {'a': 10, 'b': 12, 'c': 15}
mean(**my_numbers)

12.333333333333334

Coming back to the initial motivation, *i.e.* having an arbitrary number of arugments. It is possible to define such a function as follow - which in that case just print number and the list of arguments:

In [41]:
# Function definition with *args
def test_function(*args):
    print('There are {} arguments: '.format(len(args)))
    for a in args:
        print('  -> {}'.format(a))
    print('')
    return

In [42]:
# Test with different numbers/types of arguments
test_function()
test_function('hoho')
test_function('hoho', 3)
test_function('hoho', 3, [1,'banana'], {'mood': 'happy', 'state': 'holidays'})

There are 0 arguments: 

There are 1 arguments: 
  -> hoho

There are 2 arguments: 
  -> hoho
  -> 3

There are 4 arguments: 
  -> hoho
  -> 3
  -> [1, 'banana']
  -> {'mood': 'happy', 'state': 'holidays'}



In [43]:
# Function definition with kwargs
def test_function(**kwargs):
    print('There are {} arguments: '.format(len(kwargs)))
    for k, v in kwargs.items():
        print(' {}={}'.format(k, v))
    print('')
    return

In [104]:
test_function()
test_function(x='hoho')
test_function(word='hoho', multiplicity=3)
test_function(a='hoho', N=3, shopping=[1,'banana'], feeling={'mood': 'happy', 'state': 'holidays'})

There are 0 arguments: 

There are 1 arguments: 
 x=hoho

There are 2 arguments: 
 word=hoho
 multiplicity=3

There are 4 arguments: 
 a=hoho
 N=3
 shopping=[1, 'banana']
 feeling={'mood': 'happy', 'state': 'holidays'}



This can be used to declare argument in a very readable and concise way. This might be helpful for some cosmetic argument of plot that can be common to several plots (but not all). We'll see some concrete example later in the lecture. In the meanwhile, here is the equivalent of last call from the code above:

In [45]:
# Pack all keyword arguments in a dictionnary first
my_args = {
           'a': 'hoho', 
           'N': 3, 
           'shopping': [1,'banana'], 
           'feeling': {'mood': 'happy', 'state': 'holidays'}
          }

# Then call the function
test_function(**my_args)

There are 4 arguments: 
 a=hoho
 N=3
 shopping=[1, 'banana']
 feeling={'mood': 'happy', 'state': 'holidays'}



## File manipulation


### Open, write and close a file

In [63]:
f = open('test.txt', 'w')
f.write('# Writing a text into a file is quite easy in python')
f.write('# This can be used to store data')
f.write('time, height, N, price')
f.write('1.2, 2.0, 3, 8.7')
f.close()

### Open, read and close a file

In [65]:
f = open("test.txt", "r")
for line in f:
    print(line)
f.close()

# Writing a text into a file is quite easy in python# This can be used to store datatime, height, N, price1.2, 2.0, 3, 8.7
