# Python training UGA 2017

**A training to acquire strong basis in Python to use it efficiently**

Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTERRE), Christophe Picard (LJK), Loïc Huder (ISTerre)

# Object-oriented programming: encapsulation

See https://docs.python.org/3/tutorial/classes.html

Python is also an object-oriented language. Object-oriented programming is very useful and used in many libraries so it is very useful to understand how the simple object-oriented mechanisms work in Python.

For some problems, Object-oriented programming is a very efficient paradigm. Many libraries use it so it is worth understanding what is object oriented programming (POO) and when it is useful.

# Concepts

## Object
An object is an entity that has a state and a behavior. Objects are the basic elements of object-oriented system.

## Class
Classes are "families" of objects. A class is a pattern that describes how objects will be built.

## POO motivation: data encapsulation

**Example: the weather stations**

Let us suppose we have a set of weather stations that do measurements of wind speed and temperature. Suppose now one wants to compute some statistics on these data. A basic representation of a station will be an array of arrays: wind values and temperature values.

In [1]:
paris = [[10, 0, 20, 30, 20, 0], [1, 5, 1, -1, -1, 3]]

# get wind when temperature is maximal
idx_max_temp = paris[1].index(max(paris[1]))
print(f"max temp is of {paris[1][idx_max_temp]}°C at index {idx_max_temp} ")
print(f"wind speed at max temp = {paris[0][idx_max_temp]} km/h")

max temp is of 5°C at index 1 
wind speed at max temp = 0 km/h


**Comments on this solution**

Many problems:

- if the number of measurements increases (e.g. having rainfall, humidity, ...) the previous indexing will not be valid (what will `paris[5]` will represent? wind, temperature, ..., ?)
- Code analysis is not (that) straightforward

**A possible solution: create a box**

We can use a dictionnary:


In [2]:
paris = {"wind": [10, 0, 20, 30, 20, 0], "temperature": [1, 5, 1, -1, -1, 3]}

# get wind when temperature is minimal
paris_temp = paris["temperature"]
idx_max_temp = paris_temp.index(max(paris_temp))

print(f"max temp is {paris_temp[idx_max_temp]}°C at index {idx_max_temp}")
print(f"wind speed at max temp = {paris['wind'][idx_max_temp]} km/h")

max temp is 5°C at index 1
wind speed at max temp = 0 km/h


**Comments**
- Pro
  - More readable code (reading `paris["temperature"]` is clearer than `paris[1]`)
  - Less error prone code

- Con 
  - The code to compute the final result is not very readable

**Improvement**

Add functions

In [3]:
paris = {"wind": [10, 0, 20, 30, 20, 0], "temperature": [1, 5, 1, -1, -1, 3]}


def max_temp(station):
    """ returns the maximum temperature available in the station"""
    return max(station["temperature"])


def arg_max_temp(station):
    """ returns the index of maximum temperature available in the station"""
    max_temperature = max_temp(station)
    return station["temperature"].index(max_temperature)


idx_max_temp = arg_max_temp(paris)

print(f"max temp is {max_temp(paris)}°C at index {arg_max_temp(paris)}")
print(f"wind speed at max temp = {paris['wind'][idx_max_temp]} km/h")

max temp is 5°C at index 1
wind speed at max temp = 0 km/h


**Comments**

- Pro:
  - adding functions leads to a code that is easier to read (and therefore to debug!)
  - testing functions can be done separately from the rest of the code
- Con 
  - We rely on the fact that the dictionnaries have been built correctly (for example wind and temperature arrays have the same length).

**Improvement**

Define a function that builds the station (delegate the generation of the station dictionnary to a function)

In [5]:
def build_station(wind, temp):
    """ Build a station given wind and temp
    :param wind: (list) floats of winds
    :param temp: (list) float of temperatures
    """
    if len(wind) != len(temp):
        raise ValueError("wind and temperature should have the same size")
    return {"wind": list(wind), "temperature": list(temp)}


def max_temp(station):
    """ returns the maximum temperature available in the station"""
    return max(station["temperature"])


def arg_max_temp(station):
    """ returns the index of maximum temperature available in the station"""
    max_temperature = max_temp(station)
    return station["temperature"].index(max_temperature)


paris = build_station([10, 0, 20, 30, 20, 0], [1, 5, 1, -1, -1, 3])
idx_max_temp = arg_max_temp(paris)

print(f"max temp is {max_temp(paris)}°C at index {arg_max_temp(paris)}")
print(f"wind speed at max temp = {paris['wind'][idx_max_temp]} km/h")

max temp is 5°C at index 1
wind speed at max temp = 0 km/h


**Comments**

  - If the dedicated function `build_station` is used, the returned dictionary is well structured.
  - If one changes `build_station`, only `max_temp` and `arg_max_temp` have to be changed accordingly

## Object oriented in a nutshell

A class defines a template used for building objects. 
In our example, the class (named `WeatherStation`) defines the specifications of what is a weather station (i.e., a weather station should contain an array for wind speeds, named "wind", and an array for temperatures, named "temp").
`paris` should now be an object that answers to these specifications. Is is called an **instance** of the class `WeatherStation`.

When defining the class, we need to define how to initialize an instance of the class (special "function" `__init__`). 


In [1]:
class WeatherStation:
    """ A weather station that holds wind and temperature
    
    :param wind: any ordered iterable
    :param temperature: any ordered iterable
    
    wind and temperature must have the same length.
    
    """
    def __init__(self, wind, temperature):
        self.wind = list(wind)
        self.temp = list(temperature)
        if len(self.wind) != len(self.temp):
            raise ValueError(
                "wind and temperature should have the same size"
            )

    def max_temp(self):
        """ returns the maximum temperature recorded in the station"""
        return max(self.temp)

    def arg_max_temp(self):
        """ returns the index of (one of the) maximum temperature recorded in the station"""
        return self.temp.index(self.max_temp())


paris = WeatherStation([10, 0, 20, 30, 20, 0], [1, 5, 1, -1, -1, 3])
idx_max_temp = paris.arg_max_temp()
print(f"max temp is {paris.max_temp()}°C at index {paris.arg_max_temp()}")
print(f"wind speed at max temp = {paris.wind[idx_max_temp]} km/h")

max temp is 5°C at index 1
wind speed at max temp = 0 km/h


**Attributes**

Names attached to the instances are called  **attributes** (here, `max_temp`, `temp`, `wind`, etc.).

**Methods**

The functions `max_temp` and the `arg_max_temp` are now part of the class `WeatherStation`. 
Functions attached to classes are named **methods**.

The first argument of methods is `self`. When the method is called with an instance, as

`idx_max_temp = paris.arg_max_temp()`

`self` is automatically set as a reference to the instance who calls the method (here, `self = paris`). So it is equivalent to:

`idx_max_temp = WeatherStation.arg_max_temp(paris)`

**Special method `__init__`**

Automatically called during the creation (instantiation) of the object (instance).


**Data attributes**

`wind` and `temp` lists are attached to instances. Each instance of the class has its own version of `wind` and `temp`. 
They are **data attributes** or **instance variables**.

An object (here `paris`) thus contains both **data attributes** (holding data for example) and **methods** to access and/or process the data.



## Coming next

What if we now have a weather station that also measure humidity ?

Do we need to rewrite everything ? 

What if we rewrite everything and we find a bug ? 

**Here comes inheritance**