### Classes and Object Oriented Programming (OOP)

Imagine that you want to store some *fundamental constant* in a program, that can be accessed at the global level, without the risk to shadow or destroy them, by redefinition, in some other part of the program.

You can do that by build a specific *class* that stores those numbers: 

In [None]:
class constant_class():
    def __init__(self):
       self.h=6.62607e-34      # Planck
       self.k=1.38065e-23      # Boltzmann
       self.avo=6.02214e23     # Avogadro
       self.R=self.avo*self.k  # Gas constant

We defined the *class* *constant_class*. The *class* definition, at this level, is very much like a *function definition* (with an empty list of arguments, for now). 

Inside the class, we defined an ``` __init__ ``` function, with a strange argument *self* (*self* is a label that refers to the definition of the class, and that will be substituted by the label specifying each single *instance* of the class itself. 

The body of *init* consists of the assignment of our fundamental constants (in the standard SI units), each one preceded by the label *self*. Our constants will be the *attributes* of the class.

To use the class, we first have to create an *instance* of it. We call *fc* such instance:

In [None]:
fc=constant_class()

Now, the retrieve the Avogadro number, you just have to write:

In [None]:
fc.avo

It works fine in functions, too:

In [None]:
def func():
    print("The avogadro number is ", fc.avo)
    
func()

To get the value of some attribute of the class, it is a ***very good practice*** to define a function within the class that returns the value of the wanted attribute. A function within a class is called *method*:

In [None]:
class constant_class():
    def __init__(self):
       self.h=6.62607e-34      # Planck
       self.k=1.38065e-23      # Boltzmann
       self.avo=6.02214e23     # Avogadro
       self.R=self.avo*self.k  # Gas constant
        
    def get_avogadro(self):
        return self.avo
    
    def get_boltzmann(self):
        return self.k
    
    def get_planck(self):
        return self.h
    
    def get_gas(self):
        return self.R
    
fc=constant_class()

Now, wanting for instance the value of the gas constant, you can use the method *get_gas*:

In [None]:
fc.get_gas()

You could also define a method *get_constant* that requires the name of the constant to be retrieved:

In [None]:
class constant_class():
    def __init__(self):
       self.h=6.62607e-34      # Planck
       self.k=1.38065e-23      # Boltzmann
       self.avo=6.02214e23     # Avogadro
       self.R=self.avo*self.k  # Gas constant
        
    def get_avogadro(self):
        return self.avo
    
    def get_boltzmann(self):
        return self.k
    
    def get_planck(self):
        return self.h
    
    def get_gas(self):
        return self.R
    
    def get_constant(self, const):      
        if const=='avogadro':
            return self.avo
        elif const=='boltzmann':
            return self.k
        elif const=='planck':
            return self.h
        elif const=='gas':
            return self.R
        else:
            print("Unknown constant ", const)
              
    
fc=constant_class()

In [None]:
fc.get_constant('gas')

You can implement the method *get_constant* even by using a *match/case* construct (Python version >= 3.10):

In [None]:
class constant_class():
    def __init__(self):
       self.h=6.62607e-34      # Planck
       self.k=1.38065e-23      # Boltzmann
       self.avo=6.02214e23     # Avogadro
       self.R=self.avo*self.k  # Gas constant
        
    def get_avogadro(self):
        return self.avo
    
    def get_boltzmann(self):
        return self.k
    
    def get_planck(self):
        return self.h
    
    def get_gas(self):
        return self.R
    
    def get_constant(self, const):
        match const:
            case 'avogadro':
                return self.avo
            case 'boltzmann':
                return self.k
            case 'planck':
                return self.h
            case 'gas':
                return self.R
            case other:
                print("Unknown constant ", const)
              
    
fc=constant_class()

In [None]:
fc.get_constant('boltzmann')

For instance:

Now, those constant are stored in SI units... but you might want them in some different units! For instance, the gas constant is stored in $m^3 Pa\ /\ mol K$, and you may want it in $\ell atm\ / mole K$. 

You need a conversion factor.

So: 

- start by defining, in *init*, the variable *self.R_unit = 1.* This is the factor used to convert the value of your constant;
- modify each function returning the value of *R* (*R* multiplied by such conversion factor);
- define the function *set_unit* that *asks* for the constant and the type of units wanted.

In [None]:
class constant_class():
    def __init__(self):
       self.h=6.62607e-34      # Planck
       self.k=1.38065e-23      # Boltzmann
       self.avo=6.02214e23     # Avogadro
       self.R=self.avo*self.k  # Gas constant
       
       self.R_unit=1.
        
    def get_avogadro(self):
        return self.avo
    
    def get_boltzmann(self):
        return self.k
    
    def get_planck(self):
        return self.h
    
    def get_gas(self):
        return self.R*self.R_unit
    
    def get_constant(self, const):      
        if const=='avogadro':
            return self.avo
        elif const=='boltzmann':
            return self.k
        elif const=='planck':
            return self.h
        elif const=='gas':
            return self.R*self.R_unit
        else:
            print("Unknown constant ", const)           
            
    def set_units(self, const, system='SI'):
        if const=='gas':
           if system =='SI': 
              self.R_unit=1.
           elif system == 'litre_atm':
              self.R_unit = 1e3/101325                 
              
    
fc=constant_class()

Now, ask for *R*:

In [None]:
fc.get_gas()

Change units and ask for the constant again:

In [None]:
fc.set_units('gas', system='litre_atm')
fc.get_gas()

In [None]:
fc.get_gas()

Note that if you directly access the attribute *R* of the class, instead of using the method *get_gas*, you get the value in SI units:

In [None]:
fc.R

Of course there can be infinite ways to change the behaviour of the class, by adding other methods of modifying the existing ones. For instance, analyze this:

In [None]:
class constant_class():
    def __init__(self):
       self.h=6.62607e-34      # Planck
       self.k=1.38065e-23      # Boltzmann
       self.avo=6.02214e23     # Avogadro
       self.R=self.avo*self.k  # Gas constant
       
       self.R_unit=1.
        
    def get_avogadro(self):
        return self.avo
    
    def get_boltzmann(self):
        return self.k
    
    def get_planck(self):
        return self.h
    
    def get_gas(self, units='default'):
        if units=='default':
           return self.R*self.R_unit
        else:
           self.set_units(const='gas', units=units)
           R_value=self.R*self.R_unit
           self.set_units(const='gas')
           return R_value
    
    def get_constant(self, const, units='SI'):      
        if const=='avogadro':
            return self.avo
        elif const=='boltzmann':
            return self.k
        elif const=='planck':
            return self.h
        elif const=='gas':
            return self.get_gas(units)
        else:
            print("Unknown constant ", const)           
            
    def set_units(self, const, units='SI'):
        if const=='gas':
           if units =='SI': 
              self.R_unit=1.
           elif units == 'litre_atm':
              self.R_unit = 1e3/101325                 
              
    
fc=constant_class()

In [None]:
print(fc.get_gas())
print(fc.get_gas(units='litre_atm'))
print(fc.get_constant('gas'))
print(fc.get_constant('gas', units='litre_atm'))

With this implementation you can also change the default conversion factor for the gas constant:

In [None]:
fc.set_units(const='gas', units='litre_atm')
fc.get_gas()

Never forget to *document* your class and to extensively test it! 

Classes can be used to store variables at the global level, that can be modified within functions without the need to use *global* declaration, better by using the appropriate methods provided for the purpose. This is a *good programming practice* that can avoid a lot of mistakes.  

In [None]:
class parameter_class():
      def __init__(self, val=1):
          self.par=val
      def set_value(self, val):
          self.par=val
      def get_value(self):
          return self.par
        
par=parameter_class(val=2)

print("value of par: ", par.par)

In [None]:
def func():
    par.set_value(5)
    print("Inside func:           ", par.get_value())
    
print("Before calling func:   ", par.get_value())
func()
print("After func was called: ", par.get_value())

Now, reconsider the factorial function that was implemented by using a *class*:

In [None]:
class factorial_class():
    def __init__(self):
        self.fact=1
        
    def set_init(self):
        self.fact=1
        
    def factorial(self, n, prn=False):
        self.set_init()
        self.fact_rec(n)
        if prn:
           print("The factorial of %3i  is %6i" % (n, self.fact))
        else:
           return self.fact
    
    def fact_rec(self, b):            # <--- recursive function
        self.fact=b*self.fact
        b=b-1
        if b == 1:
           return 
        else:
           self.fact_rec(b)
    
ff=factorial_class()

In [None]:
ff.factorial(8, prn=True)

### Inheritance

Classes can be defined so that they inherit attributes and/or methods from *super-classes*.

Here there a very simple example of a *method_class* that only contains a method for computing the product of two scalars; then a *data_class* is defined so that it inherits that unique method from *method_class* and, in addition, defines and/or set two scalars: 

In [None]:
class method_class():

    def compute(self):
        return self.x*self.y
    
class data_class(method_class):
      def __init__(self, xini, yini):
            self.x=xini
            self.y=yini
      def set_x(self, x):
          self.x=x
      def set_y(self, y):
          self.y=y
    

Two instances *may_data_1* and *my_data_2* of the *data_class* class are defined, having different default values of those two scalars: 

In [None]:
my_data_1=data_class(1., 2.)
my_data_2=data_class(3., 4.)

To see how to use it, have a look at the following examples:

In [None]:
my_data_1.compute()

In [None]:
my_data_2.compute()

In [None]:
my_data_1.set_x(5.)
print(my_data_1.x)
print(my_data_1.compute())

###  Exercise:

By using inheritance, code a class system that computes average and standard deviation of a dataset, each dataset being a different instance of a data_class.

In [None]:
class stat_class():
      
      def set_size(self):
          self.size=len(self.data)
          return self.size
        
      def average(self):
          ave = 0.
          size = self.set_size()
          self.size=size
          for ix in self.data:
              ave=ave+ix
          ave=ave/size
          self.ave=ave
          self.flag=True
          return ave
        
      def standard_deviation(self, force=True):
          if (not self.flag) or (self.flag and force):
             ave=self.average() 
            
          ave=self.ave
          size=self.size
          std=0.
          for ix in self.data:
              std=std+(ix-ave)**2
                
          std=(std/(size-1))**0.5
          self.std=std
          return std
            
      def describe(self):
          if not self.flag:
             self.average()
             self.standard_deviation()
                
          print("data-set: %s" % self.name)
          print("Size: %4i" % self.size)
          print("Average:    %5.2f" % self.ave)
          print("Stand. dev: %5.2f" % self.std)
          
            
class data_class(stat_class):
    def __init__(self, name='default set', x=[0., 0., 0.]):
        self.data=x
        self.name=name
        self.ave=0.
        self.std=0.
        self.flag=False
    def set_data(self, xlist):
        self.data=xlist
        self.flag=False
        
x1=[3., 4., 6.]
x2=[1., 0., 6., 7., 10.]
set_1=data_class('set_1', x1)
set_2=data_class('set_2', x2)

In [None]:
set_1.describe()

set_2.describe()

In [None]:
set_2.describe()

In [None]:
set_1.set_data([1.0, 1.5, 1.2, 0.8, 0.95])
set_1.describe()

Also, it is possible to define arrays of *object* like the one (*sets*) constructed here:

In [None]:
import numpy as np

x1=np.array([1.0, 1.5, 1.2, 0.8, 0.95])
x2=np.array([1.1, 1.2, 1.6, 0.7, 0.85])
x3=np.array([0.9, 0.85, 1.24, 1.3, 1.25])

set1=data_class('set_1',x1)
set2=data_class('set_2',x2)
set3=data_class('set_3',x3)

sets=np.array([set1, set2, set3], dtype='object')

To describe the second element (*object*) of the array *sets*...

In [None]:
sets[1].describe()

What follow is a rather advanced use of Python features. We exploit here the functions *exec* and *eval* that take strings and evaluate them. For instance:

In [None]:
a=1
print(eval('a'))

that is: there is a variable whose name is *a*, and there is the string 'a'... ``` eval('a') ``` *evaluates* 'a' to produce the variable name *a* and passes the result to the *print* function. Note that for the evaluation to be successful, the variable *a* MUST already exists in the namespace where *eval* is operating.

The *exec* function works in this way:

In [None]:
my_string='b=1'
exec(my_string)
print(b)

here we have the string 'b=1' which is *executed* by the function *exec*. In this case, the result is the assignment ``` b=1 ``` and so, at the end of *exec*, we have a variable *b* which is equal to 1.   

All this is used in the following code:

In [None]:
x1=np.array([1.0, 1.5, 1.2, 0.8, 0.95])
x2=np.array([1.1, 1.2, 1.6, 0.7, 0.85])
x3=np.array([0.9, 0.85, 1.24, 1.3, 1.25])
x=[x1, x2, x3]

set_list=['set1', 'set2', 'set3']
set_name=['set_1', 'set_2', 'set_3']

for iset, ix, iname in zip(set_list, x, set_name):
    exec(iset + '= data_class(iname, ix)')
    
l_set=list(eval(iset) for iset in set_list)
sets=np.array(l_set, dtype='object')

In the *for* cycle the *exec* commands execute the instructions 

```
set1 = data_class('set_1', x1)
set2 = data_class('set_2', x2)
set3 = data_class('set_3', x3)
```

The *eval* commmand is used to define the list of the three *variables* corresponding to the strings in the list *set_list*. 

Now, as before, the three *objects* *set1*, *set2* and *set3* are contained in the array *sets*

In [None]:
sets[0].describe()

In [None]:
sets[2].describe()

### An aside on compilation and optimization

In addition to the *eval* and the *exec* functions, the function *compile* can also be used that prepares an *object* code from any legal python string; such object code can then be executed by using the function *exec*. For instance:

In [None]:
code_str='a,b=3,2\nprint(a*b)'
print(code_str)

code=compile(code_str, 'test', 'exec')
exec(code)

A more useful and interesting feature is the possibility to compile and optimize code by using functions of the [Numba library](https://numba.readthedocs.io/en/stable/index.html): imagine to write a function to perform a sorting of data stored in an array:

In [None]:
def my_sort(data):

    ds=np.copy(data)
    ll=np.arange(ds.size)
        
    for ip in ll:
        ll2=np.arange(ip+1, ds.size)
        for jp in ll2:
            if ds[ip] > ds[jp]:
               ds[ip], ds[jp] = ds[jp], ds[ip]
              
    return ds

To test the function, you can run it on a list of 500 random numbers in the ``` [0, 1] ``` range generated by means of the *random.uniform* function of numpy.  

In [None]:
data=np.random.uniform(0, 1, 500)

sorted_data=my_sort(data)

The time required to do such sorting can be measured with the *%timeit magic*

In [None]:
time1 = %timeit -o my_sort(data)

Now, let's optimize the function by using the *decorator* *jit* of the Numba library (decorators are preceded by the symbol '@' and are written immediately before the function definition):

In [None]:
from numba import jit

@jit(nopython=True)
def my_sort(data):

    ds=np.copy(data)
    ll=np.arange(ds.size)
        
    for ip in ll:
        ll2=np.arange(ip+1, ds.size)
        for jp in ll2:
            if ds[ip] > ds[jp]:
               ds[ip], ds[jp] = ds[jp], ds[ip]
              
    return ds

Test the function by measuring the execution time:

In [None]:
time2 = %timeit -o my_sort(data)

The optimized version is much faster than the first version! 

Note that the numpy function for sorting data is even much faster...

In [None]:
%timeit np.sort(data)

### An example 

Imagine (as here it really the case) to have downloaded data from in INGV site, concerning earthquakes (time of occurence, location, magnitude, depth and a bunch of other information). Such data are organized in text files where columns are separated by the '|' character. A quick and nice way to import one of such files in our Python code, is by means of functions of the ***pandas*** library. 

In [None]:
import pandas as pd  # import the Pandas library with the alias 'pd'
import numpy as np
import matplotlib.pyplot as plt

Here we load data for Sicilia (data refer to all the earthquake in 2021, having magnitude greater than 2): data are loaded in the Pandas *DataFrame* we call *data_sicilia*:

In [None]:
data_sicilia=pd.read_csv('data_files/earthq_sicilia.dat', sep='|')

The first 5 row of the datafile can be seen by:

In [None]:
data_sicilia.head()

Now, image we are only interested in the magnitude of those earthquakes: we select the column 'Magnitude': 

In [None]:
sicilia_magnitude=data_sicilia['Magnitude']

In [None]:
sicilia_magnitude

Note the *type* of the objects *data_sicilia* and *sicilia_magnitude*):

In [None]:
type(data_sicilia)

In [None]:
type(sicilia_magnitude)

We could get the average magnitude of the earthquakes in the file by using the *mean* method of the pandas data serie: 

In [None]:
sicilia_magnitude.mean()

A simple histogram of the data can be viewed: 

In [None]:
sicilia_magnitude.hist(bins=20)

or embedding it in a *figure* making use of the *matplotlib* library (it gives you more freedom in putting axis labels having some size, etc...)  

In [None]:
plt.figure()
sicilia_magnitude.hist(bins=20)
plt.xlabel('Magnitude', fontsize=16)
plt.show()

We can transform thi DataSerie in a numpy array:

In [None]:
sicilia_m_list=np.array(sicilia_magnitude)

In [None]:
sicilia_m_list

Now, do the same with other files referring to other areas in Italy:

In [None]:
data_north=pd.read_csv('data_files/earthq_north.dat', sep='|')
data_central=pd.read_csv('data_files/earthq_central.dat', sep='|')
data_emilia=pd.read_csv('data_files/earthq_emilia.dat', sep='|')

north_magnitude=data_north['Magnitude']
central_magnitude=data_central['Magnitude']
emilia_magnitude=data_emilia['Magnitude']

north_m_list=np.array(north_magnitude)
central_m_list=np.array(central_magnitude)
emilia_m_list=np.array(emilia_magnitude)

Now, pack everything in the array *sets*:

In [None]:
set_name=['North', 'Emilia', 'Central', 'Sicilia']
x=[north_m_list, emilia_m_list, central_m_list, sicilia_m_list] 

set_list=['set1', 'set2', 'set3', 'set4']
set_name=['set_1', 'set_2', 'set_3', 'set_4']

for iset, ix, iname in zip(set_list, x, set_name):
    exec(iset + '= data_class(iname, ix)')
    
l_set=list(eval(iset) for iset in set_list)
sets=np.array(l_set, dtype='object')

In [None]:
for iset in range(4):
    sets[iset].describe()
    print("")

## Other interesting features of classes

- Class attributes *vs* object attributes
- Class methods *vs* object methods

Let's create a class having name *Data*, just to store names of objects (*instances* of the class itself). 
We want to keep track of the number of objects that with create in that *Data* class. Such number of objects will be a variable that should be clearly *logically* related to the class itself, and we would like to have a way to translate such *logical relation* into a *structural relation*. 

In what follow
- we define the variable *number_of_obj* within the body of the *Data* class, and we initialize it to the value 0. This is a ***class attribute***: a value that will be *shared* by each object of the class: it is a *global* variable within the namespace of the class itself, so that it can be *seen* by every *method* of the class and its objects; in other words, it becomes an *attribute* of every object. 

- As usual, an \__init\__ method is created as a *constructor* of each newly defined object of the class. Such method asks for a *name* to be associated to the object and stored as an ***object attribute***. Now, as we create a new object, we want to *upgrade* the class variable *number_of_obj*, and this operation should be performed within the \__init\__ method. 

One way to work is:

In [None]:
class Data:
    number_of_obj=0  
    
    def __init__(self, name):
        self.name=name        
        Data.number_of_obj += 1

It *works*:

In [None]:
x1=Data('x1')
x2=Data('x2')

print("Number of objects: ", Data.number_of_obj)
print("Name of object 1: ", x1.name)
print("Name of object 2: ", x2.name)

Indeed, we created 2 objects (*x1* and *x2*) and such number 2 is correctly recorded in the *Data.number_of_obj* attribute of the class. However this way to proceed is ***not** advised*. In particular, note that the name of the class (*Data*) is *hardcoded* in the definition of the \__init\__ function... if, for some reason, at a later time one decides to change the name of the class, every instance of such name within the body of the class must also be changed.

Much better is to modify the class attibute by using a *method of the class* itself, rather than a *method of the object*. This is the correct way to proceed:

In [None]:
class Data:
    number_of_obj=0  
    
    def __init__(self, name):
        self.name=name        
        self.increment_number()
        
    @classmethod
    def increment_number(cls):
        cls.number_of_obj += 1       

We created a *class method* whose name is *increment_number*, which is preceded by the *@classmethod* ***function decorator***.

Such method refers to the *class* by the name *cls* (when invoked, *cls* will take the name *Data*); it increments the class attribute *cls.number_of_obj* by 1 unit. 

The \__init\__ method of the object invokes such class method by calling *self.increment_number()*: indeed, any *class method* is also a method of each of its instances. Let's try it: 

In [None]:
x1=Data('x1')
x2=Data('x2')

print("Number of objects: ", Data.number_of_obj)
print("Name of object 1: ", x1.name)
print("Name of object 2: ", x2.name)

Now, suppose you want to keep track of the objects you have instantiated (not only their total number). It would be nice to have a list (*obj_list*) of all such objects. Again, this list should be a *class attribute*: 

In [None]:
class Data:
    number_of_obj=0  
    obj_list=[]
    
    def __init__(self, name):
        self.name=name      
        self.increment_number()
        self.update_list(name)
        
        
    @classmethod
    def increment_number(cls):
        cls.number_of_obj += 1    
        
    @classmethod
    def update_list(cls, name):
        cls.obj_list.append(name)

With the same logic as before, we created a *class method* to properly handle the class attribute *obj_list*; the method is also called in the object *constructor* (\__init\__). Lets'try: 

In [None]:
x1=Data('x1')
x2=Data('x2')

print("Number of objects: ", Data.number_of_obj)
print("List of object names", Data.obj_list)

Suppose now you are interested to create new objects by *summing* other previously defined ones, by assigning names to them, which are defined following some given rule. For instance, from *x1* and *x2*, you want the object *x3=x1+x2* having the name *x1_x2*. Moreover, you want to really use the *operator* '+' to perform the job!

The *dunder* method \__add(self, other)\__ is the way to do it...

In [None]:
class Data:
    number_of_obj=0  
    obj_list=[]
    
    def __init__(self, name):
        self.name=name
        
        self.increment_number()
        self.update_list(name)
        
    def __add__(self, other):     
        new_obj_name=self.name + '_' + other.name
        new_obj=self.join(new_obj_name)
        return new_obj
              
    @classmethod
    def increment_number(cls):
        cls.number_of_obj += 1    
        
    @classmethod
    def update_list(cls, name):
        cls.obj_list.append(name)
        
    @classmethod
    def join(cls, new_name):
        return cls(new_name)
    
    @classmethod
    def reset(cls):
        cls.number_of_obj=0 
        cls.obj_list=[]
        

Some points must be noted here:

- every time the '+' operator will be used on objects of the class *Data*, the corresponding \__add\__ method will be invoked;
- such method produces the appropriate name and stores it in the *new_obj_name* (local) variable;
- \__add\__ calls the *class method* *join*, by passing it *new_obj_name* as argument;
- the class method *join* returns (to the *caller* function \__add\__) a new instance of the class: indeed, *cls(new_name)* is equivalent to *Data(new_name)* which, in turn, is just the creation of a new instance of the *Data* class. This also means that the \__init\__ constructor will be called for the newly created object, so that the *number_of_obj* and the *obj_list* will be properly updated;
- lastly, the \__add\__ method returns the new object.

A class method *reset* was also implemented to reset the class variables to their original values. 

let's see what happen:

In [None]:
x1=Data('x1')
x2=Data('x2')

x3=x1+x2

print("Number of objects: ", Data.number_of_obj)
print("List of object names", Data.obj_list)

Note (and try to motivate) the behaviour of the '+' operator in a case like the following:  

In [None]:
Data.reset()

x1=Data('x1')
x2=Data('x2')
x3=Data('x3')

x4=x1+x2+x3

print("Number of objects: ", Data.number_of_obj)
print("List of object names", Data.obj_list)