# Programming for Chemistry 2025/2026 @ UniMI

![logo](logo_small.png "Logo")

## Lecture 09: Advanced Object Oriented Programming

In this lecture I will introduced *advanced* topics of OOP.
* The first will be **special methods**, a feature common to other programming languages, that will make our classes act like **lists** or **numbers**.
* The second will be **inheritance/abstract classes** (aka **interfaces**) that are used to organize classes with common methods/behavior in a *class hierarchy* (**"is-a"** relationship).
* The third will be **composition/aggregation** that is used to represent a **"has-a"** relationship, where one class contains or *has* instances of another class, and both can exist independently.

## 1. Special methods
In reality we have already encountered a special method: the `__init__(self, ...)` method. Special methods are enclosed between two `__` (underscores).

They are special because they are called by Python when you create an object, or when you call `len()` of an object, or when you print it, or when you index an object with `[]`, or when you add two objects, just to name a few.

As an example, let's add the possibility:
1. to the the number of atoms by simply `len(molecule)`
2. to access each individual atom as `molecule[i]`:

In [None]:
class Molecule:
    def __init__(self, name=""):
        self.name = name
        self.atoms = []       # list of element symbols (e.g., ["C", "H", "H"])
        self.coordinates = [] # list of (x, y, z) tuples

    def add_atom(self, element, x, y, z):
        self.atoms.append(element)
        self.coordinates.append((float(x), float(y), float(z)))

    #########################################################    
    # instead of: def number_of_atoms(self):
    #########################################################    
    def __len__(self):
        return len(self.atoms)

    #########################################################    
    # access atoms by index
    #########################################################    
    def __getitem__(self, idx):
        return self.atoms[idx], self.coordinates[idx]
    
    def from_xyz(self, filename):
        with open(filename, "r") as f:
            lines = f.readlines()

        natoms = int(lines[0])
        self.name = lines[1].strip()
        
        self.atoms = []
        self.coordinates = []
        for line in lines[2:2+natoms]:
            parts = line.split()
            element, x, y, z = parts[0], parts[1], parts[2], parts[3]
            self.add_atom(element, x, y, z)

    def to_xyz(self, filename):
        """Save molecule to an XYZ file."""
        with open(filename, "w") as f:
            f.write(f"{len(self.atoms)}\n")
            f.write(f"{self.name}\n")
            for atom, (x, y, z) in zip(self.atoms, self.coordinates):
                f.write(f"{atom:2s} {x:15.8f} {y:15.8f} {z:15.8f}\n")

In [None]:
mol = Molecule()
mol.from_xyz('benzene.xyz')
print(len(mol))
print()

for i in range(len(mol)):
    atom, coord = mol[i]
    print(atom, coord)

Two other special methods are `__str__` which is called to convert the object into a string, and `__repr__`, to convert the object into a more informative string.

In addition to that we can think about providing a `+` operator that takes two molecules and places them in a new molecule.

Here is the updated code:

In [None]:
class Molecule:
    def __init__(self, name=""):
        self.name = name
        self.atoms = []       # list of element symbols (e.g., ["C", "H", "H"])
        self.coordinates = [] # list of (x, y, z) tuples

    def add_atom(self, element, x, y, z):
        self.atoms.append(element)
        self.coordinates.append((float(x), float(y), float(z)))

    def __len__(self):
        return len(self.atoms)

    def __getitem__(self, idx):
        return self.atoms[idx], self.coordinates[idx]

    # THIS IS NEW
    def __str__(self):
        return self.name if len(self.name) > 0 else "just a molecule"

    # THIS IS NEW
    def __repr__(self):
        return f'class Molecule: name={self.name} len={len(self.natoms)} atoms={self.atoms}'

    # THIS IS NEW
    def __add__(a, b):
        c = Molecule(f'{a.name} + {b.name}')
        
        for i in range(len(a)):
            atom, coord = a[i]
            c.add_atom(atom, *coord)
        
        for i in range(len(b)):
            atom, coord = b[i]
            c.add_atom(atom, *coord)
        return c

    def translate(self, dx, dy, dz):
        for i in range(len(self)):
            x, y, z = self.coordinates[i]
            self.coordinates[i] = (x+dx, y+dy, z+dz)
            
    def from_xyz(self, filename):
        with open(filename, "r") as f:
            lines = f.readlines()

        natoms = int(lines[0])
        self.name = lines[1].strip()
        
        self.atoms = []
        self.coordinates = []
        for line in lines[2:2+natoms]:
            parts = line.split()
            element, x, y, z = parts[0], parts[1], parts[2], parts[3]
            self.add_atom(element, x, y, z)

    def to_xyz(self, filename):
        """Save molecule to an XYZ file."""
        with open(filename, "w") as f:
            f.write(f"{len(self.atoms)}\n")
            f.write(f"{self.name}\n")
            for atom, (x, y, z) in zip(self.atoms, self.coordinates):
                f.write(f"{atom:2s} {x:15.8f} {y:15.8f} {z:15.8f}\n")

In [None]:
mol1 = Molecule()
mol1.from_xyz('benzene.xyz')

mol2 = Molecule()
mol2.from_xyz('water.xyz')
mol2.translate(0, 0, 2.0)

mol3 = mol1 + mol2
print(mol3)
print(len(mol3))
mol3.to_xyz('benzene+water.xyz')

In [None]:
!jmol benzene+water.xyz

The result should be like this:

![two_molecules](benzene+water.png "benzene+water")

### 1.1 Variable scopes

**Starting from now I will not report the code of the Molecule class in the notebook, but you will find it into the molecule.py file. Please open it with a text editor.**

Similalry to functions, class variables (attributes) prefixed with `self.` are **local** to each object. This makes possible that different `Molecule` object can hold a different number of atoms.

If you create a variable outside the class methods, those variables will be common to each object. A typical example is to hold the global counter for the number of molecules you created. Another example is to define constants to be used internally by the class.

For example:
```python
class Molecule:
    number_of_molecules = 0
    
    def add_atom(self, atom, coords):
         ...
         Molecule.number_of_molecules += 1
       
    def from_xyz(...):
         ...
         Molecule.number_of_molecules += 1
```
and:
```python
class Molecule:
    # this is a class variable
    __bohr = 0.52917721

    def bohr_to_angstrom(self):
       """Convert the coordinates from bohr to angstrom"""
          for i in range(len(self)):
              x, y, z = self.coordinates[i]
              self.coordinates[i] = (x*Molecule.__bohr, y*Molecule.__bohr, z*Molecule.__bohr)           
```

### 1.2 Class methods

Did you notice that to read a molecule from a XYZ file we have first to **create** an empty molecule, then call `.from_xyz(...)`?

```python
mol = Molecule()
mol.from_xyz('water.xyz')
```

We can instead use a **class method** that it's used to create a `Molecule` object on the fly, as an alternative to `__init__(self)`. Here is the piece of code:

```python
class Molecule:
    ...
    @classmethod
    def from_xyz(cls, filename):
        """Create a molecule from an XYZ file."""
        with open(filename, "r") as f:
            lines = f.readlines()

        natoms = int(lines[0])
        name = lines[1].strip()
        
        mol = cls(name)                  # calls Molecule.__init__(self, name)
        for line in lines[2:2+natoms]:
            parts = line.split()
            element, x, y, z = parts[0], parts[1], parts[2], parts[3]
            mol.add_atom(element, x, y, z)
        return mol
```
The special **decorator** `@classmethod` does not pass `self` as the first element, because it applies to the class not to the object. It instead passed the **name of the class** as the first element.

In [None]:
from molecule import Molecule

benzene = Molecule.from_xyz('benzene.xyz')
print('in angstrom:')
print(benzene.get_distance(0,1))

print('in bohr units:')
benzene.angstrom_to_bohr()
print(benzene.get_distance(0,1))

One final note: differently from other languages, Python does not have an explicit keyword to mark a class variable or class method as **private**. To do this just define it with two leading `__` underscores. In the following example, we'll see that `Molecule.__bohr` is not accessible.

In [None]:
print(benzene.__bohr)

## 2. Inheritance and abstract classes

**Inheritance** is a mechanism that allows a new class (the **subclass** or **child class**) to adopt the attributes and methods of an existing class (the **superclass** or **parent class**). This is often described as an **"is-a"** relationship (e.g., a "Car is Vehicle").

Suppose our car dealer is going to sell also motor bikes and airplanes. Making a sepatate class for each vehicle leads to code duplication:

In [None]:
class Car:
    def __init__(self, maker, model, year, price, sold=False):
        self.maker = maker
        self.model = model
        self.year = year
        self.price = price
        self.sold = sold
    
    def display(self):
        return f"Car: {self.maker} {self.model} {self.year} price:{self.price} sold:{self.sold}"
    
    def is_sold(self):
        return self.sold
        

class MotorBike:
    def __init__(self, maker, model, year, price, racing=True, sold=False):
        self.maker = maker
        self.model = model
        self.year = year
        self.price = price
        self.sold = sold
        self.racing = racing

    def display(self):
        return f"Motorbike: {self.maker} {self.model} {self.year} price:{self.price} racing:{self.racing} sold:{self.sold}"
        
    def is_sold(self):
        return self.sold


class Airplane:
    # ... wingspan, ... propellers
    pass

Why not creating a common class `Vehicle` that contains common features and use it as a parent class?

In [None]:
class Vehicle:
    def __init__(self, maker, model, year, price, sold=False):
        self.maker = maker
        self.model = model
        self.year = year
        self.price = price
        self.sold = sold
        
    def display(self):
        return f"{self.maker} {self.model} {self.year} price:{self.price} sold:{self.sold}"

    
class Car(Vehicle):
    def __init__(self, maker, model, year, price, sold=False, electric=False):
        super().__init__(maker, model, year, price, sold)
        self.electric = electric
        
    def display(self):
        return f"This is car: {super().display()} electric={self.electric}"

    
class Motorbike(Vehicle):
    def __init__(self, maker, model, year, price, sold=False, racing=False):
        super().__init__(maker, model, year, price, sold)
        self.racing = racing
        
    def display(self):
        return f"This is motorbike: {super().display()} racing={self.racing}"

In [None]:
v = Vehicle('Honda', 'Civic', 1995, 12000)
c = Car('Ford', 'Mustang', 1990, 40000, True)
m = Motorbike('Ducati', 'Desmosedici', 2007, 25000, False, racing=True)

print(c.display())
print(m.display())
print(v.display())                             # what is this?

The problem is that `Vehicle` is too generic and it should never be instantiated as an object. This is because `Vehicle` is an **abstract** class, that just defines a scaffold or **interface** for the other classes. Python has a special module to make `Vehicle` an **Absract Base Class**. If you decorate a method with `@abstractmethod` you must ensure that all the derived class will implement that method.

In [None]:
from abc import ABC, abstractmethod

class Vehicle(ABC):
    def __init__(self, maker, model, year, price, sold=False):
        self.maker = maker
        self.model = model
        self.year = year
        self.price = price
        self.sold = sold

    @abstractmethod
    def display(self):
        return f"{self.maker} {self.model} {self.year} price:{self.price} sold:{self.sold}"

    
class Car(Vehicle):
    def __init__(self, maker, model, year, price, sold=False, electric=False):
        super().__init__(maker, model, year, price, sold)
        self.electric = electric
        
    def display(self):
        return f"This is car: {super().display()} electric={self.electric}"

    
class Motorbike(Vehicle):
    def __init__(self, maker, model, year, price, sold=False, racing=False):
        super().__init__(maker, model, year, price, sold)
        self.racing = racing
        
    def display(self):
        return f"This is motorbike: {super().display()} racing={self.racing}"

    
class Airplane(Vehicle):
    def __init__(self, maker, model, year, price, windspan, sold=False):
        super().__init__(maker, model, year, price, sold)
        self.wingspan = wingspan

    # no display method

In [None]:
c = Car('Ford', 'Mustang', 1990, 40000, True)
m = Motorbike('Ducati', 'Desmosedici', 2007, 25000, False, racing=True)

print(c.display())
print(m.display())

In [None]:
# ERROR: you can't create a Vehicle object
v = Vehicle('Honda', 'Civic', 1995, 12000)
print(v.display())

In [None]:
# ERROR: you can't create an Airplane object because it doesn't have its own display() method
a = Airplane('Airbus', 'A330', 2002, 264_000_000, sold=False, wingspan=60.3)

### Exercise (no coding!)
Design a hierarchy of classes with applications in chemistry. For instance there could be `Molecule` at the top, from which you derive `OrganicMolecule`, `Alchool`, `Ketone`, etc.. `Acid`, `Basis` and one could apply rules of chemical reactivity according to the kind of molecule.

Other examples in the fields of physics, chemical physics, biology are welcome.  

## 3. Composition and aggregation
**Composition** and **Aggregation** are two specific types of relationships in Object-Oriented Programming (OOP) that model how objects of one class can contain or use objects of another class, often summarized as "**has-a**" relationships. The key difference lies in the **dependency** and **lifecycles** of the contained objects.

* Composition represents a **strong** "Part-of" or "Contains-A" relationship, where the contained object **cannot exist independently** of the containing object.
* Aggregation represents a **weak** "Has-A" relationship, where the contained object **can exist independently** of the containing object.

For example, to represent a crystal structure we can create a `Crystal` class that contains a `Cell` class and a `Structure` class. The `Structure` class will be made of a list of `Atom` objects, similalrly to the `Molecule` class we have developed.

```python
class Cell:
    def __init__(self, lattice, angles):
        ...
    @classmethod
    def from_bravais(cls, bravais, *args):
        if bravais == 'sc' or bravas == 'simple cubic':
            ...
        elif bravis == 'fcc'
            ...

class Atom:
    def __init__(self, symbol, coordinates):
        ...

class Structure:
    def __init__(self, atoms):
        self.atoms = atoms

class Crystal:
    def __init__(self, structure, cell):
        self.structure = structure
        self.cell = cell
```

This makes it possible to write very compact code like this:
```python
cry = Crystal.from_POSCAR('CrSb2.vasp')
cry.symmetrize(tol=1e-4)
cry.to_CIF('CrSb2.cif')
```
without aggregation you will had to write:
```python
cell, atoms, coordinates = crystal.read_POSCAR('CrSb2.vasp')
newcell, newatoms, newcoordinates = symmetry_library.symmetrize(cell, atoms, coordinates, tol=1e-4)
crystal.to_CIF(newcell, newatoms, newcoordinates)
```
**which one do you prefer?**

### Exercise (no coding!)
Design a set of classes with applications in chemistry using aggregation/composition. For instance you can describe an **NMR** experiment with a class `NMRExperiment` which aggregates a `NMRMachine(larmor_frequency)` and a `ListOfNuclei` each one with their own chemical shift and J-couplings... then a `NMRPulseSequence('Carr-Purcell')`, .. then `NMRExperiment.get_fid()`, `NMRExperiment.FFT()`, ...

Other examples in the fields of physics, chemical physics, biology are welcome.  