# Introduction to object-oriented programming

Clyde Fare and João Pedro Malhado, Imperial College London (contact: [python@imperial.ac.uk](mailto:python@imperial.ac.uk))

Notebook is licensed under a [Creative Commons Attribution 4.0 (CC-by) license](http://creativecommons.org/licenses/by/4.0/).

## Outline

This workshop follows up on the earlier [Introduction to Computer Programming](https://github.com/imperialchem/python-prog-intro) course, by exploring some of the more advanced programming techniques made available in Python. Namely we will be introducing the basic ideas of object-oriented programming, which is a powerful and popular technique to organize code for very large programs.

In later parts of the course we will be using these techniques to implement computer simulations of chemically relevant phenomena, and extract results from these.

## A digression on programming styles

Programming consists of composing sequences of instructions in order to solve some task. For this we use a language, a programming language, to express our reasoning in a form a machine (computer) can interpret. Most programming languages are quite rich, and allow people to express themselves in different ways, and achieve the same result writing different programs. This is much like expressing verbally the same idea using different words.

There are however different styles or types of logical approaches to solve the same problem, and different languages rely on, or are more suited for, different programming paradigms. To illustrate this we will consider several ways in which one could write a program to produce a list of the square root of all integers from 1 to 10.

### Procedural or imperative programming

This is probably the most common style of programming, and the one we covered in the previous course. Since you are familiar with this type of programming, we kindly ask you to provide the example for it in writing a program to produce a list of the square roots of all integers between 1 and 10.

Let us have a closer look at the program you wrote. You probably started by initializing a variable and then used a loop to update the value of this variable in order to achieve the result. The use of variables to store results and the use of loops are the hallmark of the procedural programming style.

Procedural programs are structured as instructions that will modify the value of one or more variables in a sequence that very often will be wrapped in a loop.

One can encapsulate these instructions in functions that help in the organization of the program, but when looking inside them the basic logic will be sequences of instructions changing the value of variables and loops.

The inner workings of contemporary computers resemble most the procedural paradigm and therefore it may be no surprise that the  most used programming languages enforce or strongly favour a procedural programming style. Some famous examples being [C](https://en.wikipedia.org/wiki/C_(programming_language), [Fortran](https://en.wikipedia.org/wiki/Fortran) or [BASIC](https://en.wikipedia.org/wiki/BASIC).

### Functional programming

In this type of programming, functions are not just a way to encapsulate instructions, they (rather than variable values) form the central logic of the program. One of the characteristics of this programming style is that functions don't have to receive just data (ex. an integer or a list) as input but they can receive other functions.

Python is not a particularly good language to do functional programming, but it has some of its features. We could write our example program as:

    def sqroot(n):
        return n**0.5

    list(map(sqroot,range(1,11)))

Note that in the above code there are no variables and there are no loops! We define the function *sqroot()*, and use Python's function *map()* to apply it to all the elements resulting from the execution of the function *range()*. *list()* here is not so important and is there so that we can visualize the output as a list.

Note that the function map() is receiving the function sqroot() itself as an argument, not some result of sqroot() operation on some data!

Functional programming is in fashion, with languages like [Haskell](https://en.wikipedia.org/wiki/Haskell_(programming_language)) gaining popularity next to classics like [Lisp](https://en.wikipedia.org/wiki/Lisp_(programming_language)).

### Recursive programming

In this programming style, the function calls back on itself on a "simpler" problem. A solution to our example, where *l* is the list of integers would be:

    def rec_list_sqrt(l):
        if l==[]:
            return []
        else:
            return rec_list_sqrt(l[:-1])+[l[-1]**0.5]
            
    rec_list_sqrt(list(range(1,11)))

Let us analyse how the function *rec_list_sqrt()* works. We have an *if* statement that branches the code in two different paths. The first path simply states that the function should return an empty list if the input is an empty list. This could seem like a pedantic detail but it is actually crucial for how the function will work. This code path provides a well defined immediate result, termed *base result*.

The other branch of the if statement calls the function rec_list_sqrt() on a new list containing all the elements of the initial input list except the last, and appends the list of a single component with the square root of the last element of the original input. This branch is called *recursion* and it is building our result.

It is perhaps helpful to see the workings of rec_list_sqrt() on a small list:

    rec_list_sqrt([1,2,3])
    rec_list_sqrt([1,2])+[3**0.5]
    (rec_list_sqrt([1])+[2**0.5])+[3**0.5]
    ((rec_list_sqrt([])+[1**0.5])+[2**0.5])+[3**0.5]
    (([]+[1**0.5])+[2**0.5])+[3**0.5]
    ([1**0.5]+[2**0.5])+[3**0.5]
    [1**0.5,2**0.5]+[3**0.5]
    [1**0.5,2**0.5,3**0.5]
    
Up to half way in the trace above, the code is following the recursion path until it reaches the base result, when the recursion stops and the final result is reconstructed. It is crucial that the recursion step will eventually converge to the base result, otherwise the function will keep on recurring in a form of infinite loop.

Although a little awkward at first for people used to other styles of programming, recursive programming is actually quite simple to understand. It however usually generates slower code. Many (but not all) programming languages allow one to call a function within its definition thus allowing to do recursive programming

Do you want to try to write a recursive function to calculate the factorial of a number?

### Object-oriented programming

Another programming style, which affects more the higher level structure of the code than the lower level logic, is object-oriented programming. In this case abstract objects, with specific properties and specific functions to act on them, are defined and it is these objects which play a key role in how some algorithm is written.

Many modern programming language allow an object-oriented approach to programming. Some languages where this feature is quite prominent are [C++](https://en.wikipedia.org/wiki/C%2B%2B), [Java](https://en.wikipedia.org/wiki/Java_(programming_language)), [Ada](https://en.wikipedia.org/wiki/Ada_(programming_language)) and ... Python.

In this course we are going to be introducing some of the object-oriented feature of Python and exploring how they can be useful in writing bigger computer programs.

## A known problem revisited

In the [previous course](https://github.com/imperialchem/python-prog-intro/blob/v2.1/prog_workshop4/prog_workshop4.ipynb) we looked at the problem of reading the .xyz file of a triatomic molecule, and determine its bond angle. We will now revisit that problem to get started and see how to build on from it.

In the original version of the problem we needed to read the file containing the molecule structure. We get a bit of head start and define the structure as a string here.

In [None]:
H2S='''S                  0.00000000    0.00000000    0.10224900
H                  0.00000000    0.96805900   -0.81799200
H                  0.00000000   -0.96805900   -0.81799200'''

<img src="molecule_angle.svg" style="width:400px;height:367px" />

We are interested in determining the bond angle and the bond length of the H<sub>2</sub>S structure. We saw how using Numpy arrays could greatly simplify the problem, but for the sake of the argument and the exercise we will now use lists to solve the same problem.

*This part of the workshop should not take too long, so if you find yourself stuck, do ask for help!*

The first step could be to convert the string H2S into a list we will call *H2S_xyz*, which has the form:

    [['S',x_coord,y_coord,z_coord],
     ['H',x_coord,y_coord,z_coord],
     ['H',x_coord,y_coord,z_coord]]
     
To parse the string we could use the string methods [.splitlines()](https://docs.python.org/3/library/stdtypes.html#str.splitlines) and/or [.split()](https://docs.python.org/3/library/stdtypes.html#str.split) with a loop construction in order to achieve this. Note that the element symbols in the list above will be strings, but the atom coordinates should be converted to floating point numbers.

In solving this problem it is useful to think of the chemical bonds as vectors. It is thus useful to define a function *bond()*, which receives 2 lists with the coordinates of 2 atoms, and returns one list with the coordinates of the vector defined by their position.

We will now define a few functions that implement some useful vector operations.

First of which the function *length()*, that receives a list that defines a vector, and returns the length of this vector.

And the function *inner_prod()*, which receives 2 lists as arguments, and return the inner product (or dot product) between them

Using the functions defined above, we can now define a new function *bond_length()*, that should calculate a bond length by receiving 3 arguments: the first argument should be a list of the form of *H2S_xyz* defining the structure of the molecule; followed by 2 other integer arguments specifying the position of the atoms between which we want to calculate the bond distance (defined by their position on the list). For example: if we wanted to calculate the bond distance between the S atom the H atom on the left of the picture we should execute

    bond_length(H2S_xyz,0,2)

Similarly, we want to define a function *bond_angle()* to calculate the angle between 2 bonds, by receiving the list defining the structure of the molecule and the 3 integers specifying the atoms between which the bonds are formed. Apart from the functions operating on vectors defined above, you will also need the function *acos()* from the math or the numpy modules.

We have followed a rather structured approach to the solution of the problem (this is a good thing):

* we represented the structure of the molecule by the list H2S_xyz;
* abstracted the concept of chemical bond and implemented these as vectors;
* implemented the vectors as Python lists;
* implemented basic operations on vectors as the functions length() and inner_prod();
* using these functions we built the higher level functions to solve our problem.

This is a fairly good implementation, general enough so that we can apply it to any other molecule by defining a new structure defining list.

It does however have some fragilities. What happens if you try to run

    length(['1','0','0'])

As you could have guessed, you obtain an error because although ['1','0','0'] is a perfectly valid Python list, it will not have the same properties of a vector.

Also if you run

    H2S_mol=[[0.,0.,0.102249,'S'],[0.,0.968059,-0.817992,'H'],[0.,-0.968059,-0.817992,'H']]
    
    bond_length(H2S_mol,0,2)

you will see your code breaking, although *H2S_mol* is a fine list and contains information about the structure of the molecule in an equivalent way to *H2S_xyz*.

It can be said that the above examples are just programming errors, and indeed they are! We are however looking for a way to minimize programming errors in larger codes, where a small mistake in one line can be hard to find and render the entire code useless. We also want to reduce the complexity associated with manipulating molecules, - it would be easier if we didn't have to remember that the symbol must come first and followed by the coordinates. 

Really we want to be able to define a vector, not a list, that really behaves like a vector, and has the operations we usually associate with vectors. And we want to define a molecular structure in some way, and be able to define operations that are specific to molecular structures (and completely inappropriate for lists of numbers for example).

We would like to be able define abstract **classes** of **objects**, that have specific **attributes** and specific **methods** associated with them. When a programming language allows for this, it allows for *object-oriented programming*. We will see how to do this presently.

## Why using object-oriented programming?

Even the relatively simple code such the one we have written above can span several dozens of lines. You may start to appreciate that more sophisticated programs can quickly reach thousands and even millions of lines of codes, with a number of variables of similar order of magnitude, complexity and readability quickly getting out of hand.

The figure below can give you a feeling of how big the code base of current software can be. Note there is some scientific software included in the list.

<img src="1276_Codebases.png" />

Given large code bases with 10<sup>4</sup> - 10<sup>8</sup> lines of code the most important problem is writing code that can be easily modified, understood and maintained. 

Object-oriented programming is a way of organising data and functions into reusable modular units or 'objects'.

The benefits are:

* Code reuse and recycling
* Encapsulation (different parts of the code not interfering with each other)
* Easier design
* Easier modification

Some drawbacks are:
    
* larger programs
* some reduced performance
* some learning curve

## Defining classes of objects

In order to specify the characteristics we want certain objects to have, we need to define a **class** of objects. This is done via the class statement

    class ClassName:
        'A docstring describing the objects of the class'
        
followed by the indented class content. By convention, class names are usually capitalized, and function names written in lower case. (For the type of functionality that we are covering, we don't need to specify any arguments in the class definition. This is used for some more advanced functionality).

Like with functions, we should provide a documentation string describing the class and how to use it.

Although not mandatory, the first thing to define after the class name is usually a function with a strange name '\_\_init\_\_'. Actually, functions defined inside classes are called **methods**, and we will soon see how they work.

    class ClassName:
        'A docstring describing the elements of the class'
        
        def __init__(self, arg1, arg2, arg3):
            'Docstring for the method'
            
            #code to initialise the class

The \_\_init\_\_ **method** is run whenever a new object of the class is generated. We call this process, initializing an **instance** of the class.

Inside the class definition, the name *self* is used to refer to the object being acted on (more on this below). The first argument of the \_\_init\_\_ method is always *self*, followed by the other arguments needed to define the object.

At the moment the discussion is quite abstract, so will illustrate what this means by defining a Vector class. This will allow us to create variables that represent 3D vectors, and methods that operate on them. At a bare minimum, our vector objects should possess three **attributes**, x,y and z, representing the components of the vector. 

    class Vector:   # <--- class definition
        'Implements 3D vectors and their behaviour'
        
        def __init__(self, i1,i2,i3):  #<--- initialisation method
            '''Initializes new vector objects by setting the values
               of their of their 3 components'''
            self.x = i1
            self.y = i2
            self.z = i3

We have just defined our Vector class, and can now create Vector objects. We do this by using the class name, and passing to it the arguments of the \_\_init\_\_ method except *self*. In this case we need to provide 3 numbers to define a vector.

    vec1 = Vector(1,2,3)

The variable vec1 is now an object of the class Vector, or in other words, an **instance** of the class Vector.

**object** = **instance of the class**

We can confirm this by checking its type

    type(vec1)

We defined Vector to be initialised with 3 values, and  all that the \_\_init\_\_ method above does is to assign 3 variables (in a slightly different way) with these. Variables defined inside a class in this way are called **attributes**. In our vector class we have defined three attributes *x*, *y* and *z*. 

We can access the attributes of a given object by its name followed by a dot (.) followed by the name of the attribute (or method).
So to access the x attribute we can type:

     vec1.x

In Python, the value of attributes of a given object can be changed just as one would do for a variable assignment

    vec1.x=0
    vec1.x

When working in the notebook, we can see what attributes (and methods) are associated with a given object by pressing the tab key after the dot

    vec1.

Once we have defined our class we can create as many objects as we want, i.e. instances of the Vector class.

    vec2=Vector(-1,0,1)
    vec3=Vector(1,1,4)

and access their attributes

    [vec1.x,vec2.y,vec3.z]

We have managed to generate our defined Vectors, but apart from the fact that have 3 distinct attributes, they don't look that much like vectors for now. Namely they don't have any operation associated with them. If we try to multiply a vector by a number, the operation will fail because we haven't yet defined what the result of this operation should be.

    2.5*vec2

We shall address this immediately by defining a method for scaling the vector, that is  multiplying the vector by a scalar number. We add functionality to our objects be refining our class definition, in this case by adding a new method to it.

    class Vector:   # <--- class definition
        'Implements 3D vectors and their behaviour'
        
        def __init__(self, i1,i2,i3):  #<--- initialisation method
            '''Initializes new vector objects by setting the values
               of their of their 3 components'''
            self.x = i1   #<--- initialisation of attributes
            self.y = i2
            self.z = i3
            
        def scale(self,a): #<--- scale method
            'Mutiplies the vector by a scalar a'
            self.x=self.x*a
            self.y=self.y*a
            self.z=self.z*a 

We define the scale method just like we define a function, and in this case we need to pass as arguments *self* and the scaling factor *a*. **All methods receive *self* as their first argument** (it is the first argument to \_\_init\_\_ too). *self* is what we call the object from inside the class. That is, when we are outside our class definition we create an instance and assign it to some variable (above we used vec1 but of course we could pick any name we like), then we access attributes/methods with that variable name, as we have seen. When we are defining methods we need some way to refer to object before it has been created. This is what the *self* keyword is doing.

When we call scale, we want to update the attributes of the vector, such that each component is multiplied by *a*. To access the x,y and z attributes we thus use self.x, self.y and self.z, and update their value.

Since we have updated the class, we need to create new objects with this updated functionality, and we are able to see the new method on by pressing the tab after the dot

    vec1=Vector(1,2,3)

    vec1.

Let us test the new method. Note that we only need to pass the argument *a* when calling it.

    vec1.scale(2.5)

We have now covered the basic functionality of how to create abstract objects in Python. There are always other things to learn, but that's basically it for how classes work. It is worth recapping what we covered so far:
    
* Classes are a means to combine data (attributes) and functionality (methods). 

* We define them using the *class* keyword followed by the name of our class. Within the class definition we must include an \_\_init\_\_ method which is used to initialise instances of the class.

* We create objects (or instances) of the class by typing the name of our class followed by two round brackets and passing the data we want our instance to be initialised with.

* Once we have created an object and set it to a variable we can access attributes and methods of the object using the variable name followed by a dot (.) then the name of the attribute or method.

*This is the point to ask if you don't understand what's going on!*

### Enriching the Vector class

We will now add several methods that will make our objects behave more like vectors. In particular we will implements methods that define the *dot* product between vectors, the *norm* (or length) of a vector, as well as the sum of two vectors and how to *copy* a vector on the way to it. In doing this we will be illustrating further aspects of building classes of objects.

#### dot

We are interested in implementing a method, which we will call *dot*, to calculate the dot product between 2 vectors. Here's how we want *dot* to behave:

    u = Vector(1,2,3)
    v = Vector(4,5,6)
    u.dot(v) == 32
    
So we are going to create two instances of Vector, u and v, and then call the dot method of u passing in v as the argument to the method. The method will <span style="text-decoration:underline">return</span> a number equal to the dot product of the two vectors. 

We will be updating the class Vector often, so it more practical copy it to a separate file editing it on a separate window. You can use any regular text editor (such as Notepad++ on Windows) and save the file with the extension .py, or you can use the IDE Spyder that is distributed with Anaconda.

    class Vector:
        'Implements 3D vectors and their behaviour'
    
         def __init__(self, i1,i2,i3):
             '''Initializes new vector objects by setting the values
                of their of their 3 components'''
             self.x = i1
             self.y = i2
             self.z = i3
            
         def scale(self,a):
             'Mutiplies the vector by a scalar a'
             self.x=self.x*a
             self.y=self.y*a
             self.z=self.z*a 

In normal circumstances, we would now import the module we have created that contains the Vector class, using the usual Python import syntax:

    from your_class_file import Vector

However, the module will not be reloaded when we modify the file, so we will use the following syntax to run our external file **each time we update our class**

    %run your_class_file

<blockquote>**Note on import and %run**:
The above statement may seem to suggest that *import* and *%run* are equivalent. This is not the case. First of all %run is not even part of the Python language, it rather is an IPython command that simply executes an external file. When writing your own code in the future, you will almost always want to use *import*. We will be using %run here for convenience, since we will be updating our class often, and once we have imported a module it is not easy to re-import without restarting the kernel.</blockquote>

Now add the *dot* method to the Vector class in your file, and check it behaves as intended (remember that the object needs to be recreated when new functionality is added to the class)

#### norm

Now that we have the *dot* method, we can add to the class another method to return the vector norm. (Don't forget to %run your modified file)

#### vector addition

In defining the operation corresponding to the addition of two vectors, we will take a few detours on the way.

We will first define a *combine* method. Here's how we want it to behave:
        
    u = Vector(1,2,3)
    v = Vector(4,5,6)
    u.combine(v)
    (u.x,u.y,u.z) == (5,7,9)
    
So we're going to create two instances of vector, u and v. Then we're going to call the combine method of u and again pass in v. This time the method won't return anything instead it will update u by adding on v.

Modify the file with your Vector class to include a combine method and check it behaves as intended.

A useful operation would be to be able to generate a copy of our vector, an object of the Vector class. A first uninformed attempt could be to make an assignment to a different variable

    w=v
    (w.x,w.y,w.z)

At first glance this seems to have worked just as intended, but note what happens if we now change the original vector v

    v.x=10
    w.x

w is in this case not a copy of v, but a "clone" of v. It is indeed another name to call the original object by.

To make a copy of the object, we note that we can actually create new instances of a class from inside the class. So the copy method to be added to the class would look like: 

    def copy(self):
        'Create a copy of the vector object'
        return Vector(self.x,self.y,self.z)

This is how the copy method should work.

     v=Vector(4,5,6)
     w=v.copy()
     v.x=10
     (w.x,w.y,w.z) == (4,5,6)

With this idea in mind let us improve our *combine* method so that instead of modifying the components of the object on which the method acts upon, to instead return a new instance containing the summed vectors. I.e. change your Vector class so that it behaves in the following way:

    u = Vector(1,2,3)
    v = Vector(4,5,6)
    w = u.combine(v)
    
    (u.x,u.y,u.z) == (1,2,3)
    (v.x,v.y,v.z) == (4,5,6)
    (w.x,w.y,w.z) == (5,7,9)

We have made some way in defining the sum of two vectors. It might however be nice if instead of writing:
 
    u.combine(v)

we could instead write:
    
    u + v
    
It turns out we can actually do that! There are special methods that let us make use of the same syntax that python uses to modify its own variables. The one we want is the \_\_add\_\_. Switch the name of your combine method to \_\_add\_\_ and try it out!

The method \_\_add\_\_ controls how the operation '+' behaves in our class, we can control if it behaves like the sum in numbers or the concatenation of lists or strings. There are a few of such special methods whose name start and end with two underscores. The only other one we'll mentioned is the \_\_repr\_\_ method. But a full list can be found [here](https://docs.python.org/3/reference/datamodel.html#special-method-names)

### Object display

If we create an instance of Vector and just look at it the value we see something not enormously eye-pleasing:

    u = Vector(1,2,3)
    u

What's being printed is telling us that u is an instance of the Vector class, and that weird string of symbols and numbers is actually the address in memory where the object is contained. It might be useful though to see at a glance the components of the vector. The special \_\_repr\_\_ method controls what gets printed when out if we just look at the object itself.

Here's how we want our \_\_repr\_\_ method to behave:

    u = Vector(1,2,3)
    u.__repr__() == '1 2 3'
   
It should return a string that contains the values of the components. Modify your Vector class to include the \_\_repr\_\_ method and check it behaves as expected.

Now see what happens when you look at the value of u:
        
        u = Vector(1,2,3)
        u

We have now implemented a decent Vector class with the most important properties of 3D vectors. To have an overview of your work, if you used docstrings appropriately, use the '?' on the notebook to interrogate your class

    Vector?

You can get a similar result by doing

    help(Vector)

One final note before we move on. In Python *everything* is an instance of a class: integers are actually instances of an integer class, lists are instances of a list class, etc. etc. and they all have attributes and methods. You can always check what methods and attributes are available to any object, by using the tab key after the dot:

    empty=[]
    empty.

## Creating a molecule class

We will now return to the problem of analysing chemical structure posed in the beginning of the workshop. For this we will use the Vector class we created and will create a new Molecule class.

In our vector class our only attributes were three numbers: x,y,z. For a molecule we are going to want a bit more than that. We will want to initialise our molecule with two things, a list of atom symbol strings, and a list of Vectors representing the positions of the atoms.

Define a Molecule class and check you can use it as follows:

    h2o = Molecule(['O','H','H'],
                   [Vector(0,0, 0.119262),
                   Vector(0,0.763239,-0.477047),
                   Vector(0,-0.763239,-0.477047)]
                  )
    h2o.symbols == ['O','H','H']

Now implement the *bond_length* method, that will receive the index of 2 atoms in the molecule and will return the distance between them

We now want to implement the *bond_angle* method to give the bond angle between three atoms.

Does the above defined water molecule have the appropriate equilibrium bond angle?

Let us test the methods on a slightly bigger molecule.
    
    ch4 = Molecule(['C','H','H','H','H'],
                   [Vector(0,0,0),
                    Vector(0.629118,0.629118,0.629118),
                    Vector(-0.629118,-0.629118,0.629118), 
                    Vector(0.629118, -0.629118, -0.629118),
                    Vector(-0.629118,  0.629118, -0.629118)]
                  )

This a good point to try get a broader picture. Why are we bothering with all of this? We could do all the same things we have done so far with separate variables for each component of each vector and we could use functions instead of methods, as we did in the beginning of the workshop. But if we did that it would be much more complicated to maintain, we would have to keep track of all these myriad of different variables and what they mean and how they fit together, we'd also have to keep track of what variables needed to be passed to which functions. 

With the Vector and Molecule classes that is all handled for us. In fact once it is defined we don't need to pay attention to how it works at all. We can think of an instance of the Vector class just as a vector. (To think of the instance of Molecule as a molecule is a bit more far fetched, but it is a good representation of molecular structure.) A proper class should provide methods that do all the fundamental things we expect to be able to do with the object that class represents.

To illustrate the versatility of the object-oriented approach, now that we have implemented the Vector and Molecule classes, we can easily extend the Molecule class to calculate other common molecular properties.

Let us now implement a new attribute charge and a new method to calculate the dipole moment of the molecule

$$\vec{\mu}=\sum_i q_i \vec{r}_i$$

where the sum extends to all atoms of the molecule, $q_i$ is the partial charge on atom *i*, and $\vec{r}_i$ is the position vector of the atom.

What is the dipole moment of H<sub>2</sub>O given the partial charges equal to [-0.68,0.34,0.34].

What about the fluoroform molecule?

    cf3h = Molecule(['C','F','F','F','H'],
                    [Vector(0.,0.,0.335420),
                     Vector(0.,0.,1.425773),
                     Vector(0.,1.249478,-0.127344),
                     Vector(-1.082080,-0.624739,-0.127344),
                     Vector(1.082080,-0.624739,-0.127344)],
                    [1.109,-0.401,-0.401,-0.401,0.093]
                    )

It is equally simple to implement a method to determine the centre of mass of the molecule, or the moment of inertia along the Cartesian axes (in this case you would have to displace the molecule such that the centre of mass corresponds to the origin of the reference frame).

### Optional: Spin-off an Atom class

Now that we have added charges along side the symbols and the Vectors the Molecule class is starting to get a little full. Imagine if we wanted to add magnetic moments, atomic numbers and radii to our atoms- suddenly initialising our molecule would look a bit of a mess. The solution is of course to create an Atom class to represent atom objects and then construct Molecule objects from a list of Atom objects. 

Have a go building an Atom class and then refactor your molecule class so that it is built from a list of Atoms.

## General advice in object-oriented design

To conclude,we present some general advice of choosing how to choose to implement program using objects.

When presented with a programming task, the most important stage is that of deciding the design of the program. This is particularly important when doing object-oriented programming, as the programmer is given a wider choice in how to implement his/her code.

A rule of thumb to aid in the design choice is based on the natural language description of the problem to be solved. As an example, imagine that we wanted to make the classic [video game Asteroids](https://en.wikipedia.org/wiki/Asteroids_(video_game)). To identify the relevant classes we would try describing it.

Wikipedia describes Asteroids as follows:

<blockquote>The objective of Asteroids is to score as many points as possible by destroying asteroids and flying saucers. The player controls a triangular-shaped ship that can rotate left and right, fire shots straight forward, and thrust forward. As the ship moves, momentum is not conserved – the ship eventually comes to a stop again when not thrusting.</blockquote>

Look for nouns mentioned that could meaningfully have some kind of 'state' and some set of 'actions' associated with them. These become our objects, the states become attributes and the actions become methods.

Thus classes are: 

* a ship, 
* an asteroid, 
* a flying saucer
* a bullet 

Then for each object consider their states and behaviours. For example, the ship class
    
    states: position, momentum, orientation
    behaviours: rotate_left, rotate_right, thrust, fire
    
Below is the outline of a possible Ship class for illustration purpose.

In [None]:
import numpy as np

class Ship:
    'Object class representing the ship in the Asteroids video game.'
    
    def __init__(self, position=(0,0), momentum=(0,0), orientation=0, bullets=None):
        '''Initialization of the ship object. The argument list includes arguments
        called by name, and includes default values. If the class is called as
        Ship(position=(1,1),orientation=0.85), the attributes will get the values specified
        in the arguments list, or the default values if they are not specified.'''
        self.position = np.array(position)
        self.momentum = np.array(momentum)
        self.orientation = orientation
        self.bullets = bullets
        
    def rotate_left(self, ang):
        'Rotate ship left by ang radians.'
        self.orientation = self.orientation - ang
        
    def rotate_right(self, ang):
        'Rotate ship right by ang radians.'
        self.orientation = self.orientatio + ang
        
    def thrust(self, w):
        'Controls thrust inpulse of the ship.'
        x_,y_ = self.momentum
        self.momentum = np.array([x_+np.sin(self.orientation)*w, y_+np.cos(self.orientation)*w])
    
    def fire(self):
        'Fire a shot by generating a Bullet object.'
        self.bullets.append(Bullet(self.position, self.orientation))

    def check_collision(self):
        'Check whether the shot was a hit.'
        pass
    
    def update(self, dt):
        'Evolve the ship and bullets in time.'
        for bullet in self.bullets:
            bullet.update(dt)
        self.position = self.position+dt*self.momentum
        self.check_collision()

## Overview

In this workshop we have explored different programming paradigms, and in particular we have introduced the object-oriented features of the Python language.

We have build a class to implement the behaviour of 3D vectors in Cartesian space, and we have constructed a class representing molecular structure and allowing for the calculation of structure related molecular properties. We contrasted this implementation with an "unstructured" approach based on variables and functions. Although the advantages of the object oriented approach are not self-evident for simple codes, its main strength lie in that they are easier to maintain and extend, which make it suitable to manage large programs.

Chemical simulation programs can attain a high degree of complexity and sophistication. Based on the skills acquired, we will create next a simple simulation of a 2D gas, and analyse some of its properties.

## Other Python resources

There is an immense amount of documentation available about the Python language and it's applications to different domains from which you can draw upon and further your studies.

A good place to start is the [tutorial on classes](https://docs.python.org/3/tutorial/classes.html) in Python's official documentation website.

Good reference books could be [Learn Python the hardway](http://learnpythonthehardway.org) or [Learning Scientific Programming with Python](http://scipython.com).