# Python for Psychologists - Session 6
## Classes

Of course, the concept of classes is nothing new. Basically, classes are categories of objects. In everyday life we are familiar with classes/categories for objects like "cars" or "buildings" or "dogs". The reason for the categorization of these object is that they all share common features. For example, cars are usually used as means of transportation and, hence, we can drive them and also use their break. At the same time, they also have certain features that vary between them, like their colour. Of course, every car *has* a colour, but what specific colour it is will vary from one car to another. 


### Classes and instances
In programming, classes are used to describe objects and to define what one can do with them and what other kind of features they have. Each object/"exemplar" of a class is called an **instance**, and the creation of an instance is, hence, called instantiation. The following code part shows the instantiation of an object of class list:

```python
instance_of_class_list = []
```

This should look pretty familiar. Each time we create a new object, e.g., a list, we create a new instance of the class "list". We can check the class of an object by using the following syntax:

```python
instance_of_class_list.__class__ # this would return "list"
```

Moreover, we can check if an object is an instance of a certain class using the `isinstance()` function:

```python
isinstance(my_object, some_class) # returns True or False
```

The concept of classes and instances is the core of *object oriented programming*, a programming style that aims at designing code in a way that it represents/resembles structures of the real world domain/phenomenon. Python is not exclusively an object oriented programming language but it supports it and several things are implemented in an OOP fashion. 

### Methods and attributes

Classes have two types of feautures.

**1) Methods**

Methods describe what one can do with this kind of objects, that is, methods describe *functions* that are associated with each instance of the class. Usually, they are the same for different instances of a class. Methods can be used/applied to an object like this:

```python
some_instance.some_method()
```
This will run the respective method/function with respective to the object. We have already met list methods like `.pop()` and `.remove()`.


**2) Attributes**

Attributes describe variable features that are not functions. Attributes can be variable across different instances of a class. Attributes can be accessed like this:

```python
some_instance.some_attribute
```

As you can see the syntax is the same as the one for methods, *except for the parentheses ( )*. We have also met and used some attributes, e.g., the `.columns` attribute of pandas data frames.

![title](classes.png)

### Finding methods and attributes of existing classes

Now that you know that each instance of a class has certain methods and attributes, you can be a little more specific when looking for functions in Python online. Inside the Jupyter Notebook you can also easily look for methods and attributes of an object/instance. Just type the name of the respective object, place a `.` after it and press the tab button. A drop-down menu will appear listing all methods and attributes available for instances of the respective class. Execute the code cell below. Afterwards, type the name `some_df.` and press tab.

In [27]:
import pandas as pd
some_df = pd.DataFrame({"name":["Sophie", "Lars", "Mia"], "age":[9, 14, 6]})[["name","age"]]

In [28]:
some_df

Unnamed: 0,name,age
0,Sophie,9
1,Lars,14
2,Mia,6


In [None]:
some_df.

If we want to find attributes and methods of a class in general without already having a specific instance of it, we can simply type the class name, again followed by a dot, and press "tab" again.

In [None]:
pd.DataFrame.

In [None]:
list.

Another way to obtain a list of methods and attributes is to use the `dir()` function.

In [None]:
dir(list)

### Class definition

Of course, we can create our own classes. The syntax looks like this:

```python

class some_class_name:
    
    ############### Initialization ################
    def __init__(self, some_argument):
        # e.g., define instance attributes 
        self.some_instance_attr = some_value
        self.some_other_instance_attr = some_argument
    
    ################ Methods #####################
    def some_method(self):
        # do
        # something
        # e.g., create another instance attribute
    def some_other_method(self, some_other_argument):
        # do 
        # something else
        return something_else       
        
    ################# Class Attributes #################
    some_class_attribute = some_value
        
```

Creating an instance of a class works as follows:

```python
some_instance = some_class() # don't forget the parentheses
```


Here is a very simple example of a class definition: 

In [2]:
class greeter:    
    def say_hi(self):
        print("Hello!")
    
my_greeter = greeter()
my_greeter.say_hi()

Hello!


In this case, we did not specify any `__init__` method. The `__init__` stands for initialisation, however, it is not really a constructor per se, that is, it is not necessary for an instance to be created. Instead, it simply specifies things that are supposed to happen by default once a new instance is created. 

Let's take a look at an example:

In [102]:
class greeter:
    def __init__(self):
        print("You have created another greeter!")
        
    def say_hi(self):
        print("Hello!")

my_greeter = greeter()

You have created another greeter!


Test if `my_greeter` is an instance of `greeter` class.

In [105]:
isinstance(my_greeter, greeter)

True

**Adding attributes**

(Instance) Attributes are usually added inside of methods, typically already during the `__init__` method specification.

In [3]:
class greeter:
    def __init__(self, name):
        print("You have created another greeter!")
        # adding an attribute:
        self.name = name
        print("The greeter's name is " + name)
    
    def say_hi(self):
        print("Hello!")
    
    def say_your_name(self):
        print("My name is " + self.name)
    
my_greeter = greeter("John") # now we need an extra input for the name argument!   

You have created another greeter!
The greeter's name is John


In [99]:
my_greeter.say_your_name()

My name is John


**What's with the self?**

You have probably already wondered what the funny 'self' argument to each method definition is supposed to mean. A crude answer is the following:

When an instance is created the methods of the instance are still inside the class definition. That is, what is happening when we try to run a method of an instance is *something like* this:

`class_name.method_name(instance_name)`

or, for our example:

`greeter.say_hi(my_greeter)`

Of course, this code won't work, but it helps to think of this code as if it was happening inside. This means, that each method defined in the class, also needs to be passed which object this is about. Inside the class definition the specific instance argument is called "self" by convention. Theoretically, it could be anything, like in any other function definition. So, in short, as long as you always have the "self" as part of your method arguments, everything is fine.


Look at this adder class:

In [22]:
class adder:
    def __init__(self, a, b):
        self.a = a
        self.b = b
        print("All attributes have been set!")
    
    def add_it_up(self):
        return self.a + self.b

my_adder = adder(2, 3)
my_adder.add_it_up()

All attributes have been set!


5

**Exercise 1.**

Create a class that has two attributes: the width and the length of a rectangle. Let the class have a method `.area()` that returns the area of the respective rectangle instance (that is length \* width).  

In [9]:
class rectangle:
    
    def __init__(self, length, width):
        self.length = length
        self.width = width
        
    def area(self):
        return self.length * self.width
        
r = rectangle(length=5, width = 3)
r.area()

15

**Exercise 2.**

Write a class "calculator" whose instances have two numeric attributes "a" and "b". In addition, the class should have the methods "multiply", "add", "divide" and "subtract" which have the respective result as a return value. After the class definition, create an instance of the class with the numeric attributes a = 3 and b = 2 and test all its methods.

In [41]:
class calculator:
    
    def __init__(self, a, b):
        self.a = a
        self.b = b
    def add(self):
        return(self.a + self.b)
    def subtract(self):
        return(self.a - self.b)
    def multiply(self):
        return(self.a * self.b)
    def divide(self):
        return(self.a / self.b)
    
my_calc = calculator(a=3, b=2)
print(my_calc.add())
print(my_calc.subtract())
print(my_calc.multiply())
print(my_calc.divide())

5
1
6
1.5


**Exercise 3.**

Write a Python class which has two methods `get_string` and `print_string`. `get_string accepts` a string from the user (the command is `input()` that needs to be assigned to the attribute `string`) and `print_string` prints the string. Define the attribute `string` in the `.__init__` method already.

In [13]:
class iostring:
    
    def __init__(self):
        self.string = ""
        
    def get_string(self):
        self.string = input()
    
    def print_string(self):
        print(self.string)
        
new = iostring()
new.get_string()

hallo


In [14]:
new.print_string()

hallo


### Instance attributes are changeable

Attributes can still be changed after the creation. If that wasn't the case we would have to create a new instance each time we apply changes to, e.g., our data frame.

Assign a new value to the attribute "b" of you "calculator" class instance from above. Use the `add()` method afterwards.

In [46]:
my_calc.b = 6
my_calc.add()

9

### Inheritance

When creating a class that has very similar methods and attributes as a class that already exists, we can simply inherit all or certain methods and attributes from the existing class (which is then called the "parent class" or "base class"). This way, we can easily change or add methods and attributes to an existing class.


```python

class new_class(existing_class):

    def new_method(self):
        # do
        # stuff

```

Let's define a new class of pandas data frames! The new class should inherit from the original `pd.DataFrame` class and have a new method called `.who_are_you()` that returns some string.

In [58]:
import pandas as pd

class new_df(pd.DataFrame):
    
    def who_are_you(self):
        return "'Hihihi, I am a new df class!'"
        
x = new_df({"name":["Sophie", "Lars", "Mia"], "age":[9, 14, 6]})
print(x)
print("\n" + "'Who are you?'"+"\n"+x.who_are_you())

   age    name
0    9  Sophie
1   14    Lars
2    6     Mia

'Who are you?'
'Hihihi, I am a new df class!'


Create a new type of list class called "new_list" that inherits from the existing list class and adds an attribute (a simple string) to it inside the `__init__` method. Afterwards, create a new instance of "new_list" and append a number to it. Then, look at the new attribute.

In [21]:
class new_list(list):
    
    def __init__(self):
        self.new_attr = "I'm a new attribute!"
    
x=new_list()
x.append(1)

In [22]:
x

[1]

In [23]:
x.new_attr

"I'm a new attribute!"

### Copying objects

This topic is not specific to classes, but as we are talking about objects and OOP a little bit, this is maybe a good point to give you some very relevant information on how to copy objects properly.

Look at the code below:

In [30]:
a = ["a", "b", "c"]
b = a # making a copy
b

['a', 'b', 'c']

So far so good. But now look what happens to "b" after making changes to "a":

In [31]:
a[0] = "z"
print(a)
print(b)

['z', 'b', 'c']
['z', 'b', 'c']


As you can see, as a has changed, b changed as well! Of course, it depends on what you want, but very often this effect is unintended. The reason is that by assigning one object to another using `=`, what is actually assigned to "b" is NOT the object contents of "a" *per se*, but rather a pointer to the same location in memory. We check this using the `id()` command, that returns the object/memory id of an object.

In [32]:
print(id(a))
print(id(b))

4409597704
4409597704


As we can see, the two objects have the same id! We can also use the `is` operator to check if two variables refer to the same object. It is NOT the same as `==`, which simply checks if two variables contain the same content.

In [33]:
a is b

True

In [38]:
c = ['z', 'b', 'c'] # this new object has the same value as a

In [39]:
a==c

True

In [40]:
a is c

False

What does this mean? It means that if we make changes to the object location in memory, all pointers to that location (that is, all variables) will return the same changed object when being evaluated.

A way out of this problem is to use the `.copy()` method. 

Try to make a copy "b" of "a" again, but this time, don't try to assign directly "a" to "b", but rather "a.copy( )" to "b". Afterwards, do the same changes to a as before and look at both "a" and "b", as well as their `id()`.

In [67]:
a = ["a", "b", "c"]
b = a.copy()

print(id(a))
print(id(b))

a[0] = "z"
print(a)
print(b)

print(id(a))
print(id(b))

4505434248
4505435592
['z', 'b', 'c']
['a', 'b', 'c']
4505434248
4505435592


**Deep vs. shallow copies**

Unfortunately, the `.copy()` method does not solve all possible problem with copying. Imagine you have a list like this:

`a = ["a", "b", [1,2,3]]`

If you now use `.copy()`, the first two positions of your original list will be "independent" from each other, however, the list inside the list won't. Go ahead and change one element inside the inside-list of a and take a look at b.

In [70]:
a = ["a", "b", [1,2,3]]
b = a.copy()

In [71]:
a[2][1] = 99
print(a)
print(b)

['a', 'b', [1, 99, 3]]
['a', 'b', [1, 99, 3]]


As advertised. The message is: `.copy()` makes **shallow** copies of the objects. In order to overcome this problem, there are also **deep** copies. The function name is `.deepcopy()`, contained in the module `copy`. Try and see if that solves the problem.

In [77]:
import copy
a = ["a", "b", [1,2,3]]
b = copy.deepcopy(a)

a[2][1] = 99
print(a)
print(b)

['a', 'b', [1, 99, 3]]
['a', 'b', [1, 2, 3]]


Interestingly, this only applies to mutable objects. That is, there is no way to make a copy of immutable objects like tuples. Why? Because there is no need to:

Usually, we want to make a copy of an object because we want to create a changed version of it while keeping the original. But, as tuples are immutable, whenever we try to change a tuple, e.g., by adding an extra tuple to it, Python will automatically create a new object. So, in the end, we have want we would have had to achieve via `.copy()` for mutable objects: the original AND an independent object containing the changes.

In [23]:
a = (1,2,3)
b = a
print(id(a))
print(id(b))

4525119744
4525119744


In [24]:
b += (1,)
print(b)
print(a) # a has not changed, although a and b referred to the same object and b experienced some changing
print(id(a))
print(id(b)) # ... because whenever an immutable object gets "changed" (it can't really be changed, ...
# ... that's why it is immutable) a new immutable object that includes those changes it created

(1, 2, 3, 1)
(1, 2, 3)
4525119744
4534268376
