# Chapter 4. Classes

## Objects and classes

An **object** is a data structure that contains *state* and *behaviour*.

Before we can create objects, we (usually) need to define a **class** which is essentially a template for creating objects.

Consider an example. We have a ball which has a position and a speed. The position and the speed are therefore the *state* of the ball. The ball can also move, i.e. update its position based on the speed. This is *behaviour*.

We can define a class `Ball` (which will be the template for creating `Ball` objects in the future) like this:

In [None]:
class Ball:
    def __init__(self, pos, speed):
        self.pos = pos
        self.speed = speed

Here the state consists of the values `pos` and `speed`. Generally speaking, values that are associated with objects (and represent its state) are called **attributes**.

The `__init__` function is a special function which **initializes** an object with some *initial state*.

For example if we want to create a ball at position `10` and with speed `8`, we write:

In [None]:
my_ball = Ball(10, 8)

Let us inspect the attributes of the `ball` object:

In [None]:
f"pos = {my_ball.pos}, speed = {my_ball.speed}"

We can also get the class of the object using the `type` function:

In [None]:
type(my_ball)

We can also confirm that `ball` is an instance of the `Ball` class using the `isinstance` function. This will come in handy later:

In [None]:
isinstance(my_ball, Ball)

We could now change the state of the ball directly. For example this is how we could move the ball:

In [None]:
my_ball.pos += my_ball.speed

In [None]:
f"pos = {my_ball.pos}, speed = {my_ball.speed}"

But this isn't really the way to go. What if we wanted to change the speed calculation later (for example taking friction into account)? Then we would need to adjust the calculation in every single place we use it, which is a lot of work and could introduce bugs if we forget the calculation adjustment somewhere.

If you paid attention in the previous chapter, this is the same problem we had when introducing functions. We stored the repetitive calculations in the function body and then simply called the function. Luckily, objects provide a similar mechanism.

We can define functions on objects - these functions are called **methods** and define the *behaviour* of the object. Note that the first parameter of all methods is the object itself. By convention this parameter is called `self`. For example here is how we could implement the `Ball` class with a `move` method:

In [None]:
class Ball:
    def __init__(self, pos, speed):
        self.pos = pos
        self.speed = speed
        
    def move(self, t):
        self.pos += self.speed * t

Note that after creating the new class need to *redefine* the `my_ball` object, otherwise we will still be using the old class!

In [None]:
ball = Ball(10, 8)
ball.move(2)

In [None]:
f"pos = {ball.pos}, speed = {ball.speed}"

If you see the error message `AttributeError: 'Ball' object has no attribute 'move'` this means that you are still using the *old* ball class and you need to redefine the class.

Note how the behaviour *changed* the object state. This is what objects are all about. They are initialized with some state and then we can use their behaviour to update the state.

We should also point out that apart from having to write classes yourself, you will *use* them *all the time*. In fact, we will introduce many extremely important classes in the *next chapter*.

## The \_\_str\_\_ and \_\_repr\_\_ methods

Quite often, we want to output the object (e.g. using the `print` function) to see its current state. However, if we just try to output the object, we get a string that is pretty useless:

In [None]:
ball

In [None]:
print(ball)

Luckily, Python allows us to change this behaviour. There are two special methods called `__str__` and `__repr__`.

The `__str__` method produces a human-readable string for consumption by the end user. The `__repr__` method produces an *exact representation* of the object. This is reflected by when these methods are called. For example if you just output an object in the REPL, you will see the output of `__repr__`. If you `print` and object, you will see the result of `__str__`.

These two methods often have the same implementation (since the end user usually wants to see the exact representation), but this doesn't have to be the case. For example, we could argue, that the end user doesn't care about the speed of the ball and only cares about its current position. Then the `__str__` method would return a string that doesn't contain `dx` and `dy`. However the `__repr__` method should still return all attributes:

In [None]:
class Ball:
    def __init__(self, x, y, dx, dy):
        self.x = x
        self.y = y
        self.dx = dx
        self.dy = dy
        
    def move(self):
        self.x += self.dx
        self.y += self.dy
        
    def __str__(self):
        return f"Ball(x={self.x}, y={self.y})"
    
    def __repr__(self):
        return f"Ball(x={self.x}, y={self.y}, dx={self.dx}, dy={self.dy})"

In [None]:
ball = Ball(x=320, y=240, dx=-2, dy=2)

In [None]:
# Note how printing the ball results in the output of __str__

print(ball)

In [None]:
# However just outputting the ball results in the output of __repr__

ball

We can also call `__str__` and `__repr__` manually by calling the `str` and `repr` functions:

In [None]:
str(ball)

In [None]:
repr(ball)

Since `__str__` and `__repr__` are often the same, it's enough to just define `__repr__`. If no `__str__` is defined and you try to call it, Python will fallback to calling `__repr__`:

In [None]:
class Ball:
    def __init__(self, x, y, dx, dy):
        self.x = x
        self.y = y
        self.dx = dx
        self.dy = dy
        
    def move(self):
        self.x += self.dx
        self.y += self.dy
    
    def __repr__(self):
        return f"Ball(x={self.x}, y={self.y}, dx={self.dx}, dy={self.dy})"

In [None]:
ball

In [None]:
print(ball)

You should write `__repr__` for practically all your classes as a second nature. Being able to see the state of an object when outputting it is extremely valuable.

## Object identity

Let's create two objects of class `Ball` whose attributes have the same values:

In [None]:
ball1 = Ball(320, 240, -2, 2)
ball2 = Ball(320, 240, -2, 2)

In [None]:
ball1

In [None]:
ball2

Note that all the attributes of these two objects have the same values, but they are *totally different* objects:

![](images/different-objects.png)

This means that if we change the values of `ball1`, then `ball2` will be *completely unaffected*:

In [None]:
ball1.x += ball1.dx

In [None]:
# The value of ball1.x has changed

ball1

In [None]:
# However the value of ball2.x is exactly the same as before

ball2

But what happens if we write this?

In [None]:
ball1 = Ball(320, 240, -2, 2)
ball2 = ball1

Now we have a completely different situation. The symbolic names `ball1` and `ball2` are still two different names, but they *refer to the exact same object*:

![](images/same-object.png)

This means that whenever we make a change using the name `ball1`, that change will be visible in the name `ball2` as well:

In [None]:
ball1.x += ball1.dx

In [None]:
# Outputting ball1 obviously shows the new value of x
ball1

In [None]:
# However, outputting ball2 also outputs the new value of x!
ball2

We say that `ball1` *is* `ball2` (i.e. they are the same object). In fact the `is` operator can be used to check if two names represent the same object:

In [None]:
ball1 is ball2

We can also see that `ball1` and `ball2` are the same object using the `id` function. Generally speaking, this function hands us a number that is unique for each object (as long as that object exists). In the Python interpreter we downloaded in chapter 1 (which is the CPython interpreter) this is achieved by returning the **memory address** of the object.

In this case `ball1` and `ball2` point to the same object, so their memory address is the same:

In [None]:
id(ball1)

In [None]:
id(ball2)

In [None]:
id(ball1) == id(ball2)

However if we have two names that refer to different objects, then their memory address will not be the same and the `is` operator will return `False`:

In [None]:
ball1 = Ball(320, 240, -2, 2)
ball2 = Ball(320, 240, -2, 2)

In [None]:
id(ball1)

In [None]:
id(ball2)

In [None]:
id(ball1) == id(ball2)

In [None]:
ball1 is ball2

## Object equality

An interesting question is what happens if we use the equality operator `==` on objects. By default, the equality operator is equivalent to the `is` operator:

In [None]:
ball1 = Ball(320, 240, -2, 2)
ball2 = Ball(320, 240, -2, 2)

ball1 == ball2

In [None]:
ball1 = Ball(320, 240, -2, 2)
ball2 = ball1

ball1 == ball2

This is not particularly useful. After all, if two balls have the exact same position and the exact same speed, we would probably want them to be equal (even though the `is` operator returns `False`).

Luckily, Python gives us a way to achieve this by **overriding** (i.e. providing a custom implementation) for the equality operator. In order to do this, we need to write a custom `__eq__` method which takes `self` and the other object we want to compare this object to:

In [None]:
class Ball:
    def __init__(self, x, y, dx, dy):
        self.x = x
        self.y = y
        self.dx = dx
        self.dy = dy
        
    # some methods
    
    def __eq__(self, other):
        return isinstance(other, Ball) and self.x == other.x and self.y == other.y and self.dx == other.dx and self.dy == other.dy
    
    def __repr__(self):
        return f"Ball(x={self.x}, y={self.y}, dx={self.dx}, dy={self.dy})"

The `__eq__` method checks for two things. The first thing it checks for is if the other object is an instance of `Ball`. If it's not, then the whole logical chain becomes `False`. This makes sense - after all, a `Ball` object should never be equal to a non-`Ball` object.

However, if the other object is a `Ball` as well, we check whether all attributes of `self` and `other` are equal. If they are, we return `True`, otherwise we return `False`.

Now the `==` operator works as we would like it to:

In [None]:
ball1 = Ball(320, 240, -2, 2)
ball2 = Ball(320, 240, -2, 2)

ball1 == ball2

Note that how and if you want to override `==` depends on your needs. For example it might actually be the case, that two balls shouldn't compare equal, even *if* their attributes have the same values.

But generally speaking, *if* you override the equality operator, you should override it by checking that the two objects have the same class and their attributes have the same values.