# Creating Classes the Pythonic Way

While it's possible to get by with python using only functions, built-in types and types from installed libraries, a more advanced programmer will want to encapsulate their data into their own types, with their own functionalities and constraints, which means making use of user-defined classes. And while it's actually easy to define classes, doing it the Pythonic way isn't always as clear, especially since the documentation for class definition can be hard to read for a newbie. This small notebook aims at explaining the main concepts surrounding classes in Python, and how they differ from how classes are done in other languages (looking at you Java!).

## What are classes?

In Python, "everything is an object", as they say. An object is basically a structure that bundles data as well as functionality that can be applied to that data (i.e. _methods_, which are called using the syntax `object.method(other_arguments)`). And making that claim is simply saying that everything that can be manipulated in Python, even built-ins like `int`, `float` and `str`, are objects.

In [1]:
(1.0).is_integer()

True

In [2]:
"this is a string".capitalize()

'This is a string'

In [3]:
[1, 3, 2].index(2)

2

So what is a class and how do classes and objects work together? Basically, a class is two things:

- An object's _type_, i.e. what you get if you run `type(object)`;
- An object's _blueprint_, i.e. a declaration of the data and methods an object contains.

This means that in order to create new _types_ of objects in Python, you need to create new _classes_, i.e. create the blueprint that describes the new object type, which you can then use to _instantiate_ objects. The promise of Python's "everything is an object" claim is that doing so, you can create objects that function seamlessly within Python, as if they were built-in objects.

## How to create a class?

Creating a class in Python is as easy as:

In [4]:
# Empty class example
class PrettyUseless:
    pass

pretty_useless_object1 = PrettyUseless()
pretty_useless_object2 = PrettyUseless()
print(pretty_useless_object1, pretty_useless_object2)

<__main__.PrettyUseless object at 0x000001D2949330D0> <__main__.PrettyUseless object at 0x000001D294932710>


You cannot do much with such empty classes, except compare them via `is` and `is not`:

In [5]:
pretty_useless_object1 is pretty_useless_object2

False

In [6]:
pretty_useless_object3 = pretty_useless_object1
pretty_useless_object1 is pretty_useless_object3

True

User-defined classes are always instantiated using the `object = Class()` syntax. This is actually not that different from built-in classes (you can do `int()`, `list()` and `str()` for instance. Those have other ways to instantiate them for historical and practical reasons). By convention, user-defined class names must start with a capital letter. That built-in class names don't is purely a historical accident.

In order to create useful objects, you need to add to your class _attributes_ and _methods_. Attributes are the data contained in your object, and can be of any type (including user-defined classes, even the very class you're creating!). Methods are the functions that work on that data.

In [7]:
class FancyList:
    """A simple list with a name."""
    def __init__(self, name: str, contents: list) -> None:
        self.name = name
        self.contents = contents

    def append(self, added) -> None:
        self.contents.append(added)

In [8]:
fancy_list = FancyList("Simple Title", [1, 2, 3])
print(fancy_list.name, fancy_list.contents)

fancy_list.append(4)
fancy_list.name = "Slightly Less Simple Title"
print(fancy_list.name, fancy_list.contents)

Simple Title [1, 2, 3]
Slightly Less Simple Title [1, 2, 3, 4]


As you can see, attributes are defined with the `self.attribute` syntax _within a method_ of the defined class, and can be accessed using the `object.attribute` syntax (and can be modified using the `object.attribute = something` syntax), while methods are defined with the `def method(self, other_arguments):` syntax, and are accessed using the `object.method(other_arguments)` syntax.

In the class definition, `self` is just a representation of the instantiated object, and the naming convention is customary but optional (although if you want to be understood by other programmers I urge you to stick to the convention!). Its presence as the first argument of methods is mandatory, though.

Also, note the `__init__()` method. It's a special method that makes it possible to pass arguments to the object instantiation syntax `obj = Class(arguments)`, and it's used to initialise instantiated objects with the data passed to it. We'll get back to it later.

## Properties and encapsulation

While you may be happy with creating classes with just plain attributes and methods, one issue with this is that your attributes are fully read-write to anyone using your class. This means they can go about and change them however they want, including putting data in them that can break your class's functionality.

In [9]:
# Let's break FancyList
fancy_list.contents = 1.5
fancy_list.append(5) # Oops

AttributeError: 'float' object has no attribute 'append'

The problem with classes in Python is that they are by default wide open, i.e. all attributes can be overwritten by the user of your class, even if it breaks your class's functionality. This can be a big problem, especially if your class is supposed to ensure that its contained data follows some specific rules (usually called _invariants_). In our case for instance, that means ensuring that `contents` is always a list, but it could be anything.

You could, for instance, want a fancy list that can only contain positive numbers, and will stop you from instantiating it with negative numbers, or from appending negative numbers to it.

In [10]:
class FancyList:
    """A simple list with a name that only accepts positive numbers."""
    def __init__(self, name: str, contents: list[int | float]) -> None:
        self.name = name
        if any(el < 0 for el in contents):
            raise ValueError("No negative numbers allowed")
        self.contents = contents

    def append(self, added) -> None:
        if isinstance(added, list) and any(el < 0 for el in added):
            raise ValueError("No negative numbers allowed")
        if isinstance(added, (int, float)) and added < 0:
            raise ValueError("No negative numbers allowed")
        self.contents.append(added)

In [11]:
fancy_list = FancyList("Won't work", [1, 2, -3])

ValueError: No negative numbers allowed

In [12]:
fancy_list = FancyList("Will work", [1, 2, 3])
print(fancy_list.name, fancy_list.contents)
fancy_list.append(-5) # won't work

Will work [1, 2, 3]


ValueError: No negative numbers allowed

But nobody can stop a user from doing:

In [13]:
fancy_list.contents = [-1, -2, -3]
print(fancy_list.name, fancy_list.contents)

Will work [-1, -2, -3]


Your class cannot make any guarantee about its data as long as attributes can be overwritten at will.

The solution is called encapsulation, i.e. making object-internal data unreachable by outside users, and only give access through methods. This way, the class programmer has full control over what the users of a class can or cannot do.

Here's a naive implementation of encapsulation to solve the problem of disallowing negative numbers in your fancy list:

In [15]:
class FancyList:
    """A simple list with a name that only accepts positive numbers."""
    def __init__(self, name: str, contents: list[int | float]) -> None:
        self.__name = name
        if any(el < 0 for el in contents):
            raise ValueError("No negative numbers allowed")
        self.__contents = contents

    def get_name(self) -> str:
        return self.__name

    def get_contents(self) -> list[int | float]:
        return self.__contents
    
    def set_contents(self, contents: list[int | float]) -> None:
        if any(el < 0 for el in contents):
            raise ValueError("No negative numbers allowed")
        self.__contents = contents

    def append(self, added) -> None:
        if isinstance(added, list) and any(el < 0 for el in added):
            raise ValueError("No negative numbers allowed")
        if isinstance(added, (int, float)) and added < 0:
            raise ValueError("No negative numbers allowed")
        self.__contents.append(added)

In [16]:
fancy_list = FancyList("Will work", [1, 2, 3])
print(fancy_list.get_name(), fancy_list.get_contents())
fancy_list.set_contents([-1, -2, -3]) # won't work

Will work [1, 2, 3]


ValueError: No negative numbers allowed

In [None]:
fancy_list.__name # Won't work either

There are two things to notice here:

1. A double underscore was added to the attributes. The double underscore is Python's way of creating _private attributes_, i.e. attributes that cannot be accessed from outside the class definition. They are not _truly_ private (there is a way to get to them if you really want to), but they are the best Python has to offer to encapsulate data;
2. Attributes are instead accessed using an `object.get_attribute()` method, and set to a different value with an `object.set_attribute(new_value)` method, which can do whatever invariant testing it needs before actually overwriting the attribute.

Note also that by creating a `get_name()` method but no `set_name()` method, we've effectively turned the `name` attribute of `FancyList` _read-only_, a pattern that is very useful for object data that you want to prevent from being overwritten wholesale (you can still modify it via another method if need be).

Now, while it works, there's a big problem with this pattern: the `get_*()` and `set_*()` methods are very un-Python-like. They look like something right out of Java, and look very out of place, especially when built-in Python types and other libraries make liberal use of the attribute syntax. What we actually want is to be able to use plain attribute syntax, but with the control granted by getter and setter methods. It turns out that we can do just that, using the `@property` decorator. Using it to rewrite the `FancyList` class, we get the following:

In [17]:
class FancyList:
    """A simple list with a name that only accepts positive numbers."""
    def __init__(self, name: str, contents: list[int | float]) -> None:
        self.__name = name
        if any(el < 0 for el in contents):
            raise ValueError("No negative numbers allowed")
        self.__contents = contents

    @property
    def name(self) -> str:
        return self.__name

    @property
    def contents(self) -> list[int | float]:
        return self.__contents
    
    @contents.setter
    def contents(self, contents: list[int | float]) -> None:
        if any(el < 0 for el in contents):
            raise ValueError("No negative numbers allowed")
        self.__contents = contents

    def append(self, added) -> None:
        if isinstance(added, list) and any(el < 0 for el in added):
            raise ValueError("No negative numbers allowed")
        if isinstance(added, (int, float)) and added < 0:
            raise ValueError("No negative numbers allowed")
        self.__contents.append(added)

In [18]:
fancy_list = FancyList("Will work", [1, 2, 3])
print(fancy_list.name, fancy_list.contents)
fancy_list.contents = [4, 5, 6] # will work
print(fancy_list.name, fancy_list.contents)
fancy_list.contents = [-1, -2, -3] # won't work

Will work [1, 2, 3]
Will work [4, 5, 6]


ValueError: No negative numbers allowed

In [19]:
fancy_list.name = "Won't work"

AttributeError: property 'name' of 'FancyList' object has no setter

As you can see, using the `@property` decorator, you can create _getter_ methods, which although called using attribute syntax are actually methods under the hood. And once you have a property defined, you can create the corresponding _setter_ method by using a decorator of the form `@attribute.setter`, and defining the setter method using the same function name as the getter method. This setter method, while also called using attribute syntax, is a method under the hood and can thus do whatever the class programmer wants.

Properties are very powerful. They allow you to create getter and setter methods even for things that do not have an underlying attribute in the class!

## Dunder methods, also called "magic" methods

While properties get you quite a long way into making user-defined classes indistinguishable from built-in classes, we're still a long way from the promise of seamless function. For instance, in the case of our fancy list class, we still need to access the `contents` attribute if we want to do pretty much any operation on the underlying list. But really, we want our fancy list to be a drop-in replacement of plain lists, with full symmetry between the two.

To do that, we can keep on doing what we've been doing by defining the `append()` method: define the corresponding list methods in our fancy list class, and _delegate_ the actual execution of the method to the underlying list. For instance, since lists can be sorted in place using the `sort()` method, we want to be able to do the same with our fancy list. This can be done like this:

In [20]:
class FancyList:
    """A simple list with a name."""
    def __init__(self, name: str, contents: list) -> None:
        self.__name = name
        self.__contents = contents

    @property
    def name(self) -> str:
        return self.__name

    @property
    def contents(self) -> list:
        return self.__contents
    
    @contents.setter
    def contents(self, contents: list) -> None:
        self.__contents = contents    

    def append(self, added) -> None:
        self.contents.append(added)

    def sort(self, *args, **kwargs):
        self.contents.sort(*args, **kwargs)

In [21]:
fancy_list = FancyList("Unsorted", [1, 3, 2])
print(fancy_list.name, fancy_list.contents)
fancy_list.sort()
print(fancy_list.name, fancy_list.contents)

Unsorted [1, 3, 2]
Unsorted [1, 2, 3]


The `*args, **kwargs` syntax is Python's way of saying: "take all the not already named arguments given to that function, and pass them wholesale to that other function", and is vital to that type of code delegation.

Now, while you can do that with any _method_ from the `list` class, you soon hit a brick wall with one of the most common operations done on lists: getting their lengths. Because getting the length of a list in Python isn't done using a method, but using the _function_ `len()`, which is defined outside of the definition of `list` (and works with many other things besides lists).

Since `len()` is a built-in function, its definition is set and you cannot redefine it for your fancy list class. Of course, you could define a `len()` _method_ for your class, but then you'd be stuck with having to write `object.len()` to get its length, while other types of lists would use `len(object)` instead. This would break the symmetry and promise of seamless function we've been striving for. Was it all in vain?

Of course not, and Python provides a neat solution to this problem: the _dunder_ methods (also called _magic_ methods).

Dunder methods are basically reserved method _names_ that, when defined in your class, allow it (via the power of _duck typing_) to make use of built-in Python functions, operators and syntax. They are called _dunder_ methods because their names always _start and end_ with a _d_ouble _under_score, and they are sometimes called _magic_ methods because they are never actually called by a user. Instead, they are defined by the class programmer, and this definition alone is enough for a seemingly unrelated bit of the Python language to suddenly work with instances of that class.

In fact, you've already come across dunder methods: `__init__()` is one of them. Although you define it in order to be able to pass arguments to the instantiation syntax of the class, the user never actually calls `__init__()` itself. Rather, Python itself, when seeing the `object = Class(initial_arguments)`, will first create an empty instance of that class, and _then_ call the `__init__()` method of that object, if it exists, passing it the full `initial_arguments`. By defining `__init__()` for your class, you've made it able to use a bit of Python syntax it wouldn't have been able to use otherwise.

What does this mean for our `len()` issue? Well, as it happens, in order to be able to use `len()`, the only thing a class needs is to implement the `__len__()` dunder method.

In [22]:
len(fancy_list)

TypeError: object of type 'FancyList' has no len()

In [23]:
class FancyList:
    """A simple list with a name."""
    def __init__(self, name: str, contents: list) -> None:
        self.__name = name
        self.__contents = contents

    @property
    def name(self) -> str:
        return self.__name

    @property
    def contents(self) -> list:
        return self.__contents
    
    @contents.setter
    def contents(self, contents: list) -> None:
        self.__contents = contents    

    def append(self, added) -> None:
        self.contents.append(added)

    def sort(self, *args, **kwargs):
        self.contents.sort(*args, **kwargs)

    def __len__(self) -> int:
        return len(self.contents)

In [24]:
fancy_list = FancyList("Unsorted", [1, 3, 2])
len(fancy_list)

3

But dunder methods are far more powerful than just giving access to built-in functions. There are in fact [plenty](https://mathspp.com/blog/pydonts/dunder-methods) of them, and they give you access to pretty much every part of the Python machinery.

For instance, a built-in capacity of lists is to have their contents accessed via the bracket syntax. Well, as it turns out, in Python `object[key]` is just syntactic sugar for `object.__getitem__(key)`, and `object[key] = value` is really simply `object.__setitem__(key, value)`. So making your fancy list able to use the bracket syntax is as easy as:

In [25]:
class FancyList:
    """A simple list with a name."""
    def __init__(self, name: str, contents: list) -> None:
        self.__name = name
        self.__contents = contents

    @property
    def name(self) -> str:
        return self.__name

    @property
    def contents(self) -> list:
        return self.__contents
    
    @contents.setter
    def contents(self, contents: list) -> None:
        self.__contents = contents    

    def append(self, added) -> None:
        self.contents.append(added)

    def sort(self, *args, **kwargs):
        self.contents.sort(*args, **kwargs)

    def __len__(self) -> int:
        return len(self.contents)
    
    def __getitem__(self, key):
        return self.contents[key]
    
    def __setitem__(self, key, value) -> None:
        self.contents[key] = value

In [26]:
fancy_list = FancyList("Wow!", [1, 3, 2])
print(fancy_list[1])
fancy_list[1:] = [2, 3, 4]
fancy_list.contents

3


[1, 2, 3, 4]

Other things routinely done with lists are membership testing (`if value in object:`) and iteration (`for value in object:`). As it turns out, those are defined in terms of the dunder methods `__contains__()` and `__iter__()`, and so adding them to your fancy list class just requires defining them:

In [27]:
class FancyList:
    """A simple list with a name."""
    def __init__(self, name: str, contents: list) -> None:
        self.__name = name
        self.__contents = contents

    @property
    def name(self) -> str:
        return self.__name

    @property
    def contents(self) -> list:
        return self.__contents
    
    @contents.setter
    def contents(self, contents: list) -> None:
        self.__contents = contents    

    def append(self, added) -> None:
        self.contents.append(added)

    def sort(self, *args, **kwargs):
        self.contents.sort(*args, **kwargs)

    def __len__(self) -> int:
        return len(self.contents)
    
    def __getitem__(self, key):
        return self.contents[key]
    
    def __setitem__(self, key, value) -> None:
        self.contents[key] = value

    def __contains__(self, item) -> bool:
        return item in self.contents
    
    def __iter__(self):
        return iter(self.contents)

In [28]:
fancy_list = FancyList("Iterable list", [1, 3, 2, 4, 5])
if 2 in fancy_list:
    print("It worked!")
for item in fancy_list:
    print(item)

It worked!
1
3
2
4
5


Another thing lists can do is concatenation using `+`. This, once again, is available using the dunder method `__add__()`. We can even use it to make our fancy list concatenable with plain Python lists!

In [29]:
from typing import Self

class FancyList:
    """A simple list with a name."""
    def __init__(self, name: str, contents: list) -> None:
        self.__name = name
        self.__contents = contents

    @property
    def name(self) -> str:
        return self.__name

    @property
    def contents(self) -> list:
        return self.__contents
    
    @contents.setter
    def contents(self, contents: list) -> None:
        self.__contents = contents    

    def append(self, added) -> None:
        self.contents.append(added)

    def sort(self, *args, **kwargs):
        self.contents.sort(*args, **kwargs)

    def __len__(self) -> int:
        return len(self.contents)
    
    def __getitem__(self, key):
        return self.contents[key]
    
    def __setitem__(self, key, value) -> None:
        self.contents[key] = value

    def __contains__(self, item) -> bool:
        return item in self.contents
    
    def __iter__(self):
        return iter(self.contents)
    
    def __add__(self, other) -> Self:
        if isinstance(other, FancyList):
            return FancyList(f"{self.name} and {other.name}", self.contents + other.contents)
        elif isinstance(other, list):
            return FancyList(f"{self.name} and unnamed", self.contents + other)
        else:
            raise TypeError("Cannot concatenate anything other than a list or fancy list to a fancy list")

In [30]:
fancy_list = FancyList("Concatenable list", [1, 3, 2, 4, 5])
print(fancy_list + FancyList("Other list", [6, 7, 8, 9]))
print(fancy_list + [6, 7, 8])
print([-2, -1, 0] + fancy_list)

<__main__.FancyList object at 0x000001D295720CD0>
<__main__.FancyList object at 0x000001D295721050>


TypeError: can only concatenate list (not "FancyList") to list

Oops! We're having two problems here:

1. Our fancy list doesn't look as nice as a normal list when we try to print it, and;
2. While we can concatenate a list to a fancy list, we can't do it the other way round.

To solve the first issue, we need a way to represent our fancy list as a string that works with how Python displays things. As it turns out, there are actually _two_ ways to do so: `__repr__()` and `__str__()`. The first method is the so-called "machine-readable" representation of the object. Basically, it should be (with emphasis on _should_) a string representation that, when copied and pasted in a Python REPL, would recreate the object as it is at the time. Of course something like that isn't always possible, but the `__repr__()` representation should strive to achieve this goal as much as possible. In the case of our fancy list, this simply means it should return a string containing the instantiation syntax of the object (even if that object was created in a different way, like concatenation). The second method, however, is the so-called "human-readable" representation of the object. It makes no claims as to machine usage, and is just a way to represent the object that we will find nice to read. In fact, `__str__()` is called whenever `print()` and `str()` are used on a non-`str` object. For built-in types, the two methods usually return the same thing (in fact, if you define `__repr__()` but not `__str__()`, calls to the latter will default back to the former), but for user-defined types it's not unusual for them to differ.

For the second issue, we hit another brick wall: if we try to add a fancy list to a plain list, the operation will try to call the plain list's `__add__()` method, which we have no way to modify. Does that mean we're stuck? Of course not. The Python developers have thought of everything, and have provided the `__radd__()` method for that usecase. Basically, if you type `other + object` and `other` doesn't have an `__add__()` method that can handle `object`, Python will check whether `object` implements an `__radd__()` method before raising an error. In fact, the Python developers even provide the `__iadd__()` method to handle the `object += value` syntax, as it often warrants a separate implementation (at least for mutable types as this operation should be in place if possible).

Knowing this, we can add more functionality to our fancy list class this way:

In [31]:
class FancyList:
    """A simple list with a name."""
    def __init__(self, name: str, contents: list) -> None:
        self.__name = name
        self.__contents = contents

    @property
    def name(self) -> str:
        return self.__name

    @property
    def contents(self) -> list:
        return self.__contents
    
    @contents.setter
    def contents(self, contents: list) -> None:
        self.__contents = contents    

    def append(self, added) -> None:
        self.contents.append(added)

    def sort(self, *args, **kwargs):
        self.contents.sort(*args, **kwargs)

    def __len__(self) -> int:
        return len(self.contents)
    
    def __getitem__(self, key):
        return self.contents[key]
    
    def __setitem__(self, key, value) -> None:
        self.contents[key] = value

    def __contains__(self, item) -> bool:
        return item in self.contents
    
    def __iter__(self):
        return iter(self.contents)
    
    def __add__(self, other) -> Self:
        if isinstance(other, FancyList):
            return FancyList(f"{self.name} and {other.name}", self.contents + other.contents)
        elif isinstance(other, list):
            return FancyList(f"{self.name} and unnamed", self.contents + other)
        else:
            raise TypeError("Cannot concatenate anything other than a list or fancy list to a fancy list")
        
    def __radd__(self, other) -> Self:
        if isinstance(other, list):
            return FancyList(f"unnamed and {self.name}", other + self.contents)
        else:
            raise TypeError("Cannot concatenate anything other than a list or fancy list to a fancy list")
        
    def __iadd__(self, other) -> Self:
        if isinstance(other, FancyList):
            self.__name = f"{self.name} and {other.name}"
            self.contents += other.contents
            return self
        elif isinstance(other, list):
            self.__name = f"{self.name} and unnamed"
            self.contents += other
            return self
        else:
            raise TypeError("Cannot concatenate anything other than a list or fancy list to a fancy list")
    
    def __repr__(self) -> str:
        return f"FancyList('{self.name}', {self.contents})"
    
    def __str__(self) -> str:
        return f"{self.name} {self.contents}"

In [32]:
fancy_list = FancyList("Concatenable list", [1, 3, 2, 4, 5])
print(fancy_list + FancyList("Other list", [6, 7, 8, 9]))
print(fancy_list + [6, 7, 8])
print([-2, -1, 0] + fancy_list)

Concatenable list and Other list [1, 3, 2, 4, 5, 6, 7, 8, 9]
Concatenable list and unnamed [1, 3, 2, 4, 5, 6, 7, 8]
unnamed and Concatenable list [-2, -1, 0, 1, 3, 2, 4, 5]


In [33]:
fancy_list + FancyList("Other list", [6, 7, 8, 9])

FancyList('Concatenable list and Other list', [1, 3, 2, 4, 5, 6, 7, 8, 9])

In [34]:
fancy_list += FancyList("Other list", [6, 7, 8])
fancy_list

FancyList('Concatenable list and Other list', [1, 3, 2, 4, 5, 6, 7, 8])

There are many more of these dunder methods (check the link above!), but the most important thing to remember is that with them, it's possible to make your user-defined classes feel as integrated into Python as built-in classes are, and the only limitation is what you _choose_ to implement.

## What's more?

With properties and dunder methods, we've only scratched the surface of what you can do with Python classes, and how to make them feel truly Pythonic in every way. There are many other things you can do, like _class attributes_ and _class methods_, _dataclasses_ which can help you create your classes faster, _inheritance_ which allows you to handle related classes in a smart way, `__call__()`, which turns your objects into functions that can be applied on other objects, not to mention that the claim "everything is an object" applies to classes themselves as well. Classes, as in the blueprints themselves, are objects too, and can be manipulated in Python as such. But those are more advanced topics for another time.