# Tiny Object Benchmarks

Consider these when dealing with large collections of little objects.
This is a quick set of benchmarks comparing various object constructions in terms of:

* Time taken to contruct a bunch of objects
* Time taken to run an access operation on a bunch of objects
* Memory overhead of the objects

See results below. TL;DR:

* Raw tuples are fastest to construct (but you're sacrificing readability).
* There's not an appreciable performance difference between any type when traversing a big collection of objects and accessing one of their elements.
* Setting the `__slots__` attribute of a class reduces memory use and construction time significantly, and the `dataclass` decorator adds no overhead (despite the free stuff you can get from it).

In [1]:
%load_ext memory_profiler
from collections import namedtuple
from dataclasses import dataclass

In [2]:
class Benchmark:
    """ Here's a good reason to use inheritance, btw... """

    @classmethod
    def create_list(cls, n):
        return [
            cls(i, i*2, i*3, i*4, i*5)
            for i in range(n)
        ]

    @staticmethod
    def sum_list(data):
        return sum(obj.value for obj in data)

    @classmethod
    def run(cls):
        print(f"\n--> {cls.BENCHNAME}")
        %timeit cls.create_list(10**5)
        data = cls.create_list(10**5)
        %timeit cls.sum_list(data)

In [3]:
class TupleBenchmark(Benchmark):

    BENCHNAME = "tuple"

    @staticmethod
    def create_list(n):
        return [
            (i, i*2, i*3, i*4, i*5)
            for i in range(n)
        ]

    @staticmethod
    def sum_list(data):
        return sum(obj[0] for obj in data)

In [4]:
class NamedTupleBenchmark(Benchmark):

    _namedtuple = namedtuple("Named", ["a", "b", "c", "d", "e"])
    BENCHNAME = "namedtuple"

    @classmethod
    def create_list(cls, n):
        return [
            cls._namedtuple(i, i*2, i*3, i*4, i*5)
            for i in range(n)
        ]

    @staticmethod
    def sum_list(data):
        return sum(obj[0] for obj in data)

In [5]:
class ListBenchmark(Benchmark):

    BENCHNAME = "list"

    @staticmethod
    def create_list(n):
        return [
            [i, i*2, i*3, i*4, i*5]
            for i in range(n)
        ]

    @staticmethod
    def sum_list(data):
        return sum(obj[0] for obj in data)

In [6]:
class DictBenchmark(Benchmark):

    BENCHNAME = "dict"

    @staticmethod
    def create_list(n):
        return [
            {"value": i, "a": i*2, "b": i*3, "c": i*4, "d": i*5}
            for i in range(n)
        ]

    @staticmethod
    def sum_list(data):
        return sum(obj["value"] for obj in data)

In [7]:
class ClassBenchmark(Benchmark):

    BENCHNAME = "class"

    def __init__(self, value, a, b, c, d):
        self.value = value
        self.a = a
        self.b = b
        self.c = c
        self.d = d

In [8]:
@dataclass
class DataClassBenchmark(Benchmark):

    BENCHNAME = "class + dataclass"

    value: int
    a: int
    b: int
    c: int
    d: int

In [9]:
class SlotsClassBenchmark(Benchmark):

    BENCHNAME = "class + slots"
    __slots__ = ["value", "a", "b", "c", "d"]

    def __init__(self, value, a, b, c, d):
        self.value = value
        self.a = a
        self.b = b
        self.c = c
        self.d = d

In [10]:
@dataclass
class SlotsDataClassBenchmark(Benchmark):

    BENCHNAME = "class + slots + dataclass"
    __slots__ = ["value", "a", "b", "c", "d"]

    value: int
    a: int
    b: int
    c: int
    d: int

In [11]:
TupleBenchmark.run()
NamedTupleBenchmark.run()
ListBenchmark.run()
DictBenchmark.run()
ClassBenchmark.run()
DataClassBenchmark.run()
SlotsClassBenchmark.run()
SlotsDataClassBenchmark.run()


--> tuple
20.6 ms ± 440 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
4.1 ms ± 116 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

--> namedtuple
42.6 ms ± 253 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
4.09 ms ± 50.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

--> list
23 ms ± 300 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
4.18 ms ± 45.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

--> dict
35.5 ms ± 915 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
4.46 ms ± 72.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

--> class
51.3 ms ± 406 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
4.67 ms ± 73.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

--> class + dataclass
52.5 ms ± 130 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
4.76 ms ± 18.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

--> class + slots
43.4 ms ± 173 µs per loop (mean ± std. dev. of 7 run

Not 100% sure I've done these benchmarks correctly or how reliable a memory profile run is for this purpose.
The increment field should show memory used by the objects created.

In [12]:
%memit TupleBenchmark.create_list(10**6)
%memit NamedTupleBenchmark.create_list(10**6)
%memit ListBenchmark.create_list(10**6)
%memit DictBenchmark.create_list(10**6)
%memit ClassBenchmark.create_list(10**6)
%memit DataClassBenchmark.create_list(10**6)
%memit SlotsClassBenchmark.create_list(10**6)
%memit SlotsDataClassBenchmark.create_list(10**6)

peak memory: 277.35 MiB, increment: 203.50 MiB
peak memory: 326.72 MiB, increment: 251.92 MiB
peak memory: 342.93 MiB, increment: 261.81 MiB
peak memory: 374.43 MiB, increment: 292.41 MiB
peak memory: 359.98 MiB, increment: 277.63 MiB
peak memory: 359.90 MiB, increment: 277.38 MiB
peak memory: 327.37 MiB, increment: 244.84 MiB
peak memory: 327.45 MiB, increment: 244.96 MiB
