# Dataobjects without cyclic GC support

It would seem, why we need classes in Python without garbage collection support: isn't it safe? To clarify: we are talking about classes that do not have support for cyclic garbage collection, intended for the disposal of container objects, which may contain circular references. Normal object disposing when the reference count goes to zero is supported as usual in Python.

When is it resonable? For example, when the object represents a record/struct with the fields whose values represent simple data types - numbers, strings, date / time, etc.

There is a more general class of objects that do not have reference cycles by construction. Yes, theoretically reference cycles are possible, but specific instances of classes are initialized and processed in such a way that, in fact, they are not. Due to this, it is not necessary to allocate additional `3*8=24` bytes (for 64-bit platforms) of memory for them (since Python 3.8 only `2*8=16`). If you have limited memory or you create millions of tiny or small objects, then this additional memory can make a difference. Another side effect is the fact that Python needs less time for GC.

Nowadays, there is a well-known for defining classes with `__slots__`, whose instances don’t have `__dict__` and `__weakref__` and therefore requires less memory. Let's demonstrate this by example.

In [1]:
import sys

class PointSlots:
    __slots__ = 'x', 'y'

    def __init__(self, x, y):
        self.x = x
        self.y = y
        
p = PointSlots(1,2)

print("Own size:", p.__sizeof__(), "bytes")
print("Own size + gc header:", sys.getsizeof(p), "bytes")

Own size: 32 bytes
Own size + gc header: 48 bytes


Additional 24 bytes are for GC. But, strictly speaking, cyclic garbage collection is not needed for our class

New release `0.10` of [recordclass](https://pypi.org/project/recordclass/) library introduce inventories for creation of such classes without GC support by default.

In [2]:
from recordclass import dataobject

class PointDO(dataobject):
    x:int
    y:int
    __options__ = {'fast_new':True}
        
p2 = PointDO(1,2)
print("Own size:", p2.__sizeof__(), "bytes")
print("Own size + gc header:", sys.getsizeof(p2), "bytes")

Own size: 32 bytes
Own size + gc header: 32 bytes


As for class with `__slots__` only attributes with given names 'x' and 'y' are allowed.
There is no `__dict__`.
There is no `weakref` support.
But they all can be enabled explicitly as in a case with `__slots__`.
It's also subclassable.
Any field of the instances can be made readonly, default values are also possible. At the end compare the performance in order to indicate that there is no regression:

In [3]:
def test_PointSlots():
    for i in range(1000000):
        p = PointSlots(1, 2)
        p.x = 5; p.y = 6
        a = p.x; b = p.y

def test_PointDO():
    for i in range(1000000):
        p = PointDO(1, 2)
        p.x = 5; p.y = 6
        a = p.x; b = p.y        

In [4]:
print("Point (__slots__):")
%timeit test_PointSlots()
print("Point (dataobject):")
%timeit test_PointDO()

Point (__slots__):
255 ms ± 2.22 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Point (dataobject):
155 ms ± 8.16 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [5]:
import psutil
import os
import gc

def memory_usage_psutil():
    # return the memory usage in percentage like top
    process = psutil.Process(os.getpid())
    mem = process.memory_percent()
    return mem


In [6]:
class NodeDO(dataobject):
    left: 'NodeDO'
    right: 'NodeDO'
    __options__ = {'fast_new':True}
        
class TreeDO(dataobject):
    root: NodeDO

def add_nodes_do(depth):
    if depth == 0:
        return None
    return NodeDO(add_nodes_do(depth-1), add_nodes_do(depth-1))

def test_do():
    root = add_nodes_do(22)
    tree = TreeDO(root)
#     return c


In [7]:
class NodeSlots:
    __slots__ = 'left', 'right'
    
    def __init__(self, left, right):
        self.left = left
        self.right = right
        
class TreeSlots:
    __slots__ = 'root',
    
    def __init__(self, root):
        self.root = root

def add_nodes_slots(depth):
    if depth == 0:
        return None
    return NodeSlots(add_nodes_slots(depth-1), add_nodes_slots(depth-1))

def test_slots():
    root = add_nodes_slots(22)
    tree = TreeSlots(root)
#     return c


In [8]:
%timeit test_do()

962 ms ± 6.33 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [9]:
%timeit test_slots()

1.36 s ± 6.56 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
