# Python memeory management


## C memory management

Static vs Dynamic memory alloccation.

Static:
- You know all amount of memory you need in advance
- So you allocate it befor eprogram execution

Dynamic:
- You dynamically demand memory for the new objects
- Allocated memory can be released and reused.


You allocate new memory from Operation System by usinbg `malloc` method or similar
and `free` to release.

### Memory leak - when you allocate memory, but forgot to release it.
So you don't have any object that uses it, but OS still thinks that you need it. If it
continue to happen your app will eat all available memory. And OS probably will kill it.

### Life is too short to know C very well.

## Python memory manager

In python you don't need to bother with memory allocation and releasing. There is
python memory manager that does it for you in the optimal way, so you don't need to think
about it.

Allocation of memory for the object is not a problem, but it's much harder to figure
out when to release it.

Python has two mechanisms of memory management for the objects:

### Reference counter

The simplest mechanism that you can came up to is to: count amount of references to
the object and if it's = 0 - it's no longer accessible and hence we can release memory.

How does it look?

This is PyObject declaration, which is base structure for all other types of objects like int, list, function etc.

https://docs.python.org/dev/c-api/structures.html#c.PyObject

https://github.com/python/cpython/blob/master/Include/object.h#L105

```
typedef struct _object {
    _PyObject_HEAD_EXTRA
    Py_ssize_t ob_refcnt;
    PyTypeObject *ob_type;
} PyObject;
```

Every object in python contains `ob_refcnt`

And you can see how this counter is being incremented or decremented with C function
`Py_INCREF`, `Py_DECREF`

```
static inline void _Py_INCREF(PyObject *op)
{
#ifdef Py_REF_DEBUG
    _Py_RefTotal++;
#endif
    op->ob_refcnt++;
}

#define Py_INCREF(op) _Py_INCREF(_PyObject_CAST(op))
```


Here is the example for  `list.append(v)`

https://github.com/python/cpython/blob/master/Objects/listobject.c#L304

```
static int
app1(PyListObject *self, PyObject *v)
{
    Py_ssize_t n = PyList_GET_SIZE(self);

    assert (v != NULL);
    assert((size_t)n + 1 < PY_SSIZE_T_MAX);
    if (list_resize(self, n+1) < 0)
        return -1;

    Py_INCREF(v);
    PyList_SET_ITEM(self, n, v);
    return 0;
}
```

So in this chunk of the code when we add new object `v` into the list we increment
amount of references to the object `v`.

Same think is happens when we create new variables, send object as function argument,
or when function finished the execution and we destroy to scope and etc.

This is the code for `_Py_DECREF`

https://github.com/python/cpython/blob/master/Include/object.h#L431

```
static inline void _Py_DECREF(
#ifdef Py_REF_DEBUG
    const char *filename, int lineno,
#endif
    PyObject *op)
{
#ifdef Py_REF_DEBUG
    _Py_RefTotal--;
#endif
    if (--op->ob_refcnt != 0) {
#ifdef Py_REF_DEBUG
        if (op->ob_refcnt < 0) {
            _Py_NegativeRefcount(filename, lineno, op);
        }
#endif
    }
    else {
        _Py_Dealloc(op);
    }
}
```

As soon as we have 0 reference counter we deallocate memory immediately and we
also decrease amount of references to all other objects that current one referenced to.
So it deletes all objects recursively if necessary.

You can check reference count in the python using `sys.getrefcount`,
but keep in mind that in jupyter-notebook or ipython it will show other numbers, cause that environments
does a lot of additional things. So check it in pure python interpreter!

And remember that `sys.getrefcount` function call also increased amount of references
to the object.

In [3]:
import sys

a = 1000
print(sys.getrefcount(a))
b = [1, 2, a]
print(sys.getrefcount(a))

3
4


Quick knowledge checking, what value you expect in this case?

In [None]:
c = 3
print(sys.getrefcount(c))

Surprised? Check 02_int_caching topic.

So Reference counter is super efficient mechanism! It happens only when amount of
references are changed and only for the that specific objects.

But it cant handle some cases =\

## Cyclic reference problem

In [None]:
d = []  # 1 reference to list1: variable d
e = [d]  # 1 reference to list2: variable e; 2 references to list1: variable d and list2
d.append(e) # 2 references to list2: variable e and list1

del d, e  # deleting variables
# 1 reference to list1: list2
# 1 reference to list2: list1

## Garbage collector

### Mark and Sweep algorithm

1. Build a graph from your entrypoint and mark all objects that are reachable from the entrypoint.
2. Delete all unreachable objects.

Python doesn't use it.

### Generational Garbage collector

Decrement reference counter of the objects from the same generation and if it's become 0 then they are unreachable.

### When it starts?

When amount of allocations of new objects more than de-allocations on 700.

### Optimizing with generations

There are 3 generations by default!

Instead of checking all objects for cyclic references we check only objects from specific generation.

- New objects go to generation 0.
- If object survives garbage collection of his generation, it will be moved to the generation + 1, unless it's already
in the last.
- GC of generation 0 is triggered when there are `700` more allocations then de-allocations
- GC of generation 1 is triggered when GC of generation 0 happened `10` times.
- GC of generation 2 is triggered when GC of generation 1 happened `10` times, all objects survived stay here.

So it 's the way - the most recent objects has more changes of being collected.

In [5]:
import gc
gc.get_threshold()  # get current thresholds
gc.set_threshold(1000, 20, 20)  # change them
gc.disable()  # disable gc
gc.enable()  # enable
gc.collect(generation=2)  # forcibly trigger collection


334

### What will happen if we disable GC?

https://instagram-engineering.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172


### Does all objects are tracked by garbage collection?

No. It tracks only "container" objects. Objects that can contain reference to other objects, like list, class, dict.

It doesn't track such objects as: int, str, float - because they can't make cyclic reference,
they are handled by reference counter.


### Weak references

Weakrefs - it's a references that does not stop object from being destroyed.

In [3]:
class World:
    def __init__(self):
        self.animals = []

class Animal:
    def __init__(self, world):
        self.world = world

world = World()
a1 = Animal(world)
world.animals.append(a1)

print(a1.world)
del world
print(a1.world)

<__main__.World object at 0x10b1b4040>
<__main__.World object at 0x10b1b4040>


In [4]:
import weakref

class World:
    def __init__(self):
        self.animals = []

class Animal:
    def __init__(self, world):
        self.world = weakref.ref(world)

world = World()
a1 = Animal(world)
world.animals.append(a1)

print(a1.world)
print(a1.world())
del world
print(a1.world)
print(a1.world())



<weakref at 0x10f0081d0; to 'World' at 0x10f006640>
<__main__.World object at 0x10f006640>
<weakref at 0x10f0081d0; dead>
None


Study links:

- https://pythonpedia.com/en/tutorial/2532/garbage-collection
- https://scoutapm.com/blog/python-garbage-collection
- https://towardsdatascience.com/memory-management-and-garbage-collection-in-python-c1cb51d1612c


todo

`__del__` problem

Todo exercise / interview task
