# Memory Management in Python

### Everything is in memory as an object

In [1]:
a_int = 10
a_str = "hello"
a_list = [1, 2, 3, 4]
a_dict = {"A":1, "B": 2}

def a_func():
    pass

class A:
    pass

print(hex(id(a_int)))
print(hex(id(a_str)))
print(hex(id(a_list)))
print(hex(id(a_dict)))
print(hex(id(a_func)))
print(hex(id(A)))

0x558624008e20
0x7f90245fbbb0
0x7f90245c8480
0x7f90245fbfc0
0x7f9024616c10
0x558624ce8590


In [2]:
print(isinstance(a_int, object))
print(isinstance(a_str, object))
print(isinstance(a_list, object))
print(isinstance(a_dict, object))
print(isinstance(a_func, object))
print(isinstance(A, object))

True
True
True
True
True
True


This means we can `assign` them to a variable, `pass` them to a function, or `return` them from a function.

### id() function

In [3]:
# In Python you can find out memory address of a variable by using id() function
a = 5

a_address = id(a)

print(a_address)
print(hex(a_address))

94034617994624
0x558624008d80


In [4]:
b = "hello"

b_address = id(b)
print(b_address)
print(hex(b_address))

140257062271920
0x7f90245fbbb0


In [5]:
# We can use ctypes.cast to create a new Python object which reference to same address
import ctypes

# *a_address
print(ctypes.cast(a_address, ctypes.py_object).value)

# *b_address
print(ctypes.cast(b_address, ctypes.py_object).value)

5
hello


### sys.getrefcount(...)

In [6]:
import sys


# Reference counters track the number of references to an object.
# They are part of Python's memory management system and aid in garbage collection.
my_var = 12324

# The counter starts at 1 when an object is created.
before = sys.getrefcount(my_var)

# It increments when a reference is created
my_var2 = my_var
after_reference_by_other = sys.getrefcount(my_var)

In [7]:
import sys


# Reference counters track the number of references to an object.
# They are part of Python's memory management system and aid in garbage collection.
my_var = 12324

In [8]:
hex(id(my_var))

'0x7f9024603f90'

In [9]:
# The counter starts at 1 when an object is created.
before = sys.getrefcount(my_var)

# It increments when a reference is created
my_var2 = my_var
after_reference_by_other = sys.getrefcount(my_var)

print(f"Starting reference: {before}")
print(f"After another variable reference: {after_reference_by_other}")

Starting reference: 2
After another variable reference: 3


> Note that sync value are passed by reference in Python, sys.getrefcount(...) will return one additional count to actual reference number. It can be more in case you are running Python Interpreter in some differnt environment, like Jupyter notebook.

In [10]:
print(id(my_var))
print(id(my_var2))
id(my_var) == id(my_var2)

140257062305680
140257062305680


True

In [11]:
# We can actually use ctypes to get the actual reference counter
address = id(my_var)
ctypes.c_long.from_address(address).value

2

In [12]:
# reference counter decrements when a reference is deleted.
del my_var2
ctypes.c_long.from_address(address).value

1

In [13]:
ctypes.cast(address, ctypes.py_object).value

12324

In [14]:
# The counter reaching zero indicates no more references to the object.
del my_var

# This might now give some random value. 
# As of fact reference counter for value 12324 is 0 now so python's garbage collection kicks in and free up the space
ctypes.c_long.from_address(address).value

7

In [15]:
# Doing this may crash the Python interpreter ass the address now does not contains python object
# ctypes.cast(address, ctypes.py_object).value

- Objects with a reference count of zero are considered garbage.
- Garbage collection reclaims memory occupied by unreferenced objects.
- Reference counting offers efficient memory management and immediate resource reclamation.
- However, it doesn't handle cyclic references (objects referencing each other).
- Python uses additional garbage collection mechanisms to handle cyclic references.
- The reference counter ensures timely deallocation of objects and efficient memory usage.

### Python is dynamically typed

In [16]:
my_var = 10
print(hex(id(my_var)))
print(type(my_var))

0x558624008e20
<class 'int'>


In [17]:
my_var = "Test string"
print(hex(id(my_var)))
print(type(my_var))

0x7f90245e35f0
<class 'str'>


### Mutability

In [18]:
# Mutable objects provide a way to mutate them (change the internal state)
a_list = [10, 20, 30]

print(hex(id(a_list)))

a_list.append(40)

print(hex(id(a_list)))

0x7f902460d840
0x7f902460d840


In [19]:
# Note that this create a new object in memory
a_list2 = [10, 20, 30]

print(hex(id(a_list2)))

a_list2 = a_list2 + [40]

# Note how Python recreated a new object here, 
# since it evaluates the right hand side and then reassign it to a_list2 object
print(hex(id(a_list2)))

0x7f90245c9580
0x7f90245c9840


In [20]:
a_tuple = ([10, 20, 30], 50, 60)

a_tuple[0].append(40)
print(a_tuple)

([10, 20, 30, 40], 50, 60)


> Note that in Python almost every user define data structure is mutable. Only immutable data type in Python are `int`, `float`, `complex`, `str`, `tuple`, `frozenset`, `bytes`, `True`, `False`, `None`.

### Variable Equality

- `a == b` checks if value of a is equal to value of b
- `a is b` checks if value of id(a) is equal to value of id(b)

In [21]:
a_int = 10
b_int = 10

print(a_int == b_int)
print(a_int is b_int) # id(a) == id(b)

True
True


In [22]:
a_int = 999
b_int = 999

print(a_int == b_int)
print(a_int is b_int) # id(a) == id(b)

True
False


In [23]:
a_str = "python"
b_str = "python"

print(a_str == b_str)
print(a_str is b_str)

True
True


In [24]:
a_str = "python course"
b_str = "python course"

print(a_str == b_str)
print(a_str is b_str)

True
False


We will check why this happens when we discuss what Python does for memory optimization.

In [25]:
a_list = [1, 2, 3, 4]
b_list = [1, 2, 3, 4]

print(a_list == b_list)
print(a_list is b_list)

True
False


In [26]:
a_int = 10
a_float = 10.0

print(a_int == a_float)
print(a_int is a_float)

True
False


In [27]:
a_obj = None
b_obj = None

print(a_obj == None)
print(a_obj is None)
print(a_obj == b_obj)
print(a_obj is b_obj)

True
True
True
True


### Memory optimization

#### Interning

[From Wikipedia](https://en.wikipedia.org/wiki/Interning_(computer_science))
> In computer science, interning is re-using objects of equal value on-demand instead of creating new objects. This creational pattern is frequently used for numbers and strings in different programming languages.

##### Number Interning:
- Python interns small integers in the range [-5, 256]. This means that any variable referencing an integer within this range will point to the same memory location. For example, `x = 5` and `y = 5` will have `x is y` evaluate to `True`.
- Numbers outside the interned range or those created dynamically are not interned. For example, `x = 1000` and `y = 1000` will have `x is y` evaluate to `False`.

##### String Interning:
- String literals which are valid identifier are interned by default.
- Strings created at runtime (i.e., not string literals) are typically not interned. This includes strings obtained through concatenation or string formatting.
- User can manually intern a string using `sys.intern`.

In [28]:
a_string = "python"
b_string = "python"

print(a_string == b_string)
print(a_string is b_string)

True
True


In [29]:
a_string = "python!"
b_string = "python!"

print(a_string == b_string)
print(a_string is b_string)

True
False


In [30]:
a_string = "_this_string_will_be_intern_since_it_is_valid_identifier"
b_string = "_this_string_will_be_intern_since_it_is_valid_identifier"

print(a_string == b_string)
print(a_string is b_string)

True
True


In [31]:
a_string = "Generally this is not interned, but we can intern it using sys.intern"
b_string = "Generally this is not interned, but we can intern it using sys.intern"

print(a_string == b_string)
print(a_string is b_string)

a_string = sys.intern("Generally this is not interned, but we can intern it using sys.intern")
b_string = sys.intern("Generally this is not interned, but we can intern it using sys.intern")

print(a_string == b_string)
print(a_string is b_string)

True
False
True
True


#### Peephole

[From Wikipedia](https://en.wikipedia.org/wiki/Peephole_optimization)

> Peephole optimization is an optimization technique performed on a small set of compiler-generated instructions; the small set is known as the peephole or window.

In [32]:
c = compile("24 * 60", '<string>', 'eval')
print(c.co_consts)

(1440,)


In [33]:
c = compile("(1, 2) * 5", '<string>', 'eval')
print(c.co_consts)

((1, 2, 1, 2, 1, 2, 1, 2, 1, 2),)


In [34]:
c = compile("\"xyz\" * 4", '<string>', 'eval')
print(c.co_consts)

('xyzxyzxyzxyz',)


In [35]:
def a_func():
    a_int = 24 * 60
    # short length sequence <= 20
    a_tuple = (1, 2) * 5  
    a_string = "xyz" * 4

a_func.__code__.co_consts

(None, 1440, (1, 2, 1, 2, 1, 2, 1, 2, 1, 2), 'xyzxyzxyzxyz')