<a href="https://colab.research.google.com/github/ValentinoVizner/Python_Deep_Dive_1/blob/master/Section_3_Variables_and_Memory.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# SECTION 3. VARIABLES AND MEMORY

## 1. Variables Are Memory References

![alt text](https://drive.google.com/uc?id=1VdPKXbWvyWOzxXhgh12LnQWaySAmVhCn)

When we store data in memory addresses we may actually use more than 1 slot at a time (object 1 and object 2).
</br>
But as long as we know where the object starts in memory thats good enough.
</br>
So the object 1 starts and 0x1000 and it overflows into another memory address e.g. 0x1001
</br>
Object 2 starts at memory address 0x1002 and it overflows into 2 more memory address.
</br>
While Object 3 fits preciselly into one slot of memory.
</br>
</br>
**Heap** is range of memory addresses where our objects are stored.
**Python Memory Manager** is responsible of pulling object from the **heap**

Here is an example:

![alt text](https://drive.google.com/uc?id=18UOEEgkBpnZqZDmr-Fha7t70wUukcxcd)

We can find the memory address that a variable *references*, by using the `id()` function.

The `id()` function returns the memory address of its argument as a base-10 integer.

We can use the function `hex()` to convert the base-10 number to base-16.

In [0]:
my_var = 10
print('my_var = {0}'.format(my_var))
print('memory address of my_var (decimal): {0}'.format(id(my_var)))
print('memory address of my_var (hex): {0}'.format(hex(id(my_var))))

In [0]:
greeting = 'Hello'
print('greeting = {0}'.format(greeting))
print('memory address of my_var (decimal): {0}'.format(id(greeting)))
print('memory address of my_var (hex): {0}'.format(hex(id(greeting))))

Note how the memory address of `my_var` is **different** from that of `greeting`.

Strictly speaking, `my_var` is not "equal" to 10. 

Instead `my_var` is a **reference** to an (*integer*) object (*containing the value 10*) located at the memory address `id(my_var)`

Similarly for the variable `greeting`.

## 2. Reference Counting

![alt text](https://drive.google.com/uc?id=1tC5p8CU48NWqyRoeVlxBky7JbUudg2yo)

REMEMBER: We are dealing with pointers, so our other_var is pointing, i.e. REFERENCING to the memory address 0x1000 and its not actually REFERENCING to the value 10.
</br>
They are both pointing to the same reference, so the reference count is 2.

Now if we delete the `my_var` the `count` will decrease to `1`.
</br>
After we delete last variable referencing to the memory address i.e. `other_var` our `count` will be `0` and **Python Memory Manager** will delete and free up memory space/address for someone else, another variable to use it.

![alt text](https://drive.google.com/uc?id=1Zr4LR4XYFKQ9ilOxFchLCMtQeFGGT8xQ)
</br>
</br>
There is the downside for using `sys.getrefcount(my_var)` because when we are passing `my_var` to `sys.getrefcount(my_var)` it creates and extra reference i.e. count

![alt text](https://drive.google.com/uc?id=1ZaVrmuDDRfrPL1UTM0hBZGAxQ5F6y5Ci)
</br>
</br>

This is much better because it is not referencing the variable, instead we are just passing the memory address, so it does NOT affect REFERENCING COUNT!

In [0]:
import sys

a = [1, 2, 3]

sys.getrefcount(a)

2

But why is this returning 2, instead of the expected 1 we obtained with the previous function?

Answer: The `sys.getrefcount()` function takes `my_var` as an argument, this means it receives (and stores) a reference to `my_var`'s memory address **also** - hence the count is off by 1.

In [0]:
# BETTER WAY
import ctypes

def ref_count(address: int):
    return ctypes.c_long.from_address(address).value

my_var = [1, 2, 3, 4]

ref_count(id(my_var))

1

In [0]:
other_var = my_var

print(id(my_var))
print(id(other_var))

140195183375240
140195183375240


In [0]:
ref_count(id(other_var))

2

In [0]:
other_other = my_var

ref_count(id(other_var))

3

In [0]:
other_other = 100

In [0]:
ref_count(id(other_other))

189

You'll probably never need to do anything like this in Python. Memory management is completely transparent - this is just to illustrate some of what is going behind the scenes as it helps to understand upcoming concepts.

In [0]:
my_var_id = id(my_var)
other_var = None
my_var = None


ref_count(my_var_id)

64142

In [0]:
ref_count(my_var_id)

64140

What's happening here is that once the memory address is freed, we do not know what is stored there, probably JUNK. It's dangerous to manage/delete that in Python

## 3. Garbage Collection

![alt text](https://drive.google.com/uc?id=1ig7qPO-YVKlP13KBbKbJuF7sqNupNOIk)
</br>
What happens if we remove the `my_var` pointer?
</br>
Then the reference count of object A goes to 0, but the reference count of object B is still 1.
</br>
BUT the reference count of object A is going to 0 and it will be destroyed, once it gets destroyed, reference count of object B will go to 0 and it gets destroyed as well.
</br>
So far that is good, what we expected to happen.
</br>
</br>
Consider now that object B has `var_2` that points to `var_1`, i.e. Circular References:


![alt text](https://drive.google.com/uc?id=11SbqFbGcF7fyqe0zku2Ivc5ndpbFFsyt)
</br>
Now when we remove the `my_var` the reference count for `var_1` in object A will still be 1.
</br>
So in this case by removing `my_var` both variables in objects A and B wont be destroyed as expected.

![alt text](https://drive.google.com/uc?id=1naSNs4OJ5Xlc6aOZXdoR__DXRDBxvCap)
</br>
Beccause **Python Memory Manager** can not clean this up and we leave the thins as is, we will have **MEMORY LEAK!!**
</br>
</br>
Thats where the **GARBAGE COLLECTOR** comes in!

![alt text](https://drive.google.com/uc?id=1hhcGms4SQf83N5LDCXbmSQL4x78K9IR5)

![alt text](https://drive.google.com/uc?id=1DbKKhD0-bK-jdbKhF8cR3z6jj4lMA3fJ)

In [0]:
import ctypes
import gc

In [0]:
def ref_count(address: int):
    return ctypes.c_long.from_address(address).value

We create a function that will search the objects in the GC for a specified id and tell us if the object was found or not:

In [0]:
def object_by_id(object_id):
    for obj in gc.get_objects():
        if id(obj) == object_id:
            return "Object exists"
    return "Not Found"

Next we define two classes that we will use to create a circular reference

Class A's constructor will create an instance of class B and pass itself to class B's constructor that will then store that reference in some instance variable.

In [0]:
class A:
    def __init__(self):
        self.b = B(self)
        print(f"A: self: {hex(id(self))}, b: {hex(id(self.b))}")

In [0]:
class B:
    def __init__(self, a):
        self.a = a
        print(f"B: self: {hex(id(self))}, a: {hex(id(self.a))}")

We turn off the GC so we can see how reference counts are affected when the GC does not run and when it does (by running it manually).

In [0]:
gc.disable()

In [0]:
my_var = A()

B: self: 0x7f81bc103128, a: 0x7f81bc1030f0
A: self: 0x7f81bc1030f0, b: 0x7f81bc103128


In [0]:
hex(id(my_var))

'0x7f81bc1030f0'

In [0]:
print('a: \t{0}'.format(hex(id(my_var))))
print('a.b: \t{0}'.format(hex(id(my_var.b))))
print('b.a: \t{0}'.format(hex(id(my_var.b.a))))

a: 	0x7f81bc1030f0
a.b: 	0x7f81bc103128
b.a: 	0x7f81bc1030f0


In [0]:
a_id = id(my_var)
b_id = id(my_var.b)

In [0]:
ref_count(a_id)

2

In [0]:
ref_count(b_id)

1

In [0]:
print('refcount(a) = {0}'.format(ref_count(a_id)))
print('refcount(b) = {0}'.format(ref_count(b_id)))
print('a: {0}'.format(object_by_id(a_id)))
print('b: {0}'.format(object_by_id(b_id)))

refcount(a) = 2
refcount(b) = 1
a: Object exists
b: Object exists


As we can see the A instance has two references (one from `my_var`, the other from the instance variable `b` in the B instance)

The B instance has one reference (from the A instance variable `a`)

Now, let's remove the reference to the A instance that is being held by `my_var`:

In [0]:
my_var= None

In [0]:
print('refcount(a) = {0}'.format(ref_count(a_id)))
print('refcount(b) = {0}'.format(ref_count(b_id)))
print('a: {0}'.format(object_by_id(a_id)))
print('b: {0}'.format(object_by_id(b_id)))

refcount(a) = 1
refcount(b) = 1
a: Object exists
b: Object exists


As we can see, the reference counts are now both equal to 1 (a pure circular reference), and reference counting alone did not destroy the A and B instances - they're still around. If no garbage collection is performed this would result in a memory leak.

</br>
</br>
Let's run the GC manually and re-check whether the objects still exist:

In [0]:
gc.collect()
print('refcount(a) = {0}'.format(ref_count(a_id)))
print('refcount(b) = {0}'.format(ref_count(b_id)))
print('a: {0}'.format(object_by_id(a_id)))
print('b: {0}'.format(object_by_id(b_id)))

refcount(a) = 3112880742897218958
refcount(b) = 0
a: Not Found
b: Not Found


In [0]:
ref_count(a_id)

3112880742897218958

In [0]:
ref_count(b_id)

0

In [0]:
object_by_id(b_id)

'Not Found'

So why do we get some number if we garbage collected b_id and a_id.
</br>
Variables got destroyed, their memory addresses, so we do not know what are those values, certainly they are not memory addresses of variables we saw earlier.

## 4. Dynamic vs Static Typing

![alt text](https://drive.google.com/uc?id=1D8zc-W3_MG3zIdW-6hWxsmifgwYeBa9O)

In [0]:
a = 'hello'

In [2]:
type(a)

str

In [0]:
a = lambda x: x**2

In [4]:
a(2)

4

In [5]:
type(a)

function

In [0]:
a = 3 + 4j

In [7]:
type(a)

complex