# Variables and Memory

- memory references
    - what variables really are
        - memory management
            - reference counting
            
            - garbage collection
                - dynamic vs static typing
                
                       -mutability and immutability
                           - shared references
                               - variable equality
                                   - everything is an object

## Variables are Memory References

![](./images/img9.jpg)

- Storing and retriving objects from the heap is taken care of for us by `Python Memory Manager`

![](./images/img10.jpg)

- Note: my_var_1 != 10, my_var_1 infact is = to the memory address 0x1000 in this case, but 0x1000 represents the memory address of the data that we are actually interested in.
- my_var_1 is a reference to an object at the memory location to which this name/alias is referencing to.

![](./images/img11.jpg)

- It's importat to understand that variable sin python are references to objects in memory.
- In python we can find out the memory address of a variable using the `id() function`.
     - This will return a base10 number. We can convert this base10 number to hexadecimal using `hex() function`.
     

In [2]:
a = 10
print(hex(id(a)))

0x7fffe07763b0


In [3]:
greeting = 'hello'
print(greeting)

hello


In [4]:
print(id(greeting))

1912861025872


In [5]:
print(hex(id(greeting)))

0x1bd5f66e650


- Bottom variables are just memory addresses. They are not equal to the values that we think they are equal to.

# Reference Counting

![](./images/img_12.jpg)
![](./images/img_13.jpg)

- Now suppose my_var goes aways, either becomes out of scope or my_var assigned to None. Now the reference count goes to 1.

![](./images/img_14.jpg)

- Now lets say other_var also goes away. Now since no variable is referencing to 0x1000 therefore the reference count drops down to 0.

![](./images/img_15.jpg)
- Now at this point the python memory manager recgonizes that since there is no references left, therefore it throws away the object stored at memory address 0x1000. 
    - Now this space can be reused again to store another object.

- This is called Reference Counting, this is done automatically by the python memory manager.

## Finding the Reference Count

- sys.getrefcount(my_var)
     - When we pass my_var to `getrefcount() fucntion` it is actually creating another reference to that same object in memory.
     - There is a downside to using getrefcount(), it always increases the reference count by 1 because simply the act of passing my_var to the function getrefcount() creates another reference to that same variable since variables are passed by reference in python. 

In [29]:
#Uderstading the above concept

def func_1(my_var_test):
    print('Inside func_1: ', hex(id(my_var_test)))
my_var_test = 10
print('Inside main:', hex(id(my_var_test)))
func_1(my_var_test)
# This proves that variables are passed by reference in python.

Inside main: 0x7fffe07763b0
Inside func_1:  0x7fffe07763b0


In [39]:
import sys
sys.getrefcount(my_var_test)
#134

134

- The reason we are getting 134 as the refcount because it is not the refcount of my_var_test. Rather it is the reference count of the value 10. Variable names don't have references; they are the references.

[stackoverflow related thread](https://stackoverflow.com/questions/61738531/why-does-a-new-python-variable-have-a-reference-count-of-108)

In [40]:
a = hex(id(my_var_test))
sys.getrefcount(a)

2

- Here we are getting refcount as 2. One for `a` and the other one is resulting from passing a as a parameter to `sys.getrefcount()` since parameters are passed by reference in python.

### Another method to find the reference count without having the drawbacks

- ctypes.c_long.from_address(address).value
    - here we are just passing the memory address(id()/ an integer), not a reference - does not affect reference count.

In [882]:
#example
import sys

In [883]:
a = [1, 2, 3]

In [884]:
id(a)

1912865986760

In [885]:
sys.getrefcount(a)

2

In [886]:
import ctypes

In [887]:
#defining a wrapper function
def refcount(address: int):
    return ctypes.c_long.from_address(address).value

In [888]:
refcount(id(a))
# Note id(a) is evaluated first, when we are running id(a) the reference count to the memory address is 2; 1 of 'a' and 1 since 
# we passed a as parameter to id() function. Now id() finishes running and returns the memory address and hence reduces the refcount back to 1.
#Therefore by the time we call refcount() id has finished running, and it has released it's pointer to that memory address.
#Now, the refernce count is indeed back to 1

1

In [889]:
 refcount(1912865986760)

1

In [890]:
b = a

In [891]:
id(b)
# a and b are both pointing to the same location in memory

1912865986760

In [892]:
refcount(id(a))

2

In [893]:
c = a
refcount(id(a))

3

In [894]:
c = 10

In [895]:
refcount(id(a))

2

In [896]:
b = None

In [897]:
id(b)

140736958835936

In [898]:
refcount(id(a))

1

In [62]:
# digressing a little bit also note None is a real object in memory any variable assigned as None will have same memory address
q = None
id(q)

140736958835936

In [319]:
a_id = id(a)

In [320]:
a_id

1912865715144

In [479]:
a = None

In [480]:
id(a)

140736958835936

In [526]:
a_id

1912865715144

In [899]:
#Note
a_id = id(a)
a = None
refcount(a_id)

1

In [299]:
a_id

140736958835936

In [938]:
refcount(a_id)

1

In [939]:
refcount(a_id)

1

In [940]:
refcount(a_id)

1

In [1255]:
refcount(a_id)

1

In [68]:
refcount(a_id)

0

In [69]:
refcount(a_id)

2

- This wierd behaviour is because we are using the c library
- when we set a = None; The python memory manager frees up the memory address; it tosses away the object, now this memory address is available for something else.

- In python we never deal with memory address it's very dangerous to do that. You cannot rely on this.
- The above code is just for illustration purposes, you are never going to use it unless you are trying to debug or understand whats going on.

In [694]:
cat = 253

In [695]:
id(cat)

140736959316496

In [696]:
bat = id(cat)

In [697]:
bat

140736959316496

In [698]:
refcount(bat)

30

In [745]:
cat = None

In [700]:
id(cat)

140736958835936

In [701]:
bat

140736959316496

In [881]:
cat = None
refcount(bat)

29

# Garbage Collection

## Circular References

![](./images/img_16.jpg)

- Let us consider a case, there exists a varible` my_var` which points to `Object_A`. Object_A has an instance variable called `var_1` which points towards Object_B.

- At this point if we get rid of variable `my_var`(example by pointing it to None) the reference count of Object_A goes down to zero.

- The reference count of `Object_B` is still 1 since instance variable of Object_A `var_1` is pointing to Object_B.

- Now the reference count of Object_A has gone down to zero, therefore the Object_A will be destroyed. Once it gets destroyed reference count of Object_B also goes down to zero and it gets destroyed as well.

- Therefore by removing the reference of my_var to Object_A, the python memory manager through reference counting will get rid of `Object_A` and `Object_B`.

![](./images/img_17.jpg)

- Now let us consider a case where Object_B has a instance variable `var_2` pointing towards Object_A.
- This is a case of __Circular Referencing__
- *What happens if we get rid of my_var ?*
- If my_var goes away the reference count of Object_A reduces from 2 to 1.
- The reference count of Object_B is also 1 since var_1 is pointing to object_B.
- In this case reference counting will not destroy either Object_A or Object_B, because they both have reference variables which is non zero.
- Python Memory Manager cannot free the Circularly referenced memory. This is going to lead to the problem of __Memory Leak__.


![](./images/img_18.jpg)

- This is the place where __Garbage Collector__ comes in. 
- The Garbage Collector will look for the circularly referenced memory identify them and clean them up. Hence we get rid of memory leaks.


- The __Garbage Collector__ can be controlled programatically using the __gc__ module.
- by default it is turned __on__.
- you may turn it off if you are sure your code does not create circular references- but beware!
- runs periodically on its own(if turned on)
- The usecase for turning of garbage collector is performance boost. gc runs periodically therefore it takes up some computing power.
- One must leave the gc turned on unless one have some specific need.
- you can call it manually, and even do your own cleanup.


- In general __gc__ works just fine but not always....

- ### for python < 3.4
    - if __even one__ of the objects in the circular reference has a __destructor__ [e.g. __del__()] the destruction order of the objects may be important but the GC does not know what that order should be.
    - So the object is marked as __ Uncollectable__.
    - and the objects in circular reference are not cleaned up leading to ------> __MEMOR LEAKS__.
  
- That is no longer the case with python version greater than or equal to 3.4, this problem is resolved.


### Understanding the Garbage collector workings

In [3]:
import ctypes
import gc

In [4]:
def ref_count(address: int):
    return ctypes.c_long.from_address(address).value

In [36]:
#function to check whether a particular object os tracked by the gc based on the memory address
def object_by_id(object_id: int):
    for obj in gc.get_objects():   #gc.getobjects() is an iterable which gives the objects tracked by the gc
        if id(obj) == object_id:
            return "Object Exists"
    return "Object Not Found"

In [21]:
# Creating the two classes involved in Circular Reference

class A:
    def __init__(self):
        self.b = B(self)  #This constructor will call another class B. It is going to set the instance variable b of class A
                          # to an instance of class B passing self(instance of class A) as a parameter to the constructor
                          # of class B, This reference to instance of class A will be stored in the constructor of class B later
                          # thus giving rise to circular reference
        print('A: self:{0}, b: {1}'.format(hex(id(self)), hex(id(self.b))))                

In [22]:
class B:
    def __init__(self, a):
        self.a = a
        print('B: self:{0}, a: {1}'.format(hex(id(self)), hex(id(self.a))))
        

In [23]:
gc.disable()
#disabling gc to see what is happeing, since if we don't disable it then as references goes away it's going to cleanup
#the circulary reference memory itself.

In [24]:
my_var = A()

B: self:0x14f76346588, a: 0x14f76346550
A: self:0x14f76346550, b: 0x14f76346588


In [25]:
hex(id(my_var))
#my_var is point towards object A

'0x14f76346550'

In [26]:
print(hex(id(my_var.b))) #b is an instance property of class A, since my_var is pointing to Object A therefore it can access instance property b
print(hex(id(my_var.b.a)))

0x14f76346588
0x14f76346550


In [30]:
a_id = id(my_var)
b_id = id(my_var.b)
#we are storing the memory addresses because once my_var goes away we cannot access the objects via my_var.

In [32]:
print(hex(a_id))
print(hex(b_id))

0x14f76346550
0x14f76346588


In [33]:
ref_count(a_id)
#2 since my_var is pointing to A and object B is pointing to A

2

In [34]:
ref_count(b_id)
#1 since only object A is point to object B

1

In [37]:
object_by_id(a_id)

'Object Exists'

In [38]:
object_by_id(b_id)

'Object Exists'

In [40]:
my_var = None

In [41]:
ref_count(a_id)

1

In [42]:
ref_count(b_id)

1

In [43]:
object_by_id(a_id)

'Object Exists'

In [44]:
object_by_id(b_id)

'Object Exists'

In [45]:
gc.collect()

972

In [46]:
object_by_id(a_id)

'Object Not Found'

In [47]:
object_by_id(b_id)

'Object Not Found'

In [58]:
ref_count(a_id)

-1453878798

In [73]:
ref_count(b_id)

-1

- The ref_count is fluctuating, Whats going on?
- Even though objects A and B are not tracked by the GC i.e. they are destroyed but still their ref_count exist, and keeps changing over time.
- This is the reason why we don't use memory addresses and work with memory addresses in Python, unless we are trying to debug or trying to do your own manual memory cleanup.
- You have got to be really careful using memory addresses.
- What is at the memory address used for storing object_A and object_B, it could be something else or it could be nothing but that object is no longer A, its something else.
- Trying to get refrence count of a memory address hat is no longer necessarily managed by python memory manager or even it is, that memory address could have been reclaimed by some other object.
- Therefore becareful while using memory addresses in python, use them for debugging purposes or as long as you are using a memory address for an object that you know exist.
- In the above case once the garbage collector ran, it destroyed all the objects that are involved in the circular reference and those memory addresses that we stored are no longer applicable.

