# Variables and Memory

## 1) Variables and Memory References

- Variables are the placeholders of data in python. The variables we create are just memory references. They didn't equal to data value. They actually refer the memory address where actual data is stored in. 

- Generally memory is divided into slots. Each slot has an unique memory address. These unique memory address actually stores original data. Since we cannot remember these memory addresses, we actually create an aliases to these memory addresses as variables.

- So, when we want to display the value of the variable, then python looks for memory address referenced by that variable and finds the value present in it and then displays that value to the console.

- To check the memory address referenced by the variable, we use the `id()` function. It returns the decimal value of the memory address. If we want the hexadecimal value then we use `hex(id())` function.

In [1]:
my_var = 10

print(id(my_var))

2509153370640


In [2]:
print(hex(id(my_var)))

0x24835320210


## 2) Reference Counting

- Reference Counting means how many references that each memory address have. Suppose we have a variable `a` which references the memory address `0x1000`. Now i have another variable `b` which also references same memory address `0x1000` (This can be obtianed by `b = a`). Now the reference count for memory address `0x1000` is 2. 

- Suppose we equate `b` to `None` then the reference count for `0x1000` is 1. If we do it for variable `a` also then reference count for `0x1000` is 0 . Now python thinks that there is no use of this object and deletes that object from memory address. This is python removes garbages.

- To find the reference count of a memory address, we use sys module. Sys module has a function called `getrefcount()` to get reference count of the memory address.

  **Syntax** : `sys.getrefcount(variable)`

- But this function doesn't give actual reference count. Because we pass this variable as argument to that function then in the scope of the function we have another variable which also references the same memory address. Becuase of these we will get incorrect reference count (actual  + 1).

- To get actual reference count we use `ctype` module.

  **Syntax** : `ctype.c_long.from_address(id(variable)).value`

In [3]:
import sys

a = -10

sys.getrefcount(a)

2

In [4]:
import ctypes

ctypes.c_long.from_address(id(a)).value

1

In [5]:
b = a

ctypes.c_long.from_address(id(a)).value

2

In [6]:
b = None

ctypes.c_long.from_address(id(a)).value

1

In [None]:
c = id(a)

a = None

ctypes.c_long.from_address(c).value

# Here we can see that reference count for memory address is zero

0

## 3) Python Garbage Collector

- So from reference counting method, we can say that if the reference count for any memory address becomes zero then we can say that python automatically deletes that object from the memory. This is how python manages memory.

- But reference counting method fails when there are circular references. Circular references occurs when an instance variable of object1 referencing object2 and instance variable of object 2 referencing object1.

- To effectively manage these circular references, python uses a better memory management technique called Garbage collector. Garbage collector actually scans the group of objects periodically and track their internal references then looks for internal references which are unreachable from outside. Of it finds any cycles that are unreachable from outside of your code then it will marks them as collectables and frees them.

- Garbage collector is programatically controlled by the gc module. It is turned on by default. If your code doesn't create any circular references, then we can disable it.

In [8]:
# So now see an example of circular refereces

import ctypes

import gc

class B:

    def __init__(self,a):

        self.a = a
        print(f"A : {hex(id(self.a))} B : {hex(id(self))}")

class A:

    def __init__(self):

        self.b = B(self)

        print(f"A : {hex(id(self))} B : {hex(id(self.b))}")

# Here you can see instance variable of class A (b) references class B and similarly instance variable of Class B (a) references class A.


In [10]:
# This function actually checks whether the object is present in memory or not.

def object_by_id(object_id):
    for obj in gc.get_objects():
        if id(obj) == object_id:
            return "Object Exists"
        
    return "Not Found"

def ref_count(address):
    return ctypes.c_long.from_address(address).value

In [None]:
# Now lets create an object for Class A which automatically creates an object for Class B and creates a circular reference here

my_var = A()

# In the output you can see object for class B is also created becuase we have an memory address for class B

A : 0x24839bce3b0 B : 0x24839bce440
A : 0x24839bce3b0 B : 0x24839bce440


In [None]:
# Now u can see the cyclic references are created

print(hex(id(my_var))) # Object of Class A
print(hex(id(my_var.b.a))) # a -> Instance variable of Class B

# SO from these we can see both my_var and a refers same memory address where class A is present.

0x24839bce3b0
0x24839bce3b0


In [13]:

# Similarly for class B. Eventhough we doesn't created an external object for class B, we have memory address for class B.

# This is because instance variable of class A (b) referes class B.

print(hex(id(my_var.b)))
print(hex(id(my_var.b.a.b)))

0x24839bce440
0x24839bce440


In [14]:
a_id = id(my_var)
b_id = id(my_var.b)

In [16]:
# Now let's check reference count of each object

print(f"Reference Count of A : {ref_count(a_id)}")
print(f"Reference Count of B : {ref_count(b_id)}")

Reference Count of A : 2
Reference Count of B : 1


In [17]:
# Now iam disabling the garbage collector in python and making my_var as None. 
# Now externally none of the classes has object. But still they have reference count as one because of circular references.

gc.disable()

my_var = None

print(f"Reference Count of A : {ref_count(a_id)}")
print(f"Reference Count of B : {ref_count(b_id)}")

Reference Count of A : 1
Reference Count of B : 1


In [18]:
# Now lets check whether objects exits in memory or not

print(f" Object A is : {object_by_id(a_id)}")
print(f" Object B is : {object_by_id(b_id)}")

 Object A is : Object Exists
 Object B is : Object Exists


In [19]:
# So now lets enable the garbage collector and then checks whether objects are existed or not

gc.collect()

print(f" Object A is : {object_by_id(a_id)}")
print(f" Object B is : {object_by_id(b_id)}")

 Object A is : Not Found
 Object B is : Not Found


In [20]:
print(f"Reference Count of A : {ref_count(a_id)}")
print(f"Reference Count of B : {ref_count(b_id)}")

# From these outputs we can say garbage collector frees those two unreachable objects 

Reference Count of A : 0
Reference Count of B : 0


## 4) Dynamic vs Static Typing

- Python is dynamically typed language which means the type of the variable might be dynmaic where as in other languages the type of the variable is static which means once we have created a variable we cannot assign this variable with other data type value. 

- But in python we can easily reassign any variable with any type whenever we want. This is becuase variable in python are just memory references. They won't store any values (data) like other languages. They are just aliases to memory addresses. if you reassign a variable with new datatype then this variable references the newly created object.

- For example if `a = "Hello"`. Here variable `a` references the object `"Hello"` . Now you reassign `a with 10`, now `a` references the newly created integer object `10` and the object `"Hello"` string gets deleted from memory if its reference count becomes zero.

In [21]:
my_var = "Hello"

print(type(my_var))

my_var = 10

print(type(my_var))

<class 'str'>
<class 'int'>


## 5) Variable Reassignment

- As we know python actually deletes the previous object (if its reference count is zero) whenever you reassign the variable with new value. We generally think this occurs when we reasign it with new value. But actually this occurs when we increment the orignal varaible also.

- Suppose `a = 10`. Now i made this operation `a = a + 10`. Here a new object gets created which is `20` and variable `a` starts referencing that new object `20` instead of changing the original object `10`. This is occured because intergers are immutable.

In [None]:
my_var = 10

print(id(my_var))

my_var = my_var + 10

print(id(my_var))

# Here you can see address of my_var in these two cases are different.

2509153370640
2509153370960


## 6) Objects Mutability

- In python every object has a type and internal state (which we call it as data). Changing the data inside the object is called changing the internal state of the object.

- Mutability simply means changing the internal state of the data without creating the new object. So in case of mutable objects, we can change the data of that object but still that object stays in same memory address. Some of the mutable objects in python are Lists, Sets, Dictionaries, User - Defined Classes.

- Whereas immutability is simply opposite to mutability which means we cannot change the internal state of the object without creating new object. If you try to change the internal state of the data a new object gets created and variable references the new object and original object stays in memory if its reference count is not zero. Some of the immutable objects are Numbers, Strings, Tuples, Frozen Sets etc. User Defined Classes can be considered as immutable and mutable also.


In [24]:
# Consider an element list which is mutable

my_list = [1,3,5]
id(my_list)

2509230440704

In [25]:
my_list.append(7)
id(my_list)

# Below u can see the memory address is not changed. So new object is not created then list is mutable

2509230440704

In [26]:
# List is mutable to specific operations.Those are 

#  1) Changing the elements
#  2) Adding or deleting the elements etc.

# But in case of concatenation we can see a new list is created with '+' operator.


my_list = my_list + [8]

id(my_list)

# Here u can see the address is different. So new object is created.

2509230590784

In [27]:
# But with this operator '+=', list are mutable

my_list1 = [1,2,3]
print(id(my_list1))

my_list1 += [8]

print(id(my_list1))

# In this type of concatenation new object is not created.

2509229937792
2509229937792


In [28]:
# Now consider tuple

my_tuple = (1,2,3)
id(my_tuple)

2509230226432

In [29]:
my_tuple +=(4,)
print(my_tuple)
print(id(my_tuple))

# Here u can see both memory addresses are different. So tuple is immutable i.e we cannot change the elements in it.

(1, 2, 3, 4)
2509230508688


In [30]:
# Eventhough tuple is immutable, we can't say elements in the cannot be changed. 
# If the elements in the tuple is mutable then tuple also mutable with respect to changing elements only not adding 
# or deleting  elements

my_tuple1 = ([1,2],[3,4])
print(id(my_tuple1))

my_tuple1[0].append(5)
my_tuple1[1].append(6)

print(my_tuple1)
print(id(my_tuple1))

# Here u can address of the tuple object is same.So if in case of collections we can't say this collection is immutable
# These cases might arise.

2509230443904
([1, 2, 5], [3, 4, 6])
2509230443904


## 7) Shared References

- Shared References is a concept where two variables referencing same object in memory. That means two variables have same memory addresses.

- This is generally occurs when we equate two variables explicitly. That means consider `a = 10` and now we equate `b = a` which indicates `b` is also referencing the same memory address that `a` is referencing.

- But in case of some strings and integers python automatically creates shared references if their internal state is equal. This occurs only certain strings and integers (These concepts are string interning and interger interning). This occurs because strings and integers are immutable. So they won't create any damage.

- But if we consider lists python won't create shared references automatically eventhough they are equal. Becuase lists are mutable.

In [31]:
my_str1 = "hello"
hex(id(my_str1))

'0x248398687b0'

In [32]:
my_str2 = "hello"
hex(id(my_str2))

# Here u can see memory address for both of them are same. Similarly for integers

'0x248398687b0'

In [33]:
a = 10

b = 10

print(hex(id(a)))
print(hex(id(b)))

# For these two python automatically creates shared references. But we can explicitly creates shared references.

0x24835320210
0x24835320210


In [None]:
a = [1,2,3]
b = [1,2,3]


print(hex(id(a)))
print(hex(id(b)))

# Here we can see python won't create shared references for lists. The reason is explained below.

0x24839cc54c0
0x24839cc5180


In [35]:
# For mutable obects it is danger to create shared references.

a = [1,2,3]

b = a

b.append(100)

print(a)
print(b)

# Here u can see even if u change 'b', a is automatically getting changed. So that's why it is danger to create
# shared references for mutable objects

[1, 2, 3, 100]
[1, 2, 3, 100]


## 8) Variable Equality

- In Python, Variable Equality includes two things :

  1. Finding whether two variables share same memory address i.e two variable have same memory addresses or not.

  2. Finding whether internal state of two variables are same or not.

- To check whether two variables share same memory address, we actually use identity operator (`is`). This operator returns a boolean value which is True or False.

  **Syntax** : `<variable1> is <variable2>`

- To check whether two variables have same internal state (data), we use equality operator (`==`). 

  **Syntax** : `<variable1> == <variable2>`

- In Python, if you set any variable to `None` then actually use shared references. Because python always store `None` object in its memory. It won't even delete this object from memory even though its reference count is zero. If you assign `None` to multiple variables then all the variable share same memory address.

In [36]:
a = 10

b = 10

print('a is b :', a is b)

print('a == b :', a == b)

a is b : True
a == b : True


In [37]:
a = [1,2,3]
b = [1,2,3]

print('a is b :' ,a is b )
print('a == b :', a == b)

# Here memory address of two lists are not same but contents are same.

a is b : False
a == b : True


In [38]:
a = None 
b = None
c = None

print( (a is b) and (b is c) and (a is c) )

True


## 9) Everything is an Object

- In Python, we might encounter so many data types, such as Integers (`int`), Booleans (`bool`), Floats (`float`), Strings, Lists, Tuples etc and we might encounter so many constructors such as Operators (+,-,=,== etc.), Functions, Classes, Types (Data Types) and many more. Once common thing in all these is all of them are Objects. 

- Functions are instances of `function` class, Class is an instance `class` class. Types are instances of `type` class. To know which object belongs to which class we use `help(<type>)` function. Since all are objects and so we can say that all have memory addresses. 

- Since everything is an object we can assign anything to variables, like we can assign functions to variables and we can assign classes to variables etc.

In [41]:
a = 10

print(type(a))

# Here we can see 'a' is instance of class 'int'

<class 'int'>


In [42]:
# To get the int class we use this help function.
help(int)

Help on class int in module builtins:

class int(object)
 |  int([x]) -> integer
 |  int(x, base=10) -> integer
 |  
 |  Convert a number or string to an integer, or return 0 if no arguments
 |  are given.  If x is a number, return x.__int__().  For floating point
 |  numbers, this truncates towards zero.
 |  
 |  If x is not a number or if base is given, then x must be a string,
 |  bytes, or bytearray instance representing an integer literal in the
 |  given base.  The literal can be preceded by '+' or '-' and be surrounded
 |  by whitespace.  The base defaults to 10.  Valid bases are 0 and 2-36.
 |  Base 0 means to interpret the base from the string as an integer literal.
 |  >>> int('0b100', base=0)
 |  4
 |  
 |  Built-in subclasses:
 |      bool
 |  
 |  Methods defined here:
 |  
 |  __abs__(self, /)
 |      abs(self)
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __and__(self, value, /)
 |      Return self&value.
 |  
 |  __bool__(self, /)
 |      self != 

In [43]:
# Now, lets see about functions

def square(a):

    return a**2

In [44]:
# If you see type(square) it will return <class 'function'>

print(type(square))

# So we can see function is also an object. We can assign it to another variable

<class 'function'>


In [45]:
square(2)

4

In [46]:
f = square

f(2)

4

In [47]:
# Now lets see how to return functions

def cube(a):

    return a**3

In [48]:
def select_function(fun_id):

    if fun_id == 1:
        return square
    else :
        return cube

In [49]:
f = select_function(1)

f(2)

4

In [50]:
select_function(1)(2)

4

In [51]:
# Now lets see how to pass functions to functions

def do(func, n):

    return func(n)

In [52]:
do(square,5)

25

In [53]:
do(cube,5)

125

## 10) Integer Interning

- Interning simply means reusing the objects on demand. At startup, python preloads a global list of integers in the range of [-5,256]. Anytime an integer is referenced in that range, python will find the cached version of that object. That means all the integers in that range are stored as objects by default in python. 

- That means when assign an integer 10 to variable `a` (a = 10), then python makes variable `a` to refer the existing object of integer `10`.

- Suppose if we assign `a = 257` then python create a new integer object i.e 257 and makes `a` to refer the new object.

- We might get a doubt, even though integers are immutable objects why can't we make shared references if two integers are equal. This might be a better optimization strategy. But python won't do it. Consider a tuple of lists. Here tuples are immutable. so we think we cannot change the values of it. But lists are mutable. So we can refer list in the tuple and change the values in it. Now you can see the internal state of tuple got changed. So sometimes immutable objects can be mutable if they contains mutable object. That's why python won't create shared memory referen ces to immutable objects eventhough their internal state is same.

In [54]:
a = 10
b = 10

print(id(a), id(b))

# we can see both are same. Since python won't create new object for them.

2509153370640 2509153370640


In [None]:
a = 257
b = 257

print(id(a),id(b))

# Here you can see python doesn't create shared memory for both 'a' and 'b'.


2509251921744 2509251935984


## 11) String Interning

- String Interning is also same as Integer Interning. That means python reuses te strings on demand. Some strings are automatically interned not all. Such as when python code is compiled, identifiers such as Variables, Functions, Classes names got interned. 

- Some strings literals gets automatically interned if they follow these rules :

  1. String literals that look like identifiers -> That means if they follow conventions of identifiers.
  2. Although if it starts with a digit and still it is not a valid identifier, it may still get interned. Because we have explicit interning.

- The reason behind why python do this is because of speed and memory optimization. Python, both internally , and in the code you write , deals with lots and lots of dictionary type lookups on string keys , which means a lot  of string equality testing.

- Let's say we want to see if you want to check whether two strings are equal: a = 'Some_Long_String'  b = 'Some_Long_String'. Using a == b, we need to compare  the two string charecter by charecter. But if we know that 'Some_Long_String' has been interned , then a and b  are the same string and they both point to the same memory address. In this case we use 'a is b' syntax which compares only two integers(memory addresses). So that we can decrease comparision operations. 

- For every string `==` operator compares charecter by charecter but if python know if two strings are interned then instead of checking charecter by charecter it actually first perfroms `is` (identity) operator. And this interning is useful for compilers because in code there may be many identifiers that are repeated more than once like if, from, else etc. So instead of storing each identifier seperately python uses these string interning concept there to reduce memory usage.

- Python automatically intern few string only if they follow some conventions. But to make string interned explicitly we just use sys module which has a method called intern.

  **Syntax** : `sys.intern(<String>)`

In [56]:
a = "hello"
b = "hello"

print(id(a),id(b))

# Here u can see both a and b have same addresses. Because those string literal 'hello' follows identifier naming 
# coventions.

2509226018736 2509226018736


In [57]:
a = 'hello world'
b = 'hello world'

print(id(a),id(b))

# Here u can see both address are different. Because if space is there between words, then it is not an identifier.

2509252274352 2509252273840


In [58]:
a = 'hello_world'
b = 'hello_world'

print(id(a),id(b))

2509252403760 2509252403760


In [59]:
# To make strings interned we use sys module

import sys

a = sys.intern('hello world')
b = sys.intern('hello world')

print(id(a),id(b))


2509252407088 2509252407088


## Summary

- In Python, variables are just references to the memory addresses where actual data is stored. i.e they are just aliases to memory addresses.

- Python uses both reference count method and garbage collector to remove the unnecessary objects from te memory. I mean if reference count of any object gets zero then python will free up that memory and may use it for other processes. To get the reference count we use two methods one is `sys.getrefcount()` and another one is `ctypes.c_long.from_address(id(variable)).value`

- Python is dynamically typed language which means python won't force a variable to store only particaular data types. 

- In Python we have two types of objects Mutable and Immutable. Mutable Objects mean we can change the internal state of the object without creating new object. Immutable objects are just opposite to mutable objects.

- In Python, we use `is`(identity) and `==` (equality) operator to check the variable equality. Everything we code in python is an object.

- Python use an optimization technique called interning to effectively use memory and computational resources.
