In [1]:
#hide

In [1]:
#hide
import utils
utils.hero("Getting To Know 'int' Better")

In [2]:
#hide
utils.h1("Numerical Representation")

Numerical representation is used to describe measurable quantities. To store them in memory, we have dedicated built-in dataypes in python. \
These data types are:
1. int: For storing integer values
2. float: For storing decimal values

In [3]:
#hide
utils.note("Numbers can also be stored as word (string) by treating each digit as characters but you will not be able to use built-in operations defined for int/float")

In [4]:
#hide
utils.h1("Important information about data type")

For any datatype, one must know these information about them:
1. How to create an object/instance of a data type class?
2. How much memory does it consume?
3. How to find out the memory location where it is stored?
4. Can you make changes in the data once created? (Mutable and Immutable datatypes)
5. What are the common methods that can be applied to the value?
6. How is it represented internally? (small integer caching, string interning)
7. Is it iterable or indexable?

We will answer these questions for each of the data types.

In [5]:
#hide
utils.h1("Int")

In [6]:
#hide
utils.h2("Declaration")

In [7]:
# Declaration
a = 10 # Literal declaration, no additional overhead
print(f"{a} {type(a)}")
# OR 
b = int(20) # Calling the constructor, additional overhead
print(f"{b} {type(b)}")
print("-"*100)

10 <class 'int'>
20 <class 'int'>
----------------------------------------------------------------------------------------------------


In [8]:
#hide
utils.important("The variable 'a' and 'b' only stores the memory location of the objects and the values itself")

In [9]:
#hide
utils.tip("Use literal declaration unless intended for type conversion")

In [10]:
#hide
utils.h2("Finding Memory Consumption")

In [11]:
# To find the memory consumption, we can use "sys" module
import sys
print(f"Memory used by the object a={a} is {sys.getsizeof(a)} bytes")
print(f"Memory used by the object b={b} is {sys.getsizeof(b)} bytes")

# For some built-in datatypes we also have a method '__sizeof__' that gives us the size excluding (garbage collector overhead)
print(f"Memory used by the object a={a} is {a.__sizeof__()} bytes")
print(f"Memory used by the object b={b} is {b.__sizeof__()} bytes")
print("-"*100)

Memory used by the object a=10 is 28 bytes
Memory used by the object b=20 is 28 bytes
Memory used by the object a=10 is 28 bytes
Memory used by the object b=20 is 28 bytes
----------------------------------------------------------------------------------------------------


In [12]:
#hide
utils.important("The additional size of python int is due to its flexible internal design, which allows \
to handle arbitary-sized integers (unlike C)")

In [13]:
#hide
utils.exercise("Find the size of integer ranging from 1e0 to 1e100")

In [14]:
# Your solution

In [15]:
# Solution
import sys

vals = [10**val for val in range(0, 100, 10)] # [1, 1e10, 1e20, 1e30, ...] (all integer type)

for val in vals:
    print(f"{val:.2e} || Memory: {sys.getsizeof(val)} bytes")
print("-"*100)

1.00e+00 || Memory: 28 bytes
1.00e+10 || Memory: 32 bytes
1.00e+20 || Memory: 36 bytes
1.00e+30 || Memory: 40 bytes
1.00e+40 || Memory: 44 bytes
1.00e+50 || Memory: 48 bytes
1.00e+60 || Memory: 52 bytes
1.00e+70 || Memory: 56 bytes
1.00e+80 || Memory: 60 bytes
1.00e+90 || Memory: 64 bytes
----------------------------------------------------------------------------------------------------


In [16]:
#hide
utils.note("This is an advance feature of python that the size of the int object adapts to the value it contains. \
This makes 'int' precision infinite (limited only by memory)")

In [17]:
#hide
utils.h2("Finding Memory Location")

In [18]:
# To find the memory location of an object, we have a built-in function in python called 'id'
a = 20
print(f"'a' points to {id(a)} | {hex(id(a))}")
print("-"*100)

'a' points to 4585058624 | 0x1114a7140
----------------------------------------------------------------------------------------------------


In [19]:
#hide
utils.exercise(f"Given the memory location {hex(id(a))}, how can you find the value stored in that location?")

In [20]:
# Your solution (hint: Use ctypes module)

In [21]:
# Solution

import ctypes
value = ctypes.cast(obj=id(a), typ=ctypes.py_object).value # This is only for demonstration, use it extreme caution to avoid crash
print(value, type(value))
print("-"*100)

20 <class 'int'>
----------------------------------------------------------------------------------------------------


In [22]:
#hide
utils.h2("Mutable/Immutable?")

Everything we declare in Python is an object. By object, I mean an instance of a defined 'class'. \
When you create an object and assign it to a variable, the variable stores the memory location of the object or we say the variable points to the object. If the object at that memory location can be modified during runtime then the object and the associated data type is said to be **mutable** otherwise **immutable**.

In [23]:
a = 10
print(f"Variable 'a' points to {id(a)} where the stored value is {a}")
a = 20
print(f"Variable 'a' points to {id(a)} where the stored value is {a}")
print("-"*100)

Variable 'a' points to 4585058304 where the stored value is 10
Variable 'a' points to 4585058624 where the stored value is 20
----------------------------------------------------------------------------------------------------


In [24]:
#hide
utils.question("What happened when we changed the value of a to 20? Is 'int' mutable?")

No, 'int' is immutable even though it seems like we managed to change the value (from 10 to 20) but under the hood when we tried to change the value by reassigning 'a' to 20, it actually created a new object(20) at a different memory location and then 'a' started pointing to the new object(20)

In [25]:
#hide
utils.exercise("Find out the value stored at the memory location that a was pointing before reassigning")

In [26]:
# Your solution

In [27]:
#hide
utils.h2("Available Methods for Int")

In [28]:
#hide
utils.tip("To find out all the parameters and methods available to an object, Python provides us 'dir' or '__dir__' that returns a list \
of all such parameters and methods")

In [29]:
a = 50
print(dir(a))
print("-"*100)

['__abs__', '__add__', '__and__', '__bool__', '__ceil__', '__class__', '__delattr__', '__dir__', '__divmod__', '__doc__', '__eq__', '__float__', '__floor__', '__floordiv__', '__format__', '__ge__', '__getattribute__', '__getnewargs__', '__getstate__', '__gt__', '__hash__', '__index__', '__init__', '__init_subclass__', '__int__', '__invert__', '__le__', '__lshift__', '__lt__', '__mod__', '__mul__', '__ne__', '__neg__', '__new__', '__or__', '__pos__', '__pow__', '__radd__', '__rand__', '__rdivmod__', '__reduce__', '__reduce_ex__', '__repr__', '__rfloordiv__', '__rlshift__', '__rmod__', '__rmul__', '__ror__', '__round__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__', '__rtruediv__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__truediv__', '__trunc__', '__xor__', 'as_integer_ratio', 'bit_count', 'bit_length', 'conjugate', 'denominator', 'from_bytes', 'imag', 'is_integer', 'numerator', 'real', 'to_bytes']
--------------------------------------

In [31]:
#hide
utils.exercise("Try out some of the methods from the list!")

In [32]:
# Your solution

In [33]:
# Solution

In [34]:
#hide
utils.h2("Small Integer Caching")

In [35]:
a = 10
print(f"Varaible a points at {id(a)} which has {a}")
b = 10
print(f"Varaible b points at {id(b)} which has {b}")
print("-"*100)

Varaible a points at 4585058304 which has 10
Varaible b points at 4585058304 which has 10
----------------------------------------------------------------------------------------------------


Based on above, we observe that both 'a' and 'b' points to the same address. It is because Python caches small integer values for efficiency.

In [36]:
a = 600 # Changing to large values
print(f"Varaible a points at {id(a)} which has {a}")
b = 600
print(f"Varaible b points at {id(b)} which has {b}")
print("-"*100)

Varaible a points at 4617895120 which has 600
Varaible b points at 4617895216 which has 600
----------------------------------------------------------------------------------------------------


In [37]:
#hide
utils.exercise("Find out all the integer values that python caches")

In [38]:
# Your solution

In [39]:
# Solution
def is_cached(val: int) -> bool:
    # Create two variables storing the same value
    a = int(str(val)) # Tricking python to create a new object instead
    b = int(str(val)) 
    return a is b

cached = [] # To store all the values that are cached
for i in range(-200, 400):
    if is_cached(i):
        cached.append(i)
print(f"{min(cached)} to {max(cached)} are cached")

-5 to 256 are cached


In [41]:
#hide
utils.h2("Iterable and Indexable?")

A datatype is said to be **Iterable** if it implements either of the methods:
1. \_\_iter\_\_(): returns an iterator
2. \_\_getitem\_\_(): with integer indices starting from 0

A datatype is said to be **indexable**:
- **by integer** if it implements \_\_getitem\_\_(index)
- **by key** if it implements \_\_getitem\_\_(key)

In [43]:
#hide
utils.tip("'collections' module in Python can be used to find out if an object is iterable")

In [44]:
# Code to check if an object is iterable and indexable
def is_iter_indexable(obj):
    from collections.abc import Iterable
    is_iterable = isinstance(obj, Iterable)
    is_indexable = hasattr(obj, "__getitem__")
    print(f"{type(obj)}   Iterable: {is_iterable} || Indexable: {is_indexable}")

is_iter_indexable(10)

<class 'int'>   Iterable: False || Indexable: False


In [45]:
#hide
utils.upcoming("We will look into another dataype 'float'")