In [2]:
from IPython.display import HTML
from IPython.display import display

tag = HTML('''
<style>
.advanced-cell {
    background-color: #e84c2250;
}
.advanced-cell::after {
    position: absolute;
    display: block;
    top: -2px;
    right: -2px;
    width: 5px;
    height: calc(100% + 3px);
    content: '';
    background: #e84c22;
}
.advanced-label-row {
    border-bottom: 1px solid #e84c22;
    display: flex;
    font-weight: bold;
}
.advanced-label {
    margin-left: auto;
    background-color: #e84c22;
    padding: 5px 8px;
    color: white;
    margin-right: -2px;
}
</style>
<script>

// A function to hide/show highlight advanced topics in the notebook
var highlighted = false;
function highlight_advanced_topics() {
    $(".advanced-cell").removeClass("advanced-cell");
    $(".advanced-label-row").remove();
    if(highlighted) {
        highlighted = false;
        return;
    }
    var advanced = false;
    $(".jp-Cell.jp-MarkdownCell,.jp-Cell.jp-CodeCell").each(function(){
        if(!advanced) {
            if($(this).find(".advanced-start").length > 0) {
                $(this).before("<div class='advanced-label-row'><span class='advanced-label'>Advanced Topic</span></div>");
                $(this).addClass("advanced-cell");
                advanced = true;
            }        
        } else {
            if($(this).find(".advanced-stop").length > 0) {
                if($(this).find(".advanced-start").length > 0) {
                    $(this).before("<div class='advanced-label-row' style='margin-top: 10px;'><span class='advanced-label'>Advanced Topic</span></div>");
                    $(this).addClass("advanced-cell");
                } else {
                    advanced = false;
                }
            } else {
                $(this).addClass("advanced-cell");
            }
        }
    });
    highlighted = true
}

(function() {
  // Load the script
  const script = document.createElement("script");
  script.src = 'https://ajax.googleapis.com/ajax/libs/jquery/3.6.0/jquery.min.js';
  script.type = 'text/javascript';
  script.addEventListener('load', () => {
    $(document).ready(highlight_advanced_topics);
  });
  document.head.appendChild(script);
})();
</script>
<div class="m-5 p-5"><span class="alert alert-block alert-danger">Advanced topics in notebook are highlighted!</span></div>''')
display(tag)

# Introduction to the Python Data Model
Python is an [object-oriented programming](https://en.wikipedia.org/wiki/Object-oriented_programming) language. When learning Python, one of the first things we are told is that everything in Python is an object. Not only primitive data such as numbers and strings, but also collections, functions, modules, etc. are objects. Objects are all over the place. But what exactly is an object? And what are the implications of treating every entity in the language as one? Let's flesh it out. 

Consider a simple integer number like `42`:

In [1]:
import sys

answer=42
print(f"Value: {answer}")
print(f"Size: {sys.getsizeof(answer)} bytes")
print(f"Number of attributes: {len(dir(answer))}")
print(f"Attributes list: {dir(answer)}")

Value: 42
Size: 28 bytes
Number of attributes: 71
Attributes list: ['__abs__', '__add__', '__and__', '__bool__', '__ceil__', '__class__', '__delattr__', '__dir__', '__divmod__', '__doc__', '__eq__', '__float__', '__floor__', '__floordiv__', '__format__', '__ge__', '__getattribute__', '__getnewargs__', '__gt__', '__hash__', '__index__', '__init__', '__init_subclass__', '__int__', '__invert__', '__le__', '__lshift__', '__lt__', '__mod__', '__mul__', '__ne__', '__neg__', '__new__', '__or__', '__pos__', '__pow__', '__radd__', '__rand__', '__rdivmod__', '__reduce__', '__reduce_ex__', '__repr__', '__rfloordiv__', '__rlshift__', '__rmod__', '__rmul__', '__ror__', '__round__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__', '__rtruediv__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__truediv__', '__trunc__', '__xor__', 'as_integer_ratio', 'bit_length', 'conjugate', 'denominator', 'from_bytes', 'imag', 'numerator', 'real', 'to_bytes']


Simple as it may be, the integer `42` has a rather complex representation in Python. This is because all data in a Python program, even simple integer numbers like `42`, is represented by objects. Different programming languages may define what an "object" is in different ways. In Python:
> An **object** is any data that has a *state* and a defined *behavior*. 

The state of an object tells us what information it holds. The behavior of an object tells us what can we do with it. The state of an object is defined by its value and (data) attributes, the behavior of an object is defined by its type and methods (more on all these terms later). In the example above, the state of the object named `answer` is represented by its value `42` and its behavior is defined by the methods whose names appear in the output of `dir(answer)`.

## Value, type and identity of objects
Every object in Python has an *identity*, a *type* and a *value*. The **identity** is a number that uniquely identifies the object. The **type** (or *class*) determines the operations that the object supports and defines its possible values (more on that later). The **value** of an object is the actual data the object contains. We can inspect the identity and type of objects using `id()` and `type()`.

In [36]:
print(f"Id: {id(answer)}")
print(f"Type: {type(answer)}")
print(f"Value: {answer}")

Id: 140041725855248
Type: <class 'int'>
Value: 42


An object's identity and type never change once it has been created. The value of some objects can change. Objects whose value can change are called **mutable** objects. Object whose value cannot be changed are called **immutable**. An object's mutability is determined by its type; for instance, numbers, strings and tuples are immutable, while dictionaries and lists are mutable.

In [40]:
l = ["you", "cannot", "change", "me"]
print(f"Mutable value: {l}")
print(f"Id: {id(l)}")
l[1] = "can"
print(f"Mutable value: {l}")
print(f"Id: {id(l)}")
t = ("you", "cannot", "change", "me")
print(f"Immutable value: {t}")
print(f"Id: {id(t)}")
t[1] = "can"
print(f"Immutable value: {t}")
print(f"Id: {id(t)}")

Mutable value: ['you', 'cannot', 'change', 'me']
Id: 140041653018368
Mutable value: ['you', 'can', 'change', 'me']
Id: 140041653018368
Immutable value: ('you', 'cannot', 'change', 'me')
Id: 140041576381744


TypeError: 'tuple' object does not support item assignment

When we assign a new value to an immutable object we are actually creating a new object:

In [83]:
age = 34
print(id(age))
age = 35
print(id(age))

140041725854992
140041725855024


## Attributes and methods
Objects also have **attributes**. An attribute is any value (data or function) that is associated with an object through a **name**. Attributes with a data value are called *data attributes* (or sometimes simply *attributes*) while attributes with a function value are called *methods*. We can use the built-in function `dir()` to retrieve a list of valid attributes for an object and  we can access individual object attributes by name using *dot notation*.

In [56]:
print(answer.real, answer.imag)
print(answer.bit_length())

42 0
6


## Variables and aliases
In many programming languages, variables are best thought of as containers that hold our data. In Python, by contrast, variables are best thought of as pointers or *references* to our data. In Python, variables are just **names** that we use to refer to objects. A variable does not contain the actual object, instead it contains a **reference** to that object. 

Since a variable always contains a reference, there is no need to declare variables or their types in advance. A variable is created in the moment we first assign a value to it. When we assign an object to a variable, using the `=` operator, what we do is to create a new name and assign it the reference to an object. 

An object can be referenced by multiple variables at the same time. Multiple variables referencing the same object are known as **aliases**.

<span class="advanced-start"></span>
 We can count the number of references to an object with `sys.getrefcount()`.

In [80]:
import sys
coord = (35.658581, 139.745438)
print(sys.getrefcount(coord))
tokyo = coord
print(sys.getrefcount(coord))

2
3


<span class="advanced-stop"></span>
We must be careful when we alias a mutable object as any change we make to that object using one variable will be visible from its alias as well!

In [81]:
anakin = {
    'name' : 'Anakin',
    'surname' : 'Skywalker',
    'rank' : 'Jedi Knight'
}
vader = anakin
print(vader is anakin)
print(id(anakin), id(vader))
vader['rank'] = 'Sith Lord'
print(anakin)

True
140041653213504 140041653213504
{'name': 'Anakin', 'surname': 'Skywalker', 'rank': 'Sith Lord'}


Note that even though `=` is known as the *assignment operator* it may be misleading to think of a statement like `answer=42` as "the object `42` is assigned to the variable `answer`". A better way to think about it would be as "the variable `answer` is bound to the object `42`".

## Identity vs. Equality
We use the `==` operator to compare the values of objects, while `is` compares their identities. Usually we care more about values than identities, so most of the times when comparing if two objects are equal we will use `==` and not `is`.

In [43]:
x = []
y = []
print(id(x), id(y))
print(x == y, x is y)

140041653218368 140041576518720
True False


A common use case of `is` is to compare an object with a a unique object, such as `None`, `True` or `False`. For instance, the recommended way to check if a variable is bound to `None` is by using `is`.

In [49]:
import timeit
x = None
print(x is None)
%timeit x is None
%timeit x == None

True
21.2 ns ± 0.906 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
30 ns ± 1.17 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


Using `==` is about 40% slower.

## Immutability is subtle
Immutability is subtle: it is not strictly the same as having an unchangeable value. For instance, an immutable container that contains a mutable object can change its value when the latter's value is changed.

In [4]:
t1 = (1, 2, 3, [4, 5])
t2 = (1, 2, 3, [4, 5])
print(t1 == t2)
print(f"Value: {t1}")
t1[3].append(6)
print(f"Value: {t1}")
print(t1 == t2)

True
Value: (1, 2, 3, [4, 5])
Value: (1, 2, 3, [4, 5, 6])
False


That is because immutability of an object really refers to the physical contents of the object's data structure. If an immutable object contains references to other objects, as in the case of a tuple, these references cannot change but this does not extend to the referenced objects.

## Everything is an object
When we say that in Python everything is an object, we really mean *everything*.

In [63]:
import sys

def print_type(o):
    print(type(o))
    
print_type(42)
print_type([])
print_type(sys)
print_type(print_type)
print_type(type)
print_type([].append)
print_type(len)

<class 'int'>
<class 'list'>
<class 'module'>
<class 'function'>
<class 'type'>
<class 'builtin_function_or_method'>
<class 'builtin_function_or_method'>


<span class="advanced-start"></span>
This language design choice allows for some very convenient language constructs, but has some costs in terms of performance. For instance when we perform the simple addition of two integers there are a few more steps involved besides retrieving the value of the two operands and performing the addition. The Python interpreter in fact has to check the type of both operands, see if they support the addition operation, call the appropriate addition method extract the values to be added, perform the operation and construct a new object for the result.

<span class="advanced-stop"></span>
A practical implication of the fact that everything in Python is an object is that everything can be 
- Assigned to a variable
- Passed as an argument to a function
- Returned as the return value of a function
- Set as an attribute of another object

<span class="advanced-stop"></span>