## 1-minute introduction to Jupyter ##

A Jupyter notebook consists of cells. Each cell contains either text or code.

A text cell will not have any text to the left of the cell. A code cell has `In [ ]:` to the left of the cell.

If the cell contains code, you can edit it. Press <kbd>Enter</kbd> to edit the selected cell. While editing the code, press <kbd>Enter</kbd> to create a new line, or <kbd>Shift</kbd>+<kbd>Enter</kbd> to run the code. If you are not editing the code, select a cell and press <kbd>Ctrl</kbd>+<kbd>Enter</kbd> to run the code.

# Lesson 8: Identity and mutability

So far, we have been working with various data types while Python was keeping track of everything in the backend for us.

For the most part, this works as we expect. But sometimes, it can cause some unexpected errors.

This is a function that takes in a sentence and returns a list of word statistics:

In [None]:
def word_stats(text: str) -> "list[dict]":
    """Takes in text as a string.
    Returns a list of word records.
    Each record is a dict with the following info about the word:
    - word: the word itself
    - length: number of letters in the word
    - vowels: number of vowels in the word
    """
    record = {"length": 0, "vowels": 0}  # A placeholder record
    records = []
    for word in text.split(" "):
        record["word"] = word
        for char in word:
            if char in "aeiou":
                record["vowels"] = record["vowels"] + 1
            record["length"] = record["length"] + 1
        records.append(record)
    return records
        
paragraph = "Beneath the dense canopy of the ancient forest, the sunlight struggled to pierce through the thick foliage, casting dappled shadows on the moss-covered ground. The air was filled with the earthy scent of damp soil and the soft hum of hidden creatures. As a cool breeze rustled the leaves, a faint melody seemed to drift between the towering trees—a sound so delicate it could easily be mistaken for a trick of the mind. In this tranquil setting, time felt as if it had slowed, leaving the forest suspended in an eternal moment of peace and quiet mystery."
word_records = word_stats(paragraph)
for record in word_records:
    print(record)

That ... was certainly unexpected. What happened?

## Value and Identity

In Python, objects have a **value** and an **identity**. These concepts are easy to confuse, but we can illustrate with examples.

Python keeps track of the objects it has created using a unique id. We can find out the id of an object using the built-in function `id()`:

In [None]:
a = 2
b = 2
print("id of a:", id(a))
print("id of b:", id(b))

Notice that even when we assign the same integer value to two different variables, Python is generally "smart" enough to notice and treat them as the same. This aligns with our instinct about numbers; `2` means the same thing everywhere.

#### Exercise 1

Test this property with other primitive types: `float`, `bool`, `None`. Is this always true?

## Primitive vs complex values in Python

In Python, primitive values generally have the same identity. But this is not the case for complex value types:

In [None]:
a = "hello"
b = "world"
c = a + b
d = "helloworld"
print("id of c:", id(c))
print("id of d:", id(d))

Notice that `c` and `d` both have a value of `'helloworld'`, but different identities; `c` and `d` are two different strings that happen to have the same sequence of characters. We use two different operators to distinguish these concepts:

In [None]:
if c == d:
    print("c and d have the same value")
else:
    print("c and d do not have the same value")

if c is d:
    print("c and d are the same object")
else:
    print("c and d are not the same object")


## Data structures in Python

This concept also applies to lists and dicts. In particular:

In [None]:
e = {"word": "hello", "length": 5, "vowels": 2}
f = {"word": "hello", "length": 5, "vowels": 2}

if e == f:
    print("e and f have the same value")
else:
    print("e and f do not have the same value")

if e is f:
    print("e and f are the same object")
else:
    print("e and f are not the same object")


Dicts are equal in value if they have the same key-value pairs. They are identical if they are the same object. (Use `id()` to check the ids of the two dicts; they are not the same dict!)

In [None]:
g = [a, b, c, d, e, f]
h = [a, b, c, d, e, f]

if g == h:
    print("g and h have the same value")
else:
    print("g and h do not have the same value")

if g is h:
    print("g and h are the same object")
else:
    print("g and h are not the same object")

The same idea applies to lists.

## Debugging identity problems

Back to our function; what went wrong with it?

    {'length': 458, 'vowels': 167, 'word': 'mystery.'}
    {'length': 458, 'vowels': 167, 'word': 'mystery.'}
    {'length': 458, 'vowels': 167, 'word': 'mystery.'}
    {'length': 458, 'vowels': 167, 'word': 'mystery.'}
    ...
    
Notice that it became a list of identical records; curiously, this record only represents the last word. If we check the ids of the dicts in this list:

In [None]:
for record in word_records:
    print(id(record))

Notice how they all have the same id? We have appended the same dict into our list multiple times!

To avoid this, we have to be careful not to reuse records if we do not intend to refer to the same object:

In [None]:
def word_stats(text: str) -> "list[dict]":
    """Takes in text as a string.
    Returns a list of word records.
    Each record is a dict with the following info about the word:
    - word: the word itself
    - length: number of letters in the word
    - vowels: number of vowels in the word
    """
    record = {"length": 0, "vowels": 0}  # A placeholder record
    records = []
    for word in text.split(" "):
        record = {"length": 0, "vowels": 0}  # <-- start a new record for each word
        record["word"] = word
        for char in word:
            if char in "aeiou":
                record["vowels"] = record["vowels"] + 1
            record["length"] = record["length"] + 1
        records.append(record)
    return records
        
paragraph = "Beneath the dense canopy of the ancient forest, the sunlight struggled to pierce through the thick foliage, casting dappled shadows on the moss-covered ground. The air was filled with the earthy scent of damp soil and the soft hum of hidden creatures. As a cool breeze rustled the leaves, a faint melody seemed to drift between the towering trees—a sound so delicate it could easily be mistaken for a trick of the mind. In this tranquil setting, time felt as if it had slowed, leaving the forest suspended in an eternal moment of peace and quiet mystery."
word_records = word_stats(paragraph)
for record in word_records:
    print(record)

Now we get the correct result. Our word list now contains different dicts instead of the same dict.

# Summary

Research shows that **active recall**, the mental effort of attempting to remember, helps strengthen neuron connections. For each of the questions below, try to recall what you learnt from this lesson before you click to reveal.

<ol>

<li><details>
    <summary>What is <code>==</code> used for? (click to reveal)</summary>
    <p><code>==</code> is the equality operator. It checks if two objects have the same value.</p>
</details></li>
    
<li><details>
    <summary>What is <code>is</code> used for? (click to reveal)</summary>
    <p><code>is</code> is the identity operator. It checks of two objects are identical (have the same id).</p>
</details></li>
