# Object References, Mutability, and Recycling

## Variables Are Not Boxes

In 1997, I took a summer course on Java at MIT. The professor, Lynn Stein,1 made the point that the usual “variables as boxes” metaphor actually hinders the understanding of reference variables in object-oriented languages. Python variables are like reference variables in Java; a **better metaphor is to think of variables as labels with names attached to objects**. The next example and figure will help you understand why.

Example 6-1 is a simple interaction that the “variables as boxes” idea cannot explain. Figure 6-1 illustrates why the box metaphor is wrong for Python, while sticky notes provide a helpful picture of how variables actually work.

In [3]:
a = [1, 2, 3]  # Create a list [1, 2, 3] and bind the variable a to it
b = a          # Bind the variable b to the same value that a is referencing.
a.append(4)    # Modify the list referenced by a, by appending another item.

print(b)       # You can see the effect via the b variable. If we think of b as a box 
               # that stored a copy of the [1, 2, 3] from the a box, this behavior makes no sense.

[1, 2, 3, 4]


Therefore, the b = a statement does not copy the contents of box a into box b. **It attaches the label b to the object that already has the label a.**

Prof. Stein also spoke about assignment in a very deliberate way. For example, when talking about a seesaw object in a simulation, she would say: “Variable s is assigned to the seesaw,” but never “The seesaw is assigned to variable s.” 

With **reference variables**, it makes much more sense to say that **the variable is assigned to an object**, and not the other way around. After all, the object is created before the assignment. Example 6-2 proves that the righthand side of an assignment happens first.

Since the verb “to assign” is used in contradictory ways, a useful alternative is “to bind”: Python’s assignment statement `x = …` binds the x name to the object created or referenced on the righthand side. And the object must exist before a name can be bound to it, as Example 6-2 proves.

In [4]:
class Gizmo:
    def __init__(self):
        print(f'Gizmo id: {id(self)}')

In [5]:
x = Gizmo() # The output Gizmo id: … is a side effect of creating a Gizmo instance

Gizmo id: 140334639386576


In [6]:
y = Gizmo() * 10 # Multiplying a Gizmo instance will raise an exception
# Here is proof that a second Gizmo was actually instantiated before the multiplication was attempted.

Gizmo id: 140334639384560


TypeError: unsupported operand type(s) for *: 'Gizmo' and 'int'

In [9]:
print(dir())
# But variable y was never created, because the exception happened while the 
# righthand side of the assignment was being evaluated.

['Gizmo', 'In', 'Out', '_', '_1', '_7', '__', '___', '__builtin__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', '_dh', '_i', '_i1', '_i2', '_i3', '_i4', '_i5', '_i6', '_i7', '_i8', '_i9', '_ih', '_ii', '_iii', '_oh', 'a', 'b', 'exit', 'get_ipython', 'quit', 'x']


> To understand an assignment in Python, read the righthand side first: that’s where the object is created or retrieved. After that, the variable on the left is bound to the object, like a label stuck to it. Just forget about the boxes.

Because variables are mere labels, nothing prevents an object from having several labels assigned to it. When that happens, you have aliasing, our next topic.

## Identity, Equality, and Aliases

Lewis Carroll is the pen name of Prof. Charles Lutwidge Dodgson. Mr. Carroll is not only equal to Prof. Dodgson, they are one and the same. Example 6-3 expresses this idea in Python.

In [10]:
charles = {'name': 'Charles L. Dodgson', 'born': 1832}

In [11]:
lewis = charles # lewis is an alias for charles

In [12]:
lewis is charles

True

In [13]:
id(charles), id(lewis) # The is operator and the id function confirm it.

(140334502055040, 140334502055040)

In [14]:
lewis['balance'] = 950 # Adding an item to lewis is the same as adding an item to charles

In [15]:
charles

{'name': 'Charles L. Dodgson', 'born': 1832, 'balance': 950}

However, suppose an impostor—let’s call him Dr. Alexander Pedachenko—claims he is Charles L. Dodgson, born in 1832. His credentials may be the same, but Dr. Pedachenko is not Prof. Dodgson. Figure 6-2 illustrates this scenario

In [16]:
alex = {'name': 'Charles L. Dodgson', 'born': 1832, 'balance': 950}
# alex refers to an object that is a replica of the object assigned to charles

In [17]:
alex == charles
# The objects compare equal because of the __eq__ implementation in the dict class.

True

In [18]:
alex is not charles
# But they are distinct objects. This is the Pythonic way of writing the negative identity comparison: a is not b.

True

Example 6-3 is an example of **aliasing**. In that code, lewis and charles are aliases: two variables bound to the same object. On the other hand, alex is not an alias for charles: these variables are bound to distinct objects. The objects bound to alex and charles have the **same value—that’s what == compares—but they have different identities**.

> An object’s identity never changes once it has been created; you may think of it as the object’s address in memory. The is operator compares the identity of two objects; the id() function returns an integer representing its identity.

The real meaning of an object’s ID is implementation dependent. In CPython, `id()` returns the **memory address of the object**, but it may be something else in another Python interpreter. The **key point is that the ID is guaranteed to be a unique integer label, and it will never change during the life of the object**.

In practice, we rarely use the `id()` function while programming. Identity checks are most often done with the is operator, which compares the object IDs, so our code doesn’t need to call `id()` explicitly.

### Choosing Between == and is

The `==` operator compares the values of objects (the data they hold), while `is` compares their identities.

While programming, we often care more about values than object identities, so == appears more frequently than is in Python code.

However, if you are comparing a variable to a singleton, then it makes sense to use is. By far, the most common case is checking whether a variable is bound to None. This is the recommended way to do it:

In [20]:
x = 10

In [21]:
x is None

False

And the proper way to write its negation is:

In [22]:
x is not None

True

None is the most common singleton we test with is. Sentinel objects are another example of singletons we test with is. Here is one way to create and test a sentinel object:

In [None]:
END_OF_DATA = object()
# ... many lines
def traverse(...):
    # ... more lines
    if node is END_OF_DATA:
        return
    # etc.

The **`is` operator is faster than `==`**, because it cannot be overloaded, so Python does not have to find and invoke special methods to evaluate it, and computing is as simple as comparing two integer IDs. In contrast, a == b is syntactic sugar for `a.__eq__(b)`. The `__eq__` method inherited from object compares object IDs, so it produces the same result as is. But most built-in types override `__eq__` with more meaningful implementations that actually take into account the values of the object attributes. **Equality may involve a lot of processing—for example, when comparing large collections or deeply nested structures.**


> Usually we are more interested in object equality than identity. Checking for None is the only common use case for the is operator. Most other uses I see while reviewing code are wrong. If you are not sure, use ==. It’s usually what you want, and also works with None—albeit not as fast.

### The Relative Immutability of Tuples

Tuples, like most Python collections—lists, dicts, sets, etc.—are containers: they hold references to objects. If the referenced items are mutable, they may change even if the tuple itself does not. In other words, **the immutability of tuples really refers to the physical contents of the tuple data structure (i.e., the references it holds), and does not extend to the referenced objects.**

Example 6-5 illustrates the situation in which the value of a tuple changes as a result of changes to a mutable object referenced in it. What can never change in a tuple is the identity of the items it contains.

In [24]:
t1 = (1, 2, [30, 40]) # t1 is immutable, but t1[-1] is mutable
t2 = (1, 2, [30, 40]) # Build a tuple t2 whose items are equal to those of t1

In [25]:
t1 == t2 # Although distinct objects, t1 and t2 compare equal, as expected.

True

In [26]:
id(t1[-1]) # Inspect the identity of the list at t1[-1]

140334502067584

In [27]:
t1[-1].append(99) # Modify the t1[-1] list in place.

In [28]:
t1

(1, 2, [30, 40, 99])

In [29]:
id(t1[-1]) # The identity of t1[-1] has not changed, only its value

140334502067584

In [30]:
t1 == t2 # t1 and t2 are now different.

False

This relative immutability of tuples is behind the riddle “A += Assignment Puzzler”. It’s also the reason why some tuples are unhashable, as we’ve seen in “What Is Hashable”.


## Copies Are Shallow by Default

The easiest way to copy a list (or most built-in mutable collections) is to use the built-in constructor for the type itself. For example:

In [31]:
l1 = [3, [55, 44], (7, 8, 9)]

In [32]:
l2 = list(l1) # list(l1) creates a copy of l1.

In [33]:
l2

[3, [55, 44], (7, 8, 9)]

In [34]:
l2 == l1 # The copies are equal…

True

In [35]:
l2 is l1 # …but refer to two different objects

False

For lists and other mutable sequences, the shortcut `l2 = l1[:]` also makes a copy.

However, using the constructor or [:] **produces a shallow copy** (i.e., **the outermost container is duplicated**, but the copy is filled with references to the same items held by the original container). This saves memory and **causes no problems if all the items are immutable**. But if there are mutable items, this may lead to unpleasant surprises.

In Example 6-6, we create a shallow copy of a list containing another list and a tuple, and then make changes to see how they affect the referenced objects.

In [39]:
l1 = [3, [66, 55, 44], (7, 8, 9)]
l2 = list(l1) # l2 is a shallow copy of l1.

In [40]:
l1.append(100) # Appending 100 to l1 has no effect on l2.   
l1[1].remove(55) # Here we remove 55 from the inner list l1[1]. This affects l2 because l2[1] 
                 # is bound to the same list as l1[1].

print('l1:', l1)
print('l2:', l2)

l1: [3, [66, 44], (7, 8, 9), 100]
l2: [3, [66, 44], (7, 8, 9)]


In [41]:
l2[1] += [33, 22]  # For a mutable object like the list referred by l2[1], the operator += changes the 
                   # list in place. This change is visible at l1[1], which is an alias for l2[1].
    
l2[2] += (10, 11)  # += on a tuple creates a new tuple and rebinds the variable l2[2] here. This is the same 
                   # as doing l2[2] = l2[2] + (10, 11). Now the tuples in the last position of l1 and l2 are 
                   # no longer the same object

print('l1:', l1)
print('l2:', l2)

l1: [3, [66, 44, 33, 22], (7, 8, 9), 100]
l2: [3, [66, 44, 33, 22], (7, 8, 9, 10, 11)]


It should be clear now that shallow copies are easy to make, but they may or may not be what you want. How to make deep copies is our next topic.

### Deep and Shallow Copies of Arbitrary Objects

Working with shallow copies is not always a problem, but sometimes you need to make deep copies (i.e., duplicates that do not share references of embedded objects). The copy module provides the `deepcopy` and `copy` functions that return deep and shallow copies of arbitrary objects.

To illustrate the use of `copy()` and `deepcopy()`, Example 6-8 defines a simple class, Bus, representing a school bus that is loaded with passengers and then picks up or drops off passengers on its route.

In [42]:
class Bus:
    def __init__(self, passengers=None):
        if passengers is None:
            self.passengers = []
        else:
            self.passengers = list(passengers)

    def pick(self, name):
        self.passengers.append(name)

    def drop(self, name):
        self.passengers.remove(name)

Now, in the interactive Example 6-9, we will create a bus object (bus1) and two clones—a shallow copy (bus2) and a deep copy (bus3)—to observe what happens as bus1 drops off a student.

In [43]:
import copy

bus1 = Bus(['Alice', 'Bill', 'Claire', 'David'])
bus2 = copy.copy(bus1)
bus3 = copy.deepcopy(bus1)
id(bus1), id(bus2), id(bus3) # Using copy and deepcopy, we create three distinct Bus instances.

(140334636661536, 140334638570800, 140334638535296)

In [44]:
bus1.drop('Bill')

In [45]:
bus2.passengers # After bus1 drops 'Bill', he is also missing from bus2.

['Alice', 'Claire', 'David']

In [46]:
id(bus1.passengers), id(bus2.passengers), id(bus3.passengers)
# Inspection of the passengers attributes shows that bus1 and bus2 share the same 
# list object, because bus2 is a shallow copy of bus1

(140334501494848, 140334501494848, 140334501741376)

In [47]:
bus3.passengers
# bus3 is a deep copy of bus1, so its passengers attribute refers to another list.

['Alice', 'Bill', 'Claire', 'David']

Note that making deep copies is not a simple matter in the general case. Objects may have cyclic references that would cause a naïve algorithm to enter an infinite loop. The deepcopy function remembers the objects already copied to handle cyclic references gracefully. This is demonstrated in Example 6-10.

In [48]:
a = [10, 20]
b = [a, 30]

In [49]:
b

[[10, 20], 30]

In [50]:
a.append(b)

In [51]:
a

[10, 20, [[...], 30]]

In [52]:
from copy import deepcopy

In [53]:
c = deepcopy(a)

In [54]:
c

[10, 20, [[...], 30]]

Also, a deep copy may be too deep in some cases. For example, objects may refer to external resources or singletons that should not be copied. You can control the behavior of both copy and deepcopy by implementing the `__copy__()` and `__deepcopy__()` special methods, as described in the copy module documentation.


## Function Parameters as References

The only mode of parameter passing in Python is call by sharing. That is the same mode used in most object-oriented languages, including JavaScript, Ruby, and Java (this applies to Java reference types; primitive types use call by value). Call by sharing means that each formal parameter of the function gets a copy of each reference in the arguments. In other words, the **parameters inside the function become aliases of the actual arguments**.

The result of this scheme is that **a function may change any mutable object passed as a parameter, but it cannot change the identity of those objects** (i.e., it cannot altogether replace an object with another). Example 6-11 shows a simple function using += on one of its parameters. As we pass numbers, lists, and tuples to the function, the actual arguments passed are affected in different ways.

In [55]:
def f(a, b):
    a += b
    return a

In [56]:
x = 1
y = 2

f(x, y)

3

In [57]:
x, y # The number x is unchanged.

(1, 2)

In [58]:
a = [1, 2]
b = [3, 4]
f(a, b)

[1, 2, 3, 4]

In [59]:
a, b # The list a is changed

([1, 2, 3, 4], [3, 4])

In [60]:
t = (10, 20)
u = (30, 40)
f(t, u)  

(10, 20, 30, 40)

In [61]:
t, u # The tuple t is unchanged.

((10, 20), (30, 40))

Another issue related to function parameters is the use of mutable values for defaults, as discussed next.

### Mutable Types as Parameter Defaults: Bad Idea

Optional parameters with default values are a great feature of Python function definitions, allowing our APIs to evolve while remaining backward compatible. However, **you should avoid mutable objects as default values for parameters**.

To illustrate this point, in Example 6-12, we take the Bus class from Example 6-8 and change its `__init__` method to create HauntedBus. Here we tried to be clever, and instead of having a default value of passengers=None, we have passengers=[], thus avoiding the if in the previous `__init__`. This “cleverness” gets us into trouble.

In [64]:
class HauntedBus:
    """A bus model haunted by ghost passengers"""

    def __init__(self, passengers=[]):  # When the passengers argument is not passed, this parameter is bound to the 
                                        # default list object, which is initially empty.
        self.passengers = passengers  # This assignment makes self.passengers an alias for passengers, which is itself 
                                      # an alias for the default list, when no passengers argument is given.

    def pick(self, name):
        self.passengers.append(name)   # When the methods .remove() and .append() are used with self.passengers, 
                                       # we are actually mutating the default list, which is an attribute of the 
                                       # function object.

    def drop(self, name):
        self.passengers.remove(name)

In [65]:
bus1 = HauntedBus(['Alice', 'Bill'])  # bus1 starts with a two-passenger list.
bus1.passengers 

['Alice', 'Bill']

In [66]:
bus1.pick('Charlie')
bus1.drop('Alice')
bus1.passengers # So far, so good: no surprises with bus1.

['Bill', 'Charlie']

In [67]:
bus2 = HauntedBus()  # bus2 starts empty, so the default empty list is assigned to self.passengers
bus2.pick('Carrie')
bus2.passengers

['Carrie']

In [68]:
bus3 = HauntedBus()  # bus3 also starts empty, again the default list is assigned.
bus3.passengers # The default is no longer empty!

['Carrie']

In [69]:
bus3.pick('Dave')
bus2.passengers # Now Dave, picked by bus3, appears in bus2

['Carrie', 'Dave']

In [70]:
bus2.passengers is bus3.passengers 
# The problem: bus2.passengers and bus3.passengers refer to the same list

True

In [71]:
bus1.passengers # But bus1.passengers is a distinct list.

['Bill', 'Charlie']

The problem is that HauntedBus instances that don’t get an initial passenger list end up sharing the same passenger list among themselves.

Such bugs may be subtle. As Example 6-13 demonstrates, when a HauntedBus is instantiated with passengers, it works as expected. Strange things happen only when a HauntedBus starts empty, because then self.passengers becomes an alias for the default value of the passengers parameter. The problem is that each default value is evaluated when the function is defined—i.e., usually when the module is loaded—and the default values become attributes of the function object. So if a default value is a mutable object, and you change it, the change will affect every future call of the function.

After running the lines in Example 6-13, you can inspect the `HauntedBus.__init__` object and see the ghost students haunting its `__defaults__` attribute:

In [73]:
print(dir(HauntedBus.__init__))

['__annotations__', '__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__get__', '__getattribute__', '__globals__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__kwdefaults__', '__le__', '__lt__', '__module__', '__name__', '__ne__', '__new__', '__qualname__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']


In [74]:
HauntedBus.__init__.__defaults__

(['Carrie', 'Dave'],)

Finally, we can verify that `bus2.passengers` is an alias bound to the first element of the `HauntedBus.__init__.__defaults__` attribute:

In [75]:
HauntedBus.__init__.__defaults__[0] is bus2.passengers

True

The issue with mutable defaults explains why None is commonly used as the default value for parameters that may receive mutable values. In Example 6-8, `__init__` checks whether the passengers argument is None. If it is, self.passengers is bound to a new empty list. If passengers is not None, the correct implementation binds a copy of that argument to self.passengers. The next section explains why copying the argument is a good practice.

### Defensive Programming with Mutable Parameters

When you are coding a function that receives a mutable parameter, you **should carefully consider whether the caller expects the argument passed to be changed.**

For example, if your function receives a dict and needs to modify it while processing it, should this side effect be visible outside of the function or not? Actually it depends on the context. It’s really a matter of aligning the expectation of the coder of the function and that of the caller.

The last bus example in this chapter shows how a TwilightBus breaks expectations by sharing its passenger list with its clients. Before studying the implementation, see in Example 6-14 how the TwilightBus class works from the perspective of a client of the class.

In [77]:
class TwilightBus:
    """A bus model that makes passengers vanish"""

    def __init__(self, passengers=None):
        if passengers is None:
            self.passengers = []  # Here we are careful to create a new empty list when passengers is None.
        else:
            self.passengers = passengers  
            # However, this assignment makes self.passengers an alias for passengers, which is itself an 
            # alias for the actual argument passed to __init__

    def pick(self, name):
        self.passengers.append(name)

    def drop(self, name):
        self.passengers.remove(name)
        # When the methods .remove() and .append() are used with self.passengers, we are actually 
        # mutating the original list received as an argument to the constructor

In [79]:
basketball_team = ['Sue', 'Tina', 'Maya', 'Diana', 'Pat']  # basketball_team holds five student names
bus = TwilightBus(basketball_team)  # A TwilightBus is loaded with the team.
bus.drop('Tina')  
bus.drop('Pat') # The bus drops one student, then another

basketball_team # The dropped passengers vanished from the basketball team!

['Sue', 'Maya', 'Diana']

TwilightBus violates the “Principle of least astonishment,” a best practice of interface design. It surely is astonishing that when the bus drops a student, their name is removed from the basketball team roster.

The problem here is that the bus is aliasing the list that is passed to the constructor. Instead, it should keep its own passenger list. The fix is simple: in `__init__`, when the passengers parameter is provided, self.passengers should be initialized with a copy of it, as we did correctly in Example 6-8:

In [None]:
def __init__(self, passengers=None):
        if passengers is None:
            self.passengers = []
        else:
            self.passengers = list(passengers) # Make a copy of the passengers list, or convert it to a list if it’s not one.

Now our internal handling of the passenger list will not affect the argument used to initialize the bus. As a bonus, this solution is more flexible: now the argument passed to the passengers parameter may be a tuple or any other iterable, like a set or even database results, because the list constructor accepts any iterable. As we create our own list to manage, we ensure that it supports the necessary `.remove()` and `.append()` operations we use in the `.pick()` and `.drop()` methods.

> Unless a method is explicitly intended to mutate an object received as an argument, you should think twice before aliasing the argument object by simply assigning it to an instance variable in your class. If in doubt, make a copy. Your clients will be happier. Of course, making a copy is not free: there is a cost in CPU and memory. However, an API that causes subtle bugs is usually a bigger problem than one that is a little slower or uses more resources.

## del and Garbage Collection

> Objects are never explicitly destroyed; however, when they become unreachable they may be garbage-collected.

The first strange fact about del is that it’s not a function, it’s a statement. We write del x and not del(x) — although the latter also works, but only because the expressions x and (x) usually mean the same thing in Python.

The second surprising fact is that **del deletes references, not objects**. **Python’s garbage collector may discard an object from memory as an indirect result of del, if the deleted variable was the last reference to the object**. Rebinding a variable may also cause the number of references to an object to reach zero, causing its destruction.

In [80]:
a = [1, 2] # Create object [1, 2] and bind a to it.

In [81]:
b = a # Bind b to the same [1, 2] object.

In [82]:
del a  # Delete reference a.

In [83]:
b # [1, 2] was not affected, because b still points to it.

[1, 2]

In [84]:
b = [3] # Rebinding b to a different object removes the last 
        # remaining reference to [1, 2]. Now the garbage collector can discard that object.

> There is a `__del__` special method, but it does not cause the disposal of the instance, and should not be called by your code. `__del__` is invoked by the Python interpreter when the instance is about to be destroyed to give it a chance to release external resources. You will seldom need to implement `__del__` in your own code, yet some Python programmers spend time coding it for no good reason. The proper use of `__del__` is rather tricky. See the `__del__` special method documentation in the “Data Model” chapter of The Python Language Reference.

In CPython, the **primary algorithm for garbage collection is reference counting**. Essentially, each object keeps count of how many references point to it. As soon as that refcount reaches zero, the object is immediately destroyed: CPython calls the `__del__` method on the object (if defined) and then frees the memory allocated to the object. In CPython 2.0, a generational garbage collection algorithm was added to detect groups of objects involved in reference cycles—which may be unreachable even with outstanding references to them, when all the mutual references are contained within the group. Other implementations of Python have more sophisticated garbage collectors that do not rely on reference counting, which means the `__del__` method may not be called immediately when there are no more references to the object. See “PyPy, Garbage Collection, and a Deadlock” by A. Jesse Jiryu Davis for discussion of improper and proper use of `__del__`.

To demonstrate the end of an object’s life, Example 6-16 uses `weakref.finalize` to register a callback function to be called when an object is destroyed.

In [85]:
import weakref

s1 = {1, 2, 3}
s2 = s1        # s1 and s2 are aliases referring to the same set, {1, 2, 3}.

# This function must not be a bound method of the object about to be destroyed or otherwise hold a reference to it.
def bye():      
    print('...like tears in the rain.')

In [86]:
# Register the bye callback on the object referred by s1.
ender = weakref.finalize(s1, bye)

In [87]:
ender.alive  # The .alive attribute is True before the finalize object is called.

True

In [88]:
del s1 

In [89]:
ender.alive # As discussed, del did not delete the object, just the s1 reference to it.

True

In [90]:
s2 = 'spam'  # Rebinding the last reference, s2, makes {1, 2, 3} unreachable. 
# It is destroyed, the bye callback is invoked, and ender.alive becomes False.

...like tears in the rain.


In [91]:
ender.alive

False

The point of Example 6-16 is to make explicit that del does not delete objects, but objects may be deleted as a consequence of being unreachable after del is used.

You may be wondering why the {1, 2, 3} object was destroyed in Example 6-16. After all, the s1 reference was passed to the finalize function, which must have held on to it in order to monitor the object and invoke the callback. This works because finalize holds a weak reference to {1, 2, 3}. Weak references to an object do not increase its reference count. Therefore, a weak reference does not prevent the target object from being garbage collected. Weak references are useful in caching applications because you don’t want the cached objects to be kept alive just because they are referenced by the cache.

## Tricks Python Plays with Immutables

> This optional section discusses some Python details that are not really important for users of Python, and that may not apply to other Python implementations or even future versions of CPython. Nevertheless, I’ve seen people stumble upon these corner cases and then start using the is operator incorrectly, so I felt they were worth mentioning.

I was surprised to learn that, for a tuple t, t[:] does not make a copy, but returns a reference to the same object. You also get a reference to the same tuple if you write tuple(t). Example 6-17 proves it.

In [92]:
t1 = (1, 2, 3)
t2 = tuple(t1)

In [93]:
t2 is t1 # t1 and t2 are bound to the same object

True

In [94]:
t3 = t1[:]

In [95]:
t3 is t1

True

The same behavior can be observed with instances of str, bytes, and frozenset. Note that a frozenset is not a sequence, so fs[:] does not work if fs is a frozenset. But fs.copy() has the same effect: it cheats and returns a reference to the same object, and not a copy at all, as Example 6-18 shows.

In [96]:
t1 = (1, 2, 3)
t3 = (1, 2, 3)   # Creating a new tuple from scratch.
t3 is t1  # t1 and t3 are equal, but not the same object. 

False

In [97]:
s1 = 'ABC'
s2 = 'ABC'  # Creating a second str from scratch.
s2 is s1 # Surprise: a and b refer to the same str!

True

**The sharing of string literals is an optimization technique called interning**. CPython uses a similar technique with small integers to avoid unnecessary duplication of numbers that appear frequently in programs like 0, 1, –1, etc. Note that CPython does not intern all strings or integers, and the criteria it uses to do so is an undocumented implementation detail.

> Never depend on str or int interning! Always use == instead of is to compare strings or integers for equality. Interning is an optimization for internal use of the Python interpreter.

The tricks discussed in this section, including the behavior of frozenset.copy(), are harmless “lies” that save memory and make the interpreter faster. Do not worry about them, they should not give you any trouble because they only apply to immutable types. Probably the best use of these bits of trivia is to win bets with fellow Pythonistas.