***
# Mutable object use in Python: a cautionary tale

Met Office *Python Guild* Bug of the Month, July '18

Sadie Bartholomew (**@sadielbartholomew**)
***

## The bare bones...

I've wrote up a simple function to add 'final item' to the end of a list. What could possibly go wrong?

In [1]:
def mark_end_of_my_list(my_list=[]):
    my_list.append('final item')
    return my_list

print(mark_end_of_my_list(['first item', 'intermediate item']))

['first item', 'intermediate item', 'final item']


It does 'as it says on the tin'. Great.

Let's just check it's behaviour with no list input as an argument:

In [2]:
print(mark_end_of_my_list())

['final item']


All fine... or is it?

In [3]:
print(mark_end_of_my_list())
print(mark_end_of_my_list())
print(mark_end_of_my_list())
# you get the idea...

['final item', 'final item']
['final item', 'final item', 'final item']
['final item', 'final item', 'final item', 'final item']


**Wait a minute...** why are there *multiple* elements in the output? The ``append()`` list method was *only called once* to add in a single 'final item'...

**This is not the behaviour many would expect!**

Let's do some investigation. First we can try multiple calls with our original, unremarkable, check using a test list:

In [4]:
test_list = ['first item', 'intermediate item']
print(mark_end_of_my_list(test_list))
print(mark_end_of_my_list(test_list))
print(mark_end_of_my_list(test_list))
# etc...

['first item', 'intermediate item', 'final item']
['first item', 'intermediate item', 'final item', 'final item']
['first item', 'intermediate item', 'final item', 'final item', 'final item']


Nothing to see here; this is all exactly as anticipated! The problem, therefore, seems to originate from the lack of the function argument.

Okay, let's try calling the argument-less function again, but this time with a slight adaption to the function. We'll *make a copy* of the input (or default) list before we do anything internally i.e. before we append the 'final item'.

In [5]:
def mark_end_of_my_copied_list(my_list=[]):
    copy_of_list = list(my_list)  # copy the list before we do anything to it
    copy_of_list.append('final item')
    return copy_of_list

print(mark_end_of_my_copied_list())
print(mark_end_of_my_copied_list())
print(mark_end_of_my_copied_list())
# etc...

['final item']
['final item']
['final item']


Now *this* is the nice, *intuitive* behaviour I wanted from my simple function. Each function call adds a single 'final item' to either an input list or to the default empty list.

Contrast this with equivalent processing on to add a fixed element to the end of a *string* or an empty string by default:

In [6]:
def mark_end_of_my_string(my_string=''):
    my_string = my_string + '.'  # might as well end with a full stop in this case, as is customary!
    return my_string

test_sentence = 'A standard sentence should end with a full stop'
print(mark_end_of_my_string(test_sentence))
print(mark_end_of_my_string(test_sentence))
print(mark_end_of_my_string(test_sentence))

print(mark_end_of_my_string())
print(mark_end_of_my_string())  # what will happen at this point, '.' or '..'?
print(mark_end_of_my_string())  # what will happen at this point, '.' or '...'?
# etc...

A standard sentence should end with a full stop.
A standard sentence should end with a full stop.
A standard sentence should end with a full stop.
.
.
.


Without needing to do initial copying of the object to be processing, in this case, we get single element final addition to our empty default object.

**So what is the cause of this (at least to the uninitiated) peculiar behaviour? And why does it apply to lists but *not* to strings?**

***

## Mutable default arguments: a classic Python 'gotcha'

The crux of it all:

<div class="alert alert-block alert-danger">
Default argument values are evaluated when a function (or method, or class) is <b>defined</b>, *not* when it is called.
<br><br>
Repeat calls to a function (etc.) with one or more *mutable* objects as default argument(s) will therefore apply internal changes to the *same* mutable object(s) created upon definition. Changes made with each successive call to these objects *persist*, as they are being applied to the same object.
<br><br>

However, this does not apply to *immutable* objects, which are fixed after creation. They can't be changed, so changes cannot persist. In this sense, this makes them safer to use.
</div>

***

### The standard get-around:

Let's see how to avoid this behaviour (if we don't intend to use it to our advantage). For variety, we'll use a new basic example with a class. Lists are used here again, but this behaviour also applies with other mutables, e.g. dictionaries & sets:

In [7]:
"""Some basic classes for recording shopping list items for a default shop."""

class Ignorant_Shopper:

    def __init__(self, shop_name='Tesco', shop_id='a485', shopping_list=[]):
        self.shop_name = shop_name
        self.shop_id = shop_id

        # With ignorance:
        self.shopping_list = shopping_list

    def add_item_to_list(self, grocery):
        self.shopping_list.append(grocery)


class Wise_Shopper:

    def __init__(self, shop_name='Tesco', shop_id='a485', shopping_list=None):
        self.shop_name = shop_name
        self.shop_id = shop_id

        # With wisdom/awareness:
        if shopping_list is None:  # or 'if not shopping basket:'
            shopping_list = []
        self.shopping_list = shopping_list

    def add_item_to_list(self, grocery):
        self.shopping_list.append(grocery)

Note the only difference between the ``Ignorant_Shopper`` and ``Wise_Shopper`` classes: the default settings & initlialisation of the ``shopping_list`` attribute.

In [8]:
shopper = Ignorant_Shopper()
shopper.add_item_to_list('milk')
print(shopper.shopping_list)

another_shopper = Ignorant_Shopper()
print(another_shopper.shopping_list)

['milk']
['milk']


Oh dear: ``another_shopper``, has mistakenly been asked to buy milk! ``Ignorant_Shopper`` class has our default argument persistence problem.

The shopping list for both instances point to the same object:

In [9]:
object.__repr__(shopper.shopping_list)

'<list object at 0x7ff758655230>'

In [10]:
object.__repr__(another_shopper.shopping_list)

'<list object at 0x7ff758655230>'

With the standard get-around using a placeholder default argument, typically None, as in ``Wise_Shopper``, we get what we wanted all along:

In [11]:
shopper = Wise_Shopper()
shopper.add_item_to_list('milk')
another_shopper = Wise_Shopper()

print(shopper.shopping_list)
print(another_shopper.shopping_list)

['milk']
[]


In this case, each instance creates its own mutable, here the ``shopping_list`` list:

In [12]:
object.__repr__(shopper.shopping_list)

'<list object at 0x7ff758652780>'

In [13]:
object.__repr__(another_shopper.shopping_list)

'<list object at 0x7ff75aeeeb40>'

### Disclaimer:

<div class="alert alert-block alert-info">
<b>This is a Python *feature*, not a *bug*.</b> It is not going to be 'fixed'!
<br><br>
Mutable default arguments & their behaviour, as outlined, can be **used advantageously** if one is aware of the proper behaviour. For example our ``Ignorant_Shopper`` class might not be so ignorant after all, & have been designed with **caching** in mind.</div>

***

## But mutable objects are not only deceptive as default arguments...

Mutables can also cause trouble elsewhere. Here's a basic example:

In [19]:
import random

def shuffle_my_list(my_list):
    # ...
    # some processing code
    # ...
    random.shuffle(my_list)
    # ...
    # some more processing code
    # ...
    return my_list

In [20]:
listing_one_to_five = ["1", "2", "3", "4", "5"]

# ... doing some processing, including:
shuffle_my_list(listing_one_to_five)
# ... doing some more processing, including:
listing_one_to_five.append("once I caught a fish alive")

print("My favourite song goes: '" + " ".join(listing_one_to_five) + "'")

My favourite song goes: '3 4 2 1 5 once I caught a fish alive'


Wait a minute, that's not quite my favourite song...

Let's again investigate by copying the list in question:

In [16]:
new_listing_one_to_five = ["1", "2", "3", "4", "5"]

# ... doing some processing, including:
shuffle_my_list(list(new_listing_one_to_five))  # note we input a copy of the list
# ... doing some more processing, including:
new_listing_one_to_five.append("once I caught a fish alive")

print("My favourite song goes: '" + ", ".join(new_listing_one_to_five) + "'")

My favourite song goes: '1, 2, 3, 4, 5, once I caught a fish alive'


That's better!

Contrasting the above, second, example to the first example, which gave an unintentionally shuffled result for the listing of the integers one to five input to the song lyric string printed, noting the only difference is in the input list to ``shuffle_my_list()``, we can see that inputting a copied list using ``list()`` prevents the original list from being changed by that function.

**In general, if you want to utilise a mutable object but want to prevent it changing & causing trouble downstream, copy it by some means before using it in a function, method, class etc.**

***

## A summary of lessons learned:

* Default argument values are evaluated when a function/method/class is defined, not when called. If mutable, they will persist throughout instances if used naively.<br>
* The standard get-around for mutable defaults is to instead default to ``None`` & then use an ``if`` statement checking for ``None`` to update accordingly.<br>
* This is a classic 'gotcha': just need to be aware of the behaviour! Can you use it to your advantage?

***