# Section 02 - Classes

##  Objects and Classes

A class is a type of object. In Python we create classes using the `class` keyword.

In [1]:
class Person:
    pass

Now this class doesn't do much, but it is an object of type `type` (which is itself an object).

In [2]:
type(Person)

type

In [3]:
type(type)

type

Classes have "built-in" attributes, even though we did not specifically add any to the class ourselves.

For example, they have a name:

In [4]:
Person.__name__

'Person'

They are also callables, and calling a class results in the creation and return of a new **instance** of that class:

In [5]:
p = Person()

Now the type of the object is the class used to build that object:

In [6]:
type(p)

__main__.Person

These instances also have "built_in" properties, which we will cover throughout this course.

For example, they have a `__class__` property that tells us which class was used to create the instance:

In [7]:
p.__class__

__main__.Person

As you can see that returns the class object used to instantiate `p`.

In fact:

In [8]:
type(p) is p.__class__

True

We can also use `isinstance` to test if an object is an instance of a particular class - now this gets a bit more complicated when we use inheritance, but right now we're not, so it's quite straightforward:

In [9]:
isinstance(p, Person)

True

In [10]:
isinstance(p, str)

False

We can even use `isinstance` with our class, since we know it's type is `type`:

In [11]:
isinstance(Person, type)

True

`type` is like the most generic kind of **class** object - we'll come back to this when discussing meta programming.

We really need inheritance to understand how this works, but every class **is** a `type` object (it inherits all the properties of `type`).

For now let's just see what functionality `type` has:

In [12]:
help(type)

Help on class type in module builtins:

class type(object)
 |  type(object_or_name, bases, dict)
 |  type(object) -> the object's type
 |  type(name, bases, dict) -> a new type
 |  
 |  Methods defined here:
 |  
 |  __call__(self, /, *args, **kwargs)
 |      Call self as a function.
 |  
 |  __delattr__(self, name, /)
 |      Implement delattr(self, name).
 |  
 |  __dir__(...)
 |      __dir__() -> list
 |      specialized __dir__ implementation for types
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __instancecheck__(...)
 |      __instancecheck__() -> bool
 |      check if an object is an instance
 |  
 |  __new__(*args, **kwargs)
 |      Create and return a new object.  See help(type) for accurate signature.
 |  
 |  __prepare__(...)
 |      __prepare__() -> dict
 |      used to create the namespace for the class statement
 |  
 

As you can see it has a `__call__` method (that's how our class becomes callable), and a bunch of other attributes and methods that we'll see throughout this course.

Our class objects also have these properties, because they inherit from the `type` object.

And in fact, `type` is an instance of itself - that's kind of weird, and not the case for our own classes:

In [13]:
isinstance(type, type)

True

In [14]:
isinstance(Person, Person)

False

##  Class Attributes

As we saw, when we create a class Python automatically builds-in properties and behaviors into our class object, like making it callable, and properties like `__name__`.

In [1]:
class Person:
    pass

In [2]:
Person.__name__

'Person'

`__name__` is a **class attribute**. We can add our own class attributes easily this way:

In [3]:
class Program:
    language = 'Python'
    version = '3.6'

In [4]:
Program.__name__

'Program'

In [5]:
Program.language

'Python'

In [6]:
Program.version

'3.6'

Here we used "dotted notation" to access the class attributes. In fact we can also use dotted notation to set the class attribute:

In [7]:
Program.version = '3.7'

In [8]:
Program.version

'3.7'

But we can also use the functions `getattr` and `setattr` to read and write these attributes:

In [9]:
getattr(Program, 'version')

'3.7'

In [10]:
setattr(Program, 'version', '3.6')

In [11]:
Program.version, getattr(Program, 'version')

('3.6', '3.6')

Python is a dynamic language, and we can create attributes at run-time, outside of the class definition itself:

In [12]:
Program.x = 100

Using dotted notation we added an attribute `x` to the Person class:

In [13]:
Program.x, getattr(Program, 'x')

(100, 100)

We could also just have used a `setattr` function call:

In [14]:
setattr(Program, 'y', 200)

In [15]:
Program.y, getattr(Program, 'y')

(200, 200)

So where is the state stored? Usually in a dictionary that is attached to the **class** object (often referred to as the **namespace** of the class):

In [16]:
Program.__dict__

mappingproxy({'__module__': '__main__',
              'language': 'Python',
              'version': '3.6',
              '__dict__': <attribute '__dict__' of 'Program' objects>,
              '__weakref__': <attribute '__weakref__' of 'Program' objects>,
              '__doc__': None,
              'x': 100,
              'y': 200})

As you can see that dictionary contains our attributes: `language`, `version`, `x`, `y` with their corresponding current values.

Notice also that `Program.__dict__` does not return a dictionary, but a `mappingproxy` object - this is essentially a read-only dictionary that we cannot modify directly (but we can modify it by using `setattr`, or dotted notation).

For example, if we change the value of an attribute:

In [17]:
setattr(Program, 'x', -10)

We'll see this reflected in the underlying dictionary:

In [18]:
Program.__dict__

mappingproxy({'__module__': '__main__',
              'language': 'Python',
              'version': '3.6',
              '__dict__': <attribute '__dict__' of 'Program' objects>,
              '__weakref__': <attribute '__weakref__' of 'Program' objects>,
              '__doc__': None,
              'x': -10,
              'y': 200})

#### Deleting Attributes

So, we can create and mutate class attributes at run-time. Can we delete attributes too?

The answer of course is yes. We can either use the `del` keyword, or the `delattr` function:

In [19]:
del Program.x

In [20]:
Program.__dict__

mappingproxy({'__module__': '__main__',
              'language': 'Python',
              'version': '3.6',
              '__dict__': <attribute '__dict__' of 'Program' objects>,
              '__weakref__': <attribute '__weakref__' of 'Program' objects>,
              '__doc__': None,
              'y': 200})

In [21]:
delattr(Program, 'y')

#### Direct Namespace Access

In [22]:
Program.__dict__

mappingproxy({'__module__': '__main__',
              'language': 'Python',
              'version': '3.6',
              '__dict__': <attribute '__dict__' of 'Program' objects>,
              '__weakref__': <attribute '__weakref__' of 'Program' objects>,
              '__doc__': None})

Although `__dict__` returns a `mappingproxy` object, it still is a hash map and essentially behaves like a read-only dictionary:

In [23]:
Program.__dict__['language']

'Python'

In [24]:
list(Program.__dict__.items())

[('__module__', '__main__'),
 ('language', 'Python'),
 ('version', '3.6'),
 ('__dict__', <attribute '__dict__' of 'Program' objects>),
 ('__weakref__', <attribute '__weakref__' of 'Program' objects>),
 ('__doc__', None)]

One word of caution: not every attribute that a class has lives in that dictionary (we'll come back to this later).

For example, you'll notice that the `__name__` attribute is not there:

In [25]:
Program.__name__

'Program'

In [26]:
__name__ in Program.__dict__

False

##  Callable Class Attributes

Class attributes can be any object type, including callables such as functions:

In [1]:
class Program:
    language = 'Python'
    
    def say_hello():
        print(f'Hello from {Program.language}!')

In [2]:
Program.__dict__

mappingproxy({'__module__': '__main__',
              'language': 'Python',
              'say_hello': <function __main__.Program.say_hello()>,
              '__dict__': <attribute '__dict__' of 'Program' objects>,
              '__weakref__': <attribute '__weakref__' of 'Program' objects>,
              '__doc__': None})

As we can see, the `say_hello` symbol is in the class dictionary.

We can also retrieve it using either `getattr` or dotted notation:

In [3]:
Program.say_hello, getattr(Program, 'say_hello')

(<function __main__.Program.say_hello()>,
 <function __main__.Program.say_hello()>)

And of course we can call it, since it is a callable:

In [4]:
Program.say_hello()

Hello from Python!


In [5]:
getattr(Program, 'say_hello')()

Hello from Python!


We can even access it via the namespace dictionary as well:

In [6]:
Program.__dict__['say_hello']()

Hello from Python!


##  Classes are Callable

As we saw earlier, one of the things Python does for us when we create a class is to make it callable.

Calling a class creates a new instance of the class - an object of that particular type.

In [1]:
class Program:
    language = 'Python'
    
    def say_hello():
        print(f'Hello from {Program.language}!')

In [2]:
p = Program()

In [3]:
type(p)

__main__.Program

In [4]:
isinstance(p, Program)

True

These instances have their own namespace, and their own `__dict__` that is distinct from the class `__dict__`:

In [5]:
p.__dict__

{}

In [6]:
Program.__dict__

mappingproxy({'__module__': '__main__',
              'language': 'Python',
              'say_hello': <function __main__.Program.say_hello()>,
              '__dict__': <attribute '__dict__' of 'Program' objects>,
              '__weakref__': <attribute '__weakref__' of 'Program' objects>,
              '__doc__': None})

Instances also have attributes that may not be visible in their `__dict__` (they are being stored elsewhere, as we'll examine later):

In [7]:
p.__class__

__main__.Program

Although we can use `__class__` we can also use `type`:

In [8]:
type(p) is p.__class__

True

Generally we use `type` instead of using `__class__` just like we usually use `len()` instead of accessing `__len__`.

Why? Well, one reason is that people can mess around with the `__class__` attribute:

In [9]:
class MyClass:
    pass

In [10]:
m = MyClass()

In [11]:
type(m), m.__class__

(__main__.MyClass, __main__.MyClass)

But look at what happens here:

In [12]:
class MyClass:
    __class__ = str

In [13]:
m = MyClass()

In [14]:
type(m), m.__class__

(__main__.MyClass, str)

So as you can see, `type` wasn't fooled!

##  Data Attributes

Let's focus on data attributes first (non-callables).

As we saw before we can have class attributes - they live in the class dictionary:

In [1]:
class BankAccount:
    apr = 1.2

In [2]:
BankAccount.__dict__

mappingproxy({'__module__': '__main__',
              'apr': 1.2,
              '__dict__': <attribute '__dict__' of 'BankAccount' objects>,
              '__weakref__': <attribute '__weakref__' of 'BankAccount' objects>,
              '__doc__': None})

In [3]:
BankAccount.apr

1.2

Now when we create instances of that class:

In [4]:
acc_1 = BankAccount()
acc_2 = BankAccount()

The instance dictionaries are currently empty:

In [5]:
acc_1.__dict__, acc_2.__dict__

({}, {})

Yet, these instances do have an `apr` attribute:

In [6]:
acc_1.apr, acc_2.apr

(1.2, 1.2)

Where is that value coming from? The class the objects were created from!

In fact, if we modify the class attribute:

In [7]:
BankAccount.apr = 2.5

We'll see this reflected in the instances as well:

In [8]:
acc_1.apr, acc_2.apr

(2.5, 2.5)

And if we a a class attribute to `BankAccount`:

In [9]:
BankAccount.account_type = 'Savings'

In [10]:
acc_1.account_type, acc_2.account_type

('Savings', 'Savings')

As you can see modifying attributes in the **class** are reflected in the instances too - that's because Python does not find an `apr` attribute in the instance dic tionary, so next it looks in the class that was used to create the instance.

Which raises the question, what happens if we add `apr` to the **instance** dictionary?

In [11]:
acc_1.apr = 0

Well that did not raise an exception - so what's happening now:

In [12]:
acc_1.__dict__, acc_2.__dict__

({'apr': 0}, {})

As you can see, we actually create an entry for `apr` in the state dictionary of `acc_1`.

Now that we have it there, it we try to get the attribute value `apr` for `acc_1`, Python will find it in the instance dictionary, so it will use that instead!

In [13]:
acc_1.apr, acc_2.apr

(0, 2.5)

In effect, the instance attribute `apr` is **hiding** the class attribute.

You'll notice also that `acc_2` was **not** affected - this is because we did not modify `acc_2`'s dictionary, just the dictionary for `acc_1`.

And the `getattr` and `setattr` functions work the same way as dotted notation:

In [14]:
acc_1 = BankAccount()
print(acc_1.__dict__)
print(acc_1.apr)
print(getattr(acc_1, 'apr'))

{}
2.5
2.5


In [15]:
setattr(acc_1, 'apr', 0)
print(acc_1.__dict__)
print(acc_1.apr)
print(getattr(acc_1, 'apr'))

{'apr': 0}
0
0


We can even add instance attributes directly to an instance:

In [16]:
acc_1.bank = 'Acme Savings & Loans'

In [17]:
acc_1.__dict__

{'apr': 0, 'bank': 'Acme Savings & Loans'}

But this is specific to the instance, and only that specific instance:

In [18]:
acc_2 = BankAccount()

In [19]:
acc_2.__dict__

{}

As you can see `acc_2` has an empty instance dictionary.

So it is really important to distingush between **class attributes** and **instance attributes**.

**Class attributes** are like attributes that are "common" to all instances - because the attribute does not live in the instance, but in the class itself.

On the other hand, **instance attributes** are specific to each instance, and values for the same attribute can be different across multiple instances, as we just saw with `acc_1.apr` and `acc_2.apr`.

So, in summary, classes and instances each have their own state - usually maintained in a dictionary, available through `__dict__`. Irrespective of where the state is stored, when we look up an attribute on an instance, Python will first look for the attribute in the instance's local state. If it does not find it there, it will next look for it in the class of the instance.

One other thing to note is the difference in type between class and instance `__dict__`.

Classes as we saw, return a `mapping proxy` object:

In [20]:
BankAccount.__dict__

mappingproxy({'__module__': '__main__',
              'apr': 2.5,
              '__dict__': <attribute '__dict__' of 'BankAccount' objects>,
              '__weakref__': <attribute '__weakref__' of 'BankAccount' objects>,
              '__doc__': None,
              'account_type': 'Savings'})

But instances, return a real dictionary:

In [21]:
acc_1.__dict__

{'apr': 0, 'bank': 'Acme Savings & Loans'}

So with instances, unlike with classes, we can manipulate that dictionary directly:

In [22]:
class Program:
    language = 'Python'

In [23]:
p = Program()

In [24]:
p.__dict__

{}

In [25]:
p.__dict__['version'] = '3.7'

In [26]:
p.__dict__

{'version': '3.7'}

In [27]:
p.version, getattr(p, 'version')

('3.7', '3.7')

But once again, this only affects that specific **instance**.

##  Function Attributes

So far, we have been dealing with non-callable attributes. When attributes are actually functions, things behave differently.

In [1]:
class Person:
    def say_hello():
        print('Hello!')

In [2]:
Person.say_hello

<function __main__.Person.say_hello()>

In [3]:
type(Person.say_hello)

function

As we can see it is just a plain function, and be called as usual:

In [4]:
Person.say_hello()

Hello!


Now let's create an instance of that class:

In [5]:
p = Person()

In [6]:
hex(id(p))

'0x7f88a06937b8'

We know we can access class attributes via the instance, so we should also be able to access the function attribute in the same way:

In [7]:
p.say_hello

<bound method Person.say_hello of <__main__.Person object at 0x7f88a06937b8>>

In [8]:
type(p.say_hello)

method

Hmm, the type has changed from `function` to `method`, and the function representation states that it is a **bound method** of the **specific object** `p` we created (notice the memory address).

And if we try to call the function from the instance, here's what happens:

In [9]:
try:
    p.say_hello()
except Exception as ex:
    print(type(ex).__name__, ex)

TypeError say_hello() takes 0 positional arguments but 1 was given


`method` is an actual type in Python, and, like functions, they are callables, but they have one distinguishing feature. They need to be bound to an object, and that object reference is passed to the underlying function.

Often when we define functions in a class and call them from the instance we need to know which **specific** instance was used to call the function. This allows us to interact with the instance variables.

To do this, Python will automatically transform an ordinary function defined in a class into a method when it is called from an instance of the class.

Further, it will "bind" the method to the instance - meaning that the instance will be passed as the **first** argument to the function being called.

It does this using **descriptors** which we'll come back to in detail later.

For now let's just explore this a bit more:

In [10]:
class Person:
    def say_hello(*args):
        print('say_hello args:', args)

In [11]:
Person.say_hello()

say_hello args: ()


As we can see, calling `say_hello` from the **class**, just calls the function (it is just a function).

But when we call it from an instance:

In [12]:
p = Person()
hex(id(p))

'0x7f88d0428748'

In [13]:
p.say_hello()

say_hello args: (<__main__.Person object at 0x7f88d0428748>,)


You can see that the object `p` was passed as an argument to the class function `say_hello`.

The obvious advantage is that we can now interact with instance attributes easily:

In [14]:
class Person:
    def set_name(instance_obj, new_name):
        instance_obj.name = new_name  # or setattr(instance_obj, 'name', new_name)
        

In [15]:
p = Person()

In [16]:
p.set_name('Alex')

In [17]:

p.__dict__

{'name': 'Alex'}

This has essentially the same effect as doing this:

In [18]:
Person.set_name(p, 'John')

In [19]:
p.__dict__

{'name': 'John'}

By convention, the first argument is usually named `self`, but asd you just saw we can name it whatever we want - it just will be in the instance when the method variant of the function is called - and it is called an **instance method**.

But **methods** are objects created by Python when calling class functions from an instance.

They have their own unique attributes too:

In [20]:
class Person:
    def say_hello(self):
        print(f'{self} says hello')

In [21]:
p = Person()

In [22]:
p.say_hello

<bound method Person.say_hello of <__main__.Person object at 0x7f88d0428c18>>

In [23]:
m_hello = p.say_hello

In [24]:
type(m_hello)

method

For example it has a `__func__` attribute:

In [25]:
m_hello.__func__

<function __main__.Person.say_hello(self)>

which happens to be the class function used to create the method (the underlying function).

But remember that a method is bound to an instance. In this case we got the method from the `p` object:

In [26]:
hex(id(p))

'0x7f88d0428c18'

In [27]:
m_hello.__self__

<__main__.Person at 0x7f88d0428c18>

As you can see, the method also has a reference to the object it is **bound** to.

So think of methods as functions that have been bound to a specific object, and that object is passed in as the first argument of the function call. The remaining arguments are then passed after that.

Instance methods are created automatically for us, when we define functions inside our class definitions.

This even holds true if we monkey-patch our classes at run-time:

In [28]:
class Person:
    def say_hello(self):
        print(f'instance method called from {self}')

In [29]:
p = Person()
hex(id(p))

'0x7f88d0435f28'

In [30]:
p.say_hello()

instance method called from <__main__.Person object at 0x7f88d0435f28>


In [31]:
Person.do_work = lambda self: f"do_work called from {self}"

In [32]:
Person.__dict__

mappingproxy({'__module__': '__main__',
              'say_hello': <function __main__.Person.say_hello(self)>,
              '__dict__': <attribute '__dict__' of 'Person' objects>,
              '__weakref__': <attribute '__weakref__' of 'Person' objects>,
              '__doc__': None,
              'do_work': <function __main__.<lambda>(self)>})

OK, so both functions are in the class `__dict__`.

let's create an instance and see what happens:

In [33]:
p.say_hello

<bound method Person.say_hello of <__main__.Person object at 0x7f88d0435f28>>

In [34]:
p.do_work

<bound method <lambda> of <__main__.Person object at 0x7f88d0435f28>>

In [35]:
p.do_work()

'do_work called from <__main__.Person object at 0x7f88d0435f28>'

But be careful, if we add a function to the **instance** directly, this does not work the same - we have create a function in the instance, so it is not considered a method (since it was not defined in the class):

In [36]:
p.other_func = lambda *args: print(f'other_func called with {args}')

In [37]:
p.other_func

<function __main__.<lambda>(*args)>

In [38]:
'other_func' in Person.__dict__

False

In [39]:
p.other_func()

other_func called with ()


As you can see, `other_func` is, and behaves, like an ordinary function.

Long story short, functions defined in a class are transformed into methods when called from instances of the class. So of course, we have to account for that extra argument that is passed to the method.

##  Initializing Class Instances

When we create a new instance of a class two separate things are happening:
1. The object instance is **created**
2. The object instance is then further **initialized**

We can "intercept" both the creating and initialization phases, by using special methods `__new__` and `__init__`.

We'll come back to `__new__` later. For now we'll focus on `__init__`.

What's important to remember, is that `__init__` is an **instance method**. By the time `__init__` is called, the new object has **already** been created, and our `__init__` function defined in the class is now treated like a **method** bound to the instance.

In [1]:
class Person:
    def __init__(self):
        print(f'Initializing a new Person object: {self}')

In [2]:
p = Person()

Initializing a new Person object: <__main__.Person object at 0x7f80a022b0f0>


And we can see that `p` has the same memory address:

In [3]:
hex(id(p))

'0x7f80a022b0f0'

Because `__init__` is an instance method, we have access to the object (instance) state within the method, so we can use it to manipulate the object state:

In [4]:
class Person:
    def __init__(self, name):
        self.name = name

In [5]:
p = Person('Eric')

In [6]:
p.__dict__

{'name': 'Eric'}

What actually happens is that after the new instance has been created, Python sees and automatically calls `<instance>.__init__(self, *args, **kwargs)`

So this is no different that if we had done it this way:

In [7]:
class Person:
    def initialize(self, name):
        self.name = name

In [8]:
p = Person()

In [9]:
p.__dict__

{}

In [10]:
p.initialize('Eric')

In [11]:
p.__dict__

{'name': 'Eric'}

But by using the `__init__` method both these things are done automatically for us.

Just remember that by the time `__init__` is called, the instance has **already** been created, and `__init__` is an instance method.

##  Creating Attributes at Run-Time

We already saw that we can add attributes to instances at run-time, and that it affects just that single instance:

In [1]:
class Person:
    pass

In [2]:
p1 = Person()
p2 = Person()

p1.name = 'Alex'

In [3]:
p1.__dict__

{'name': 'Alex'}

In [4]:
p2.__dict__

{}

So what happens if we add a function as an attribute to our instances directly (we can even do the same within an `__init__` method, works the same way)?

Remember that if we add a function to the class itself, calling the function from the instance will result in a method. 

Here, the result is different, since we are adding the function directly to the instance, not the class:

In [5]:
p1.say_hello = lambda: 'Hello!'

In [6]:
p1.__dict__

{'name': 'Alex', 'say_hello': <function __main__.<lambda>()>}

In [7]:
p1.say_hello

<function __main__.<lambda>()>

As you can see, that attribute is a **plain** function - it is **not** being interpreted as a **method**.

In [8]:
p1.say_hello()

'Hello!'

Of course, the other instances do not know anything about that function:

In [9]:
p2.__dict__

{}

So, the question becomes, **can** we create a **method** on a specific instance?

The answer (of course!) is yes, but we have to explicitly tell Python we are setting up a method bound to that specific instance.

We do this by creating a `method` type object:

In [10]:
from types import MethodType

In [11]:
class Person:
    def __init__(self, name):
        self.name = name

In [12]:
p1 = Person('Eric')
p2 = Person('Alex')

Now let's create a `method` object, and bind it to `p1`. First we create a function that will handle the bound object as it's first argument, and use the instance `name` property.

In [13]:
def say_hello(self):
    return f'{self.name} says hello!'

Now we can use that function just by itself, passing in any object that has a `name` attribute:

In [14]:
say_hello(p1), say_hello(p2)

('Eric says hello!', 'Alex says hello!')

Now however, we are going to create a method bound to `p1` specifically:

In [15]:
p1_say_hello = MethodType(say_hello, p1)

In [16]:
p1_say_hello

<bound method say_hello of <__main__.Person object at 0x7f9750295630>>

As you can see that method is bound to the instance `p1`. But how do we call it?

If we try to use dotted notation or a `getattr`, that won't work because the `p1` object does not know anything about that method:

In [17]:
try:
    p1.p1_say_hello()
except AttributeError as ex:
    print(ex)

'Person' object has no attribute 'p1_say_hello'


All we need to do is add that method to the instance dictionary - giving it whatever name we want:

In [18]:
p1.say_hello = p1_say_hello

In [19]:
p1.__dict__

{'name': 'Eric',
 'say_hello': <bound method say_hello of <__main__.Person object at 0x7f9750295630>>}

OK, so now out instance knows about that method that we stored under the name `say_hello`:

In [20]:
p1.say_hello()

'Eric says hello!'

or, we can use the `getattr` function:

In [21]:
getattr(p1, 'say_hello')()

'Eric says hello!'

And of course, othe instances know nothing about this:

In [22]:
p2.__dict__

{'name': 'Alex'}

So, to create a bound method after the object has initially been created, we just create a bound method and add it to the instance itself.

We can do it this way (what we just saw):

In [23]:
p1 = Person('Alex')
p1.__dict__

{'name': 'Alex'}

In [24]:
p1.say_hello = MethodType(lambda self: f'{self.name} says hello', p1)

In [25]:
p1.say_hello()

'Alex says hello'

But we can also do this from any instance method too.

#### Example

Suppose we want some class to have some functionality that is called the same way but will differ from instance to instance. Although we could use inheritance, here I want some kind of 'plug-in' approach and we can do this without inheritance, mixins, or anything like that!

In [26]:
from types import MethodType

class Person:
    def __init__(self, name):
        self.name = name
        
    def register_do_work(self, func):
        setattr(self, '_do_work', MethodType(func, self))
        
    def do_work(self):
        do_work_method = getattr(self, '_do_work', None)
        # if attribute exists we'll get it back, otherwise it will be None
        if do_work_method:
            return do_work_method()
        else:
            raise AttributeError('You must first register a do_work method')

In [27]:
math_teacher = Person('Eric')
english_teacher = Person('John')

Right now neither the math nor the english teacher can do any woirk because we haven't "registered" a worker yet:

In [28]:
try:
    math_teacher.do_work()
except AttributeError as ex:
    print(ex)

You must first register a do_work method


Ok, so let's do that:

In [29]:
def work_math(self):
     return f'{self.name} will teach differentials today.'

In [30]:
math_teacher.register_do_work(work_math)

In [31]:
math_teacher.__dict__

{'name': 'Eric',
 '_do_work': <bound method work_math of <__main__.Person object at 0x7f97584cdac8>>}

In [32]:
math_teacher.do_work()

'Eric will teach differentials today.'

And we can create a different `do_work` method for the English teacher:

In [33]:
def work_english(self):
    return f'{self.name} will analyze Hamlet today.'

In [34]:
english_teacher.register_do_work(work_english)

In [35]:
english_teacher.do_work()

'John will analyze Hamlet today.'

##  Properties

To be clear, here we are examining **instance** properties. That is, we define the property in the class we are defining, but the property itself is going to be **instance** specific, i.e. different instances will support different values for the property. Just like instance attributes. The main difference is that we will use accessor method to get, set (and optionally) delete the associated instance value.

As I mentioned in the lecture, because properties use the same dotted notation (and the same `getattr`, `setattr` and `delattr` functions), we do not need to **start** with properties. Often a bare attribute works just fine, and if, later, we decide we need to manage getting/setting/deleting the attribute value, we can switch over to properties without breaking our class interface. This is unlike languages like Java - and hence why in those languages it is recommended to **always** use getter and setter functions. *Not so* in Python!

A **property** in Python is essentially a class instance - we'll come back to what that class looks like when we study descriptors. For now, we are going to use the `property` function in Python which is a convenience callable essentially.

Let's start with a simple example and a bare attribute:

In [1]:
class Person:
    def __init__(self, name):
        self.name = name

So this class has a single instance **attribute**, `name`.

In [2]:
p = Person('Alex')

And we can access and modify that attribute using either dotted notation or the `getattr` and `setattr` methods:

In [3]:
p.name

'Alex'

In [4]:
getattr(p, 'name')

'Alex'

p.name = 'John'

In [5]:
p.name

'Alex'

In [6]:
setattr(p, 'name', 'Eric')

In [7]:
p.name

'Eric'

Now suppose we wan't to disallow setting an empty string or `None` for the name. Also, we'll require the name to be a string.

To do that we are going to create an instance method that will handle the logic and setting of the value. We also create an instance method to retrieve the attribute value.

We'll use `_name` as the instance variable to store the name.

In [8]:
class Person:
    def __init__(self, name):
        self.set_name(name)
        
    def get_name(self):
        return self._name
    
    def set_name(self, value):
        if isinstance(value, str) and len(value.strip()) > 0:
            # this is valid
            self._name = value.strip()
        else:
            raise ValueError('name must be a non-empty string')

In [9]:
p = Person('Alex')

In [10]:
try:
    p.set_name(100)
except ValueError as ex:
    print(ex)

name must be a non-empty string


In [11]:
p.set_name('Eric')

In [12]:
p.get_name()

'Eric'

Of course, our users can still manipulate the atribute directly if they want by using the "private" attribute `_name`. You can't stop anyone from doing this in Python - they should know better than to do that, but we're all good programmers, and know what and what not to do!

The way we set up our initializer, the validation will work too:

In [13]:
try:
    p = Person('')
except ValueError as ex:
    print(ex)

name must be a non-empty string


So this works, but it's a bit of pain to use the method names. So let's turn this into a property instead. We start with the class we just had and tweak it a bit:

In [14]:
class Person:
    def __init__(self, name):
        self.name = name  # note how we are actually setting the value for name using the property!
        
    def get_name(self):
        return self._name
    
    def set_name(self, value):
        if isinstance(value, str) and len(value.strip()) > 0:
            # this is valid
            self._name = value.strip()
        else:
            raise ValueError('name must be a non-empty string')
            
    name = property(fget=get_name, fset=set_name)

In [15]:
p = Person('Alex')

In [16]:
p.name

'Alex'

In [17]:
p.name = 'Eric'

In [18]:
try:
    p.name = None
except ValueError as ex:
    print(ex)

name must be a non-empty string


So now we have the benefit of using accessor methods, without having to call the methods explicitly.

In fact, even `getattr` and `setattr` will work the same way:

In [19]:
setattr(p, 'name', 'John')  # or p.name = 'John'

In [20]:
getattr(p, 'name')  # or simply p.name

'John'

Now let's examine the instance dictionary:

In [21]:
p.__dict__

{'_name': 'John'}

You'll see we can find the underlying "private" attribute we are using to store the name. But the property itself (`name`) is not in the dictionary.

The property was defined in the class:

In [22]:
Person.__dict__

mappingproxy({'__module__': '__main__',
              '__init__': <function __main__.Person.__init__(self, name)>,
              'get_name': <function __main__.Person.get_name(self)>,
              'set_name': <function __main__.Person.set_name(self, value)>,
              'name': <property at 0x7fbad886e138>,
              '__dict__': <attribute '__dict__' of 'Person' objects>,
              '__weakref__': <attribute '__weakref__' of 'Person' objects>,
              '__doc__': None})

And you can see it's type is `property`.

So when we type `p.name` or `p.name = value`, Python recognizes that `'name` is a `property` and therefore uses the accessor methods. (How it does we'll see later when we study descriptors).

What's interesting is that even if we muck around with the instance dictionary, Python does not get confused - (and as usual in Python, just because you **can** do something does not mean you **should**!)

In [23]:
p = Person('Alex')

In [24]:
p.name

'Alex'

In [25]:
p.__dict__

{'_name': 'Alex'}

In [26]:
p.__dict__['name'] = 'John'

In [27]:
p.__dict__

{'_name': 'Alex', 'name': 'John'}

As you can see, we now have `name` in our instance dictionary.

Let's retrieve the `name` via dotted notation:

In [28]:
p.name

'Alex'

That's obviously still using the getter method.

And setting the name:

In [29]:
p.name = 'Raymond'

In [30]:
p.__dict__

{'_name': 'Raymond', 'name': 'John'}

As you can see, it used the setter method.

And the same thing happens with `setattr` and `getattr`:

In [31]:
getattr(p, 'name')

'Raymond'

In [32]:
setattr(p, 'name', 'Python')

In [33]:
p.__dict__

{'_name': 'Python', 'name': 'John'}

As you can see the `setattr` method used the property setter.

For completeness, let's see how the deleter method works:

In [34]:
class Person:
    def __init__(self, name):
        self._name = name
        
    def get_name(self):
        print('getting name')
        return self._name
    
    def set_name(self, value):
        print('setting name')
        self._name = value
        
    def del_name(self):
        print('deleting name')
        del self._name  # or whatever "cleanup" we want to do
        
    name = property(get_name, set_name, del_name)

In [35]:
p = Person('Alex')

In [36]:
p.__dict__

{'_name': 'Alex'}

In [37]:
p.name

getting name


'Alex'

In [38]:
p.name = 'Eric'

setting name


In [39]:
p.__dict__

{'_name': 'Eric'}

In [40]:
del p.name

deleting name


In [41]:
p.__dict__

{}

Now, the property still exists (since it is defined in the class) - all we did was remove the underlying value for the property (the `_name` instance attribute):

In [42]:
try:
    p.name
except AttributeError as ex:
    print(ex)

getting name
'Person' object has no attribute '_name'


As you can see the issue is that the getter function is trying to find `_name` in the attribute, which no longer exists. So the getter and setter still exist (i.e. the property still exists), so we can still assign to it:

In [43]:
p.name = 'Alex'

setting name


In [44]:
p.name

getting name


'Alex'

The last param in `property` is just a docstring. So we could do this:

In [45]:
class Person:
    """This is a Person object"""
    def __init__(self, name):
        self._name = name
        
    def get_name(self):
        return self._name
    
    def set_name(self, value):
        self._name = value
        
    name = property(get_name, set_name, doc="The person's name.")

In [46]:
p = Person('Alex')

In [47]:
help(Person.name)

Help on property:

    The person's name.



In [48]:
help(Person)

Help on class Person in module __main__:

class Person(builtins.object)
 |  This is a Person object
 |  
 |  Methods defined here:
 |  
 |  __init__(self, name)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  get_name(self)
 |  
 |  set_name(self, value)
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  name
 |      The person's name.



##  Property Decorators

As I explain in the lecture video, the `property` callable actually returns itself:

In [1]:
p = property(fget=lambda self: print('getting property'))

In [2]:
p

<property at 0x7f8348671778>

As you can see `p` is a property, and in fact is the same property that was created.

Think back to how decorators work:

In [3]:
def my_decorator(fn):
    print('decorating function')
    def inner(*args, **kwargs):
        print('running decorated function')
        return fn(*args, **kwargs)
    return inner

In [4]:
def undecorated_function(a, b):
    print('running original function')
    return a + b

Now we can decorate our undecorated function this way:

In [5]:
decorated_func = my_decorator(undecorated_function)

decorating function


And we can call the decorated function:

In [6]:
decorated_func(10, 20)

running decorated function
running original function


30

Now instead of giving the decorate function a new symbol, we could have just re-used the same symbol:

In [7]:
def my_func(a, b):
    print('running original function')
    return a + b

my_func = my_decorator(my_func)

decorating function


In [8]:
my_func(10, 20)

running decorated function
running original function


30

And of course this is exactly what the decorator `@` syntax does:

In [9]:
@my_decorator
def my_func(a, b):
    print('running original function')
    return a + b

decorating function


In [10]:
my_func(10, 20)

running decorated function
running original function


30

Ok, now that we've refreshed our memory on decorators, we should be ready to look at the `property` callable.

The `property` callable creates a property object, **and returns it**.

In other words, we could create our property this way, as usual:

In [11]:
class Person:
    def __init__(self, name):
        self._name = name
        
    def name(self):
        return self._name
    
    name = property(name)

In [12]:
p = Person('Alex')

p.name

'Alex'

But you'll notice that line: `name = property(name)` - that's exactly what the decorator syntax does for us!

So instead we can write:

In [13]:
class Person:
    def __init__(self, name):
        self._name = name
        
    @property
    def name(self):
        return self._name

In [14]:
p = Person('Guido')
p.name

'Guido'

If you refresh your memory on the single dispatch generic function decorator, you'll remember that the decorated function included another property, the `register` property that was itself a decorator.

Well, the `property` object has some properties, like `setter` that will basically accept a reference to the setter method, and return itself also.

In [15]:
p = property(lambda self: 'getter')

In [16]:
dir(p)

['__class__',
 '__delattr__',
 '__delete__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__get__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__isabstractmethod__',
 '__le__',
 '__lt__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__set__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'deleter',
 'fdel',
 'fget',
 'fset',
 'getter',
 'setter']

So, we can "register" and setter method, using the `setter` callable, and get our property back as well:

In [17]:
p

<property at 0x7f83581afe58>

In [18]:
p2 = p.setter(lambda self: 'setter')

In [19]:
id(p), id(p2)

(140202095607384, 140202095618152)

Now you'll notice that the property id has changed. The setter callable actually creates a new property (with both the original getter, and the new setter assigned).

But that does not really matter, we just have a new property object that we can use to assign to a symbol - and that property will have both the getter and the setter defined.

Let's do this manually (without the decorator syntax first):

In [20]:
class Person:
    def __init__(self, name):
        self._name = name
        
    def name(self):
        return self._name
    
    name = property(name)
    
    # creating another symbol that holds on to 
    # the name property
    name_prop = name 
    
    # because herte I'm redefining name, so we lose 
    # our original reference to the property object
    def name(self, value):
        self._name = value
        
    name = name_prop.setter(name)
    
    # finally delete name_prop which we no longer need
    del name_prop

In [21]:
Person.__dict__

mappingproxy({'__module__': '__main__',
              '__init__': <function __main__.Person.__init__(self, name)>,
              'name': <property at 0x7f83581b2bd8>,
              '__dict__': <attribute '__dict__' of 'Person' objects>,
              '__weakref__': <attribute '__weakref__' of 'Person' objects>,
              '__doc__': None})

And we now have a `name` property that we created in two steps: first create the property with just a getter.

Then we replaced our property with a new property that had both the getter and the setter.

In [22]:
p = Person('Alex')
p.name

'Alex'

In [23]:
p.name = 'Raymond'
p.name

'Raymond'

Hopefully you can now see where the original property (with just the getter), had a callable `setter` that "added" the setter to the property (by creating a new property with both getter and setter), that also returned the (new) property object.

So, we can simplify our code this way:

In [24]:
class Person:
    def __init__(self, name):
        self._name = name
        
    @property
    def name(self):
        return self._name
    
    # what's the property name now? --> name
    # so name has a setter callable
    @name.setter
    def name(self, value):
        self._name = value

Note that if we had not named our setter function `name` the property name would have changed!

Remember that:
```
@dec
def my_func():
    pass
 ```
 returns a decorated function with the same name as the original function

In [25]:
Person.__dict__

mappingproxy({'__module__': '__main__',
              '__init__': <function __main__.Person.__init__(self, name)>,
              'name': <property at 0x7f83581c46d8>,
              '__dict__': <attribute '__dict__' of 'Person' objects>,
              '__weakref__': <attribute '__weakref__' of 'Person' objects>,
              '__doc__': None})

In [26]:
p = Person('Alex')

In [27]:
p.name

'Alex'

In [28]:
p.name = 'Guido'
p.name

'Guido'

Just to show you, if we had not used the same name for the setter function:

In [29]:
class Person:
    def __init__(self, name):
        self._name = name
        
    @property
    def name(self):
        return self._name
    
    # property is now called name
    
    @name.setter
    def full_name(self, value):
        self._name = value

In [30]:
Person.__dict__

mappingproxy({'__module__': '__main__',
              '__init__': <function __main__.Person.__init__(self, name)>,
              'name': <property at 0x7f83581c4db8>,
              'full_name': <property at 0x7f83581c4f48>,
              '__dict__': <attribute '__dict__' of 'Person' objects>,
              '__weakref__': <attribute '__weakref__' of 'Person' objects>,
              '__doc__': None})

As you can see we now have two properties on the class! The first one `name` will only work as a getter. And the second one `full_name` will work as both a getter and a setter:

In [31]:
p = Person('Alex')

In [32]:
p.name

'Alex'

In [33]:
p.full_name

'Alex'

In [34]:
p.full_name = 'Raymond'

In [35]:
p.full_name

'Raymond'

But this won't work:

In [36]:
try:
    p.name = 'Guido'
except AttributeError as ex:
    print(ex)

can't set attribute


Technically, the property callable has both a getter and setter method - so we can create the setter first, then "add in" the getter. But since the first argument to `property` is the getter, we have to work a bit more to do it:

In [37]:
class Person:
    def __init__(self, name):
        self._name = name
        
    name = property()  # an "empty" prroperty - no getter or setter
    
    @name.setter
    def name(self, value):
        self._name = value

By the way, we now have a property that can be set, but not read back!

In [38]:
p = Person('Alex')

In [39]:
p.__dict__

{'_name': 'Alex'}

In [40]:
p.name = 'Raymond'

In [41]:
p.__dict__

{'_name': 'Raymond'}

In [42]:
try:
    p.name
except AttributeError as ex:
    print(ex)

unreadable attribute


So, if you ever need an attribute that is "write-only" - you can do it. Maybe the data is sensitive, and you want to set it, but not show back to users... But the data is never truly private, so at best you're obfuscating the data - so in my experience I've never had to do something like that. Just wanted you to see this in case the need ever came up.

But let's finish this up and make the property read/write:

In [43]:
class Person:
    def __init__(self, name):
        self._name = name
        
    name = property()  # an "empty" prroperty - no getter or setter
    
    @name.setter
    def name(self, value):
        self._name = value
        
    @name.getter
    def name(self):
        return self._name

In [44]:
p = Person('Alex')

In [45]:
p.name

'Alex'

In [46]:
p.name = 'Raymond'

In [47]:
p.name

'Raymond'

The deleter works the same way, and we'll come back to it soon.

Lastly you'll recall that we could set up a docstring when using the `property` callable.

The standard technique is to simply define the docstring in the getter function:

In [48]:
class Person:
    def __init__(self, name):
        self._name = name
        
    @property
    def name(self):
        """The Person's name."""
        return self._name
    
    @name.setter
    def name(self, value):
        self._name = value

In [49]:
help(Person.name)

Help on property:

    The Person's name.



In [50]:
help(Person)

Help on class Person in module __main__:

class Person(builtins.object)
 |  Methods defined here:
 |  
 |  __init__(self, name)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  name
 |      The Person's name.



What happens if we set it in the setter instead?

In [51]:
class Person:
    def __init__(self, name):
        self._name = name
        
    @property
    def name(self):
        return self._name
    
    @name.setter
    def name(self, value):
        """The Person's name."""
        self._name = value

In [52]:
help(Person.name)

Help on property:




In [53]:
help(Person)

Help on class Person in module __main__:

class Person(builtins.object)
 |  Methods defined here:
 |  
 |  __init__(self, name)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  name



As you can see, the property docstring is only set on the getter. So how to set a docstring with a write-only property? We can do that when we create the initial property:

In [54]:
class Person:
    def __init__(self, name):
        self._name = name
        
    name = property(doc='Write-only name property.')
    
    @name.setter
    def name(self, value):
        self._name = value

In [55]:
help(Person.name)

Help on property:

    Write-only name property.



##  Read-Only and Computed Properties

Although write-only properties are not that common, read-only properties (i.e. that define a getter but not a setter) are quite common for a number of things.

Of course, we can create read-only properties, but since nothing is private, at best we are "suggesting" to the users of our class they should treat the property as read-only. There's always a way to hack around that of course.

But still, it's good to be able to at least explicitly indicate to a user that a property is meant to be read-only.

The use case I'm going to focus on in this video, is one of computed properties. Those are properties that may not actually have a backing variable, but are instead calculated on the fly.

Consider this simple example of a `Circle` class where we can read/write the radius of the circle, but want a computed property for the area. We don't need to store the area value, we can alway calculate it given the current radius value.

In [1]:
from math import pi

class Circle:
    def __init__(self, radius):
        self.radius = radius
        
    @property
    def area(self):
        print('calculating area...')
        return pi * (self.radius ** 2)

In [2]:
c = Circle(1)
c.area

calculating area...


3.141592653589793

We could certainly just use a class method `area()`, but the area is more a property of the circle, so it makes more sense to just retrive it as a property, without the extra `()` to make the call.

The advantage of how we did this is that shoudl the radius of the circle ever change, the area property will immediately reflect that.

In [3]:
c.radius = 2
c.area

calculating area...


12.566370614359172

On the other hand, it's also a weakness - every time we need the area of the circle, it gets recalculated, even if the radius has not changed!

In [4]:
c.area
c.area

calculating area...
calculating area...


12.566370614359172

So now we can use properties to fix this problem without breaking our interface!

We are going to cache the area value, and only-recalculate it if the radius has changed.

In order for us to know if the radius has changed, we are going to make it into a property, and the setter will keep track of whether the radius is set, in which case it will invalidate the cached area value.

In [5]:
class Circle:
    def __init__(self, radius):
        self.radius = radius
        self._area = None
        
    @property
    def radius(self):
        return self._radius
    
    @radius.setter
    def radius(self, value):
        # if radius value is set we invalidate our cached _area value
        # we could make this more intelligent and see if the radius has actually changed
        # but keeping it simple
        self._area = None
        # we could even add validation here, like value has to be numeric, non-negative, etc
        self._radius = value
        
    @property
    def area(self):
        if self._area is None:
            # value not cached - calculate it
            print('Calculating area...')
            self._area = pi * (self.radius ** 2)
        return self._area

In [6]:
c = Circle(1)

In [7]:
c.area

Calculating area...


3.141592653589793

In [8]:
c.area

3.141592653589793

In [9]:
c.radius = 2

In [10]:
c.area

Calculating area...


12.566370614359172

In [11]:
c.area

12.566370614359172

There are a lot of other uses for calculate properties.

Some properties may even do a lot work, like retrieving data from a database, making a call to some external API, and so on.

### Example

Let's write a class that takes a URL, downloads the web page for that URL and provides us some metrics on that URL - like how long it took to download, the size (in bytes) of the page.

Although I am going to use the `urllib` module for this, I strongly recommend you use the `requests` 3rd party library instead: http://docs.python-requests.org

In [12]:
import urllib
from time import perf_counter

In [13]:
class WebPage:
    def __init__(self, url):
        self.url = url
        self._page = None
        self._load_time_secs = None
        self._page_size = None
        
    @property
    def url(self):
        return self._url
    
    @url.setter
    def url(self, value):
        self._url = value
        self._page = None
        # we'll lazy load the page - i.e. we wait until some property is requested
        
    @property
    def page(self):
        if self._page is None:
            self.download_page()
        return self._page
    
    @property
    def page_size(self):
        if self._page is None:
            # need to first download the page
            self.download_page()
        return self._page_size
        
    @property
    def time_elapsed(self):
        if self._page is None:
            self.download_page()
        return self._load_time_secs
            
    def download_page(self):
        self._page_size = None
        self._load_time_secs = None
        start_time = perf_counter()
        with urllib.request.urlopen(self.url) as f:
            self._page = f.read()
        end_time = perf_counter()
        
        self._page_size = len(self._page)
        self._load_time_secs = end_time - start_time

In [14]:
urls = [
    'https://www.google.com',
    'https://www.python.org',
    'https://www.yahoo.com'
]

for url in urls:
    page = WebPage(url)
    print(f'{url} \tsize={format(page.page_size, "_")} \telapsed={page.time_elapsed:.2f} secs')

https://www.google.com 	size=11_489 	elapsed=0.20 secs
https://www.python.org 	size=49_132 	elapsed=0.18 secs
https://www.yahoo.com 	size=524_548 	elapsed=0.77 secs


##  Deleting Properties

Just like we can delete an attribute from an instance object, we can also delete a property from an instance object.

Note that this action simply runs the deleter method, but the propertu remains defined **on the class**. It does not remove the property from the class, instead it is generally used to remove the property value from the **instance**.

Properties, like attributes, can be deleted by using the `del` keyword, or the `delattr` function.

In [1]:
class Person:
    def __init__(self, name):
        self.name = name

    def get_name(self):
        print('getting name property value...')
        return self._name
    
    def set_name(self, value):
        print(f'setting name property to {value}...')
        self._name = value
    
    def del_name(self):
        # delete the underlying data
        print('deleting name property value...')
        del self._name
        
    name = property(fget=get_name, fset=set_name, fdel=del_name, doc='Person name.')


In [2]:
p = Person('Guido')

setting name property to Guido...


In [3]:
p.name

getting name property value...


'Guido'

And the underlying `_name` property is in our instance dictionary:

In [4]:
p.__dict__

{'_name': 'Guido'}

In [5]:
del p.name

deleting name property value...


As we can see, the underlying `_name` attribute is no longer present in the instance dictionary:

In [6]:
p.__dict__

{}

In [7]:
try:
    print(p.name)
except AttributeError as ex:
    print(ex)

getting name property value...
'Person' object has no attribute '_name'


As you can see, the property deletion did not remove the property definition, that still exists.

Alternatively, we can use the `delattr` function as well:

In [8]:
 p = Person('Raymond')

setting name property to Raymond...


In [9]:
delattr(p, 'name')

deleting name property value...


And we can of course use the decorator syntax as well:

In [10]:
class Person:
    def __init__(self, name):
        self.name = name

    @property
    def name(self):
        print('getting name property value...')
        return self._name
    
    @name.setter
    def name(self, value):
        """Person name"""
        print(f'setting name property to {value}...')
        self._name = value
    
    @name.deleter
    def name(self):
        # delete the underlying data
        print('deleting name property value...')
        del self._name

In [11]:
p = Person('Alex')

setting name property to Alex...


In [12]:
p.name

getting name property value...


'Alex'

In [13]:
del p.name

deleting name property value...


##  Class and Static Methods

Asd we saw, when we define a function inside a class, how it behaves (as a function or a method) depends on how the function is accessed: from the class, or from the instance. (We'll cover how that works when we look at descriptors later in this course).

In [1]:
class Person:
    def hello(arg='default'):
        print(f'Hello, with arg={arg}')

If we call `hello` from the class:

In [2]:
Person.hello()

Hello, with arg=default


You'll notice that `hello` was called without any arguments, in fact, `hello` is a regular function:

In [3]:
Person.hello

<function __main__.Person.hello(arg='default')>

But if we call `hello` from an instance, things are different:

In [4]:
p = Person()

In [5]:
p.hello

<bound method Person.hello of <__main__.Person object at 0x7f8f287fb860>>

In [6]:
p.hello()

Hello, with arg=<__main__.Person object at 0x7f8f287fb860>


In [7]:
hex(id(p))

'0x7f8f287fb860'

And as you can see the instance `p` was passed as an argument to `hello`. 

Sometimes however, we define functions in a class that do not interact with the instance itself, but may need something from the class. In those cases, we want the class to be passed to the function as an argument, whether it is called from the class or from an instance of the class.

These are called **class methods**. You'll note that the behavior needs to be different - we don't want the instance to be passed to the function when called from an instance, we want the **class** to be passed to it. In addition, when called from the class, we **also** want the class to be passed to it (this is similar to `static` methods in Java, not to be confused with, as we'll see in a bit, static methods in Python).

We use the `@classmethod` decorator to define class methods, and the first argument of these methods will always be the class where the method is defined.

Let's see a simple example first:

In [8]:
class MyClass:
    def hello():
        # this IS an instance method, we just forgot to add a parameter to capture the instance
        # when this is called from an instance - so this will fail
        print('hello...')
        
    def instance_hello(arg):
        print(f'hello from {arg}')
        
    @classmethod
    def class_hello(arg):
        print(f'hello from {arg}')
        

In [9]:
m = MyClass()

In [10]:
MyClass.hello()

hello...


But, as expected, this won't work:

In [11]:
try:
    m.hello()
except TypeError as ex:
    print(ex)

hello() takes 0 positional arguments but 1 was given


On the other hand, notice now the instance method when called from the instance and the class:

In [12]:
m.instance_hello()

hello from <__main__.MyClass object at 0x7f8ed87fff60>


In [13]:
try:
    MyClass.instance_hello()
except TypeError as ex:
    print(ex)

instance_hello() missing 1 required positional argument: 'arg'


As you can see, the instance method needs to be called from the instance. If we call it from the class, no argument is passed to the function, so we end up with an exception.

This is not the case with class methods - whether we call the method from the class, or the instance, that first argument will always be provided by Python, and will be the class object (not the instance).

Notice how the bindings are different:

In [14]:
MyClass.class_hello

<bound method MyClass.class_hello of <class '__main__.MyClass'>>

In [15]:
m.class_hello

<bound method MyClass.class_hello of <class '__main__.MyClass'>>

As you can see in both these cases, `class_hello` is bound to the class.

But with an instance method, the bindings behave differently:

In [16]:
MyClass.instance_hello

<function __main__.MyClass.instance_hello(arg)>

In [17]:
m.instance_hello

<bound method MyClass.instance_hello of <__main__.MyClass object at 0x7f8ed87fff60>>

So, whenever we call `class_hello` the method is bound to the **class**, and the first argument is the class:

In [18]:
MyClass.class_hello()

hello from <class '__main__.MyClass'>


In [19]:
m.class_hello()

hello from <class '__main__.MyClass'>


Although in this example I used `arg` as the parameter name in our methods, the normal **convention** is to use `self` and `cls` - that way everyone knows what we're talking about!

We sometimes also want to define functions in a class and always have them be just that - functions, never bound to either the class or the instance, however we call them. Often we do this because we need to utility function that is specific to our class, and we want to keep our class self-contained, or maybe we're writing a library of functions (though modules and packages may be more appropriate for this).

These are called **static** methods. (So be careful here, Python static methods and Java static methods do not have the same meaning!)

We can define static methods using the `@staticmethod` decorator:

In [20]:
class MyClass:
    def instance_hello(self):
        print(f'Instance method bound to {self}')
        
    @classmethod
    def class_hello(cls):
        print(f'Class method bound to {cls}')
        
    @staticmethod
    def static_hello():
        print('Static method not bound to anything')

In [21]:
m = MyClass()

In [22]:
m.instance_hello()

Instance method bound to <__main__.MyClass object at 0x7f8ed8811a58>


In [23]:
MyClass.class_hello()

Class method bound to <class '__main__.MyClass'>


In [24]:
m.class_hello()

Class method bound to <class '__main__.MyClass'>


And the static method can be called either from the class or the instance, but is never bound:

In [25]:
MyClass.static_hello

<function __main__.MyClass.static_hello()>

In [26]:
m.static_hello

<function __main__.MyClass.static_hello()>

In [27]:
MyClass.static_hello()

Static method not bound to anything


In [28]:
m.static_hello()

Static method not bound to anything


#### Example

Let's see a more concrete example of using these different method types.

We're going to create a `Timer` class that will allow us to get the current time (in both UTC and some timezone), as well as record start/stop times.

We want to have the same timezone for all instances of our `Timer` class with an easy way to change the timezone for all instances when needed.

If you need to work with timezones, I recommend you use the `pyrz` 3rd party library. Here, I'll just use the standard library, which is definitely not as easy to use as `pytz`.

In [29]:
from datetime import datetime, timezone, timedelta

class Timer:
    tz = timezone.utc  # class variable to store the timezone - default to UTC
    
    @classmethod
    def set_tz(cls, offset, name):
        cls.tz = timezone(timedelta(hours=offset), name)

So `tz` is a class attribute, and we can set it using a class method `set_timezone` - any instances will share the same `tz` value (unless we override it at the instance level)

In [30]:
Timer.set_tz(-7, 'MST')

In [31]:
Timer.tz

datetime.timezone(datetime.timedelta(-1, 61200), 'MST')

In [32]:
t1 = Timer()
t2 = Timer()

In [33]:
t1.tz, t2.tz

(datetime.timezone(datetime.timedelta(-1, 61200), 'MST'),
 datetime.timezone(datetime.timedelta(-1, 61200), 'MST'))

In [34]:
Timer.set_tz(-8, 'PST')

In [35]:
t1.tz, t2.tz

(datetime.timezone(datetime.timedelta(-1, 57600), 'PST'),
 datetime.timezone(datetime.timedelta(-1, 57600), 'PST'))

Next we want a function to return the current UTC time. Obviously this has nothing to do with either the class or the instance, so it is a prime candidate for a static method:

In [36]:
class Timer:
    tz = timezone.utc  # class variable to store the timezone - default to UTC
    
    @staticmethod
    def current_dt_utc():
        return datetime.now(timezone.utc)
    
    @classmethod
    def set_tz(cls, offset, name):
        cls.tz = timezone(timedelta(hours=offset), name)

In [37]:
Timer.current_dt_utc()

datetime.datetime(2019, 6, 2, 23, 25, 59, 714761, tzinfo=datetime.timezone.utc)

In [38]:
t = Timer()

In [39]:
t.current_dt_utc()

datetime.datetime(2019, 6, 2, 23, 25, 59, 723565, tzinfo=datetime.timezone.utc)

Next we want a method that will return the current time based on the set time zone. Obviously the time zone is a class variable, so we'll need to access that, but we don't need any instance data, so this is a prime candidate for a class method:

In [40]:
class Timer:
    tz = timezone.utc  # class variable to store the timezone - default to UTC
    
    @staticmethod
    def current_dt_utc():
        return datetime.now(timezone.utc)
    
    @classmethod
    def set_tz(cls, offset, name):
        cls.tz = timezone(timedelta(hours=offset), name)
        
    @classmethod
    def current_dt(cls):
        return datetime.now(cls.tz)

In [41]:
Timer.current_dt_utc(), Timer.current_dt()

(datetime.datetime(2019, 6, 2, 23, 25, 59, 733420, tzinfo=datetime.timezone.utc),
 datetime.datetime(2019, 6, 2, 23, 25, 59, 733423, tzinfo=datetime.timezone.utc))

In [42]:
t1 = Timer()
t2 = Timer()

In [43]:
t1.current_dt_utc(), t1.current_dt()

(datetime.datetime(2019, 6, 2, 23, 25, 59, 741248, tzinfo=datetime.timezone.utc),
 datetime.datetime(2019, 6, 2, 23, 25, 59, 741251, tzinfo=datetime.timezone.utc))

In [44]:
t2.current_dt()

datetime.datetime(2019, 6, 2, 23, 25, 59, 745699, tzinfo=datetime.timezone.utc)

And if we change the time zone (we can do so either via the class or the instance, no difference, since the `set_tz` method is always bound to the class):

In [45]:
t2.set_tz(-7, 'MST')

In [46]:
Timer.__dict__

mappingproxy({'__module__': '__main__',
              'tz': datetime.timezone(datetime.timedelta(-1, 61200), 'MST'),
              'current_dt_utc': <staticmethod at 0x7f8ed8836d30>,
              'set_tz': <classmethod at 0x7f8ed8836d68>,
              'current_dt': <classmethod at 0x7f8ed8836da0>,
              '__dict__': <attribute '__dict__' of 'Timer' objects>,
              '__weakref__': <attribute '__weakref__' of 'Timer' objects>,
              '__doc__': None})

In [47]:
Timer.current_dt_utc(), Timer.current_dt(), t1.current_dt(), t2.current_dt()

(datetime.datetime(2019, 6, 2, 23, 25, 59, 761523, tzinfo=datetime.timezone.utc),
 datetime.datetime(2019, 6, 2, 16, 25, 59, 761526, tzinfo=datetime.timezone(datetime.timedelta(-1, 61200), 'MST')),
 datetime.datetime(2019, 6, 2, 16, 25, 59, 761526, tzinfo=datetime.timezone(datetime.timedelta(-1, 61200), 'MST')),
 datetime.datetime(2019, 6, 2, 16, 25, 59, 761527, tzinfo=datetime.timezone(datetime.timedelta(-1, 61200), 'MST')))

So far we have not needed any instances to work with this class!

Now we're going to add functionality to start/stop a timer. Obviously we want this to be instance based, since we want to be able to create multiple timers.

In [48]:
class TimerError(Exception):
    """A custom exception used for Timer class"""
    # (since """...""" is a statement, we don't need to pass)
    
class Timer:
    tz = timezone.utc  # class variable to store the timezone - default to UTC
    
    def __init__(self):
        # use these instance variables to keep track of start/end times
        self._time_start = None
        self._time_end = None
        
    @staticmethod
    def current_dt_utc():
        """Returns non-naive current UTC"""
        return datetime.now(timezone.utc)
    
    @classmethod
    def set_tz(cls, offset, name):
        cls.tz = timezone(timedelta(hours=offset), name)
        
    @classmethod
    def current_dt(cls):
        return datetime.now(cls.tz)
    
    def start(self):
        # internally we always non-naive UTC
        self._time_start = self.current_dt_utc()
        self._time_end = None
        
    def stop(self):
        if self._time_start is None:
            # cannot stop if timer was not started!
            raise TimerError('Timer must be started before it can be stopped.')
        self._time_end = self.current_dt_utc()
        
    @property
    def start_time(self):
        if self._time_start is None:
            raise TimerError('Timer has not been started.')
        # since tz is a class variable, we can just as easily access it from self
        return self._time_start.astimezone(self.tz)  
        
    @property
    def end_time(self):
        if self._time_end is None:
            raise TimerError('Timer has not been stopped.')
        return self._time_end.astimezone(self.tz)
    
    @property
    def elapsed(self):
        if self._time_start is None:
            raise TimerError('Timer must be started before an elapsed time is available')
            
        if self._time_end is None:
            # timer has not ben stopped, calculate elapsed between start and now
            elapsed_time = self.current_dt_utc() - self._time_start
        else:
            # timer has been stopped, calculate elapsed between start and end
            elapsed_time = self._time_end - self._time_start
            
        return elapsed_time.total_seconds()

In [49]:
from time import sleep

t1 = Timer()
t1.start()
sleep(2)
t1.stop()
print(f'Start time: {t1.start_time}')
print(f'End time: {t1.end_time}')
print(f'Elapsed: {t1.elapsed} seconds')

Start time: 2019-06-02 23:25:59.777250+00:00
End time: 2019-06-02 23:26:01.781431+00:00
Elapsed: 2.004181 seconds


In [50]:
t2 = Timer()
t2.start()
sleep(3)
t2.stop()
print(f'Start time: {t2.start_time}')
print(f'End time: {t2.end_time}')
print(f'Elapsed: {t2.elapsed} seconds')

Start time: 2019-06-02 23:26:01.787596+00:00
End time: 2019-06-02 23:26:04.792814+00:00
Elapsed: 3.005218 seconds


So our timer works. Furthermore, we want to use `MST` throughout our application, so we'll set it, and since it's a class level attribute, we only need to change it once:

In [51]:
Timer.set_tz(-7, 'MST')

In [52]:
print(f'Start time: {t1.start_time}')
print(f'End time: {t1.end_time}')
print(f'Elapsed: {t1.elapsed} seconds')

Start time: 2019-06-02 16:25:59.777250-07:00
End time: 2019-06-02 16:26:01.781431-07:00
Elapsed: 2.004181 seconds


In [53]:
print(f'Start time: {t2.start_time}')
print(f'End time: {t2.end_time}')
print(f'Elapsed: {t2.elapsed} seconds')

Start time: 2019-06-02 16:26:01.787596-07:00
End time: 2019-06-02 16:26:04.792814-07:00
Elapsed: 3.005218 seconds


##  Class Body Scope

The class body is a scope and therefore has it's own namespace. Inside that scope we can reference symbols like we would within any other scope:

In [1]:
class Language:
    MAJOR = 3
    MINOR = 7
    REVISION = 4
    FULL = '{}.{}.{}'.format(MAJOR, MINOR, REVISION)

In [2]:
Language.FULL

'3.7.4'

However, functions defined inside the class are not nested in the body scope - instead they are nested in whatever scope the class itself is in.

This means that we cannot reference the class symbols inside a function without also telling Python where to look for it:

In [3]:
class Language:
    MAJOR = 3
    MINOR = 7
    REVISION = 4
    
    @property
    def version(self):
        return '{}.{}.{}'.format(self.MAJOR, self.MINOR, self.REVISION)
    
    @classmethod
    def cls_version(cls):
        return '{}.{}.{}'.format(cls.MAJOR, cls.MINOR, cls.REVISION)
    
    @staticmethod
    def static_version():
        return '{}.{}.{}'.format(Language.MAJOR, Language.MINOR, Language.REVISION)

In [4]:
l = Language()
l.version

'3.7.4'

In [5]:
Language.cls_version()

'3.7.4'

In [6]:
Language.static_version()

'3.7.4'

Basically think that the function symbols are in the class body namespace, but the functions themselves are defined externally to the class - just as if we had written it this way:

In [7]:
def full_version():
 return '{}.{}.{}'.format(Language.MAJOR, Language.MINOR, Language.REVISION)

In [8]:
full_version()

'3.7.4'

So writing something like this will not work:

In [9]:
class Language:
    MAJOR = 3
    MINOR = 7
    REVISION = 4
    
    @classmethod
    def cls_version(cls):
        return '{}.{}.{}'.format(MAJOR, MINOR, REVISION)

In [10]:
Language.cls_version()

NameError: name 'MAJOR' is not defined

This behavior can lead to subtle bugs if we aren't careful. 

What happens if the names `MAJOR`, `MINOR` and `REVISION` **are** defined in the enclosing scope?

In [11]:
MAJOR = 0
MINOR = 0
REVISION = 1

In [12]:
Language.cls_version()

'0.0.1'

See what happened?!!

Now of course, the nested scopes follow the same usual rules, so we could technically have something like this:

In [13]:
MAJOR = 0
MINOR = 0
REVISION = 1

def gen_class():
    MAJOR = 0
    MINOR = 4
    REVISION = 2
    
    class Language:
        MAJOR = 3
        MINOR = 7
        REVISION = 4

        @classmethod
        def version(cls):
            return '{}.{}.{}'.format(MAJOR, MINOR, REVISION)
        
    return Language

In [14]:
cls = gen_class()

In [15]:
cls.version()

'0.4.2'

Notice how the scope of `version` was nested inside `gen_class` which itself is nested in the `global` scope.

When we called the `version` method, it found the `MAJOR`, `MINOR` and `REVISION` in the closest enclosing scope - which turned out to be the `gen_class` scope.

This means by the way, that `version` is not only a method, but actually a closure.

In [16]:
import inspect

In [17]:
inspect.getclosurevars(cls.version)

ClosureVars(nonlocals={'MAJOR': 0, 'MINOR': 4, 'REVISION': 2}, globals={}, builtins={'format': <built-in function format>}, unbound=set())

This last example of "unexpected" behavior I want to show you was show to me by a friend who was puzzled by it:

In [18]:
name = 'Guido'

class MyClass:
    name = 'Raymond'
    list_1 = [name] * 3
    list_2 = [name.upper() for i in range(3)]
    
    @classmethod
    def hello(cls):
        return '{} says hello'.format(name)

In [19]:
MyClass.list_1

['Raymond', 'Raymond', 'Raymond']

Since the expression `[name] * 3` lives in the class body, it uses `name` that it finds in the class namespace.

In [20]:
MyClass.hello()

'Guido says hello'

Here, `name` is used inside a function, so the closest `name` symbol is the one in the module/global scope. Hence we see that `Guido` was used.

In [21]:
MyClass.list_2

['GUIDO', 'GUIDO', 'GUIDO']

That one is more puzzling... Why is the expression `[name.upper() for i in range(3)]` using `name` from the enclosing (module/global) scope, and not the one from the class namespace like `[name] * 3` did?

Remember what we discussed about comprehensions?

They are essentially thinly veiled **functions**!!!

So they behave like a function would, and therefore are not nested in the class body scope, but, in this case, in the module/global scope!

# Section 03 - Project 1

##  Project 1

We need to design an dimplement a class that will be used to represent bank accounts.

We want the following functionality and characteristics:
- accounts are uniquely identified by an **account number** (assume it will just be passed in the initializer)
- account holders have a **first** and **last** name
- accounts have an associated **preferred time zone offset** (e.g. -7 for MST)
- **balances** need to be zero or higher, and should not be directly settable.
- but, **deposits** and **withdrawals** can be made (given sufficient funds)
    - if a withdrawal is attempted that would result in nagative funds, the transaction should be declined.
- a **monthly interest rate** exists and is applicable to all accounts **uniformly**. There should be a method that can be called to calculate the interest on the current balance using the current interest rate, and **add it** to the balance.
- each deposit and withdrawal must generate a **confirmation number** composed of:
    - the transaction type: `D` for deposit, and `W` for withdrawal, `I` for interest deposit, and `X` for declined (in which case the balance remains unaffected)
    - the account number
    - the time the transaction was made, using UTC
    - an incrementing number (that increments across all accounts and transactions)
    - for (extreme!) simplicity assume that the transaction id starts at zero (or whatever number you choose) whenever the program starts
    - the confirmation number should be returned from any of the transaction methods (deposit, withdraw, etc)
- create a **method** that, given a confirmation number, returns:
    - the account number, transaction code (D, W, etc), datetime (UTC format), date time (in whatever timezone is specified in te argument, but more human readable), the transaction ID
    - make it so it is a nicely structured object (so can use dotted notation to access these three attributes)
    - I purposefully made it so the desired timezone is passed as an argument. Can you figure out why? (hint: does this method require any information from any instance?)

For example, we may have an account with:
- account number `140568` 
- preferred time zone offset of -7 (MST) 
- an existing balance of `100.00`

Suppose the last transaction ID in the system was `123`, and a deposit is made for `50.00` on `2019-03-15T14:59:00` (UTC) on that account (or `2019-03-15T07:59:00` in account's preferred time zone offset)

The new balance should reflect `150.00` and the confirmation number returned should look something like this:

```D-140568-20190315145900-124```

We also want a method that given the confirmation number returns an object with attributes:
- `result.account_number` --> `140568`
- `result.transaction_code` --> `D`
- `result.transaction_id` --> `124`
- `result.time` --> `2019-03-15 07:59:00 (MST)`
- `result.time_utc` --> `2019-03-15T14:59:00`

Furthermore, if current interest rate is `0.5%`, and the account's balance is `1000.00`, then the result of calling the `deposit_interest` (or whatever name you choose) method, should result in a new transaction and a new balance of `1050.00`. Calling this method should also return a confirmation number.

For simplicty, just use floats, but be aware that for these types of situations you'll probably want to use `Decimal` objects instead of floats.

There are going to be many ways to design something like this, especially since I have not nailed down all the specific requirements, so you'll have to fill the gaps yourself and decide what other things you may want to implement (like is the account number going to be a mutable property, or "read-only" and so on).

See how many different ideas you can use from what we covered in the last section. 

My approach will end up creating two classes: a `TimeZone` class used to store the time zone name and offset definition (in hours and minutes), and a main class called `Account` that will have the following "public" interface:
- initializer with account number, first name, last name, optional preferred time zone, starting balance (defaults to 0)
- a first name property (read/write)
- a last name property (read/write)
- a full name property (computed, read-only)
- a balance property (read-only)
- an interest rate property (class level property)
- deposit, withdraw, pay_interest methods
- parse confirmation code

Class will have additional state and methods, but those will be used for implementation.

You should also remember to test your code! In the solutions I will introduce you to Python's `unittest` package. Even if you skip this project, at least review that video and/or notebook if you are unfamiliar with `unittest`.

##  Project 1: TimeZone class

Let's start with the timezone class. This one will have two instance attributes, offset and name. I'm going to create those as read-only properties. Offsets should be provided as a timespan (timedelta) of hours and minutes - we'll allow specifying the hour and minute offsets separately in the __init__, but the offset property will combine those as a timespan object.

In [1]:
import numbers
from datetime import timedelta


class TimeZone:
    def __init__(self, name, offset_hours, offset_minutes):
        if name is None or len(str(name).strip()) == 0:
            raise ValueError('Timezone name cannot be empty.')
            
        self._name = str(name).strip()
        
        if not isinstance(offset_hours, numbers.Integral):
            raise ValueError('Hour offset must be an integer.')
        
        if not isinstance(offset_minutes, numbers.Integral):
            raise ValueError('Minutes offset must be an integer.')
            
        if offset_minutes < -59 or offset_minutes > 59:
            raise ValueError('Minutes offset must between -59 and 59 (inclusive).')
            
        # for time delta sign of minutes will be set to sign of hours
        offset = timedelta(hours=offset_hours, minutes=offset_minutes)

        # offsets are technically bounded between -12:00 and 14:00
        # see: https://en.wikipedia.org/wiki/List_of_UTC_time_offsets
        if offset < timedelta(hours=-12, minutes=0) or offset > timedelta(hours=14, minutes=0):
            raise ValueError('Offset must be between -12:00 and +14:00.')
            
        self._offset_hours = offset_hours
        self._offset_minutes = offset_minutes
        self._offset = offset
        
    @property
    def offset(self):
        return self._offset
    
    @property
    def name(self):
        return self._name
    
    def __eq__(self, other):
        return (isinstance(other, TimeZone) and 
                self.name == other.name and 
                self._offset_hours == other._offset_hours and
                self._offset_minutes == other._offset_minutes)
    def __repr__(self):
        return (f"TimeZone(name='{self.name}', "
                f"offset_hours={self._offset_hours}, "
                f"offset_minutes={self._offset_minutes})")

Let's try it out and make sure it's working:

In [2]:
tz1 = TimeZone('ABC', -2, -15)

In [3]:
tz1.name

'ABC'

In [4]:
from datetime import datetime

dt = datetime.utcnow()
print(dt)

2019-06-02 23:27:15.937254


In [5]:
print(dt + tz1.offset)

2019-06-02 21:12:15.937254


As we can see the offset seems to be working (-2:15 from current time)

(We really should be writing unit tests as we write our code - but I'll show you unit tests in the last section of this project, and next project we can code and unit test in parallel)

##  Project 1: Transaction Numbers

Next I want something for the transaction ID logic.

Basically, we just need an initialized counter that returns the next value every time it is called.

We have different ways of doing this - since this is an OOP course, you might be tempted to jump straight into a class implementation for this.

We could do it this way:

In [1]:
class TransactionID:
    def __init__(self, start_id):
        self._start_id = start_id
        
    def next(self):
        self._start_id += 1
        return self._start_id

We could then use an instance of this class as a class attribute for `Account` thusly:

In [2]:
class Account:
    transaction_counter = TransactionID(100)
    
    def make_transaction(self):
        new_trans_id = Account.transaction_counter.next()
        return new_trans_id

In [3]:
a1 = Account()
a2 = Account()

print(a1.make_transaction())
print(a2.make_transaction())
print(a1.make_transaction())

101
102
103


So, that works just fine, but if we think about this a bit more, we should see that we really do not need a class to solve this particular problem. We could even implement the iterator protocol for this class (implementing `__iter__` and `__next__`), but a simple infinite generator will work just as well.

So moral of the story here, don't just jump into solving every problem using classes - this isn't Java, we don't need a class for everything!

Instead, I'm going to implement it this way:

In [4]:
def transaction_ids(start_id):
    while True:
        start_id += 1
        yield start_id

So we can use it this way:

In [5]:
class Account:
    transaction_counter = transaction_ids(100)
    
    def make_transaction(self):
        new_trans_id = next(Account.transaction_counter)
        return new_trans_id

In [6]:
a1 = Account()
a2 = Account()

print(a1.make_transaction())
print(a2.make_transaction())
print(a1.make_transaction())

101
102
103


So this works equally well, but if we recall the `counter` method in the `itertools` module, we can simplify this even further:

In [7]:
import itertools

class Account:
    transaction_counter = itertools.count(100)
    
    def make_transaction(self):
        new_trans_id = next(Account.transaction_counter)
        return new_trans_id

In [8]:
a1 = Account()
a2 = Account()

print(a1.make_transaction())
print(a2.make_transaction())
print(a1.make_transaction())

100
101
102


##  Project 1: Account Number, First Name, Last Name

Here's our code so far:

In [1]:
import itertools
import numbers
from datetime import timedelta

class TimeZone:
    def __init__(self, name, offset_hours, offset_minutes):
        if name is None or len(str(name).strip()) == 0:
            raise ValueError('Timezone name cannot be empty.')
            
        self._name = str(name).strip()
        # technically we should check that offset is a
        if not isinstance(offset_hours, numbers.Integral):
            raise ValueError('Hour offset must be an integer.')
        
        if not isinstance(offset_minutes, numbers.Integral):
            raise ValueError('Minutes offset must be an integer.')
            
        if offset_minutes < -59 or offset_minutes > 59:
            raise ValueError('Minutes offset must between -59 and 59 (inclusive).')
            
        # for time delta sign of minutes will be set to sign of hours
        offset = timedelta(hours=offset_hours, minutes=offset_minutes)

        # offsets are technically bounded between -12:00 and 14:00
        # see: https://en.wikipedia.org/wiki/List_of_UTC_time_offsets
        if offset < timedelta(hours=-12, minutes=0) or offset > timedelta(hours=14, minutes=0):
            raise ValueError('Offset must be between -12:00 and +14:00.')
            
        self._offset_hours = offset_hours
        self._offset_minutes = offset_minutes
        self._offset = offset
        
    @property
    def offset(self):
        return self._offset
    
    @property
    def name(self):
        return self._name
    
    def __eq__(self, other):
        return (isinstance(other, TimeZone) and 
                self.name == other.name and 
                self._offset_hours == other._offset_hours and
                self._offset_minutes == other._offset_minutes)
    def __repr__(self):
        return (f"TimeZone(name='{self.name}', "
                f"offset_hours={self._offset_hours}, "
                f"offset_minutes={self._offset_minutes})")
    
class Account:
    transaction_counter = itertools.count(100)

Let's implement the properties for account number, first name and last name.

In [2]:
class Account:
    transaction_counter = itertools.count(100)
    
    def __init__(self, account_number, first_name, last_name):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError('First name cannot be empty.')
        self._first_name = value
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError('Last name cannot be empty.')
        self._last_name = value
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'

You'll notice how we are using the same basic functionality to validate the first and last names. Most likely we'll need to add additional validations, in which case we'd have to add it to both places.

I don't like repetitive code, so I'm going to move the validation into a separate function. That function won't need access to the instance data, so that's a prime candidate for a static method:

In [3]:
class Account:
    transaction_counter = itertools.count(100)
    
    def __init__(self, account_number, first_name, last_name):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self._first_name = Account.validate_name(value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self._last_name = Account.validate_name(value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
    
    
    @staticmethod
    def validate_name(value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        return str(value).strip()

In [4]:
try:
    a = Account('12345', 'John', '')
except ValueError as ex:
    print(ex)

Last Name cannot be empty.


Now, just to show you how we could use `setattr`, we could also do something like this instead:

In [5]:
class Account:
    transaction_counter = itertools.count(100)
    
    def __init__(self, account_number, first_name, last_name):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
    
    
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)

In [6]:
try:
    a = Account('12345', 'John', '')
except ValueError as ex:
    print(ex)

Last Name cannot be empty.


In [7]:
a = Account('12345', 'Alex', 'Martelli')

In [8]:
a.first_name, a.last_name, a.full_name

('Alex', 'Martelli', 'Alex Martelli')

So, whichever approach you prefer - I favor the second one because that way I have the same validation apply to both first and last name properties.

##  Project 1: Adding the Preferred TimeZone Property

Here's where we left off with our code:

In [1]:
import itertools
import numbers
from datetime import timedelta

class TimeZone:
    def __init__(self, name, offset_hours, offset_minutes):
        if name is None or len(str(name).strip()) == 0:
            raise ValueError('Timezone name cannot be empty.')
            
        self._name = str(name).strip()
        # technically we should check that offset is a
        if not isinstance(offset_hours, numbers.Integral):
            raise ValueError('Hour offset must be an integer.')
        
        if not isinstance(offset_minutes, numbers.Integral):
            raise ValueError('Minutes offset must be an integer.')
            
        if offset_minutes < -59 or offset_minutes > 59:
            raise ValueError('Minutes offset must between -59 and 59 (inclusive).')
            
        # for time delta sign of minutes will be set to sign of hours
        offset = timedelta(hours=offset_hours, minutes=offset_minutes)

        # offsets are technically bounded between -12:00 and 14:00
        # see: https://en.wikipedia.org/wiki/List_of_UTC_time_offsets
        if offset < timedelta(hours=-12, minutes=0) or offset > timedelta(hours=14, minutes=0):
            raise ValueError('Offset must be between -12:00 and +14:00.')
            
        self._offset_hours = offset_hours
        self._offset_minutes = offset_minutes
        self._offset = offset
        
    @property
    def offset(self):
        return self._offset
    
    @property
    def name(self):
        return self._name
    
    def __eq__(self, other):
        return (isinstance(other, TimeZone) and 
                self.name == other.name and 
                self._offset_hours == other._offset_hours and
                self._offset_minutes == other._offset_minutes)
    def __repr__(self):
        return (f"TimeZone(name='{self.name}', "
                f"offset_hours={self._offset_hours}, "
                f"offset_minutes={self._offset_minutes})")
    
    
class Account:
    transaction_counter = itertools.count(100)
    
    def __init__(self, account_number, first_name, last_name):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
    
    
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)

The preferred time zone is in instance attribute, and should be read-write (account holder may want to change time zones later). We'll add it to the `__init__` method, and default it to UTC (so 0 offset), but provide getters and setters to read/write it later if needed.

In [2]:
class Account:
    transaction_counter = itertools.count(100)
    
    def __init__(self, account_number, first_name, last_name, 
                 timezone=None):
        # in practice we probably would want to add checks to make 
        # sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
        if timezone is None:
            timezone = TimeZone('UTC', 0, 0)
        self.timezone = timezone
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
    
    
    @property
    def timezone(self):
        return self._timezone
    
    @timezone.setter
    def timezone(self, value):
        if not isinstance(value, TimeZone):
            raise ValueError('Time zone must be a valid TimeZone object.')
        self._timezone = value
            
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)

In [3]:
try:
    a = Account('123', 'John', 'Smith', '-7:00')
except ValueError as ex:
    print(ex)

Time zone must be a valid TimeZone object.


In [4]:
a = Account('123', 'John', 'Smith')
print(a.timezone)

TimeZone(name='UTC', offset_hours=0, offset_minutes=0)


##  Project 1 - Balance

Here's where we left off with our code:

In [1]:
import itertools
import numbers
from datetime import timedelta

class TimeZone:
    def __init__(self, name, offset_hours, offset_minutes):
        if name is None or len(str(name).strip()) == 0:
            raise ValueError('Timezone name cannot be empty.')
            
        self._name = str(name).strip()
        # technically we should check that offset is a
        if not isinstance(offset_hours, numbers.Integral):
            raise ValueError('Hour offset must be an integer.')
        
        if not isinstance(offset_minutes, numbers.Integral):
            raise ValueError('Minutes offset must be an integer.')
            
        if offset_minutes < -59 or offset_minutes > 59:
            raise ValueError('Minutes offset must between -59 and 59 (inclusive).')
            
        # for time delta sign of minutes will be set to sign of hours
        offset = timedelta(hours=offset_hours, minutes=offset_minutes)

        # offsets are technically bounded between -12:00 and 14:00
        # see: https://en.wikipedia.org/wiki/List_of_UTC_time_offsets
        if offset < timedelta(hours=-12, minutes=0) or offset > timedelta(hours=14, minutes=0):
            raise ValueError('Offset must be between -12:00 and +14:00.')
            
        self._offset_hours = offset_hours
        self._offset_minutes = offset_minutes
        self._offset = offset
        
    @property
    def offset(self):
        return self._offset
    
    @property
    def name(self):
        return self._name
    
    def __eq__(self, other):
        return (isinstance(other, TimeZone) and 
                self.name == other.name and 
                self._offset_hours == other._offset_hours and
                self._offset_minutes == other._offset_minutes)
    def __repr__(self):
        return (f"TimeZone(name='{self.name}', "
                f"offset_hours={self._offset_hours}, "
                f"offset_minutes={self._offset_minutes})")

In [2]:
class Account:
    transaction_counter = itertools.count(100)
    
    def __init__(self, account_number, first_name, last_name, timezone=None):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
        if timezone is None:
            timezone = TimeZone('UTC', 0, 0)
        self.timezone = timezone
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'    
    
    @property
    def timezone(self):
        return self._timezone
    
    @timezone.setter
    def timezone(self, value):
        if not isinstance(value, TimeZone):
            raise ValueError('Time zone must be a valid TimeZone object.')
        self._timezone = value
            
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)

Next we need to add something to maintain the balance. Again this is going to be a "read-only" property, and we'll add it as an optional param to the `__init__` method as well:

In [3]:
class Account:
    transaction_counter = itertools.count(100)
    
    def __init__(self, account_number, first_name, last_name, 
                 timezone=None, initial_balance=0):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
        if timezone is None:
            timezone = TimeZone('UTC', 0, 0)
        self.timezone = timezone
        
        self._balance = float(initial_balance)  # force use of floats here, but maybe Decimal would be better
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
        
    @property
    def timezone(self):
        return self._timezone
    
    @property
    def balance(self):
        return self._balance
    
    @timezone.setter
    def timezone(self, value):
        if not isinstance(value, TimeZone):
            raise ValueError('Time zone must be a valid TimeZone object.')
        self._timezone = value
            
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)

In [4]:
a = Account('1234', 'John', 'Cleese', initial_balance=100)
print(a.balance)

100.0


In [5]:
try:
    a.balance = 200
except AttributeError as ex:
    print(ex)

can't set attribute


##  Project 1 - Interest Rate

Our code so far:

In [1]:
import itertools
import numbers
from datetime import timedelta

class TimeZone:
    def __init__(self, name, offset_hours, offset_minutes):
        if name is None or len(str(name).strip()) == 0:
            raise ValueError('Timezone name cannot be empty.')
            
        self._name = str(name).strip()
        # technically we should check that offset is a
        if not isinstance(offset_hours, numbers.Integral):
            raise ValueError('Hour offset must be an integer.')
        
        if not isinstance(offset_minutes, numbers.Integral):
            raise ValueError('Minutes offset must be an integer.')
            
        if offset_minutes < -59 or offset_minutes > 59:
            raise ValueError('Minutes offset must between -59 and 59 (inclusive).')
            
        # for time delta sign of minutes will be set to sign of hours
        offset = timedelta(hours=offset_hours, minutes=offset_minutes)

        # offsets are technically bounded between -12:00 and 14:00
        # see: https://en.wikipedia.org/wiki/List_of_UTC_time_offsets
        if offset < timedelta(hours=-12, minutes=0) or offset > timedelta(hours=14, minutes=0):
            raise ValueError('Offset must be between -12:00 and +14:00.')
            
        self._offset_hours = offset_hours
        self._offset_minutes = offset_minutes
        self._offset = offset
        
    @property
    def offset(self):
        return self._offset
    
    @property
    def name(self):
        return self._name
    
    def __eq__(self, other):
        return (isinstance(other, TimeZone) and 
                self.name == other.name and 
                self._offset_hours == other._offset_hours and
                self._offset_minutes == other._offset_minutes)
    def __repr__(self):
        return (f"TimeZone(name='{self.name}', "
                f"offset_hours={self._offset_hours}, "
                f"offset_minutes={self._offset_minutes})")

In [2]:
class Account:
    transaction_counter = itertools.count(100)
    
    def __init__(self, account_number, first_name, last_name, timezone=None, initial_balance=0):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
        if timezone is None:
            timezone = TimeZone('UTC', 0, 0)
        self.timezone = timezone
        
        self._balance = float(initial_balance)  # force use of floats here, but maybe Decimal would be better
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
        
    @property
    def timezone(self):
        return self._timezone
    
    @property
    def balance(self):
        return self._balance
    
    @timezone.setter
    def timezone(self, value):
        if not isinstance(value, TimeZone):
            raise ValueError('Time zone must be a valid TimeZone object.')
        self._timezone = value
            
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)

The interest rate is common across all bank accounts, so we can use a class attribute for it. 

We haven't studied how to write class properties (we'll see how when we look at descriptors), so for now we could just use a bare attribute - not ideal, but let's try that first.

In [3]:
class Account:
    transaction_counter = itertools.count(100)
    interest_rate = 0.5  # percentage
    
    def __init__(self, account_number, first_name, last_name, timezone=None, initial_balance=0):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
        if timezone is None:
            timezone = TimeZone('UTC', 0, 0)
        self.timezone = timezone
        
        self._balance = float(initial_balance)  # force use of floats here, but maybe Decimal would be better
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
        
    @property
    def timezone(self):
        return self._timezone
    
    @property
    def balance(self):
        return self._balance
    
    @timezone.setter
    def timezone(self, value):
        if not isinstance(value, TimeZone):
            raise ValueError('Time zone must be a valid TimeZone object.')
        self._timezone = value
            
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)

In [4]:
a1 = Account(1234, 'Monty', 'Python', initial_balance=0)
a2 = Account(2345, 'John', 'Cleese', initial_balance=0)

In [5]:
a1.interest_rate, a2.interest_rate

(0.5, 0.5)

In [6]:
Account.interest_rate = 0.025

In [7]:
a1.interest_rate, a2.interest_rate

(0.025, 0.025)

If we don't feel comfortable with the bare attribute and want to validate changing the interest rate, then we could use a more Java-like approach with getter and setter class methods. 

We'll start by changing the class variable name to indicate it is a "private" attribute, then add a getter and a setter class method:

In [8]:
class Account:
    transaction_counter = itertools.count(100)
    _interest_rate = 0.5  # percentage
    
    def __init__(self, account_number, first_name, last_name, timezone=None, initial_balance=0):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
        if timezone is None:
            timezone = TimeZone('UTC', 0, 0)
        self.timezone = timezone
        
        self._balance = float(initial_balance)  # force use of floats here, but maybe Decimal would be better
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
        
    @property
    def timezone(self):
        return self._timezone
    
    @property
    def balance(self):
        return self._balance
    
    @timezone.setter
    def timezone(self, value):
        if not isinstance(value, TimeZone):
            raise ValueError('Time zone must be a valid TimeZone object.')
        self._timezone = value
          
    @classmethod
    def get_interest_rate(cls):
        return cls._interest_rate
    
    @classmethod
    def set_interest_rate(cls, value):
        if not isinstance(value, numbers.Real):
            raise ValueError('Interest rate must be a real number')
        if value < 0:
            raise ValueError('Interest rate cannot be negative.')
        cls._interest_rate = value
        
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)

In [9]:
Account.get_interest_rate()

0.5

In [10]:
Account.set_interest_rate(10)

In [11]:
Account.get_interest_rate()

10

In [12]:
try:
    Account.set_interest_rate(-10)
except ValueError as ex:
    print(ex)

Interest rate cannot be negative.


In [13]:
try:
    Account.set_interest_rate(1+1j)
except ValueError as ex:
    print(ex)

Interest rate must be a real number


##  Project 1: Setting up Transaction Codes

Our code so far:

In [1]:
import itertools
import numbers
from datetime import timedelta

class TimeZone:
    def __init__(self, name, offset_hours, offset_minutes):
        if name is None or len(str(name).strip()) == 0:
            raise ValueError('Timezone name cannot be empty.')
            
        self._name = str(name).strip()
        # technically we should check that offset is a
        if not isinstance(offset_hours, numbers.Integral):
            raise ValueError('Hour offset must be an integer.')
        
        if not isinstance(offset_minutes, numbers.Integral):
            raise ValueError('Minutes offset must be an integer.')
            
        if offset_minutes < -59 or offset_minutes > 59:
            raise ValueError('Minutes offset must between -59 and 59 (inclusive).')
            
        # for time delta sign of minutes will be set to sign of hours
        offset = timedelta(hours=offset_hours, minutes=offset_minutes)

        # offsets are technically bounded between -12:00 and 14:00
        # see: https://en.wikipedia.org/wiki/List_of_UTC_time_offsets
        if offset < timedelta(hours=-12, minutes=0) or offset > timedelta(hours=14, minutes=0):
            raise ValueError('Offset must be between -12:00 and +14:00.')
            
        self._offset_hours = offset_hours
        self._offset_minutes = offset_minutes
        self._offset = offset
        
    @property
    def offset(self):
        return self._offset
    
    @property
    def name(self):
        return self._name
    
    def __eq__(self, other):
        return (isinstance(other, TimeZone) and 
                self.name == other.name and 
                self._offset_hours == other._offset_hours and
                self._offset_minutes == other._offset_minutes)
    def __repr__(self):
        return (f"TimeZone(name='{self.name}', "
                f"offset_hours={self._offset_hours}, "
                f"offset_minutes={self._offset_minutes})")

In [2]:
class Account:
    transaction_counter = itertools.count(100)
    _interest_rate = 0.5  # percentage
    
    def __init__(self, account_number, first_name, last_name, 
                 timezone=None, initial_balance=0):
        # in practice we probably would want to add checks to 
        # make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
        if timezone is None:
            timezone = TimeZone('UTC', 0, 0)
        self.timezone = timezone
        
        self._balance = float(initial_balance)  # force use of floats here, but maybe Decimal would be better
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
        
    @property
    def timezone(self):
        return self._timezone
    
    @property
    def balance(self):
        return self._balance
    
    @timezone.setter
    def timezone(self, value):
        if not isinstance(value, TimeZone):
            raise ValueError('Time zone must be a valid TimeZone object.')
        self._timezone = value
          
    @classmethod
    def get_interest_rate(cls):
        return cls._interest_rate
    
    @classmethod
    def set_interest_rate(cls, value):
        if not isinstance(value, numbers.Real):
            raise ValueError('Interest rate must be a real number')
        if value < 0:
            raise ValueError('Interest rate cannot be negative.')
        cls._interest_rate = value
        
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)

Although we _could_ use hardcoded values for the `D`, `W`, `I`, and `X` transaction codes, I prefer to store them in a dictionary and lookup the code whenever I need to. That way, if we ever need to change those codes for some reason, we don't have to hunt them down in the code itself.

So for that, I'm going to use a "private" class attribute (dictionary), with the assumption that the keys will **not** change, but the associated values (codes) **can**. We could actually go on step further and define "constants" for the keys as well, but I don't think that's really necessary.

A better approach would be to use an **enumeration** type - but we're not there yet!

In [3]:
class Account:
    transaction_counter = itertools.count(100)
    _interest_rate = 0.5  # percentage
    
    _transaction_codes = {
        'deposit': 'D',
        'withdraw': 'W',
        'interest': 'I',
        'rejected': 'X'
    }
    
    def __init__(self, account_number, first_name, last_name, timezone=None, initial_balance=0):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
        if timezone is None:
            timezone = TimeZone('UTC', 0, 0)
        self.timezone = timezone
        
        self._balance = float(initial_balance)  # force use of floats here, but maybe Decimal would be better
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
        
    @property
    def timezone(self):
        return self._timezone
    
    @property
    def balance(self):
        return self._balance
    
    @timezone.setter
    def timezone(self, value):
        if not isinstance(value, TimeZone):
            raise ValueError('Time zone must be a valid TimeZone object.')
        self._timezone = value
          
    @classmethod
    def get_interest_rate(cls):
        return cls._interest_rate
    
    @classmethod
    def set_interest_rate(cls, value):
        if not isinstance(value, numbers.Real):
            raise ValueError('Interest rate must be a real number')
        if value < 0:
            raise ValueError('Interest rate cannot be negative.')
        cls._interest_rate = value
        
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)

##  Project 1 - Implementing Confirmation Codes

Our code so far:

In [1]:
import itertools
import numbers
from datetime import timedelta

class TimeZone:
    def __init__(self, name, offset_hours, offset_minutes):
        if name is None or len(str(name).strip()) == 0:
            raise ValueError('Timezone name cannot be empty.')
            
        self._name = str(name).strip()
        # technically we should check that offset is a
        if not isinstance(offset_hours, numbers.Integral):
            raise ValueError('Hour offset must be an integer.')
        
        if not isinstance(offset_minutes, numbers.Integral):
            raise ValueError('Minutes offset must be an integer.')
            
        if offset_minutes < -59 or offset_minutes > 59:
            raise ValueError('Minutes offset must between -59 and 59 (inclusive).')
            
        # for time delta sign of minutes will be set to sign of hours
        offset = timedelta(hours=offset_hours, minutes=offset_minutes)

        # offsets are technically bounded between -12:00 and 14:00
        # see: https://en.wikipedia.org/wiki/List_of_UTC_time_offsets
        if offset < timedelta(hours=-12, minutes=0) or offset > timedelta(hours=14, minutes=0):
            raise ValueError('Offset must be between -12:00 and +14:00.')
            
        self._offset_hours = offset_hours
        self._offset_minutes = offset_minutes
        self._offset = offset
        
    @property
    def offset(self):
        return self._offset
    
    @property
    def name(self):
        return self._name
    
    def __eq__(self, other):
        return (isinstance(other, TimeZone) and 
                self.name == other.name and 
                self._offset_hours == other._offset_hours and
                self._offset_minutes == other._offset_minutes)
    def __repr__(self):
        return (f"TimeZone(name='{self.name}', "
                f"offset_hours={self._offset_hours}, "
                f"offset_minutes={self._offset_minutes})")

In [2]:
class Account:
    transaction_counter = itertools.count(100)
    _interest_rate = 0.5  # percentage
    
    _transaction_codes = {
        'deposit': 'D',
        'withdraw': 'W',
        'interest': 'I',
        'rejected': 'X'
    }
    
    def __init__(self, account_number, first_name, last_name, timezone=None, initial_balance=0):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
        if timezone is None:
            timezone = TimeZone('UTC', 0, 0)
        self.timezone = timezone
        
        self._balance = float(initial_balance)  # force use of floats here, but maybe Decimal would be better
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
        
    @property
    def timezone(self):
        return self._timezone
    
    @property
    def balance(self):
        return self._balance
    
    @timezone.setter
    def timezone(self, value):
        if not isinstance(value, TimeZone):
            raise ValueError('Time zone must be a valid TimeZone object.')
        self._timezone = value
          
    @classmethod
    def get_interest_rate(cls):
        return cls._interest_rate
    
    @classmethod
    def set_interest_rate(cls, value):
        if not isinstance(value, numbers.Real):
            raise ValueError('Interest rate must be a real number')
        if value < 0:
            raise ValueError('Interest rate cannot be negative.')
        cls._interest_rate = value
        
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)

As I mentioned earlier the code should contain:
- the transaction code
- the account number
- the date / time in UTC of the transaction
- the transaction number

Something like:
```D-140568-20190115154500-124```

Let's first build this function in isolation, and then we'll add it to the class once we're happy with it:

In [3]:
from datetime import datetime

In [4]:
def generate_confirmation_code(account_number, transaction_id, transaction_code):
    # main difficulty here is to generate the current time in UTC using this formatting:
    # YYYYMMDDHHMMSS
    dt_str = datetime.utcnow().strftime('%Y%m%d%H%M%S')
    return f'{transaction_code}-{account_number}-{dt_str}-{transaction_id}'

In [5]:
generate_confirmation_code(123, 1000, 'X')

'X-123-20190602232855-1000'

Now let's incorporate that as an instance method in our class - we won't need to pass account_numberor transaction_id - we can get those ourselves from within the class:

In [6]:
class Account:
    transaction_counter = itertools.count(100)
    _interest_rate = 0.5  # percentage
    
    _transaction_codes = {
        'deposit': 'D',
        'withdraw': 'W',
        'interest': 'I',
        'rejected': 'X'
    }
    
    def __init__(self, account_number, first_name, last_name, timezone=None, initial_balance=0):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
        if timezone is None:
            timezone = TimeZone('UTC', 0, 0)
        self.timezone = timezone
        
        self._balance = float(initial_balance)  # force use of floats here, but maybe Decimal would be better
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
        
    @property
    def timezone(self):
        return self._timezone
    
    @property
    def balance(self):
        return self._balance
    
    @timezone.setter
    def timezone(self, value):
        if not isinstance(value, TimeZone):
            raise ValueError('Time zone must be a valid TimeZone object.')
        self._timezone = value
          
    @classmethod
    def get_interest_rate(cls):
        return cls._interest_rate
    
    @classmethod
    def set_interest_rate(cls, value):
        if not isinstance(value, numbers.Real):
            raise ValueError('Interest rate must be a real number')
        if value < 0:
            raise ValueError('Interest rate cannot be negative.')
        cls._interest_rate = value
        
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)
        
    def generate_confirmation_code(self, transaction_code):
        # main difficulty here is to generate the current time in UTC using this formatting:
        # YYYYMMDDHHMMSS
        dt_str = datetime.utcnow().strftime('%Y%m%d%H%M%S')
        return f'{transaction_code}-{self.account_number}-{dt_str}-{next(Account.transaction_counter)}'

So we can test it out, let's write a dummy transaction method:

In [7]:
class Account:
    transaction_counter = itertools.count(100)
    _interest_rate = 0.5  # percentage
    
    _transaction_codes = {
        'deposit': 'D',
        'withdraw': 'W',
        'interest': 'I',
        'rejected': 'X'
    }
    
    def __init__(self, account_number, first_name, last_name, timezone=None, initial_balance=0):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
        if timezone is None:
            timezone = TimeZone('UTC', 0, 0)
        self.timezone = timezone
        
        self._balance = float(initial_balance)  # force use of floats here, but maybe Decimal would be better
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
        
    @property
    def timezone(self):
        return self._timezone
    
    @property
    def balance(self):
        return self._balance
    
    @timezone.setter
    def timezone(self, value):
        if not isinstance(value, TimeZone):
            raise ValueError('Time zone must be a valid TimeZone object.')
        self._timezone = value
          
    @classmethod
    def get_interest_rate(cls):
        return cls._interest_rate
    
    @classmethod
    def set_interest_rate(cls, value):
        if not isinstance(value, numbers.Real):
            raise ValueError('Interest rate must be a real number')
        if value < 0:
            raise ValueError('Interest rate cannot be negative.')
        cls._interest_rate = value
        
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)
        
    def generate_confirmation_code(self, transaction_code):
        # main difficulty here is to generate the current time in UTC using this formatting:
        # YYYYMMDDHHMMSS
        dt_str = datetime.utcnow().strftime('%Y%m%d%H%M%S')
        return f'{transaction_code}-{self.account_number}-{dt_str}-{next(Account.transaction_counter)}'
    
    def make_transaction(self):
        return self.generate_confirmation_code('dummy')

In [8]:
a = Account('A100', 'John', 'Cleese', initial_balance=100)

In [9]:
a.make_transaction()

'dummy-A100-20190602232855-100'

In [10]:
a.make_transaction()

'dummy-A100-20190602232855-101'

#### Confirmation Parser

This one is actually going to be a static method. We don't really require the instance to reverse-engineer the confirmation code. We could, but then we probably would also want to check that the account number embedded in the conformation code is the same as the instance account number we are decoding from. There's not really a need for that.

The other thing is we want to have a neat dotted notation to recover the various properties of the partsed confirmation code.
Once again, before jumping into creating a utility class for that, we can really get away with a named tuple here instead!

Let's see what a confirmation code looks like:
```
dummy-A100-20190325224918-101
```

So, we can split things on a `-` symbol, and we'll have to parse the date time string into an actual date time (and we know confirmation codes use UTC dates)

In [11]:
from collections import namedtuple

Confirmation = namedtuple('Confirmation', 'account_number, transaction_code, transaction_id, time_utc, time')

In [12]:
class Account:
    transaction_counter = itertools.count(100)
    _interest_rate = 0.5  # percentage
    
    _transaction_codes = {
        'deposit': 'D',
        'withdraw': 'W',
        'interest': 'I',
        'rejected': 'X'
    }
    
    def __init__(self, account_number, first_name, last_name, timezone=None, initial_balance=0):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
        if timezone is None:
            timezone = TimeZone('UTC', 0, 0)
        self.timezone = timezone
        
        self._balance = float(initial_balance)  # force use of floats here, but maybe Decimal would be better
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
        
    @property
    def timezone(self):
        return self._timezone
    
    @property
    def balance(self):
        return self._balance
    
    @timezone.setter
    def timezone(self, value):
        if not isinstance(value, TimeZone):
            raise ValueError('Time zone must be a valid TimeZone object.')
        self._timezone = value
          
    @classmethod
    def get_interest_rate(cls):
        return cls._interest_rate
    
    @classmethod
    def set_interest_rate(cls, value):
        if not isinstance(value, numbers.Real):
            raise ValueError('Interest rate must be a real number')
        if value < 0:
            raise ValueError('Interest rate cannot be negative.')
        cls._interest_rate = value
        
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)
        
    def generate_confirmation_code(self, transaction_code):
        # main difficulty here is to generate the current time in UTC using this formatting:
        # YYYYMMDDHHMMSS
        dt_str = datetime.utcnow().strftime('%Y%m%d%H%M%S')
        return f'{transaction_code}-{self.account_number}-{dt_str}-{next(Account.transaction_counter)}'
    
    @staticmethod
    def parse_confirmation_code(confirmation_code, preferred_time_zone=None):
        # dummy-A100-20190325224918-101
        parts = confirmation_code.split('-')
        if len(parts) != 4:
            # really simplistic validation here - would need something better
            raise ValueError('Invalid confirmation code')
        
        # unpack into separate variables
        transaction_code, account_number, raw_dt_utc, transaction_id = parts
        
        # need to convert raw_dt_utc into a proper datetime object
        try:
            dt_utc = datetime.strptime(raw_dt_utc, '%Y%m%d%H%M%S')
        except ValueError as ex:
            # again, probably need better error handling here
            raise ValueError('Invalid transaction datetime') from ex
          
        if preferred_time_zone is None:
            preferred_time_zone = TimeZone('UTC', 0, 0)
            
        if not isinstance(preferred_time_zone, TimeZone):
            raise ValueError('Invalid TimeZone specified.')
            
        dt_preferred = dt_utc + preferred_time_zone.offset
        dt_preferred_str = f"{dt_preferred.strftime('%Y-%m-%d %H:%M:%S')} ({preferred_time_zone.name})"
        
        return Confirmation(account_number, transaction_code, transaction_id, dt_utc.isoformat(), dt_preferred_str)
    
    def make_transaction(self):
        return self.generate_confirmation_code('dummy')

OK, so let's try this out, and make sure it's working (within reason, this will not catch all badly formatted confirmation codes by any means):

In [13]:
a = Account('A100', 'John', 'Cleese', initial_balance=100)
conf_code = a.make_transaction()
print(conf_code)

dummy-A100-20190602232855-100


In [14]:
Account.parse_confirmation_code(conf_code)

Confirmation(account_number='A100', transaction_code='dummy', transaction_id='100', time_utc='2019-06-02T23:28:55', time='2019-06-02 23:28:55 (UTC)')

In [15]:
Account.parse_confirmation_code(conf_code, TimeZone('MST', -7, 0))

Confirmation(account_number='A100', transaction_code='dummy', transaction_id='100', time_utc='2019-06-02T23:28:55', time='2019-06-02 16:28:55 (MST)')

In [16]:
try:
    Account.parse_confirmation_code('X-A100-asdasd-123')
except ValueError as ex:
    print(ex)

Invalid transaction datetime


##  Project 1 - Transactions

Our code so far:

In [1]:
import itertools
import numbers
from datetime import timedelta, datetime
from collections import namedtuple


class TimeZone:
    def __init__(self, name, offset_hours, offset_minutes):
        if name is None or len(str(name).strip()) == 0:
            raise ValueError('Timezone name cannot be empty.')
            
        self._name = str(name).strip()
        # technically we should check that offset is a
        if not isinstance(offset_hours, numbers.Integral):
            raise ValueError('Hour offset must be an integer.')
        
        if not isinstance(offset_minutes, numbers.Integral):
            raise ValueError('Minutes offset must be an integer.')
            
        if offset_minutes < -59 or offset_minutes > 59:
            raise ValueError('Minutes offset must between -59 and 59 (inclusive).')
            
        # for time delta sign of minutes will be set to sign of hours
        offset = timedelta(hours=offset_hours, minutes=offset_minutes)

        # offsets are technically bounded between -12:00 and 14:00
        # see: https://en.wikipedia.org/wiki/List_of_UTC_time_offsets
        if offset < timedelta(hours=-12, minutes=0) or offset > timedelta(hours=14, minutes=0):
            raise ValueError('Offset must be between -12:00 and +14:00.')
            
        self._offset_hours = offset_hours
        self._offset_minutes = offset_minutes
        self._offset = offset
        
    @property
    def offset(self):
        return self._offset
    
    @property
    def name(self):
        return self._name
    
    def __eq__(self, other):
        return (isinstance(other, TimeZone) and 
                self.name == other.name and 
                self._offset_hours == other._offset_hours and
                self._offset_minutes == other._offset_minutes)
    def __repr__(self):
        return (f"TimeZone(name='{self.name}', "
                f"offset_hours={self._offset_hours}, "
                f"offset_minutes={self._offset_minutes})")

In [2]:
class Account:
    transaction_counter = itertools.count(100)
    _interest_rate = 0.5  # percentage
    
    _transaction_codes = {
        'deposit': 'D',
        'withdraw': 'W',
        'interest': 'I',
        'rejected': 'X'
    }
    
    def __init__(self, account_number, first_name, last_name, timezone=None, initial_balance=0):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
        if timezone is None:
            timezone = TimeZone('UTC', 0, 0)
        self.timezone = timezone
        
        self._balance = float(initial_balance)  # force use of floats here, but maybe Decimal would be better
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
        
    @property
    def timezone(self):
        return self._timezone
    
    @property
    def balance(self):
        return self._balance
    
    @timezone.setter
    def timezone(self, value):
        if not isinstance(value, TimeZone):
            raise ValueError('Time zone must be a valid TimeZone object.')
        self._timezone = value
          
    @classmethod
    def get_interest_rate(cls):
        return cls._interest_rate
    
    @classmethod
    def set_interest_rate(cls, value):
        if not isinstance(value, numbers.Real):
            raise ValueError('Interest rate must be a real number')
        if value < 0:
            raise ValueError('Interest rate cannot be negative.')
        cls._interest_rate = value
        
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)
        
    def generate_confirmation_code(self, transaction_code):
        # main difficulty here is to generate the current time in UTC using this formatting:
        # YYYYMMDDHHMMSS
        dt_str = datetime.utcnow().strftime('%Y%m%d%H%M%S')
        return f'{transaction_code}-{self.account_number}-{dt_str}-{next(Account.transaction_counter)}'
    
    @staticmethod
    def parse_confirmation_code(confirmation_code, preferred_time_zone=None):
        # dummy-A100-20190325224918-101
        parts = confirmation_code.split('-')
        if len(parts) != 4:
            # really simplistic validation here - would need something better
            raise ValueError('Invalid confirmation code')
        
        # unpack into separate variables
        transaction_code, account_number, raw_dt_utc, transaction_id = parts
        
        # need to convert raw_dt_utc into a proper datetime object
        try:
            dt_utc = datetime.strptime(raw_dt_utc, '%Y%m%d%H%M%S')
        except ValueError as ex:
            # again, probably need better error handling here
            raise ValueError('Invalid transaction datetime') from ex
          
        if preferred_time_zone is None:
            preferred_time_zone = TimeZone('UTC', 0, 0)
            
        if not isinstance(preferred_time_zone, TimeZone):
            raise ValueError('Invalid TimeZone specified.')
            
        dt_preferred = dt_utc + preferred_time_zone.offset
        dt_preferred_str = f"{dt_preferred.strftime('%Y-%m-%d %H:%M:%S')} ({preferred_time_zone.name})"
        
        return Confirmation(account_number, transaction_code, transaction_id, dt_utc.isoformat(), dt_preferred_str)
    
    def make_transaction(self):
        return self.generate_confirmation_code('dummy')

Now it's time to add in support for the various transaction methods:

In [3]:
class Account:
    transaction_counter = itertools.count(100)
    _interest_rate = 0.5  # percentage
    
    _transaction_codes = {
        'deposit': 'D',
        'withdraw': 'W',
        'interest': 'I',
        'rejected': 'X'
    }
    
    def __init__(self, account_number, first_name, last_name, timezone=None, initial_balance=0):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
        if timezone is None:
            timezone = TimeZone('UTC', 0, 0)
        self.timezone = timezone
        
        self._balance = float(initial_balance)  # force use of floats here, but maybe Decimal would be better
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
        
    @property
    def timezone(self):
        return self._timezone
    
    @property
    def balance(self):
        return self._balance
    
    @timezone.setter
    def timezone(self, value):
        if not isinstance(value, TimeZone):
            raise ValueError('Time zone must be a valid TimeZone object.')
        self._timezone = value
          
    @classmethod
    def get_interest_rate(cls):
        return cls._interest_rate
    
    @classmethod
    def set_interest_rate(cls, value):
        if not isinstance(value, numbers.Real):
            raise ValueError('Interest rate must be a real number')
        if value < 0:
            raise ValueError('Interest rate cannot be negative.')
        cls._interest_rate = value
        
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)
        
    def generate_confirmation_code(self, transaction_code):
        # main difficulty here is to generate the current time in UTC using this formatting:
        # YYYYMMDDHHMMSS
        dt_str = datetime.utcnow().strftime('%Y%m%d%H%M%S')
        return f'{transaction_code}-{self.account_number}-{dt_str}-{next(Account.transaction_counter)}'
    
    @staticmethod
    def parse_confirmation_code(confirmation_code, preferred_time_zone=None):
        # dummy-A100-20190325224918-101
        parts = confirmation_code.split('-')
        if len(parts) != 4:
            # really simplistic validation here - would need something better
            raise ValueError('Invalid confirmation code')
        
        # unpack into separate variables
        transaction_code, account_number, raw_dt_utc, transaction_id = parts
        
        # need to convert raw_dt_utc into a proper datetime object
        try:
            dt_utc = datetime.strptime(raw_dt_utc, '%Y%m%d%H%M%S')
        except ValueError as ex:
            # again, probably need better error handling here
            raise ValueError('Invalid transaction datetime') from ex
          
        if preferred_time_zone is None:
            preferred_time_zone = TimeZone('UTC', 0, 0)
            
        if not isinstance(preferred_time_zone, TimeZone):
            raise ValueError('Invalid TimeZone specified.')
            
        dt_preferred = dt_utc + preferred_time_zone.offset
        dt_preferred_str = f"{dt_preferred.strftime('%Y-%m-%d %H:%M:%S')} ({preferred_time_zone.name})"
        
        return Confirmation(account_number, transaction_code, transaction_id, dt_utc.isoformat(), dt_preferred_str)
    
    def deposit(self, value):
        if not isinstance(value, numbers.Real):
            raise ValueError('Deposit value must be a real number.')
        if value <= 0:
            raise ValueError('Deposit value must be a positive number.')
        
        # get transaction code
        transaction_code = Account._transaction_codes['deposit']
        
        # generate a confirmation code
        conf_code = self.generate_confirmation_code(transaction_code)
        
        # make deposit and return conf code
        self._balance += value
        return conf_code
    
    def withdraw(self, value):
        # hmmm... repetitive code! we'll need to fix this
        # TODO: refactor a function to validate a valid positive number
        #       and use in __init__, deposit and 
        
        accepted = False
        if self.balance - value < 0:
            # insufficient funds - we'll reject this transaction
            transaction_code = Account._transaction_codes['rejected']
        else:
            transaction_code = Account._transaction_codes['withdraw']
            accepted = True
            
        conf_code = self.generate_confirmation_code(transaction_code)
        
        # Doing this here in case there's a problem generating a confirmation code
        # - do not want to modify the balance if we cannot generate a transaction code successfully
        if accepted:
            self._balance -= value
            
        return conf_code
    
    def pay_interestt(self):
        interest = self.balance * Account.get_interest_rate() / 100
        conf_code = self.generate_confirmation_code(Account._transaction_codes['interest'])
        self._balance += interest
        return conf_code

In [4]:
a = Account('A100', 'Eric', 'Idle', TimeZone('MST', -7, 0), 100.0)

In [5]:
a.balance

100.0

In [6]:
a.deposit(100)

'D-A100-20190602232906-100'

In [7]:
a.balance

200.0

In [8]:
a.withdraw(100)

'W-A100-20190602232907-101'

In [9]:
a.balance

100.0

In [10]:
a.withdraw(1000)

'X-A100-20190602232907-102'

In [11]:
a.balance

100.0

#### Refactoring

So now, let's refactor some code. We have to do that real positive number validation. We'll deal with it in the same we dealt with the name validations, except we won't be storing anything - this is going to be a good candidate for a static method.

While we're at it, we'll modify our `__init__` method as well to validate the balance. Remember I had stated in the project description that balances should never be negative. So, I'm going to introduce a subtle bug here where the initialization will validate that the initial balance is a real number, but will not validate that it be non-negative. We'll "catch" that bug later.

In [12]:
class Account:
    transaction_counter = itertools.count(100)
    _interest_rate = 0.5  # percentage
    
    _transaction_codes = {
        'deposit': 'D',
        'withdraw': 'W',
        'interest': 'I',
        'rejected': 'X'
    }
    
    def __init__(self, account_number, first_name, last_name, timezone=None, initial_balance=0):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
        if timezone is None:
            timezone = TimeZone('UTC', 0, 0)
        self.timezone = timezone
        
        self._balance = Account.validate_real_number(initial_balance)
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
        
    @property
    def timezone(self):
        return self._timezone
    
    @property
    def balance(self):
        return self._balance
    
    @timezone.setter
    def timezone(self, value):
        if not isinstance(value, TimeZone):
            raise ValueError('Time zone must be a valid TimeZone object.')
        self._timezone = value
          
    @classmethod
    def get_interest_rate(cls):
        return cls._interest_rate
    
    @classmethod
    def set_interest_rate(cls, value):
        if not isinstance(value, numbers.Real):
            raise ValueError('Interest rate must be a real number')
        if value < 0:
            raise ValueError('Interest rate cannot be negative.')
        cls._interest_rate = value
        
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)
        
    @staticmethod
    def validate_real_number(value, min_value=None):
        if not isinstance(value, numbers.Real):
            raise ValueError('Value must be a real number.')
            
        if min_value is not None and value < min_value:
            raise ValueError(f'Value must be at least {min_value}')
            
        # validation passed, return valid value
        return value
    
    def generate_confirmation_code(self, transaction_code):
        # main difficulty here is to generate the current time in UTC using this formatting:
        # YYYYMMDDHHMMSS
        dt_str = datetime.utcnow().strftime('%Y%m%d%H%M%S')
        return f'{transaction_code}-{self.account_number}-{dt_str}-{next(Account.transaction_counter)}'
    
    @staticmethod
    def parse_confirmation_code(confirmation_code, preferred_time_zone=None):
        # dummy-A100-20190325224918-101
        parts = confirmation_code.split('-')
        if len(parts) != 4:
            # really simplistic validation here - would need something better
            raise ValueError('Invalid confirmation code')
        
        # unpack into separate variables
        transaction_code, account_number, raw_dt_utc, transaction_id = parts
        
        # need to convert raw_dt_utc into a proper datetime object
        try:
            dt_utc = datetime.strptime(raw_dt_utc, '%Y%m%d%H%M%S')
        except ValueError as ex:
            # again, probably need better error handling here
            raise ValueError('Invalid transaction datetime') from ex
          
        if preferred_time_zone is None:
            preferred_time_zone = TimeZone('UTC', 0, 0)
            
        if not isinstance(preferred_time_zone, TimeZone):
            raise ValueError('Invalid TimeZone specified.')
            
        dt_preferred = dt_utc + preferred_time_zone.offset
        dt_preferred_str = f"{dt_preferred.strftime('%Y-%m-%d %H:%M:%S')} ({preferred_time_zone.name})"
        
        return Confirmation(account_number, transaction_code, transaction_id, dt_utc.isoformat(), dt_preferred_str)
    
    def deposit(self, value):
        value = Account.validate_real_number(value, min_value=0.01)
       
        # get transaction code
        transaction_code = Account._transaction_codes['deposit']
        
        # generate a confirmation code
        conf_code = self.generate_confirmation_code(transaction_code)
        
        # make deposit and return conf code
        self._balance += value
        return conf_code
    
    def withdraw(self, value):
        value = Account.validate_real_number(value, min_value=0.01)
        accepted = False
        if self.balance - value < 0:
            # insufficient funds - we'll reject this transaction
            transaction_code = Account._transaction_codes['rejected']
        else:
            transaction_code = Account._transaction_codes['withdraw']
            accepted = True
            
        conf_code = self.generate_confirmation_code(transaction_code)
        
        # Doing this here in case there's a problem generating a confirmation code
        # - do not want to modify the balance if we cannot generate a transaction code successfully
        if accepted:
            self._balance -= value
            
        return conf_code
    
    def pay_interest(self):
        interest = self.balance * Account.get_interest_rate() / 100
        conf_code = self.generate_confirmation_code(self._transaction_codes['interest'])
        self._balance += interest
        return conf_code

We should now be seeing exceptions when using non-real or negative numbers

In [13]:
a = Account('A100', 'Eric', 'Idle', initial_balance=100)

In [14]:
try:
    a.deposit(-100)
except ValueError as ex:
    print(ex)

Value must be at least 0.01


In [15]:
try:
    a.withdraw("100")
except ValueError as ex:
    print(ex)

Value must be a real number.


So this is where I am going to leave this implementation. In reality, there are a lof of other things we should consider such as additional validations, defining equality for two `Account` instances, and so on.

Also, one major piece we have not really done is any kind of testing! We manually tested various pieces as we were coding, but there are multiple problems with this:
- the tests are not easily repeatable - every time we change code in our `Account` class we **really** need to re-run all the tests to make sure everything still works as expected and we have not broken anything.
- our tests were woefully inadequate - we tested just a few cases, and almost no edge cases.

For serious production work, we need to test at least at the following levels:
- unit testing - where we test each piece of code in isolation and make sure all code branches are tested, edge cases are tested, etc (coverage)
- integration testing - where we test that multiple parts of our code works together correctly. For example, testing that multiple account creation, deposits, interest deposits, withdrawals, confirmation codes, etc work properly as a whole.

##  Project 1 - Unit Testing

As you can imagine to fully test this code is going to take quite a bit of effort. I'm not going to do this here, so I very well may have bugs in my code. 

In practice, a whole suite of tests should be written to ensure this code is working properly.

If you're thinking this will be as much work as writing the code itself - well... It will probably be **more** work.

In my experience, designing and implementing unit tests takes longer than writing the code in the first place!

Here's a manual approach to a bit of integration testing:

First, our code:

In [1]:
import itertools
import numbers
from datetime import timedelta, datetime
from collections import namedtuple


class TimeZone:
    def __init__(self, name, offset_hours, offset_minutes):
        if name is None or len(str(name).strip()) == 0:
            raise ValueError('Timezone name cannot be empty.')
            
        self._name = str(name).strip()
        # technically we should check that offset is a
        if not isinstance(offset_hours, numbers.Integral):
            raise ValueError('Hour offset must be an integer.')
        
        if not isinstance(offset_minutes, numbers.Integral):
            raise ValueError('Minutes offset must be an integer.')
            
        if offset_minutes < -59 or offset_minutes > 59:
            raise ValueError('Minutes offset must between -59 and 59 (inclusive).')
            
        # for time delta sign of minutes will be set to sign of hours
        offset = timedelta(hours=offset_hours, minutes=offset_minutes)

        # offsets are technically bounded between -12:00 and 14:00
        # see: https://en.wikipedia.org/wiki/List_of_UTC_time_offsets
        if offset < timedelta(hours=-12, minutes=0) or offset > timedelta(hours=14, minutes=0):
            raise ValueError('Offset must be between -12:00 and +14:00.')
            
        self._offset_hours = offset_hours
        self._offset_minutes = offset_minutes
        self._offset = offset
        
    @property
    def offset(self):
        return self._offset
    
    @property
    def name(self):
        return self._name
    
    def __eq__(self, other):
        return (isinstance(other, TimeZone) and 
                self.name == other.name and 
                self._offset_hours == other._offset_hours and
                self._offset_minutes == other._offset_minutes)
    def __repr__(self):
        return (f"TimeZone(name='{self.name}', "
                f"offset_hours={self._offset_hours}, "
                f"offset_minutes={self._offset_minutes})")

In [2]:
class Account:
    transaction_counter = itertools.count(100)
    _interest_rate = 0.5  # percentage
    
    _transaction_codes = {
        'deposit': 'D',
        'withdraw': 'W',
        'interest': 'I',
        'rejected': 'X'
    }
    
    def __init__(self, account_number, first_name, last_name, timezone=None, initial_balance=0):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
        if timezone is None:
            timezone = TimeZone('UTC', 0, 0)
        self.timezone = timezone
        
        self._balance = Account.validate_real_number(initial_balance)
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
        
    @property
    def timezone(self):
        return self._timezone
    
    @property
    def balance(self):
        return self._balance
    
    @timezone.setter
    def timezone(self, value):
        if not isinstance(value, TimeZone):
            raise ValueError('Time zone must be a valid TimeZone object.')
        self._timezone = value
          
    @classmethod
    def get_interest_rate(cls):
        return cls._interest_rate
    
    @classmethod
    def set_interest_rate(cls, value):
        if not isinstance(value, numbers.Real):
            raise ValueError('Interest rate must be a real number')
        if value < 0:
            raise ValueError('Interest rate cannot be negative.')
        cls._interest_rate = value
        
    def validate_and_set_name(self, property_name, value, field_title):
        if value is None or len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)
        
    @staticmethod
    def validate_real_number(value, min_value=None):
        if not isinstance(value, numbers.Real):
            raise ValueError('Value must be a real number.')
            
        if min_value is not None and value < min_value:
            raise ValueError(f'Value must be at least {min_value}')
            
        # validation passed, return valid value
        return value
    
    def generate_confirmation_code(self, transaction_code):
        # main difficulty here is to generate the current time in UTC using this formatting:
        # YYYYMMDDHHMMSS
        dt_str = datetime.utcnow().strftime('%Y%m%d%H%M%S')
        return f'{transaction_code}-{self.account_number}-{dt_str}-{next(Account.transaction_counter)}'
    
    @staticmethod
    def parse_confirmation_code(confirmation_code, preferred_time_zone=None):
        # dummy-A100-20190325224918-101
        parts = confirmation_code.split('-')
        if len(parts) != 4:
            # really simplistic validation here - would need something better
            raise ValueError('Invalid confirmation code')
        
        # unpack into separate variables
        transaction_code, account_number, raw_dt_utc, transaction_id = parts
        
        # need to convert raw_dt_utc into a proper datetime object
        try:
            dt_utc = datetime.strptime(raw_dt_utc, '%Y%m%d%H%M%S')
        except ValueError as ex:
            # again, probably need better error handling here
            raise ValueError('Invalid transaction datetime') from ex
          
        if preferred_time_zone is None:
            preferred_time_zone = TimeZone('UTC', 0, 0)
            
        if not isinstance(preferred_time_zone, TimeZone):
            raise ValueError('Invalid TimeZone specified.')
            
        dt_preferred = dt_utc + preferred_time_zone.offset
        dt_preferred_str = f"{dt_preferred.strftime('%Y-%m-%d %H:%M:%S')} ({preferred_time_zone.name})"
        
        return Confirmation(account_number, transaction_code, transaction_id, dt_utc.isoformat(), dt_preferred_str)
    
    def deposit(self, value):
        value = Account.validate_real_number(value, min_value=0.01)
       
        # get transaction code
        transaction_code = Account._transaction_codes['deposit']
        
        # generate a confirmation code
        conf_code = self.generate_confirmation_code(transaction_code)
        
        # make deposit and return conf code
        self._balance += value
        return conf_code
    
    def withdraw(self, value):
        value = Account.validate_real_number(value, min_value=0.01)
        accepted = False
        if self.balance - value < 0:
            # insufficient funds - we'll reject this transaction
            transaction_code = Account._transaction_codes['rejected']
        else:
            transaction_code = Account._transaction_codes['withdraw']
            accepted = True
            
        conf_code = self.generate_confirmation_code(transaction_code)
        
        # Doing this here in case there's a problem generating a confirmation code
        # - do not want to modify the balance if we cannot generate a transaction code successfully
        if accepted:
            self._balance -= value
            
        return conf_code
    
    def pay_interest(self):
        interest = self.balance * Account.get_interest_rate() / 100
        conf_code = self.generate_confirmation_code(Account._transaction_codes['interest'])
        self._balance += interest
        return conf_code

In [3]:
a = Account('A100', 'Eric', 'Idle', timezone=TimeZone('MST', -7, 0), initial_balance=100)
print(a.balance)
print(a.deposit(150.02))
print(a.balance)
print(a.withdraw(0.02))
print(a.balance)
Account.set_interest_rate(1.0)
print(a.get_interest_rate())
print(a.pay_interest())
print(a.balance)
print(a.withdraw(1000))

100
D-A100-20190602232958-100
250.02
W-A100-20190602232958-101
250.0
1.0
I-A100-20190602232958-102
252.5
X-A100-20190602232958-103


OK, so that works, but of course we really need to test things a whole lot more, including various scenarios (like withdrawing with insufficient funds, and so on). Also our tests really need to be easily repeatable so we can re-run our tests every time we make a code change.

For that I'm going to introduce you to the `unitest` framework in Python.

In production environments, a 3rd party library (that leverages `unittest`) is used that extends the base `unittest` framework. But for now, `unittest` will work just fine for us.

Normally, unit tests are invoked from the command line, which in turn sets up a test runner and seamlessly runs our tests. In this case, so I can stay within a Jupyter Notebook I'll have to add a few extra chunks of code to set up a test runner manually - this is usually not necessary.

In [4]:
import unittest

In [5]:
def run_tests(test_class):
    suite = unittest.TestLoader().loadTestsFromTestCase(test_class)
    runner = unittest.TextTestRunner(verbosity=2)
    result = runner.run(suite)

Let's see how some simple tests are setup and executed:

In [6]:
class TestAccount(unittest.TestCase):
    def test_ok(self):
        self.assertEqual(1, 1)

In [7]:
run_tests(TestAccount)

test_ok (__main__.TestAccount) ... ok

----------------------------------------------------------------------
Ran 1 test in 0.001s

OK


In cases where the a test fails:

In [8]:
class TestAccount(unittest.TestCase):
    def test_ok(self):
        self.assertEqual(1, 0)

In [9]:
run_tests(TestAccount)

test_ok (__main__.TestAccount) ... FAIL

FAIL: test_ok (__main__.TestAccount)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython-input-8-b9c1ce72eb43>", line 3, in test_ok
    self.assertEqual(1, 0)
AssertionError: 1 != 0

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=1)


So now we can write some simple unit tests. The only thing is that each unit test should be a function in the class that starts with the word `test` - that way it is automatically identified as a unit test.

We also have the option of defining setup and tear down functionality - these are just methods that will be executed before **each** test method, and right **after**. - here's a simple example that shows how that works:

In [10]:
class TestAccount(unittest.TestCase):
    def setUp(self):
        print('Running setup...')
        self.account_number = 'A100'
        
    def tearDown(self):
        print('Running tear down...')
        
    def test_1(self):
        self.account_number = 'A200'
        self.assertTrue('A200', self.account_number)
        
    def test_2(self):
        self.assertTrue('A100', self.account_number)

In [11]:
run_tests(TestAccount)

test_1 (__main__.TestAccount) ... ok
test_2 (__main__.TestAccount) ... 

Running setup...
Running tear down...
Running setup...
Running tear down...


ok

----------------------------------------------------------------------
Ran 2 tests in 0.001s

OK


Even, if the test fails, the tear down method will still run:

In [12]:
class TestAccount(unittest.TestCase):
    def setUp(self):
        print('Running setup...')
        
    def tearDown(self):
        print('Running tear down...')
        
    def testOK(self):
        self.assertTrue(False)

In [13]:
run_tests(TestAccount)

testOK (__main__.TestAccount) ... 

Running setup...
Running tear down...


FAIL

FAIL: testOK (__main__.TestAccount)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython-input-12-12820de134fd>", line 9, in testOK
    self.assertTrue(False)
AssertionError: False is not true

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=1)


So we could use `setUp` to maybe create some bank accounts that we can use throughout our tests. Remember that `TestAccount` is a class, so we can create instance attributes in the `setUp` method, and access them in any of the instance methods (like the test methods).

Another thing to watch out for is that there is no guarantee of the order in which the unit tests are run. Best practice is that unit tests should be independent of each other.

Let's add some simple unit tests for the TimeZone class first:

In [14]:
class TestAccount(unittest.TestCase):
   
    def test_create_timezone(self):
        tz = TimeZone('ABC', -1, -30)
        self.assertEqual('ABC', tz.name)
        self.assertEqual(timedelta(hours=-1, minutes=-30), tz.offset)
        
    def test_timezones_equal(self):
        tz1 = TimeZone('ABC', -1, -30)
        tz2 = TimeZone('ABC', -1, -30)
        self.assertEqual(tz1, tz2)
        
    def test_timezones_not_equal(self):
        tz = TimeZone('ABC', -1, -30)
        
        test_timezones = (
            TimeZone('DEF', -1, -30),
            TimeZone('ABC', -1, 0),
            TimeZone('ABC', 1, -30)
        )
        for test_tz in test_timezones:
            self.assertNotEqual(tz, test_tz)

In [15]:
run_tests(TestAccount)

test_create_timezone (__main__.TestAccount) ... ok
test_timezones_equal (__main__.TestAccount) ... ok
test_timezones_not_equal (__main__.TestAccount) ... ok

----------------------------------------------------------------------
Ran 3 tests in 0.002s

OK


Notice how we needed to run multiple scenarios for testing non-equal time zones. This is a fairly common occurrence, and there's a better way to set this up so we actually have separate tests, that are distinguishable from each other (it's slightly easier when using pytest, but the end result is similar):

In [16]:
class TestAccount(unittest.TestCase):
    
    def test_create_timezone(self):
        tz = TimeZone('ABC', -1, -30)
        self.assertEqual('ABC', tz.name)
        self.assertEqual(timedelta(hours=-1, minutes=-30), tz.offset)
        
    def test_timezones_equal(self):
        tz1 = TimeZone('ABC', -1, -30)
        tz2 = TimeZone('ABC', -1, -30)
        self.assertEqual(tz1, tz2)
        
    def test_timezones_not_equal(self):
        tz = TimeZone('ABC', -1, -30)
        
        test_timezones = (
            TimeZone('DEF', -1, -30),
            TimeZone('ABC', -1, 0),
            TimeZone('ABC', 1, -30)
        )
        for i, test_tz in enumerate(test_timezones):
            with self.subTest(test_number=i):
                self.assertNotEqual(tz, test_tz)

In [17]:
run_tests(TestAccount)

test_create_timezone (__main__.TestAccount) ... ok
test_timezones_equal (__main__.TestAccount) ... ok
test_timezones_not_equal (__main__.TestAccount) ... ok

----------------------------------------------------------------------
Ran 3 tests in 0.002s

OK


Where this might be handy is in the case of a test failure:

In [18]:
class TestAccount(unittest.TestCase):
    
    def test_create_timezone(self):
        tz = TimeZone('ABC', -1, -30)
        self.assertEqual('ABC', tz.name)
        self.assertEqual(timedelta(hours=-1, minutes=-30), tz.offset)
        
    def test_timezones_equal(self):
        tz1 = TimeZone('ABC', -1, -30)
        tz2 = TimeZone('ABC', -1, -30)
        self.assertEqual(tz1, tz2)
        
    def test_timezones_not_equal(self):
        tz = TimeZone('ABC', -1, -30)
        
        test_timezones = (
            TimeZone('DEF', -1, -30),
            TimeZone('ABC', -1, 0),
            TimeZone('ABC', 1, -30),
            TimeZone('ABC', -1, -30)
        )
        for i, test_tz in enumerate(test_timezones):
            with self.subTest(test_number=i):
                self.assertNotEqual(tz, test_tz)

In [19]:
run_tests(TestAccount)

test_create_timezone (__main__.TestAccount) ... ok
test_timezones_equal (__main__.TestAccount) ... ok
test_timezones_not_equal (__main__.TestAccount) ... 
FAIL: test_timezones_not_equal (__main__.TestAccount) (test_number=3)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython-input-18-72e5ac56299a>", line 24, in test_timezones_not_equal
    self.assertNotEqual(tz, test_tz)
AssertionError: TimeZone(name='ABC', offset_hours=-1, offset_minutes=-30) == TimeZone(name='ABC', offset_hours=-1, offset_minutes=-30)

----------------------------------------------------------------------
Ran 3 tests in 0.002s

FAILED (failures=1)


As you can see we have a message associated with the failed test.

Let's go back and remove that incorrect test:

In [20]:
class TestAccount(unittest.TestCase):
    
    def test_create_timezone(self):
        tz = TimeZone('ABC', -1, -30)
        self.assertEqual('ABC', tz.name)
        self.assertEqual(timedelta(hours=-1, minutes=-30), tz.offset)
        
    def test_timezones_equal(self):
        tz1 = TimeZone('ABC', -1, -30)
        tz2 = TimeZone('ABC', -1, -30)
        self.assertEqual(tz1, tz2)
        
    def test_timezones_not_equal(self):
        tz = TimeZone('ABC', -1, -30)
        
        test_timezones = (
            TimeZone('DEF', -1, -30),
            TimeZone('ABC', -1, 0),
            TimeZone('ABC', 1, -30)
        )
        for i, test_tz in enumerate(test_timezones):
            with self.subTest(test_number=i):
                self.assertNotEqual(tz, test_tz)

Now we can start adding additional unit tests for our Account class.

Remember that unit tests are meant to test one specific piece of functionality - don't try to group too much in your tests, as otherwise the error messages can become less meaningful, making harder to track down the actual problem.

A recommended practice is either to set up unit tests **before** you write your code, or soon after. Here I left writing unit tests to the end, and this leads to badly written, or simply omitted unit tests because it becomes too tedious!

I'm only going to add a few more tests, and you can continue writing them on your own. We'll use unit tests again, and introduce additional functionality over time.

In [21]:
class TestAccount(unittest.TestCase):
    
    def test_create_timezone(self):
        tz = TimeZone('ABC', -1, -30)
        self.assertEqual('ABC', tz.name)
        self.assertEqual(timedelta(hours=-1, minutes=-30), tz.offset)
        
    def test_timezones_equal(self):
        tz1 = TimeZone('ABC', -1, -30)
        tz2 = TimeZone('ABC', -1, -30)
        self.assertEqual(tz1, tz2)
        
    def test_timezones_not_equal(self):
        tz = TimeZone('ABC', -1, -30)
        
        test_timezones = (
            TimeZone('DEF', -1, -30),
            TimeZone('ABC', -1, 0),
            TimeZone('ABC', 1, -30)
        )
        for i, test_tz in enumerate(test_timezones):
            with self.subTest(test_number=i):
                self.assertNotEqual(tz, test_tz)
                
    def test_create_account(self):
        account_number = 'A100'
        first_name = 'FIRST'
        last_name = 'LAST'
        tz = TimeZone('TZ', 1, 30)
        balance = 100.00
        
        a = Account(account_number, first_name, last_name, tz, balance)
        self.assertEqual(account_number, a.account_number)
        self.assertEqual(first_name, a.first_name)
        self.assertEqual(last_name, a.last_name)
        self.assertEqual(first_name + ' ' + last_name, a.full_name)
        self.assertEqual(tz, a.timezone)
        self.assertEqual(balance, a.balance)

In [22]:
run_tests(TestAccount)

test_create_account (__main__.TestAccount) ... ok
test_create_timezone (__main__.TestAccount) ... ok
test_timezones_equal (__main__.TestAccount) ... ok
test_timezones_not_equal (__main__.TestAccount) ... ok

----------------------------------------------------------------------
Ran 4 tests in 0.003s

OK


One last piece of unit testing functionality, is handling exceptions when they are **expected**, for example creating an account with an empty first name should result in a `ValueError` exception. We can write a unit test that will test this expected exception, and which will fail if the exception is not encountered (or is a different exception).

To do this we need to indicate that an exception is expected, as well as the expected exception class.

In [23]:
class TestAccount(unittest.TestCase):
    
    def test_create_timezone(self):
        tz = TimeZone('ABC', -1, -30)
        self.assertEqual('ABC', tz.name)
        self.assertEqual(timedelta(hours=-1, minutes=-30), tz.offset)
        
    def test_timezones_equal(self):
        tz1 = TimeZone('ABC', -1, -30)
        tz2 = TimeZone('ABC', -1, -30)
        self.assertEqual(tz1, tz2)
        
    def test_timezones_not_equal(self):
        tz = TimeZone('ABC', -1, -30)
        
        test_timezones = (
            TimeZone('DEF', -1, -30),
            TimeZone('ABC', -1, 0),
            TimeZone('ABC', 1, -30)
        )
        for i, test_tz in enumerate(test_timezones):
            with self.subTest(test_number=i):
                self.assertNotEqual(tz, test_tz)
                
    def test_create_account(self):
        account_number = 'A100'
        first_name = 'FIRST'
        last_name = 'LAST'
        tz = TimeZone('TZ', 1, 30)
        balance = 100.00
        
        a = Account(account_number, first_name, last_name, tz, balance)
        self.assertEqual(account_number, a.account_number)
        self.assertEqual(first_name, a.first_name)
        self.assertEqual(last_name, a.last_name)
        self.assertEqual(first_name + ' ' + last_name, a.full_name)
        self.assertEqual(tz, a.timezone)
        self.assertEqual(balance, a.balance)
        
    def test_create_account_blank_first_name(self):
        account_number = 'A100'
        first_name = ''
        last_name = 'LAST'
        tz = TimeZone('TZ', 1, 30)
        balance = 100.00
        
        with self.assertRaises(ValueError):
            a = Account(account_number, first_name, last_name, tz, balance)

In [24]:
run_tests(TestAccount)

test_create_account (__main__.TestAccount) ... ok
test_create_account_blank_first_name (__main__.TestAccount) ... ok
test_create_timezone (__main__.TestAccount) ... ok
test_timezones_equal (__main__.TestAccount) ... ok
test_timezones_not_equal (__main__.TestAccount) ... ok

----------------------------------------------------------------------
Ran 5 tests in 0.003s

OK


But, if we were looking for a different exception:

In [25]:
class TestAccount(unittest.TestCase):
    
    def test_create_timezone(self):
        tz = TimeZone('ABC', -1, -30)
        self.assertEqual('ABC', tz.name)
        self.assertEqual(timedelta(hours=-1, minutes=-30), tz.offset)
        
    def test_timezones_equal(self):
        tz1 = TimeZone('ABC', -1, -30)
        tz2 = TimeZone('ABC', -1, -30)
        self.assertEqual(tz1, tz2)
        
    def test_timezones_not_equal(self):
        tz = TimeZone('ABC', -1, -30)
        
        test_timezones = (
            TimeZone('DEF', -1, -30),
            TimeZone('ABC', -1, 0),
            TimeZone('ABC', 1, -30)
        )
        for i, test_tz in enumerate(test_timezones):
            with self.subTest(test_number=i):
                self.assertNotEqual(tz, test_tz)
                
    def test_create_account(self):
        account_number = 'A100'
        first_name = 'FIRST'
        last_name = 'LAST'
        tz = TimeZone('TZ', 1, 30)
        balance = 100.00
        
        a = Account(account_number, first_name, last_name, tz, balance)
        self.assertEqual(account_number, a.account_number)
        self.assertEqual(first_name, a.first_name)
        self.assertEqual(last_name, a.last_name)
        self.assertEqual(first_name + ' ' + last_name, a.full_name)
        self.assertEqual(tz, a.timezone)
        self.assertEqual(balance, a.balance)
        
    def test_create_account_blank_first_name(self):
        account_number = 'A100'
        first_name = ''
        last_name = 'LAST'
        tz = TimeZone('TZ', 1, 30)
        balance = 100.00
        
        with self.assertRaises(TypeError):
            a = Account(account_number, first_name, last_name, tz, balance)

In [26]:
run_tests(TestAccount)

test_create_account (__main__.TestAccount) ... ok
test_create_account_blank_first_name (__main__.TestAccount) ... ERROR
test_create_timezone (__main__.TestAccount) ... ok
test_timezones_equal (__main__.TestAccount) ... ok
test_timezones_not_equal (__main__.TestAccount) ... ok

ERROR: test_create_account_blank_first_name (__main__.TestAccount)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython-input-25-ff9e533b4ff8>", line 48, in test_create_account_blank_first_name
    a = Account(account_number, first_name, last_name, tz, balance)
  File "<ipython-input-2-aa0deed29ed0>", line 15, in __init__
    self.first_name = first_name
  File "<ipython-input-2-aa0deed29ed0>", line 34, in first_name
    self.validate_and_set_name('_first_name', value, 'First Name')
  File "<ipython-input-2-aa0deed29ed0>", line 77, in validate_and_set_name
    raise ValueError(f'{field_title} cannot be empty.')
ValueError: First Name cannot be 

As you can start to see, there are a lot of gaps in our `Account` implementation. For example, we allow empty account numbers, negative starting balances. As you start writing unit tests you will not only discover bugs in your code, but also gaps in your design and implementation!

Let's fix our unit test back to expecting a `ValueError`, and write a few more.

In [27]:
class TestAccount(unittest.TestCase):
    
    def test_create_timezone(self):
        tz = TimeZone('ABC', -1, -30)
        self.assertEqual('ABC', tz.name)
        self.assertEqual(timedelta(hours=-1, minutes=-30), tz.offset)
        
    def test_timezones_equal(self):
        tz1 = TimeZone('ABC', -1, -30)
        tz2 = TimeZone('ABC', -1, -30)
        self.assertEqual(tz1, tz2)
        
    def test_timezones_not_equal(self):
        tz = TimeZone('ABC', -1, -30)
        
        test_timezones = (
            TimeZone('DEF', -1, -30),
            TimeZone('ABC', -1, 0),
            TimeZone('ABC', 1, -30)
        )
        for i, test_tz in enumerate(test_timezones):
            with self.subTest(test_number=i):
                self.assertNotEqual(tz, test_tz)
                
    def test_create_account(self):
        account_number = 'A100'
        first_name = 'FIRST'
        last_name = 'LAST'
        tz = TimeZone('TZ', 1, 30)
        balance = 100.00
        
        a = Account(account_number, first_name, last_name, tz, balance)
        self.assertEqual(account_number, a.account_number)
        self.assertEqual(first_name, a.first_name)
        self.assertEqual(last_name, a.last_name)
        self.assertEqual(first_name + ' ' + last_name, a.full_name)
        self.assertEqual(tz, a.timezone)
        self.assertEqual(balance, a.balance)
        
    def test_create_account_blank_first_name(self):
        account_number = 'A100'
        first_name = ''
        last_name = 'LAST'
        tz = TimeZone('TZ', 1, 30)
        balance = 100.00
        
        with self.assertRaises(ValueError):
            a = Account(account_number, first_name, last_name, tz, balance)
            
    def test_account_deposit_ok(self):
        account_number = 'A100'
        first_name = 'FIRST'
        last_name = 'LAST'
        balance = 100.00
        
        a = Account(account_number, first_name, last_name, initial_balance=balance)
        conf_code = a.deposit(100)
        self.assertEqual(200, a.balance)
        self.assertIn('D-', conf_code)
    
    def test_account_deposit_negative_amount(self):
        account_number = 'A100'
        first_name = 'FIRST'
        last_name = 'LAST'
        balance = 100.00
        
        a = Account(account_number, first_name, last_name, initial_balance=balance)
        with self.assertRaises(ValueError):
            conf_code = a.deposit(-100)
        
    def test_account_withdraw_ok(self):
        account_number = 'A100'
        first_name = 'FIRST'
        last_name = 'LAST'
        balance = 100.00
        
        a = Account(account_number, first_name, last_name, initial_balance=balance)
        conf_code = a.withdraw(20)
        self.assertEqual(80, a.balance)
        self.assertIn('W-', conf_code)
        
    
    def test_account_withdraw_overdraw(self):
        account_number = 'A100'
        first_name = 'FIRST'
        last_name = 'LAST'
        balance = 100.00
        
        a = Account(account_number, first_name, last_name, initial_balance=balance)
        conf_code = a.withdraw(200)
        self.assertIn('X-', conf_code)
        self.assertEqual(balance, a.balance)
        

In [28]:
run_tests(TestAccount)

test_account_deposit_negative_amount (__main__.TestAccount) ... ok
test_account_deposit_ok (__main__.TestAccount) ... ok
test_account_withdraw_ok (__main__.TestAccount) ... ok
test_account_withdraw_overdraw (__main__.TestAccount) ... ok
test_create_account (__main__.TestAccount) ... ok
test_create_account_blank_first_name (__main__.TestAccount) ... ok
test_create_timezone (__main__.TestAccount) ... ok
test_timezones_equal (__main__.TestAccount) ... ok
test_timezones_not_equal (__main__.TestAccount) ... ok

----------------------------------------------------------------------
Ran 9 tests in 0.005s

OK


Let's add one more unit test, that checkes to make sure we cannot create accounts with negative balances:

In [29]:
class TestAccount(unittest.TestCase):
    
    def test_create_timezone(self):
        tz = TimeZone('ABC', -1, -30)
        self.assertEqual('ABC', tz.name)
        self.assertEqual(timedelta(hours=-1, minutes=-30), tz.offset)
        
    def test_timezones_equal(self):
        tz1 = TimeZone('ABC', -1, -30)
        tz2 = TimeZone('ABC', -1, -30)
        self.assertEqual(tz1, tz2)
        
    def test_timezones_not_equal(self):
        tz = TimeZone('ABC', -1, -30)
        
        test_timezones = (
            TimeZone('DEF', -1, -30),
            TimeZone('ABC', -1, 0),
            TimeZone('ABC', 1, -30)
        )
        for i, test_tz in enumerate(test_timezones):
            with self.subTest(test_number=i):
                self.assertNotEqual(tz, test_tz)
                
    def test_create_account(self):
        account_number = 'A100'
        first_name = 'FIRST'
        last_name = 'LAST'
        tz = TimeZone('TZ', 1, 30)
        balance = 100.00
        
        a = Account(account_number, first_name, last_name, tz, balance)
        self.assertEqual(account_number, a.account_number)
        self.assertEqual(first_name, a.first_name)
        self.assertEqual(last_name, a.last_name)
        self.assertEqual(first_name + ' ' + last_name, a.full_name)
        self.assertEqual(tz, a.timezone)
        self.assertEqual(balance, a.balance)
        
    def test_create_account_blank_first_name(self):
        account_number = 'A100'
        first_name = ''
        last_name = 'LAST'
        tz = TimeZone('TZ', 1, 30)
        balance = 100.00
        
        with self.assertRaises(ValueError):
            a = Account(account_number, first_name, last_name, tz, balance)
            
    def test_create_account_negative_balance(self):
        account_number = 'A100'
        first_name = 'FIRST'
        last_name = 'LAST'
        tz = TimeZone('TZ', 1, 30)
        balance = -100.00
        
        with self.assertRaises(ValueError):
            a = Account(account_number, first_name, last_name, tz, balance)
            
    def test_account_deposit_ok(self):
        account_number = 'A100'
        first_name = 'FIRST'
        last_name = 'LAST'
        balance = 100.00
        
        a = Account(account_number, first_name, last_name, initial_balance=balance)
        conf_code = a.deposit(100)
        self.assertEqual(200, a.balance)
        self.assertIn('D-', conf_code)
    
    def test_account_deposit_negative_amount(self):
        account_number = 'A100'
        first_name = 'FIRST'
        last_name = 'LAST'
        balance = 100.00
        
        a = Account(account_number, first_name, last_name, initial_balance=balance)
        with self.assertRaises(ValueError):
            conf_code = a.deposit(-100)
        
    def test_account_withdraw_ok(self):
        account_number = 'A100'
        first_name = 'FIRST'
        last_name = 'LAST'
        balance = 100.00
        
        a = Account(account_number, first_name, last_name, initial_balance=balance)
        conf_code = a.withdraw(20)
        self.assertEqual(80, a.balance)
        self.assertIn('W-', conf_code)
        
    
    def test_account_withdraw_overdraw(self):
        account_number = 'A100'
        first_name = 'FIRST'
        last_name = 'LAST'
        balance = 100.00
        
        a = Account(account_number, first_name, last_name, initial_balance=balance)
        conf_code = a.withdraw(200)
        self.assertIn('X-', conf_code)
        self.assertEqual(balance, a.balance)
        

In [30]:
run_tests(TestAccount)

test_account_deposit_negative_amount (__main__.TestAccount) ... ok
test_account_deposit_ok (__main__.TestAccount) ... ok
test_account_withdraw_ok (__main__.TestAccount) ... ok
test_account_withdraw_overdraw (__main__.TestAccount) ... ok
test_create_account (__main__.TestAccount) ... ok
test_create_account_blank_first_name (__main__.TestAccount) ... ok
test_create_account_negative_balance (__main__.TestAccount) ... FAIL
test_create_timezone (__main__.TestAccount) ... ok
test_timezones_equal (__main__.TestAccount) ... ok
test_timezones_not_equal (__main__.TestAccount) ... ok

FAIL: test_create_account_negative_balance (__main__.TestAccount)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython-input-29-94814e310335>", line 58, in test_create_account_negative_balance
    a = Account(account_number, first_name, last_name, tz, balance)
AssertionError: ValueError not raised

-------------------------------------------------

Oh-oh, we are not raising an exception! That's a bug in our code.

So, let's fix it, and re-run the tests:

In [31]:
class Account:
    transaction_counter = itertools.count(100)
    _interest_rate = 0.5  # percentage
    
    _transaction_codes = {
        'deposit': 'D',
        'withdraw': 'W',
        'interest': 'I',
        'rejected': 'X'
    }
    
    def __init__(self, account_number, first_name, last_name, timezone=None, initial_balance=0):
        # in practice we probably would want to add checks to make sure these values are valid / non-empty
        self._account_number = account_number
        self.first_name = first_name
        self.last_name = last_name
        
        if timezone is None:
            timezone = TimeZone('UTC', 0, 0)
        self.timezone = timezone
        
        self._balance = Account.validate_real_number(initial_balance, min_value=0)
        
    @property
    def account_number(self):
        return self._account_number
    
    @property 
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        self.validate_and_set_name('_first_name', value, 'First Name')
        
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        self.validate_and_set_name('_last_name', value, 'Last Name')
        
    # also going to create a full_name computed property, for ease of use
    @property
    def full_name(self):
        return f'{self.first_name} {self.last_name}'
        
    @property
    def timezone(self):
        return self._timezone
    
    @property
    def balance(self):
        return self._balance
    
    @timezone.setter
    def timezone(self, value):
        if not isinstance(value, TimeZone):
            raise ValueError('Time zone must be a valid TimeZone object.')
        self._timezone = value
          
    @classmethod
    def get_interest_rate(cls):
        return cls._interest_rate
    
    @classmethod
    def set_interest_rate(cls, value):
        if not isinstance(value, numbers.Real):
            raise ValueError('Interest rate must be a real number')
        if value < 0:
            raise ValueError('Interest rate cannot be negative.')
        cls._interest_rate = value
        
    def validate_and_set_name(self, property_name, value, field_title):
        if len(str(value).strip()) == 0:
            raise ValueError(f'{field_title} cannot be empty.')
        setattr(self, property_name, value)
        
    @staticmethod
    def validate_real_number(value, min_value=None):
        if not isinstance(value, numbers.Real):
            raise ValueError('Value must be a real number.')
            
        if min_value is not None and value < min_value:
            raise ValueError(f'Value must be at least {min_value}')
            
        # validation passed, return valid value
        return value
    
    def generate_confirmation_code(self, transaction_code):
        # main difficulty here is to generate the current time in UTC using this formatting:
        # YYYYMMDDHHMMSS
        dt_str = datetime.utcnow().strftime('%Y%m%d%H%M%S')
        return f'{transaction_code}-{self.account_number}-{dt_str}-{next(Account.transaction_counter)}'
    
    @staticmethod
    def parse_confirmation_code(confirmation_code, preferred_time_zone=None):
        # dummy-A100-20190325224918-101
        parts = confirmation_code.split('-')
        if len(parts) != 4:
            # really simplistic validation here - would need something better
            raise ValueError('Invalid confirmation code')
        
        # unpack into separate variables
        transaction_code, account_number, raw_dt_utc, transaction_id = parts
        
        # need to convert raw_dt_utc into a proper datetime object
        try:
            dt_utc = datetime.strptime(raw_dt_utc, '%Y%m%d%H%M%S')
        except ValueError as ex:
            # again, probably need better error handling here
            raise ValueError('Invalid transaction datetime') from ex
          
        if preferred_time_zone is None:
            preferred_time_zone = TimeZone('UTC', 0, 0)
            
        if not isinstance(preferred_time_zone, TimeZone):
            raise ValueError('Invalid TimeZone specified.')
            
        dt_preferred = dt_utc + preferred_time_zone.offset
        dt_preferred_str = f"{dt_preferred.strftime('%Y-%m-%d %H:%M:%S')} ({preferred_time_zone.name})"
        
        return Confirmation(account_number, transaction_code, transaction_id, dt_utc.isoformat(), dt_preferred_str)
    
    def deposit(self, value):
        value = Account.validate_real_number(value, min_value=0.01)
       
        # get transaction code
        transaction_code = Account._transaction_codes['deposit']
        
        # generate a confirmation code
        conf_code = self.generate_confirmation_code(transaction_code)
        
        # make deposit and return conf code
        self._balance += value
        return conf_code
    
    def withdraw(self, value):
        value = Account.validate_real_number(value, min_value=0.01)
        accepted = False
        if self.balance - value < 0:
            # insufficient funds - we'll reject this transaction
            transaction_code = Account._transaction_codes['rejected']
        else:
            transaction_code = Account._transaction_codes['withdraw']
            accepted = True
            
        conf_code = self.generate_confirmation_code(transaction_code)
        
        # Doing this here in case there's a problem generating a confirmation code
        # - do not want to modify the balance if we cannot generate a transaction code successfully
        if accepted:
            self._balance -= value
            
        return conf_code
    
    def pay_interest(self):
        interest = self.balance * Account.get_interest_rate() / 100
        conf_code = self.generate_confirmation_code(Account._transaction_codes['interest'])
        self._balance += interest
        return conf_code

In [32]:
run_tests(TestAccount)

test_account_deposit_negative_amount (__main__.TestAccount) ... ok
test_account_deposit_ok (__main__.TestAccount) ... ok
test_account_withdraw_ok (__main__.TestAccount) ... ok
test_account_withdraw_overdraw (__main__.TestAccount) ... ok
test_create_account (__main__.TestAccount) ... ok
test_create_account_blank_first_name (__main__.TestAccount) ... ok
test_create_account_negative_balance (__main__.TestAccount) ... ok
test_create_timezone (__main__.TestAccount) ... ok
test_timezones_equal (__main__.TestAccount) ... ok
test_timezones_not_equal (__main__.TestAccount) ... ok

----------------------------------------------------------------------
Ran 10 tests in 0.006s

OK


Now our tests pass!

And so on... !

You should add at least a few more unit tests to this to get some extra practice.

# Section 04 - Polymorphism and Special Methods

##  `__str__` and `__repr__`

Let's see how this works by first implementing the `__repr__` method:

In [1]:
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age
        
    def __repr__(self):
        print('__repr__ called')
        return f"Person(name='{self.name}, age={self.age}')"

In [2]:
p = Person('Python', 30)

Here's how Jupyter shows us the string representation for the object `p`:

In [3]:
p

__repr__ called


Person(name='Python, age=30')

Here's what it looks like when we use the `print` function:

In [4]:
print(p)

__repr__ called
Person(name='Python, age=30')


Here's what happens if we call the `repr` function:

In [5]:
repr(p)

__repr__ called


"Person(name='Python, age=30')"

And here's what happens when we call the `str` function:

In [6]:
str(p)

__repr__ called


"Person(name='Python, age=30')"

As you can see, in all cases, our `__repr__` method was called.

Now, let's implement a `__str__` method:

In [7]:
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age
        
    def __repr__(self):
        print('__repr__ called')
        return f"Person(name='{self.name}, age=self.age')"
    
    def __str__(self):
        print('__str__ called')
        return self.name

In [8]:
p = Person('Python', 30)

And let's try out each of the ways to get a string representation for `p`:

In [9]:
p

__repr__ called


Person(name='Python, age=self.age')

So, same as before - uses the `__repr__` method.

In [10]:
print(p)

__str__ called
Python


As you can see, `print` will try to use `__str__` if present, otherwise it will fall back to using `__repr__`.

In [11]:
str(p)

__str__ called


'Python'

As expected, `str()` will try to use the `__str__` method first.

In [12]:
repr(p)

__repr__ called


"Person(name='Python, age=self.age')"

Whereas the `repr()` method will use the `__repr__` method directly.

What happens if we define a `__str__` method, but not `__repr__` method.

We'll look at inheritance later, but for now think of it as Python providing "defaults" for those methods when they are not present.

Let's first see how it works if we do not have either of those methods for two different classes:

In [13]:
class Person:
    pass

class Point:
    pass

In [14]:
person = Person()
point = Point()

In [15]:
repr(person), repr(point)

('<__main__.Person object at 0x7fbfe954b860>',
 '<__main__.Point object at 0x7fbfe954b9e8>')

As we can see, Python provides a default representation for objects that contains the class name, and the instance memory address.

If we use `str()` instead, we get the same result:

In [16]:
str(person), str(point)

('<__main__.Person object at 0x7fbfe954b860>',
 '<__main__.Point object at 0x7fbfe954b9e8>')

Now let's go back to our original `Person` class and remove the `__repr__` method:

In [17]:
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age
    
    def __str__(self):
        print('__str__ called')
        return self.name

In [18]:
p = Person('Python', 30)

In [19]:
p

<__main__.Person at 0x7fbfe9569e48>

In [20]:
repr(p)

'<__main__.Person object at 0x7fbfe9569e48>'

Since we do not have a `__repr__` method, Python uses the "default" - it does not use our custom `__str__` method!

But if we use `print()` or `str()`:

In [21]:
print(p)

__str__ called
Python


In [22]:
str(p)

__str__ called


'Python'

Lastly, various formatting functions will also prefer using the `__str__` method when available. Lert's first go back to our `Person` class that implements both:

In [23]:
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age
        
    def __repr__(self):
        print('__repr__ called')
        return f"Person(name='{self.name}, age=self.age')"
    
    def __str__(self):
        print('__str__ called')
        return self.name

In [24]:
p = Person('Python', 30)

In [25]:
f'The person is {p}'

__str__ called


'The person is Python'

In [26]:
'The person is {}'.format(p)

__str__ called


'The person is Python'

In [27]:
'The person is %s' % p

__str__ called


'The person is Python'

##  Arithmetic Operators

Let's first look at some simple example of using the straightforward `__add__`, `__sub__`, etc.

Let's say we want to implement a `Vector` class that supports various arithmetic operations. We won't assume a specific number of dimensions - that will be determined by how many arguments are passed to the `__init__` method. We will however require the arguments to be Real numbers.


In [1]:
from numbers import Real

class Vector:
    def __init__(self, *components):
        # validate number of components is at least one, and all of them are real numbers
        if len(components) < 1:
            raise ValueError('Cannot create an empty Vector.')
        for component in components:
            if not isinstance(component, Real):
                raise ValueError(f'Vector components must all be real numbers - {component} is invalid.')
        
        # use immutable storage for vector
        self._components = tuple(components)
        
    def __len__(self):
        return len(self._components)
        
    @property
    def components(self):
        return self._components
    
    def __repr__(self):
        # works - but unwieldy for high dimension vectors
        return f'Vector{self._components}'

Now, let's support addition and subtraction of vectors - they'll need to be of the same dimension, othwerwise we should raise a `TypeError` exception (consistent with the exception Python raises if you try to add a string and an int for example).

In [2]:
from numbers import Real

class Vector:
    def __init__(self, *components):
        # validate number of components is at least one, and all of them are real numbers
        if len(components) < 1:
            raise ValueError('Cannot create an empty Vector.')
        for component in components:
            if not isinstance(component, Real):
                raise ValueError(f'Vector components must all be real numbers - {component} is invalid.')
        
        # use immutable storage for vector
        self._components = tuple(components)
        
    def __len__(self):
        return len(self._components)
        
    @property
    def components(self):
        return self._components
    
    def __repr__(self):
        # works - but unwieldy for high dimension vectors
        return f'Vector{self._components}'
    
    def validate_type_and_dimension(self, v):
        return isinstance(v, Vector) and len(v) == len(self)
            
    def __add__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x + y for x, y in zip(self.components, other.components))
        return Vector(*components)
            
    def __sub__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x - y for x, y in zip(self.components, other.components))
        return Vector(*components)

Let's try out our class and see how things work at this point:

In [3]:
v1 = Vector(1, 2)
v2 = Vector(10, 10)
v3 = Vector(1, 2, 3, 4)

In [4]:
v1

Vector(1, 2)

In [5]:
v1 + v2

Vector(11, 12)

In [6]:
v2 + v1 

Vector(11, 12)

In [7]:
try:
    print(v1 + v3)
except TypeError as ex:
    print(ex)

unsupported operand type(s) for +: 'Vector' and 'Vector'


In [8]:
try:
    print(v1 + 100)
except TypeError as ex:
    print(ex)

unsupported operand type(s) for +: 'Vector' and 'int'


Now, let's add support for multiplication by a scalar value - e.g. multipliying a vector by a real num ber (not another vector).

To do that we'll implement the `__mul__` method:

In [9]:
from numbers import Real

class Vector:
    def __init__(self, *components):
        # validate number of components is at least one, and all of them are real numbers
        if len(components) < 1:
            raise ValueError('Cannot create an empty Vector.')
        for component in components:
            if not isinstance(component, Real):
                raise ValueError(f'Vector components must all be real numbers - {component} is invalid.')
        
        # use immutable storage for vector
        self._components = tuple(components)
        
    def __len__(self):
        return len(self._components)
        
    @property
    def components(self):
        return self._components
    
    def __repr__(self):
        # works - but unwieldy for high dimension vectors
        return f'Vector{self._components}'
    
    def validate_type_and_dimension(self, v):
        return isinstance(v, Vector) and len(v) == len(self)
            
    def __add__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x + y for x, y in zip(self.components, other.components))
        return Vector(*components)
            
    def __sub__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x - y for x, y in zip(self.components, other.components))
        return Vector(*components)
    
    def __mul__(self, other):
        print('__mul__ called...')
        if not isinstance(other, Real):
            return NotImplemented
        components = (other * x for x in self.components)
        return Vector(*components)

In [10]:
v1 = Vector(1, 2)

In [11]:
v1 * 10

__mul__ called...


Vector(10, 20)

But what happens if we reverse the operation:

In [12]:
try:
    10 * v1
except TypeError as ex:
    print(ex)

unsupported operand type(s) for *: 'int' and 'Vector'


What happened here is that Python first tried calling the addition operation on the `int` object, using the `Vector` as the second operand. Integers of course do no support this type, so Python tried using our `Vector` class - but not the `__mul__` since that is called when the `Vector` is the **left** operand. Instead, it is looking for (and does not find) a method to use when the `Vector` is the **right** operand.

We can implement this method, using `__rmul__`:

In [13]:
from numbers import Real

class Vector:
    def __init__(self, *components):
        # validate number of components is at least one, and all of them are real numbers
        if len(components) < 1:
            raise ValueError('Cannot create an empty Vector.')
        for component in components:
            if not isinstance(component, Real):
                raise ValueError(f'Vector components must all be real numbers - {component} is invalid.')
        
        # use immutable storage for vector
        self._components = tuple(components)
        
    def __len__(self):
        return len(self._components)
        
    @property
    def components(self):
        return self._components
    
    def __repr__(self):
        # works - but unwieldy for high dimension vectors
        return f'Vector{self._components}'
    
    def validate_type_and_dimension(self, v):
        return isinstance(v, Vector) and len(v) == len(self)
            
    def __add__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x + y for x, y in zip(self.components, other.components))
        return Vector(*components)
            
    def __sub__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x - y for x, y in zip(self.components, other.components))
        return Vector(*components)
    
    def __mul__(self, other):
        print('__mul__ called...')
        if not isinstance(other, Real):
            return NotImplemented
        components = (other * x for x in self.components)
        return Vector(*components)
    
    def __rmul__(self, other):
        print('__rmul__ called...')
        # for us, multiplication is commutative, so we can leverage our existing __mul__ method
        return self * other

In [14]:
v1 = Vector(1, 2)

In [15]:
v1 * 10

__mul__ called...


Vector(10, 20)

In [16]:
10 * v1

__rmul__ called...
__mul__ called...


Vector(10, 20)

Now, let's say we want to implement the dot product of two vectors.

If you are rusty on this, just do a quick read of this: https://en.wikipedia.org/wiki/Dot_product

Basically we need vectors of equal dimension, and we calculate the sum of the product of components (pairwise) in each vector.

We can implement it by differentiating between a `Real` and ` Vector` type in our `__mul__` method - of course we won't need it in the `__rmul__` method because if we implement multiplication between two `Vectors` we'll always have a `Vector` as the left operand, so `__mul__` will get called first.

In [17]:
from numbers import Real

class Vector:
    def __init__(self, *components):
        # validate number of components is at least one, and all of them are real numbers
        if len(components) < 1:
            raise ValueError('Cannot create an empty Vector.')
        for component in components:
            if not isinstance(component, Real):
                raise ValueError(f'Vector components must all be real numbers - {component} is invalid.')
        
        # use immutable storage for vector
        self._components = tuple(components)
        
    def __len__(self):
        return len(self._components)
        
    @property
    def components(self):
        return self._components
    
    def __repr__(self):
        # works - but unwieldy for high dimension vectors
        return f'Vector{self._components}'
    
    def validate_type_and_dimension(self, v):
        return isinstance(v, Vector) and len(v) == len(self)
            
    def __add__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x + y for x, y in zip(self.components, other.components))
        return Vector(*components)
            
    def __sub__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x - y for x, y in zip(self.components, other.components))
        return Vector(*components)
    
    def __mul__(self, other):
        print('__mul__ called...')
        if isinstance(other, Real):
            components = (other * x for x in self.components)
            return Vector(*components)
        if self.validate_type_and_dimension(other):
            # dot product
            components = (x * y for x, y in zip(self.components, other.components))
            return sum(components)
        return NotImplemented
    
    def __rmul__(self, other):
        print('__rmul__ called...')
        # for us, multiplication is commutative, so we can leverage our existing __mul__ method
        return self * other

In [18]:
v1 = Vector(1, 2)
v2 = Vector(3, 4)

In [19]:
v1 * v2

__mul__ called...


11

We could also implement the **cross** product of two vectors (which would return another vector).

The calculations get a little more complicated, so I won't show you those details, but let's see how we could use the `@` operator to implement this:

In [20]:
from numbers import Real

class Vector:
    def __init__(self, *components):
        # validate number of components is at least one, and all of them are real numbers
        if len(components) < 1:
            raise ValueError('Cannot create an empty Vector.')
        for component in components:
            if not isinstance(component, Real):
                raise ValueError(f'Vector components must all be real numbers - {component} is invalid.')
        
        # use immutable storage for vector
        self._components = tuple(components)
        
    def __len__(self):
        return len(self._components)
        
    @property
    def components(self):
        return self._components
    
    def __repr__(self):
        # works - but unwieldy for high dimension vectors
        return f'Vector{self._components}'
    
    def validate_type_and_dimension(self, v):
        return isinstance(v, Vector) and len(v) == len(self)
            
    def __add__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x + y for x, y in zip(self.components, other.components))
        return Vector(*components)
            
    def __sub__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x - y for x, y in zip(self.components, other.components))
        return Vector(*components)
    
    def __mul__(self, other):
        print('__mul__ called...')
        if isinstance(other, Real):
            components = (other * x for x in self.components)
            return Vector(*components)
        if self.validate_type_and_dimension(other):
            # dot product
            components = (x * y for x, y in zip(self.components, other.components))
            return sum(components)
        return NotImplemented
    
    def __rmul__(self, other):
        print('__rmul__ called...')
        # for us, multiplication is commutative, so we can leverage our existing __mul__ method
        return self * other
    
    def __matmul__(self, other):
        print('__matmul__ called...')

In [21]:
v1 = Vector(1, 2)
v2 = Vector(3, 4)

In [22]:
v1 * v2

__mul__ called...


11

In [23]:
v1 @ v2

__matmul__ called...


#### In-Place Operators

We also have the in-place operators. Typically in-place operators will try to **mutate** the object on the left of the expression:

In [24]:
l = [1, 2]

In [25]:
id(l)

140460867509512

In [26]:
l += [3]

In [27]:
id(l), l

(140460867509512, [1, 2, 3])

As you can see, the list `l` mas mutated (memory address remained the same). This is not the same effect as:

In [28]:
l = [1, 2]
print(id(l))

l = l + [3]
print(id(l), l)

140460867621128
140460867592136 [1, 2, 3]


As you can see, here we ended up with a **new** list object.

But in-place does **not** *guarantee* a mutation. For example, tuples are immutable objects:

In [29]:
t = (1, 2)
print(id(t))
t += (3, )
print(id(t), t)

140460615377736
140460867314816 (1, 2, 3)


As you can see we ended up with a new tuple. Same thing happens with strings, integers, floats and so on, that are also immutable types. 

Let's go back to our `Vector` class and implement in-place addition - but we'll implement it in such a way that we do not mutate the Vector, instead just returning a new Vector - similar to what we just saw with tuples:

In [30]:
from numbers import Real

class Vector:
    def __init__(self, *components):
        # validate number of components is at least one, and all of them are real numbers
        if len(components) < 1:
            raise ValueError('Cannot create an empty Vector.')
        for component in components:
            if not isinstance(component, Real):
                raise ValueError(f'Vector components must all be real numbers - {component} is invalid.')
        
        # use immutable storage for vector
        self._components = tuple(components)
        
    def __len__(self):
        return len(self._components)
        
    @property
    def components(self):
        return self._components
    
    def __repr__(self):
        # works - but unwieldy for high dimension vectors
        return f'Vector{self._components}'
    
    def validate_type_and_dimension(self, v):
        return isinstance(v, Vector) and len(v) == len(self)
            
    def __add__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x + y for x, y in zip(self.components, other.components))
        return Vector(*components)
            
    def __sub__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x - y for x, y in zip(self.components, other.components))
        return Vector(*components)
    
    def __mul__(self, other):
        print('__mul__ called...')
        if isinstance(other, Real):
            components = (other * x for x in self.components)
            return Vector(*components)
        if self.validate_type_and_dimension(other):
            # dot product
            components = (x * y for x, y in zip(self.components, other.components))
            return sum(components)
        return NotImplemented
    
    def __rmul__(self, other):
        print('__rmul__ called...')
        # for us, multiplication is commutative, so we can leverage our existing __mul__ method
        return self * other
    
    def __iadd__(self, other):
        print('__radd__ called...')
        return self + other

In [31]:
v1 = Vector(1, 2)
v2 = Vector(10, 10)

print(id(v1))

v1 += v2

print(id(v1), v1)

140460867485200
__radd__ called...
140460867485312 Vector(11, 12)


As you can see, we end up with a new `Vector` object.

Now let's modify this so we actually mutate the `Vector` object:

In [32]:
from numbers import Real

class Vector:
    def __init__(self, *components):
        # validate number of components is at least one, and all of them are real numbers
        if len(components) < 1:
            raise ValueError('Cannot create an empty Vector.')
        for component in components:
            if not isinstance(component, Real):
                raise ValueError(f'Vector components must all be real numbers - {component} is invalid.')
        
        # use immutable storage for vector
        self._components = tuple(components)
        
    def __len__(self):
        return len(self._components)
        
    @property
    def components(self):
        return self._components
    
    def __repr__(self):
        # works - but unwieldy for high dimension vectors
        return f'Vector{self._components}'
    
    def validate_type_and_dimension(self, v):
        return isinstance(v, Vector) and len(v) == len(self)
            
    def __add__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x + y for x, y in zip(self.components, other.components))
        return Vector(*components)
            
    def __sub__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x - y for x, y in zip(self.components, other.components))
        return Vector(*components)
    
    def __mul__(self, other):
        print('__mul__ called...')
        if isinstance(other, Real):
            components = (other * x for x in self.components)
            return Vector(*components)
        if self.validate_type_and_dimension(other):
            # dot product
            components = (x * y for x, y in zip(self.components, other.components))
            return sum(components)
        return NotImplemented
    
    def __rmul__(self, other):
        print('__rmul__ called...')
        # for us, multiplication is commutative, so we can leverage our existing __mul__ method
        return self * other
    
    def __iadd__(self, other):
        print('__radd__ called...')
        if self.validate_type_and_dimension(other):
            components = (x + y for x, y in zip(self.components, other.components))
            self._components = tuple(components)  # mutating our Vector object
            return self # don't forget to return the result of the operation!
        return NotImplemented
        

In [33]:
v1 = Vector(1, 2)
v2 = Vector(10, 20)

print(id(v1))

v1 += v2

print(id(v1), v1)

140460867518080
__radd__ called...
140460867518080 Vector(11, 22)


As you can see we **mutated** the object `v1`.

Let's also implement the unary minus on our `Vector` class. In this case we just want to return a new `Vector` with each component negated:

In [34]:
from numbers import Real

class Vector:
    def __init__(self, *components):
        # validate number of components is at least one, and all of them are real numbers
        if len(components) < 1:
            raise ValueError('Cannot create an empty Vector.')
        for component in components:
            if not isinstance(component, Real):
                raise ValueError(f'Vector components must all be real numbers - {component} is invalid.')
        
        # use immutable storage for vector
        self._components = tuple(components)
        
    def __len__(self):
        return len(self._components)
        
    @property
    def components(self):
        return self._components
    
    def __repr__(self):
        # works - but unwieldy for high dimension vectors
        return f'Vector{self._components}'
    
    def validate_type_and_dimension(self, v):
        return isinstance(v, Vector) and len(v) == len(self)
            
    def __add__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x + y for x, y in zip(self.components, other.components))
        return Vector(*components)
            
    def __sub__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x - y for x, y in zip(self.components, other.components))
        return Vector(*components)
    
    def __mul__(self, other):
        print('__mul__ called...')
        if isinstance(other, Real):
            components = (other * x for x in self.components)
            return Vector(*components)
        if self.validate_type_and_dimension(other):
            # dot product
            components = (x * y for x, y in zip(self.components, other.components))
            return sum(components)
        return NotImplemented
    
    def __rmul__(self, other):
        print('__rmul__ called...')
        # for us, multiplication is commutative, so we can leverage our existing __mul__ method
        return self * other
    
    def __iadd__(self, other):
        print('__radd__ called...')
        if self.validate_type_and_dimension(other):
            components = (x + y for x, y in zip(self.components, other.components))
            self._components = tuple(components)  # mutating our Vector object
            return self # don't forget to return the result of the operation!
        return NotImplemented
        
    def __neg__(self):
        print('__neg__ called...')
        components = (-x for x in self.components)
        return Vector(*components)

In [35]:
v1 = Vector(1, 2)
-v1

__neg__ called...


Vector(-1, -2)

So we can use it in arithmetic operations such as:

In [36]:
v2 = Vector(10, 10)

v2 + -v1

__neg__ called...


Vector(9, 8)

Lastly, let's implement the `abs` function for our Vector. Right now it won't work:

In [37]:
try:
    abs(v1)
except TypeError as ex:
    print(ex)

bad operand type for abs(): 'Vector'


But we can fix that:

In [38]:
from numbers import Real
from math import sqrt

class Vector:
    def __init__(self, *components):
        # validate number of components is at least one, and all of them are real numbers
        if len(components) < 1:
            raise ValueError('Cannot create an empty Vector.')
        for component in components:
            if not isinstance(component, Real):
                raise ValueError(f'Vector components must all be real numbers - {component} is invalid.')
        
        # use immutable storage for vector
        self._components = tuple(components)
        
    def __len__(self):
        return len(self._components)
        
    @property
    def components(self):
        return self._components
    
    def __repr__(self):
        # works - but unwieldy for high dimension vectors
        return f'Vector{self._components}'
    
    def validate_type_and_dimension(self, v):
        return isinstance(v, Vector) and len(v) == len(self)
            
    def __add__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x + y for x, y in zip(self.components, other.components))
        return Vector(*components)
            
    def __sub__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x - y for x, y in zip(self.components, other.components))
        return Vector(*components)
    
    def __mul__(self, other):
        print('__mul__ called...')
        if isinstance(other, Real):
            components = (other * x for x in self.components)
            return Vector(*components)
        if self.validate_type_and_dimension(other):
            # dot product
            components = (x * y for x, y in zip(self.components, other.components))
            return sum(components)
        return NotImplemented
    
    def __rmul__(self, other):
        print('__rmul__ called...')
        # for us, multiplication is commutative, so we can leverage our existing __mul__ method
        return self * other
    
    def __iadd__(self, other):
        print('__radd__ called...')
        if self.validate_type_and_dimension(other):
            components = (x + y for x, y in zip(self.components, other.components))
            self._components = tuple(components)  # mutating our Vector object
            return self # don't forget to return the result of the operation!
        return NotImplemented
        
    def __neg__(self):
        print('__neg__ called...')
        components = (-x for x in self.components)
        return Vector(*components)
    
    def __abs__(self):
        print('__abs__ called...')
        return sqrt(sum(x ** 2 for x in self.components))

In [39]:
v1 = Vector(1, 1)

In [40]:
abs(v1)

__abs__ called...


1.4142135623730951

#### Other Uses

Of course, these arithmetic operators are not restricted to working with numbers. We've seen them work with strings as well for example, or lists even.

We can also use them in our custom classes in different ways where we want to implement and attach special meaning to these operators.

For example, we might have a `Family` class that holds together:
- mother and father `Person` objects
- a list of children `Person` objects

We want to make it such that we can add children simply by using inplace addition.

In [41]:
class Person:
    def __init__(self, name):
        self.name = name
        
    def __repr__(self):
        return f"Person('{self.name}')"

In [42]:
p1 = Person('John')

In [43]:
class Family:
    def __init__(self, mother, father):
        self.mother = mother
        self.father = father
        self.children = []
        
    def __iadd__(self, other):
        self.children.append(other)
        return self
    

In [44]:
f = Family(Person('Mary'), Person('John'))
print(id(f))

140460867516680


In [45]:
f += Person('Eric')
print(id(f))
print(f.children)

140460867516680
[Person('Eric')]


In [46]:
f += Person('Michael')
print(id(f))
print(f.children)

140460867516680
[Person('Eric'), Person('Michael')]


So, don't feel restricted to using these operators for numerical use cases only.

##  Rich Comparisons

This is quite staightforward. We can choose to implement any number of these rich comparison operators in our classes.

Furthermore, if one comparison does not exist, Python will try to the reverse the operands and the operator (and unlike the arithmetic operators, both operands can be of the same type).

Let's use a 2D `Vector` class to check this out:

In [1]:
class Vector:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __repr__(self):
        return f'Vector(x={self.x}, y={self.y})'

In [2]:
v1 = Vector(0, 0)
v2 = Vector(0, 0)
print(id(v1), id(v2))

140301551452112 140301551452000


In [3]:
v1 == v2

False

By default, Python will use `is` when we do not provide an implementation for `==`. In this case we have two different objects, so they do not compare `==`.

Let's change that:

In [4]:
class Vector:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __repr__(self):
        return f'Vector(x={self.x}, y={self.y})'
        
    def __eq__(self, other):
        if isinstance(other, Vector):
            return self.x == other.x and self.y == other.y
        return NotImplemented

In [5]:
v1 = Vector(1, 1)
v2 = Vector(1, 1)
v3 = Vector(10, 10)

In [6]:
v1 == v2, v1 is v2

(True, False)

In [7]:
v1 == v3

False

We could even support an equality comparison with  other iterable types. Let's say we want to support equality comparisons with tuples:

In [8]:
class Vector:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __repr__(self):
        return f'Vector(x={self.x}, y={self.y})'
        
    def __eq__(self, other):
        if isinstance(other, tuple):
            other = Vector(*other)
        if isinstance(other, Vector):
            return self.x == other.x and self.y == other.y
        return NotImplemented

In [9]:
v1 = Vector(10, 11)

In [10]:
v1 == (10, 11)

True

In fact, although tuples do not implement equality against a `Vector`, it will still work because Python will reflect the operation:

In [11]:
(10, 11) == v1

True

We can also implement the other rich comparison operators in the same way.

Let's implement the `<` operator:

We'll consider a Vector to be less than another vector if it's length (Euclidean) is less than the other.

We're actually going to make use of the `abs` function for this, so we'll define the `__abs__` method as well.

In [12]:
from math import sqrt

class Vector:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __repr__(self):
        return f'Vector(x={self.x}, y={self.y})'
        
    def __eq__(self, other):
        if isinstance(other, tuple):
            other = Vector(*other)
        if isinstance(other, Vector):
            return self.x == other.x and self.y == other.y
        return NotImplemented
    
    def __abs__(self):
        return sqrt(self.x ** 2 + self.y ** 2)
    
    def __lt__(self, other):
        if isinstance(other, tuple):
            other = Vector(*other)
        if isinstance(other, Vector):
            return abs(self) < abs(other)
        return NotImplemented

In [13]:
v1 = Vector(0, 0)
v2 = Vector(1, 1)

In [14]:
v1 < v2

True

What's interesting is that `>` between two vectors will work as well:

In [15]:
v2 > v1

True

What happened is that since `__gt__` was not implemented, Python decided to reflect the operation, so instead of actually running this comparison:

```v2 > v1```

Python actually ran:

```v1 < v2```

What about with tuples?

In [16]:
v1 < (1, 1)

True

And the reverse?

In [17]:
(1, 1) > v1

True

That worked too. How about `<=`, since we have `,` and `==` defined, will Python be able to use both to come up with a result?

In [18]:
v1, v2

(Vector(x=0, y=0), Vector(x=1, y=1))

In [19]:
try:
    v1 <= v2
except TypeError as ex:
    print(ex)

'<=' not supported between instances of 'Vector' and 'Vector'


Nope - so we have to implement it ourselves. Let's do that:

In [20]:
from math import sqrt

class Vector:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __repr__(self):
        return f'Vector(x={self.x}, y={self.y})'
        
    def __eq__(self, other):
        if isinstance(other, tuple):
            other = Vector(*other)
        if isinstance(other, Vector):
            return self.x == other.x and self.y == other.y
        return NotImplemented
    
    def __abs__(self):
        return sqrt(self.x ** 2 + self.y ** 2)
    
    def __lt__(self, other):
        if isinstance(other, tuple):
            other = Vector(*other)
        if isinstance(other, Vector):
            return abs(self) < abs(other)
        return NotImplemented
    
    def __le__(self, other):
        return self == other or self < other

In [21]:
v1 = Vector(0, 0)
v2 = Vector(0, 0)
v3 = Vector(1, 1)

In [22]:
v1 <= v2

True

In [23]:
v1 <= v3

True

In [24]:
v1 <= (0.5, 0.5)

True

What about `>=`?

In [25]:
v1 >= v2

True

Again, Python was able to reverse the operation:

```v1 >= v2```

and run:

```v2 <= v1```

We also have the `!=` operator:

In [26]:
v1 != v2

False

How did that work?
Well Python could not find a `__ne__` method, so it delegated to `__eq__` instead:

```
not(v1 == v2)
```

We can easily see this by adding a print statement to our `__eq__` method:

In [27]:
from math import sqrt

class Vector:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __repr__(self):
        return f'Vector(x={self.x}, y={self.y})'
        
    def __eq__(self, other):
        print('__eq__ called...')
        if isinstance(other, tuple):
            other = Vector(*other)
        if isinstance(other, Vector):
            return self.x == other.x and self.y == other.y
        return NotImplemented
    
    def __abs__(self):
        return sqrt(self.x ** 2 + self.y ** 2)
    
    def __lt__(self, other):
        if isinstance(other, tuple):
            other = Vector(*other)
        if isinstance(other, Vector):
            return abs(self) < abs(other)
        return NotImplemented
    
    def __le__(self, other):
        return self == other or self < other

In [28]:
v1 = Vector(0, 0)
v2 = Vector(1, 1)

In [29]:
v1 != v2

__eq__ called...


True

In many cases, we can derive most of the rich comparisons from just two base ones: the `__eq__` and one other one, maybe `__lt__`, or `__le__`, etc.

For example, if `==` and `<` is defined, then:
- `a <= b` is `a == b or a < b`
- `a > b` is `b < a`
- `a >= b` is `a == b or b < a`
- `a != b` is `not(a == b)`

On the other hand if we define `==` and `<=`, then:
- `a < b` is `a <= b and not(a == b)`
- `a >= b` is `b <= a`
- `a > b` is `b <= a and not(b == a)`
- `a != b` is `not(a == b)`

So, instead of us defining all the various methods, we can use the `@total_ordering` decorator in the `functools` module, that will work with `__eq__` and **one** other rich comparison method, filling in all the gaps for us:

In [30]:
from functools import total_ordering

@total_ordering
class Number:
    def __init__(self, x):
        self.x = x
        
    def __eq__(self, other):
        print('__eq__ called...')
        if isinstance(other, Number):
            return self.x == other.x
        return NotImplemented
    
    def __lt__(self, other):
        print('__lt__ called...')
        if isinstance(other, Number):
            return self.x < other.x
        return NotImplemented

In [31]:
a = Number(1)
b = Number(2)
c = Number(1)

In [32]:
a < b

__lt__ called...


True

In [33]:
a <= b

__lt__ called...


True

You'll notice that `__eq__` was not called - that's because `a < b` was True, and short-circuit evaluation. In this next example though, you'll see both methods are called:

In [34]:
a <= c

__lt__ called...
__eq__ called...


True

One thing I want to point out, according to the documentation the `__eq__` is not actually **required**. That's because as we saw earlier, all objects have a **default** implementation for `==` based on the memory address. That's usually not what we want, so we normally end up defining a custom `__eq__` implementation as well.

##  Hashing and Equality

By default, when we create a custom class, we inherit `__eq__` and `__hash__` from the `object` class.

In [1]:
dir(object)

['__class__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__']

This means that by default our custom classes produce hashable objects that can be used in mapping types such as dictionaries and sets.

In [2]:
class Person:
    pass

In [3]:
p1 = Person()
p2 = Person()

In [4]:
hash(p1), hash(p2)

(8779945178916, -9223363256909596903)

In [5]:
p1 == p2

False

By default `__hash__` uses the object's identity, and `__eq__` will only evaluate to `True` if the two objects are the same objects (identity).

We can override those default implementations ourselves. 

If we override the `__eq__` method, Python will automatically make our class unhashable:

In [6]:
class Person:
    def __init__(self, name):
        self.name = name
        
    def __eq__(self, other):
        return isinstance(other, Person) and self.name == other.name
            

In [7]:
p1 = Person('John')
p2 = Person('John')
p3 = Person('Eric')

In [8]:
p1 == p2, p1 == p3

(True, False)

But now we have lost hashing:

In [9]:
try:
    hash(p1)
except TypeError as ex:
    print(ex)

unhashable type: 'Person'


This is because two objects that compare equal should also have the same hash. However, Python's default is to use the object's identity. So if that were the case then `p1` and `p2` would be equal, but would not have the same hash.

So Python sets the `__hash__` property to `None`:

In [10]:
type(p1.__hash__)

NoneType

The downside to this is that we can no longer use instances of this class as keys in a dictionary or elements of a set:

In [11]:
try:
    d = {p1: 'person 1'}
except TypeError as ex:
    print(ex)

unhashable type: 'Person'


We can however provide our own override for `__hash__`:

In [12]:
class Person:
    def __init__(self, name):
        self.name = name
        
    def __eq__(self, other):
        return isinstance(other, Person) and self.name == other.name
            
    def __hash__(self):
        return hash(self.name)

We now have a `Person` class that supports equality based on the state of the class (the `name` in this instance) and is hashable too.

We should also keep in mind that for this to work well in data structurfes such as dictionaries, what we use to create a hash of the class should remain immutable.

So, a better approach would be to make the `name` property a read-only property:

In [13]:
class Person:
    def __init__(self, name):
        self._name = name
        
    @property
    def name(self):
        return self._name
    
    def __eq__(self, other):
        return isinstance(other, Person) and self.name == other.name
            
    def __hash__(self):
        return hash(self.name)

And now our Person instances can be used in sets and dictionaries (keys)

In [14]:
p1 = Person('Eric')

In [15]:
d = {p1: 'Eric'}

In [16]:
d

{<__main__.Person at 0x7fc3d838f0f0>: 'Eric'}

In [17]:
s = {p1}

And of course since we now have equality defined in terms of the object state (and not the default of, essentially, the memory address), we can recover an element from a dictionary using different objects (identity wise) that have the same state (equality wise).

##  Booleans

As we know every object in Python has an associated truth value. Empty container types are falsy, non-zero numbers are truthy, zero numbers are falsy, etc.

The way Python determines the truth value of our custom classes is to:
1. first look for an implementation of the `__bool__` method (which needs to return a boolean)
2. if not present, looks for `__len__` and will return `False` if that is `0`, and `True` otherwise
3. otherwise returns `True`

Let's look at some example which illustrate this behavior:

First let's not define anything, so our objects should always have a `True` associated truth value:

In [1]:
class Person:
    pass

In [2]:
p = Person()

In [3]:
bool(p)

True

Now let's implement the `__len__` method:

In [4]:
class MyList:
    def __init__(self, length):
        self._length = length
        
    def __len__(self):
        print('__len__ called')
        return self._length

In [5]:
l1 = MyList(0)  # so __len__ will return 0
l2 = MyList(10)  # so __len__ will return 10

In [6]:
bool(l1)

__len__ called


False

In [7]:
bool(l2)

__len__ called


True

So when we create custom iterables, as long as we have a `__len__` method implemented, we can actually skip implementing the `__bool__` method, and our class will remain consistent with other collection types behaviors (empty collections are falsy, otherwise truthy).

Let's implement the `__bool__` method though, just to see that if it is present it will get called instead of the `__len__` method:

In [8]:
class MyList:
    def __init__(self, length):
        self._length = length
        
    def __len__(self):
        print('__len__ called')
        return self._length
    
    def __bool__(self):
        print('__bool__ called')
        return self._length > 0

In [9]:
p1 = MyList(0)
p2 = MyList(100)

In [10]:
bool(p1)

__bool__ called


False

In [11]:
bool(p2)

__bool__ called


True

For classes that do not define `__len__` we may want to use the `__bool__` method. For example, consider a 2D `Point` class where we want to consider the origin point `(0,0)` falsy, and everything else truthy.

By default, all instances of our `Point` class will be truthy (they have neither a `__len__` nor a `__bool__` method):

In [12]:
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

In [13]:
p1 = Point(0, 0)
p2 = Point(1, 1)

In [14]:
bool(p1), bool(p2)

(True, True)

So now let's implement `__bool__` to get our desired functionality:

In [15]:
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __bool__(self):
        return self.x != 0 or self.y != 0

In [16]:
p1 = Point(0, 0)
p2 = Point(1, 1)

In [17]:
bool(p1)

False

In [18]:
bool(p2)

True

Note that with associated values, we could technically do something like this:

In [19]:
bool(p1.x or p1.y)

False

In [20]:
bool(p2.x or p2.y)

True

This works because any `0` number is falsy.

So we might think we can use this approach instead of the explicit `!= 0` comparisons:

In [21]:
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __bool__(self):
        return self.x or self.y

In [22]:
p1 = Point(0, 0)
p2 = Point(1, 1)

Then if we call `__bool__` directly:

In [23]:
bool(p1.__bool__()), bool(p2.__bool__())

(False, True)

But it we try to use the `bool()` function:

In [24]:
try:
    bool(p1)
except TypeError as ex:
    print(ex)

__bool__ should return bool, returned int


we can see that we have an exception. Although we can work with truth values in most circumstances, Python insists that `__bool__` should return an actual boolean type.

If we really wanted to, we could write:

In [25]:
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __bool__(self):
        return bool(self.x or self.y)

In [26]:
p1 = Point(0, 0)
p2 = Point(1, 1)

In [27]:
bool(p1), bool(p2)

(False, True)

##  Making Objects Callable

We can make instances of our classes callables by implementing the `__call__` method.

Let's first see a simple example:

In [1]:
class Person:
    def __call__(self):
        print('__call__ called...')

In [2]:
p = Person()

And now we can use `p` as a callable too:

In [3]:
type(p)

__main__.Person

In [4]:
p()

__call__ called...


This is actually quite useful, and is widely used by Python itself.

For example, the `functools` module has a `partial` function that we can use to create partial functions, like this:

In [5]:
from functools import partial

In [6]:
def my_func(a, b, c):
    return a, b, c

We can call this function with three arguments, but we could also create a partial function that essentially pre-sets some of the positional arguments:

In [7]:
partial_func = partial(my_func, 10, 20)

And now we can (indirectly) call `my_func` using `partial_func` using only an argument for `c`, with `a` and `b` pre-set to `10` and `20` respectively:

In [8]:
partial_func(30)

(10, 20, 30)

So I referred to `partial` as a function, but in reality it's just a callable (and this is why in Python we generally refer to things as callables, not just functions, because an object might be callable without being an actual function). In fact, we've seen this before with properties - these are callables, but they are not functions!

Back to `partial`, you'll notice that the `type` of `partial` is not a `function` at all!

In [9]:
type(partial)

type

So the type is `type` which means `partial` is actually a class, not a function.

We can easily re-create a simplified approximation of `partial` ourselves using the `__call__` method in a custom class.

In [10]:
class Partial:
    def __init__(self, func, *args):
        self._func = func
        self._args = args
        
    def __call__(self, *args):
        return self._func(*self._args, *args)

In [11]:
partial_func = Partial(my_func, 10, 20)

In [12]:
type(partial_func)

__main__.Partial

In [13]:
partial_func(30)

(10, 20, 30)

Many such "functions" in Python are actually just general callables. The distinction is often not important.

There is a built-in function in Python, `callable` that can be used to determine if an object is callable:

In [14]:
callable(print)

True

In [15]:
callable(partial)

True

In [16]:
callable(partial_func)

True

As you can see our `Partial` class **instance** is callable, but the `Person` class instances will not be (the class itself is callable of course):

In [17]:
class Person:
    def __init__(self, name):
        self.name = name

In [18]:
callable(Person)

True

In [19]:
p = Person('Alex')

In [20]:
callable(p)

False

#### Example: Cache with a cache-miss counter

Let's take a look at another example. I want to implement a dictionary to act as a cache, but I also want to keep track of the cache misses so I can later evaluate if my caching strategy is effective or not.

The `defaultdict` class can be useful as a cache.

Recall that I can specify a default callable to use when requesting a non-existent key from a `defaultdict`:

In [21]:
from collections import defaultdict

In [22]:
def default_value():
    return 'N/A'

In [23]:
d = defaultdict(default_value)

In [24]:
d['a']

'N/A'

In [25]:
d.items()

dict_items([('a', 'N/A')])

Now, I want to use this `default_value` callable to keep track of the number of times it has been called - this will tell me how may times a non-existent key was requested from my `defaultdict`.

I could try to create a global counter, and use that in my `default_value` function:

In [26]:
miss_counter = 0

In [27]:
def default_value():
    global miss_counter
    miss_counter += 1
    return 'N/A'

And now we can use it this way:

In [28]:
d = defaultdict(default_value)

In [29]:
d['a'] = 1
d['a']
d['b']
d['c']

'N/A'

In [30]:
miss_counter

2

This works, but is not very good - the `default_value` function **relies** on us having a global `miss_counter` variable - if we don't have it our function won't work. Additionally we cannot use it to keep track of different cache instances since they would all use the same instance of `miss_counter`.

In [31]:
del miss_counter

In [32]:
d = defaultdict(default_value)

In [33]:
try:
    d['a']
except NameError as ex:
    print(ex)

name 'miss_counter' is not defined


So nmaybe we can just pass in the counter (defined in our current scope) we want to use to the `default_value` function:

In [34]:
def default_value(counter):
    counter += 1
    return 'N/A'

But this **won't work**, because counter is now local to the function so the local `counter` will be incremented, not the `counter` from the outside scope.

Instead, we could use a class to maintain both a counter state, and return the default value for a cache miss:

In [35]:
class DefaultValue:
    def __init__(self):
        self.counter = 0
        
    def __iadd__(self, other):
        if isinstance(other, int):
            self.counter += other
            return self
        raise ValueError('Can only increment with an integer value.')

So we can use this class a a counter:

In [36]:
default_value_1 = DefaultValue()

In [37]:
default_value_1 += 1

In [38]:
default_value_1.counter

1

So this works as a counter, but `default_value_1` is not callable, which is what we need to the `defaultdict`.

So let's make it callable, and implement the behavior we need:

In [39]:
class DefaultValue:
    def __init__(self):
        self.counter = 0
        
    def __iadd__(self, other):
        if isinstance(other, int):
            self.counter += other
            return self
        raise ValueError('Can only increment with an integer value.')
        
    def __call__(self):
        self.counter += 1
        return 'N/A'

And now we can use this as our default callable for our default dicts:

In [40]:
def_1 = DefaultValue()
def_2 = DefaultValue()

In [41]:
cache_1 = defaultdict(def_1)
cache_2 = defaultdict(def_2)

In [42]:
cache_1['a'], cache_1['b']

('N/A', 'N/A')

In [43]:
def_1.counter

2

In [44]:
cache_2['a']

'N/A'

In [45]:
def_2.counter

1

As one last little enhancement, I'm going to make the returned default value an instance attribute for more flexibility:

In [46]:
class DefaultValue:
    def __init__(self, default_value):
        self.default_value = default_value
        self.counter = 0
        
    def __iadd__(self, other):
        if isinstance(other, int):
            self.counter += other
            return self
        raise ValueError('Can only increment with an integer value.')
        
    def __call__(self):
        self.counter += 1
        return self.default_value

And now we could use it this way:

In [47]:
cache_def_1 = DefaultValue(None)
cache_def_2 = DefaultValue(0)

cache_1 = defaultdict(cache_def_1)
cache_2 = defaultdict(cache_def_2)

In [48]:
cache_1['a'], cache_1['b'], cache_1['a']

(None, None, None)

In [49]:
cache_def_1.counter

2

In [50]:
cache_2['a'], cache_2['b'], cache_2['c']

(0, 0, 0)

In [51]:
cache_def_2.counter

3

So the `__call__` method can essentially be used to make **instances** of our classes callable.

This is also very useful to create **decorator** classes.

Often we just use closures to create decorators, but sometimes it is easier to use a class instead, or if we want our class to provide functionality beyond just being used as a decorator.

Let's look at an example.

#### Example: Profiling Functions

For simplicity I will assume here that we only want to decorate functions defined at the module level. For creating a decorator that also works for methods (bound functions) we have to do a bit more work and will need to understand descriptors - more on descriptors later.

So we want to easily be able to keep track of how many times our functions are called and how long they take to run on average.

Although we could cretainly implement code directly inside our function to do this, it becomes repetitive if we need to do it for multiple functions - so a decorator is ideal for that.

Let's look at how we can use a decorator class to keep track of how many times our function is called and also keep track of the time it takes to run on average.

We could certainly try a closure-based approach, maybe something like this:

In [52]:
from time import perf_counter
from functools import wraps

def profiler(fn):
    counter = 0
    total_elapsed = 0
    avg_time = 0
    
    @wraps(fn)
    def inner(*args, **kwargs):
        nonlocal counter
        nonlocal total_elapsed
        nonlocal avg_time
        counter += 1
        start = perf_counter()
        result = fn(*args, **kwargs)
        end = perf_counter()
        total_elapsed += (end - start)
        avg_time = total_elapsed / counter
        return result
    
    # we need to give a way to our users to look at the
    # counter and avg_time values - spoiler: this won't work!
    inner.counter = counter
    inner.avg_time = avg_time
    return inner

So, we added `counter` and `avg_time` as attributes to the `inner` function (the decorated function) - that works but looks a little weird - also notice that we calculate `avg_time` every time we call our decorated fuinction, even though the user may never request it - seems wasteful.

In [53]:
from time import sleep
import random

random.seed(0)

@profiler
def func1():
    sleep(random.random())

In [54]:
func1(), func1()

(None, None)

In [55]:
func1.counter

0

Hmm, that's weird - `counter` still shows zero. This is because we have to understand what we did in the decorator - we made `inner.counter` the value of `counter` **at the time the decorator function was called** - this is **not** the counter value that we keep updating!!

So instead we could try to fix it this way:

In [56]:
from time import perf_counter
from functools import wraps

def profiler(fn):
    _counter = 0
    _total_elapsed = 0
    _avg_time = 0
    
    @wraps(fn)
    def inner(*args, **kwargs):
        nonlocal _counter
        nonlocal _total_elapsed
        nonlocal _avg_time
        _counter += 1
        start = perf_counter()
        result = fn(*args, **kwargs)
        end = perf_counter()
        _total_elapsed += (end - start)
        return result
    
    # we need to give a way to our users to look at the
    # counter and avg_time values - but we need to make sure
    # it is using a cell reference!
    def counter():
        # this will now be a closure with a cell pointing to 
        # _counter
        return _counter
    
    def avg_time():
        return _total_elapsed / _counter
    
    inner.counter = counter
    inner.avg_time = avg_time
    return inner

In [57]:
@profiler
def func1():
    sleep(random.random())

In [58]:
func1(), func1()

(None, None)

In [59]:
func1.counter()

2

In [60]:
func1.avg_time()

0.3425700559746474

OK, so that works, but it's a little convoluted. In this case a decorator class will be much easier to write and read!

In [61]:
class Profiler:
    def __init__(self, fn):
        self.counter = 0
        self.total_elapsed = 0
        self.fn = fn
        
    def __call__(self, *args, **kwargs):
        self.counter += 1
        start = perf_counter()
        result = self.fn(*args, **kwargs)
        end = perf_counter()
        self.total_elapsed += (end - start)
        return result
        
    @property
    def avg_time(self):
        return self.total_elapsed / self.counter
        
        

So we can now use `Profiler` as a decorator!

In [62]:
@Profiler
def func_1(a, b):
    sleep(random.random())
    return (a, b)

In [63]:
func_1(1, 2)

(1, 2)

In [64]:
func_1.counter

1

In [65]:
func_1(2, 3)

(2, 3)

In [66]:
func_1.counter

2

In [67]:
func_1.avg_time

0.46242688701022416

And of course we can use it for other functions too:

In [68]:
@Profiler
def func_2():
    sleep(random.random())

In [69]:
func_2(), func_2(), func_2()

(None, None, None)

In [70]:
func_2.counter, func_2.avg_time

(3, 0.5231811150054758)

As you can see, it was much easier to implement this more complex decorator using a class and the `__call__` method than using a purely function approach. But of course, if the decorator is simple enough to implement using a functional approach, that's my preferred way of doing things! 

Just because I have a hammer does not mean everything is a nail!!

##  The `__del__` Method

The `__del__` method as we discussed in the lecture is called right before the object is about to be garbage collected. This is sometimes called the **finalizer**. It is sometimes referred to as the **destructor**, but that's not really accurate since that method does not destroy the object - that's the GC's responsibility - `__del__` just gets called prior to the GC destroying the object.

Although this method can be useful in some circumstances we need to be aware of some pitfalls:

1. Using the `del` keyword does not call `__del__` directly - it just removes the symbol for wehatever namespace it is being deleted from and reduces the reference count by 1.
2. The `__del__` method is not called until the object is about to be destroyed - so using `del obj` decreases the ref count by 1, but if something else is referencing that object then `__del__` is **not** called.
3. Unhandled exceptions that occur in the `__del__` method are essentially ignored, and the exceptions are written to `sys.stderr`.

It's actually pretty easy to have unwitting references to an object.

Let's first write a small helper function to calculate the reference count for an object using it's memory address (which only works correctly if the object actually exists):

In [1]:
import ctypes

def ref_count(address):
    return ctypes.c_long.from_address(address).value

Now let's write a class that implements the `__del__` method:

In [2]:
class Person:
    def __init__(self, name):
        self.name = name
        
    def __repr__(self):
        return f'Person({self.name})'
    
    def __del__(self):
        print(f'__del__ called for {self}...')

Let's first see how the `__del__` gets called when we create then remove a reference to an instance in our global scope:

In [3]:
p = Person('Alex')

We can now remove that reference from the symbol `p` to the instance either by using `del p` or even just setting `p` to `None`:

In [4]:
p = None

__del__ called for Person(Alex)...


As you can see the `__del__` was called.

It works the same way with the `del` statement:

In [5]:
p = Person('Alex')

In [6]:
del p

__del__ called for Person(Alex)...


Now let's see how we might create an unwitting extra reference to the object.

Let's implement a method that is going to create an exception:

In [7]:
class Person:
    def __init__(self, name):
        self.name = name
    
    def gen_ex(self):
        raise ValueError('Something went bump...')
        
    def __repr__(self):
        return f'Person({self.name})'
    
    def __del__(self):
        print(f'__del__ called for {self}...')

In [8]:
p = Person('Alex')

At this point we have one reference to the object, the reference held by `p`:

In [9]:
p_id = id(p)
ref_count(p_id)

1

Now let's make that exception happen and store the exception in a variable:

In [10]:
try:
    p.gen_ex()
except ValueError as ex:
    error = ex
    print(ex)

Something went bump...


In [11]:
ref_count(p_id)

2

As you can see our reference count is now `2`. Why?

Let's look at the `error` variable:

In [12]:
dir(error)

['__cause__',
 '__class__',
 '__context__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__setstate__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__suppress_context__',
 '__traceback__',
 'args',
 'with_traceback']

In [13]:
dir(error.__traceback__)

['tb_frame', 'tb_lasti', 'tb_lineno', 'tb_next']

In [14]:
dir(error.__traceback__.tb_frame)

['__class__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'clear',
 'f_back',
 'f_builtins',
 'f_code',
 'f_globals',
 'f_lasti',
 'f_lineno',
 'f_locals',
 'f_trace']

In [15]:
for key, value in error.__traceback__.tb_frame.f_locals.copy().items():
    if isinstance(value, Person):
        print(key, value, id(value), id(key))

p Person(Alex) 140665691193640 140665683500032


As you can see the traceback contains a refererence to our object in it's dictionary - so we have a second reference to our object.

Let's check our reference count now, to make sure we did not inadvertently create even more references:

In [16]:
ref_count(p_id)

2

Now, even if we remove our reference to the object, we will still have something handing on to it, and the `__del__` method will not get called:

In [17]:
del p

See! `__del__` was not called!

But now let's get rid of that exception we stored:

In [18]:
del error

__del__ called for Person(Alex)...


And now, as you can see, we finally had the `__del__` method called. (Note that depending on what you were doing in your notebook, you may not even see this call at all - which just means that something else is holding on to our object somewhere!)

For this reason it is rare for devs to use the `__del__` method for critical things like closing a file, or closing committing a transaction in a database, etc - instead use a context manager, and avoid using the `__del__` method.

Because you do not know when the `__del__` method is going to get called (unless you know exactly how your code might be creating references to the object), you could also get into a situation where other objects (like global objects) referenced in the `__del__` method will even still be around by the time `__del__` is called (it would get called when the module is destroyed, such as at program shutdown).

The last point to make about `__del__` is that any unhandled exceptions in the `__del__` method are essentially ignored by Python (although their output is sent to `sys.stderr`).

Let's see this:

In [19]:
class Person:
    def __del__(self):
        raise ValueError('Something went bump...')

In [20]:
p = Person()

In [21]:
del p

Exception ignored in: <bound method Person.__del__ of <__main__.Person object at 0x7fef381d8e48>>
Traceback (most recent call last):
  File "<ipython-input-19-6ed6e62e38b8>", line 3, in __del__
ValueError: Something went bump...


What we are seeing here is actually the `stderr` output, which Jupyter redirects into our notebook.

In [22]:
import sys

In [23]:
sys.stderr, sys.stdout

(<ipykernel.iostream.OutStream at 0x7fef18b90978>,
 <ipykernel.iostream.OutStream at 0x7fef18b903c8>)

What I'm going to do here is redirect `stderr` to a file instead, using a context manager:

In [24]:
class ErrToFile:
    def __init__(self, fname):
        self._fname = fname
        self._current_stderr = sys.stderr
        
    def __enter__(self):
        self._file = open(self._fname, 'w')
        sys.stderr = self._file
        
    def __exit__(self, exc_type, exc_value, exc_tb):
        sys.stderr = self._current_stderr
        if self._file:
            self._file.close()
        return False

In [25]:
p = Person()

In [26]:
with ErrToFile('err.txt'):
    del p

As you can see, no exception was generated and our code continues to run happily along.

But let's examine the contents of that file:

In [27]:
with open('err.txt') as f:
    print(f.readlines())

['Exception ignored in: <bound method Person.__del__ of <__main__.Person object at 0x7fef381cc9e8>>\n', 'Traceback (most recent call last):\n', '  File "<ipython-input-19-6ed6e62e38b8>", line 3, in __del__\n', 'ValueError: Something went bump...\n']


So, as you can see the exception was silenced and the exception data was just sent to `stderr`.

What this means is that you cannot trap exceptions that occur in the `__del__` method (from outside the `__del__` method to be exact):

In [28]:
p = Person()

try:
    del p
    print('p was deleted (succesfully)')
except ValueError as ex:
    print('Exception caught!')
else:
    print('No exception seen!')

p was deleted (succesfully)
No exception seen!


Exception ignored in: <bound method Person.__del__ of <__main__.Person object at 0x7fef381ee898>>
Traceback (most recent call last):
  File "<ipython-input-19-6ed6e62e38b8>", line 3, in __del__
ValueError: Something went bump...


Now all this does not mean you should just altogether avoid using the `__del__` method - you just need to be aware of its limitations, and be extra careful in your code with circular references or unintentional extra references to your objects.
Things get even dicier when using multi-threading, but that's beyond the scope of this course!

Personally I never use `__del__`. Instead I use context managers to manage releasing resources such as files, sockets, database connections, etc.

##  The `__format__` Method

We saw before the use of `__str__` and `__repr__`.

However we have one more formatting function to look at!

The `format()` function.

For example we can use `format()` with a format specification for floats:

In [1]:
a = 0.1

In [2]:
format(a, '.20f')

'0.10000000000000000555'

Or we can use it with a datetime object:

In [3]:
from datetime import datetime

In [4]:
now = datetime.utcnow()

In [5]:
now

datetime.datetime(2019, 6, 13, 3, 43, 48, 904829)

In [6]:
format(now, '%a %Y-%m-%d  %I:%M %p')

'Thu 2019-06-13  03:43 AM'

We can implement support for format specifiers in our own classes by implementing the `__format__` method.

This is actually quite complicated to do, so we usually delegate back to some other type's formatting.

Just like with `__str__` and `__repr__`, `__format__` should return a string.

In [7]:
class Person:
    def __init__(self, name, dob):
        self.name = name
        self.dob = dob
        
    def __repr__(self):
        print('__repr__ called...')
        return f'Person(name={self.name}, dob={self.dob.isoformat()})'
    
    def __str__(self):
        print('__str__ called...')
        return f'Person({self.name})'
    
    def __format__(self, date_format_spec):
        print(f'__format__ called with {repr(date_format_spec)}...')
        dob = format(self.dob, date_format_spec)
        return f'Person(name={self.name}, dob={dob})'

So now have:

In [8]:
from datetime import date

p = Person('Alex', date(1900, 10, 20))

In [9]:
str(p)

__str__ called...


'Person(Alex)'

In [10]:
repr(p)

__repr__ called...


'Person(name=Alex, dob=1900-10-20)'

In [11]:
format(p, '%B %d, %Y')

__format__ called with '%B %d, %Y'...


'Person(name=Alex, dob=October 20, 1900)'

If we do not specify a format, then the `format` function will use an empty string:

In [12]:
format(p)

__format__ called with ''...


'Person(name=Alex, dob=1900-10-20)'

# Section 05 - Project 2

##  Project 2 - Solution

In [1]:
from functools import total_ordering

@total_ordering
class Mod:
    def __init__(self, value, modulus):
        if not isinstance(modulus, int):
            raise TypeError('Unsupported type for modulus')
        if not isinstance(value, int):
            raise TypeError('Unsupported type for value')
        if modulus <= 0:
            raise ValueError('Modulus must be positive')

        self._modulus = modulus
        self._value = value % modulus  # store residue as the value
        
    @property
    def modulus(self):
        return self._modulus
    
    @property
    def value(self):
        return self._value
    
    @value.setter
    def value(self, value):
        self._value = value
    
    def __repr__(self):
        return f'Mod({self._value}, {self._modulus})'
    
    def __int__(self):
        # calculates the value (residue)
        return self.value

    def __eq__(self, other):
        # calculates congruence (same equivalence class)
        if isinstance(other, Mod):
            if self.modulus != other.modulus:
                return NotImplemented
            else:
                return self.value == other.value
        elif isinstance(other, int):
            return other % self.modulus == self.value
        else:
            return NotImplemented
    
    def __hash__(self):
        return hash((self.value, self.modulus))
    
    def __neg__(self):
        return Mod(-self.value, self.modulus)
    
    def __add__(self, other):
        if isinstance(other, Mod) and self.modulus == other.modulus:
            return Mod(self.value + other.value, self.modulus)
        if isinstance(other, int):
            return Mod(self.value + other, self.modulus)
        return NotImplemented
    
    def __iadd__(self, other):
        if isinstance(other, Mod) and self.modulus == other.modulus:
            self.value = (self.value + other.value) % self.modulus
            return self
        elif isinstance(other, int):
            self.value = (self.value + other) % self.modulus
            return self
        return NotImplemented
    
    def __sub__(self, other):
        if isinstance(other, Mod) and self.modulus == other.modulus:
            return Mod(self.value - other.value, self.modulus)
        if isinstance(other, int):
            return Mod(self.value - other, self.modulus)
        return NotImplemented
    
    def __isub__(self, other):
        if isinstance(other, Mod) and self.modulus == other.modulus:
            self.value = (self.value - other.value) % self.modulus
            return self
        if isinstance(other, int):
            self.value = (self.value - other) % self.modulus
            return self
        return NotImplemented
        
    def __mul__(self, other):
        if isinstance(other, Mod) and self.modulus == other.modulus:
            return Mod(self.value * other.value, self.modulus)
        if isinstance(other, int):
            return Mod(self.value * other, self.modulus)
        return NotImplemented
    
    def __imul__(self, other):
        if isinstance(other, Mod) and self.modulus == other.modulus:
            self.value = (self.value * other.value) % self.modulus
            return self
        if isinstance(other, int):
            self.value = (self.value * other) % self.modulus
            return self
        return NotImplemented
    
    def __pow__(self, other):
        if isinstance(other, Mod) and self.modulus == other.modulus:
            return Mod(self.value ** other.value, self.modulus)
        if isinstance(other, int):
            # use residue of other, to make computation potentially smaller
            return Mod(self.value ** (other % self.modulus), self.modulus)
        return NotImplemented
    
    def __ipow__(self, other):
        if isinstance(other, Mod) and self.modulus == other.modulus:
            self.value = (self.value ** other.value) % self.modulus
            return self
        if isinstance(other, int):
            # use residue of other, to make computation potentially smaller
            self.value = (self.value ** (other % self.modulus)) % self.modulus
            return self
        return NotImplemented
    
    def __lt__(self, other):
        if isinstance(other, Mod) and self.modulus == other.modulus:
            return self.value < other.value
        if isinstance(other, int):
            return self.value < other % self.modulus
        return NotImplemented

You should test this class writing some unit tests!

OK, so this class implementation seems to work, but I'm not happy about the amount of repetitive code we had to write (all those checks to make sure we either have a comparable Mod instance, and then either using the value of the Mod instance or the int depending on what was passed in).

I really want to do something about that.

First thing is I'm going to add a "private" method that will indicate whether two objects are compatible. Maybe something like this:

In [2]:
def _is_compatible(self, other):
    return isinstance(other, int) or (isinstance(other, Mod) and self.modulus == other.modulus)

But then I'm still left with which value do I use,  `.value` or the `int` itself. So, I'm going to make that part of the compatibility check. 

Here, I'm going to use exceptions to indicate an incompatible type, otherwise I'll return the value we should use. 

Something like this:

In [3]:
def _get_value(self, other):
    if isinstance(other, int):
        return other % self.modulus  # return the residue
    if isinstance(other, Mod) and self.modulus == other.modulues:
        return other.value
    raise TypeError('Incompatible types.')

And then I can refactor my class accordingly. Also, even though we should technically return `NotImplemented` (to allow Python to use reflection), in this case the reflection is not going to be needed, so I'm just going to let the `TypeError` exception through.

The only exception to this is for ordering - we **do** want Python to try to reflect a `<` if `>` is not implemented (although using `@total_ordering` means this does not really matter anyway).

In [4]:
from functools import total_ordering

@total_ordering
class Mod:
    def __init__(self, value, modulus):
        if not isinstance(modulus, int):
            raise TypeError('Unsupported type for modulus')
        if not isinstance(value, int):
            raise TypeError('Unsupported type for value')
        if modulus <= 0:
            raise ValueError('Modulus must be positive')

        self._modulus = modulus
        self._value = value % modulus  # store residue as the value
        
    @property
    def modulus(self):
        return self._modulus
    
    @property
    def value(self):
        return self._value
    
    @value.setter
    def value(self, value):
        self._value = value
    
    def __repr__(self):
        return f'Mod({self._value}, {self._modulus})'
    
    def __int__(self):
        # calculates the value (residue)
        return self.value

    def _get_value(self, other):
        if isinstance(other, int):
            return other % self.modulus  # return the residue
        if isinstance(other, Mod) and self.modulus == other.modulus:
            return other.value
        raise TypeError('Incompatible types.')
    
    def __eq__(self, other):
        # calculates congruence (same equivalence class)
        other_value = self._get_value(other)
        return other_value == self.value
    
    def __hash__(self):
        return hash((self.value, self.modulus))
    
    def __neg__(self):
        return Mod(-self.value, self.modulus)
    
    def __add__(self, other):
        other_value = self._get_value(other)
        return Mod(self.value + other_value, self.modulus)
    
    def __iadd__(self, other):
        other_value = self._get_value(other)
        self.value = (self.value + other_value) % self.modulus
        return self
    
    def __sub__(self, other):
        other_value = self._get_value(other)
        return Mod(self.value - other_value, self.modulus)
    
    def __isub__(self, other):
        other_value = self._get_value(other)
        self.value = (self.value - other_value) % self.modulus
        return self
    
    def __mul__(self, other):
        other_value = self._get_value(other)
        return Mod(self.value * other_value, self.modulus)
    
    def __imul__(self, other):
        other_value = self._get_value(other)
        self.value = (self.value * other_value) % self.modulus
        return self
    
    def __pow__(self, other):
        other_value = self._get_value(other)
        return Mod(self.value ** other_value, self.modulus)
        
    def __ipow__(self, other):
        other_value = self._get_value(other)
        self.value = (self.value ** other_value) % self.modulus
        return self
    
    def __lt__(self, other):
        # here, raising a TypeError instead of returning NotImplemented
        # would result in Python not trying the reflection - which we DO want
        # although since we are using @total_ordering this does not really matter
        try:
            other_value = self._get_value(other)
            return self.value < other_value
        except TypeError:
            return NotImplemented

Ok, so this is better, but there still quite a bit of repetitive code for the addition, subtraction, multiplcation and power operations - the only thing that changes there is which particular arithmetic operator we are delegating to.

So, I'm going to use the `operator` module to simplify things even further.

In [5]:
import operator

In [6]:
operator.mul(2, 3)

6

In [7]:
operator.add(2, 3)

5

So, now I'm going to write a generic `_compute` that will perform the requested operation, and return either a new `Mod` object, or do the in-place calculation (I'll use an optional keyword-only arg for this).

So, something like this:

In [8]:
def _perform_operation(self, other, op, *, in_place=False):
    other_value = self._get_value(other)
    new_value = op(self.value, other_value)
    if in_place:
        self.value = new_value % self.modulus
        return self
    else:
        return Mod(new_value, self.modulus)

Let's add that to our class and refactor one more time:

In [9]:
from functools import total_ordering

@total_ordering
class Mod:
    def __init__(self, value, modulus):
        if not isinstance(modulus, int):
            raise TypeError('Unsupported type for modulus')
        if not isinstance(value, int):
            raise TypeError('Unsupported type for value')
        if modulus <= 0:
            raise ValueError('Modulus must be positive')

        self._modulus = modulus
        self._value = value % modulus  # store residue as the value
        
    @property
    def modulus(self):
        return self._modulus
    
    @property
    def value(self):
        return self._value
    
    @value.setter
    def value(self, value):
        self._value = value
    
    def __repr__(self):
        return f'Mod({self._value}, {self._modulus})'
    
    def __int__(self):
        # calculates the value (residue)
        return self.value

    def _get_value(self, other):
        if isinstance(other, int):
            return other % self.modulus  # return the residue
        if isinstance(other, Mod) and self.modulus == other.modulus:
            return other.value
        raise TypeError('Incompatible types.')
    
    def _perform_operation(self, other, op, *, in_place=False):
        other_value = self._get_value(other)
        new_value = op(self.value, other_value)
        if in_place:
            self.value = new_value % self.modulus
            return self
        else:
            return Mod(new_value, self.modulus)
    
    def __eq__(self, other):
        # calculates congruence (same equivalence class)
        other_value = self._get_value(other)
        return other_value == self.value
    
    def __hash__(self):
        return hash((self.value, self.modulus))
    
    def __neg__(self):
        return Mod(-self.value, self.modulus)
    
    def __add__(self, other):
        return self._perform_operation(other, operator.add)
    
    def __iadd__(self, other):
        return self._perform_operation(other, operator.add, in_place=True)
    
    def __sub__(self, other):
        return self._perform_operation(other, operator.sub)
    
    def __isub__(self, other):
        return self._perform_operation(other, operator.sub, in_place=True)
    
    def __mul__(self, other):
        return self._perform_operation(other, operator.mul)
    
    def __imul__(self, other):
        return self._perform_operation(other, operator.mul, in_place=True)
    
    def __pow__(self, other):
        return self._perform_operation(other, operator.pow)
        
    def __ipow__(self, other):
        return self._perform_operation(other, operator.pow, in_place=True)
    
    def __lt__(self, other):
        # here, raising a TypeError instead of returning NotImplemented
        # would result in Python not trying the reflection - which we DO want
        # although since we are using @total_ordering this does not really matter
        try:
            other_value = self._get_value(other)
            return self.value < other_value
        except TypeError:
            return NotImplemented

OK, so if you had your unit tests set up, each refactor we did would only have needed re-running the unit tests to make sure we did not break anything!

# Section 06 - Single Inheritance

##  Single Inheritance

For now we're just going to define classes that inherit from another class, but we aren't going to bother implementing any functionality or state for these classes.

We just want to explore the relationships between objects created from classes that inherit from each other.

In [1]:
class Shape:
    pass

class Ellipse(Shape):
    pass

class Circle(Ellipse):
    pass

class Polygon(Shape):
    pass

class Rectangle(Polygon):
    pass

class Square(Rectangle):
    pass

class Triangle(Polygon):
    pass

As you can see we created a single inheritance chain that looks something like this:

```
                         Shape
     Ellipse                            Polygon
     
      Circle                   Rectangle          Triangle
                               Square
```

It is important to understand that these **classes** are subclasses of each other - just remember that **subclass** contains the word **class** - so it defines a relationship between classes, not instances:

In [2]:
issubclass(Ellipse, Shape)

True

But if we create instances of those two classes:

In [3]:
s = Shape()
e = Ellipse()
try:
    issubclass(e, s)
except TypeError as ex:
    print(ex)

issubclass() arg 1 must be a class


When we deal with instances of classes, we can instead use the `isinstance()` function:

In [4]:
isinstance(e, Ellipse)

True

But, not only is `e` an instance of an `Ellipse`, since `Ellipse` IS-A `Shape`, i.e. `Ellipse` is a **subclass** of `Shape`, it tunrs out thet `e` is also considered an instance of `Shape`:

In [5]:
isinstance(e, Shape)

True

Subclasses behave similarly in that a class may be a subclass of another class without being a **direct** subclass.

In our example here, every class we defined is a subclass of `Shape` because the inheritance chains all go back up to the `Shape` class:

In [6]:
issubclass(Square, Shape)

True

And of course, the same works for instances when we look at `isinstance`:

In [7]:
sq = Square()

In [8]:
isinstance(sq, Square)

True

In [9]:
isinstance(sq, Rectangle)

True

In [10]:
isinstance(sq, Polygon)

True

In [11]:
isinstance(sq, Shape)

True

But of course, a `Square` is not a subclass of `Ellipse` and `Square` instances are not instances of `Ellipse`:

In [12]:
issubclass(Square, Ellipse)

False

In [13]:
isinstance(sq, Ellipse)

False

We'll come back to this later, but when we define a class in Python 3 that does not explicitly inherit from another class:

In [14]:
class Person:
    pass

it is actually implicitly inheriting from a class!

There is a class in Python called `object` - yes, it is a **class**, even though the name says `object` (but classes are objects - everything in Python is an object):

In [15]:
issubclass(Person, object)

True

In [16]:
p = Person()

In [17]:
isinstance(p, Person)

True

This means that our `Shape` class we created actually inherits from `object`, and therefore every other class we created also inherits from `object`:

In [18]:
issubclass(Square, object)

True

In [19]:
isinstance(sq, object)

True

We'll look at the `object` class in the next lecture.

##  The `object` Class

As we discussed earlier, `object` is a built-in Python **class**, and every class in Python inherits from that class.

In [1]:
type(object)

type

As you can see the type of `object` is `type` - this means it is a class, just like `int`, `str`, `dict` are also classes (types):

In [2]:
type(int), type(str), type(dict)

(type, type, type)

When we create a class that does not explicitly inherit from anything, we are implicitly inheriting from `object`:

In [3]:
class Person:
    pass

In [4]:
issubclass(Person, object)

True

And it's not just our custom classes that inherit from `object`, every type in Python does too:

In [5]:
issubclass(int, object)

True

Even modules, which are objects and instances of `module` are subclasses of `object`:

In [6]:
import math

In [7]:
type(math)

module

So the `math` module is an instance of the `module` type:

In [8]:
ty = type(math)

In [9]:
type(ty)

type

In [10]:
issubclass(ty, object)

True

If you're wondering where the `module` type (class) lives, you can get a reference to it the way I did here, or you can look for it in the `types` module where you can it and the other built-in types.

In [11]:
import types

In [12]:
dir(types)

['AsyncGeneratorType',
 'BuiltinFunctionType',
 'BuiltinMethodType',
 'CodeType',
 'CoroutineType',
 'DynamicClassAttribute',
 'FrameType',
 'FunctionType',
 'GeneratorType',
 'GetSetDescriptorType',
 'LambdaType',
 'MappingProxyType',
 'MemberDescriptorType',
 'MethodType',
 'ModuleType',
 'SimpleNamespace',
 'TracebackType',
 '_GeneratorWrapper',
 '__all__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '_ag',
 '_calculate_meta',
 '_collections_abc',
 '_functools',
 'coroutine',
 'new_class',
 'prepare_class']

For example, if we define a function:

In [13]:
def my_func():
    pass

In [14]:
type(my_func)

function

In [15]:
types.FunctionType is type(my_func)

True

And `FunctionType` inherits from `object`:

In [16]:
issubclass(types.FunctionType, object)

True

and of course, instances of that type are therefore also instances of `object`:

In [17]:
isinstance(my_func,  object)

True

as well as being instances of `FunctionType`:

In [18]:
isinstance(my_func, types.FunctionType)

True

The `object` class implements a certain amount of base functionality.

We can see some of them here:

In [19]:
dir(object)

['__class__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__']

So as you can see `object` implements methods such as `__eq__`, `__hash__`, `__repr__` and `__str__`.

Let's investigate some of those, starting with `__repr__` and `__str__`:

In [20]:
o1 = object()

In [21]:
str(o1)

'<object object at 0x7faa0003bd10>'

In [22]:
repr(o1)

'<object object at 0x7faa0003bd10>'

You probably recognize that output! If we define our own class that does not **override** the `__repr__` or `__str__` methods, when we call those methods on instances of that class it will actually call the implementation in the `object` class:

In [23]:
class Person:
    pass

In [24]:
p = Person()
str(p)

'<__main__.Person object at 0x7fa9a0077f28>'

So this actually called the `__str__` method in the `object` class (but it is an instance method, so it applies to our specific instance `p`).

Similarly, the `__eq__` method in the object class is  implemented, and uses the object **id** to determine equality:

In [25]:
o1 = object()
o2 = object()

In [26]:
id(o1), id(o2)

(140368121412912, 140368121412880)

In [27]:
o1 is o2, o1 == o2, o1 is o1, o1 == o1

(False, False, True, True)

So we can use the `==` operator with our custom classes even if we did not implement `__eq__` explicitly - because it inherits it from the `object` class. 

And so we have the same functionality - our custom objects will compare equal only if they are the same object (id):

In [28]:
p1 = Person()
p2 = Person()

p1 is p2, p1 == p2, p1 is p1, p1 == p1

(False, False, True, True)

We can actually see what specific method is being called by looking at the id of the method in our object, and in the object class:

In [29]:
id(Person.__eq__)

140368389620424

In [30]:
id(object.__eq__)

140368389620424

See? Same method!

In the same way, we can write classes that do not have `__init__` or `__new__` methods - because they just inherit it from `object`:

In [31]:
id(Person.__init__), id(object.__init__)

(140368389620744, 140368389620744)

But of course, if we override those methods, then the `object` methods will not be used:

In [32]:
class Person:
    def __init__(self):
        pass

In [33]:
id(Person.__init__), id(object.__init__)

(140366511098464, 140368389620744)

Different methods...

We'll look at overriding in more detail next.

##  Overriding

As we saw in the lecture, classes that inherit from another class **inherit** the functionality from the parent class (and all parent classes up the chain).

Let's look at what happens when we override the `__str__` method in a custom class (which remember inherits it from the `object` class):

In [1]:
class Person:
    pass

In [2]:
p = Person()
str(p)

'<__main__.Person object at 0x7fbcb04c3908>'

What happened here is that `str()` tries to call a `__str__` method. Since the `Person` class does not define it, Python continues looking up the inheritance chain until it finds it - in this case it finds it in the `object` class, so it uses it.

Now let's override the `__str__` method in the `Person` class:

In [3]:
class Person:
    def __str__(self):
        return 'Person class'

In [4]:
p = Person()

In [5]:
str(p)

'Person class'

What happens if we implement a `__repr__` method only, and still call the `str()` method:

In [6]:
class Person:
    def __repr__(self):
        return 'Person()'

In [7]:
p = Person()

In [8]:
str(p)

'Person()'

As you can see it ended calling `__repr__` **in the Person class**, even though we did not have a `__str__` method defined - that's because `objects` delegates `str` to `__repr__` which in turn will find it in our class.

As we discussed in the lecture, in an inheritance chain we have to be very aware of how overrides are handled.

Let's create a simple chain:

In [9]:
class Shape:
    def __init__(self, name):
        self.name = name
        
    def info(self):
         return f'Shape.info called for Shape({self.name})'
    
    def extended_info(self):
        return f'Shape.extended_info called for Shape({self.name})'
    
class Polygon(Shape):
    def __init__(self, name):
        self.name = name  # we'll come back to this later in the context of using the super()
        
    def info(self):
        return f'Polygon info called for Polygon({self.name})'

In [10]:
p = Polygon('square')

In [11]:
p.info()

'Polygon info called for Polygon(square)'

But if we call `extended_info`:

In [12]:
p.extended_info()

'Shape.extended_info called for Shape(square)'

That makes sense, it uses `extended_info` in the superclass - but now let's add a twist - let's have `extended_info` in the `Shape` class also call `info`:

In [13]:
class Shape:
    def __init__(self, name):
        self.name = name
        
    def info(self):
         return f'Shape.info called for Shape({self.name})'
    
    def extended_info(self):
        return f'Shape.extended_info called for Shape({self.name})', self.info()
    
class Polygon(Shape):
    def __init__(self, name):
        self.name = name  # we'll come back to this later in the context of using the super()
        
    def info(self):
        return f'Polygon.info called for Polygon({self.name})'

In [14]:
p = Polygon('Square')

In [15]:
p.info()

'Polygon.info called for Polygon(Square)'

That works the same as before. But what about `extended_info`? Remember it will use the definition in `Shape`, which in turn calls `info`. Keep in mind that `self` in that context refers to `p` - a `Polygon` class which overrides `info`:

In [16]:
print(p.extended_info())

('Shape.extended_info called for Shape(Square)', 'Polygon.info called for Polygon(Square)')


And this is the same mechanism that results in `str(Person)` ending up calling the `__repr__` method in the `Person` class instead of the `__repr__` method in the `object` class which would have just printed out the name and memory address of the `Person` instance.

In fact we can see how this happens exactly this way:

In [17]:
class Person:
    def __str__(self):
        return 'Person.__str__ called'
    
class Student(Person):
    def __repr__(self):
        return 'Student.__repr__ called'

In [18]:
s = Student()

In [19]:
str(s)

'Person.__str__ called'

In [20]:
repr(s)

'Student.__repr__ called'

And if we now have `__str__` delegate to `__repr__` instead:

In [21]:
class Person:
    def __str__(self):
        print('Person.__str__ called')
        return self.__repr__()
    
class Student(Person):
    def __repr__(self):
        return 'Student.__repr__ called'

In [22]:
s = Student()

In [23]:
str(s)

Person.__str__ called


'Student.__repr__ called'

In [24]:
repr(s)

'Student.__repr__ called'

Basically just keep track of which instance the methods are bound to and always start working you way from there to find the "closest" relevant method.

##  Extending

So far we have seen inheriting and overriding methods from a parent class.

We can also provide additional functionality in child classes. This is very straightforward, we simply define methods (or attributes) in the child class.

In fact we have already done this multiple times - whenever we create a class and define attributes and methods, we are essentially extending the functionality of the `object` class!

In [1]:
class Person:
    pass

In [2]:
class Student(Person):
    def study(self):
        return 'study... study... study...'

In [3]:
p = Person()

In [4]:
try:
    p.study()
except AttributeError as ex:
    print(ex)

'Person' object has no attribute 'study'


In [5]:
s = Student()

In [6]:
isinstance(s, Person)

True

In [7]:
s.study()

'study... study... study...'

Now, think back to what happened when we provided an override in a child class and called the method from inside a method in the parent class.

Since the method being called was bound to an instance of the child class we ended up calling the override method in the child class.

The same thing happens here:

In [8]:
class Person:
    def routine(self):
        return self.eat() + self.study() + self.sleep()
        
    def eat(self):
        return 'Person eats...'
    
    def sleep(self):
        return 'Person sleeps...'
        

Now we have a problem here! We call `self.study()` in the `routine` method of `Person`, but of course that method does not exist.

We get this exception if we try to call `routine`:

In [9]:
p = Person()

try:
    p.routine()
except AttributeError as ex:
    print(ex)

'Person' object has no attribute 'study'


But watch what happens if we create a `Student` class that inherits from `Person` and extends that class by implementing a `study` method:

In [10]:
class Student(Person):
    def study(self):
        return 'Student studies...'

In [11]:
s = Student()

In [12]:
s.routine()

'Person eats...Student studies...Person sleeps...'

So, `Person` does not implement `sleep`, but `Student` does. In this case, since we are directly calling `sleep` from the `Person` class we really want that method to exist. Or we could check if the instance has that method before we call it.

Let's do the latter first:

In [13]:
class Person:
    def routine(self):
        result = self.eat()
        if hasattr(self, 'study'):
            result += self.study()
        result += self.sleep()
        return result
    
    def eat(self):
        return 'Person eats...'
    
    def sleep(self):
        return 'Person sleeps...'

In [14]:
p = Person()

In [15]:
p.routine()

'Person eats...Person sleeps...'

So that works, and if our child class implements the `study` method:

In [16]:
class Student(Person):
    def study(self):
        return 'Student studies...'

In [17]:
s = Student()

In [18]:
s.routine()

'Person eats...Student studies...Person sleeps...'

There are times when we want our base class to be used as a base class only, and not really directly. This starts getting into abstract classes, so I won't cover it now beyond a few basics.

Suppose we want our "base" class to be something that is used via inheritance, and not really directly. If you've studied Java OOP, you probably are aware of this coconcept alreads: **abstract** classes.

Abstract classes are basically classes that are not meant to be instantiated directly, but instead used in some inheritance chain.

For now, we can achieve this quite simply in Python by actually implementing the method in the "base" class, but returning a `NotImplemented` value, letting the users of our class know that they need to implement the functionality by overriding the method.

We could do it this way:

In [19]:
class Person:
    def __init__(self, name):
        self.name = name
        
    def routine(self):
        return NotImplemented

In [20]:
p = Person('Alex')

In [21]:
p.routine()

NotImplemented

And now we can extend this class, providing an override for that method:

In [22]:
class Student(Person):
    def routine(self):
        return 'Eat...Study...Sleep'

In [23]:
class Teacher(Person):
    def routine(self):
        return 'Eat...Teach...Sleep'

In [24]:
s = Student('Alex')

In [25]:
t = Teacher('Fred')

In [26]:
s.routine()

'Eat...Study...Sleep'

In [27]:
t.routine()

'Eat...Teach...Sleep'

The drawback of our current approach is that we can still create instances of the `Person` class - but doing so does not make much sense since we really need the `routine` method to be defined.

To address this properly we will need to look at the framework Python provides for abstract base classes (*ABC*), but is beyond our current scope.

Everything I have explained concerning the method being always bound to the instance applies equally well to any instance or class attribute.

Let's look at an example of this:

In [28]:
class Account:
    apr = 3.0
    
    def __init__(self, account_number, balance):
        self.account_number = account_number
        self.balance = balance
        self.account_type = 'Generic Account'
        
    def calc_interest(self):
        return f'Calc interest on {self.account_type} with APR = {self.apr}'
        

In [29]:
a = Account(123, 100)

In [30]:
a.apr, a.account_type, a.calc_interest()

(3.0, 'Generic Account', 'Calc interest on Generic Account with APR = 3.0')

In [31]:
class Savings(Account):
    apr = 5.0
    
    def __init__(self, account_number, balance):
        self.account_number = account_number  # We'll revisit this later - this is clumsy
        self.balance = balance
        self.account_type = 'Savings Account'

In [32]:
s = Savings(234, 200)

In [33]:
s.apr, s.account_type, s.calc_interest()

(5.0, 'Savings Account', 'Calc interest on Savings Account with APR = 5.0')

Notice how the `calc_interest` method defined in the `Account` class used the correct instance value for `account_type` as well as the class level variable `apr`.

Now let's look at the class variable a bit closer.

You'll notice that I referenced it by using `self.apr`.

Now as we know, we can also access class attributes directly from the class, not just from the instance:

In [34]:
Account.apr, Savings.apr

(3.0, 5.0)

But we have to be careful here when we use it in the `calc_interest` method:

In [35]:
class Account:
    apr = 3.0
    
    def __init__(self, account_number, balance):
        self.account_number = account_number
        self.balance = balance
        self.account_type = 'Generic Account'
        
    def calc_interest(self):
        return f'Calc interest on {self.account_type} with APR = {Account.apr}'
        
        
class Savings(Account):
    apr = 5.0
    
    def __init__(self, account_number, balance):
        self.account_number = account_number  # We'll revisit this later - this is clumsy
        self.balance = balance
        self.account_type = 'Savings Account' 

In [36]:
s = Savings(123, 100)
s.calc_interest()

'Calc interest on Savings Account with APR = 3.0'

Notice how even though this was a `Savings` account, we still used the `apr` defined in the `Account` class. That's because we explicitly used `Account.apr`.

This is why I chose to use `self.apr` in the first example. We can also use the `__class__` method to recover the actual class of the specific instance:

In [37]:
a = Account(123, 100)
s = Savings(234, 200)

In [38]:
a.__class__

__main__.Account

In [39]:
s.__class__

__main__.Savings

Fairly often we need to get a handle on the class of the instance, but we cannot assume it is necessarily the class our code is *defined* in, as was the case in his example. Even though `calc_interest` is defined in the `Account` class, it is actually bound to an instance of the `Savings` class when we call `s.calc_interest()`.

So we can also do it this way:

In [40]:
class Account:
    apr = 3.0
    
    def __init__(self, account_number, balance):
        self.account_number = account_number
        self.balance = balance
        self.account_type = 'Generic Account'
        
    def calc_interest(self):
        return f'Calc interest on {self.account_type} with APR = {self.__class__.apr}'
        
        
class Savings(Account):
    apr = 5.0
    
    def __init__(self, account_number, balance):
        self.account_number = account_number  # We'll revisit this later - this is clumsy
        self.balance = balance
        self.account_type = 'Savings Account' 

In [41]:
a = Account(123, 100)
s = Savings(234, 200)

In [42]:
a.calc_interest(), s.calc_interest()

('Calc interest on Generic Account with APR = 3.0',
 'Calc interest on Savings Account with APR = 5.0')

So why use this `self.__class__.apr` technique instead of using `self.apr`? Basically if we want to protect from someone shadowing the `apr` class attribute with an instance attribute:

Remember that instances can define instance attributes that can shadow class attributes:

In [43]:
s1 = Savings(123, 100)

In [44]:
s1.__dict__

{'account_number': 123, 'balance': 100, 'account_type': 'Savings Account'}

In [45]:
s1.apr

5.0

In [46]:
s2 = Savings(234, 200)
s2.apr = 10

In [47]:
s2.__dict__

{'account_number': 234,
 'balance': 200,
 'account_type': 'Savings Account',
 'apr': 10}

In [48]:
s2.apr

10

So now watch what happens when we use the `self.apr`:

In [49]:
class Account:
    apr = 3.0
    
    def __init__(self, account_number, balance):
        self.account_number = account_number
        self.balance = balance
        self.account_type = 'Generic Account'
        
    def calc_interest(self):
        return f'Calc interest on {self.account_type} with APR = {self.apr}'
        
        
class Savings(Account):
    apr = 5.0
    
    def __init__(self, account_number, balance):
        self.account_number = account_number  # We'll revisit this later - this is clumsy
        self.balance = balance
        self.account_type = 'Savings Account' 

In [50]:
s1 = Savings(123, 100)
s2 = Savings(234, 200)
s1.apr = 10

In [51]:
s1.calc_interest(), s2.calc_interest()

('Calc interest on Savings Account with APR = 10',
 'Calc interest on Savings Account with APR = 5.0')

As you can see `self.apr` used the "overriding" instance attribute for the class attribute `apr`.

If instead we use `self.__class__.apr`:

In [52]:
class Account:
    apr = 3.0
    
    def __init__(self, account_number, balance):
        self.account_number = account_number
        self.balance = balance
        self.account_type = 'Generic Account'
        
    def calc_interest(self):
        return f'Calc interest on {self.account_type} with APR = {self.__class__.apr}'
        
        
class Savings(Account):
    apr = 5.0
    
    def __init__(self, account_number, balance):
        self.account_number = account_number  # We'll revisit this later - this is clumsy
        self.balance = balance
        self.account_type = 'Savings Account' 

In [53]:
s1 = Savings(123, 100)
s2 = Savings(234, 200)
s1.apr = 10

In [54]:
s1.calc_interest(), s2.calc_interest()

('Calc interest on Savings Account with APR = 5.0',
 'Calc interest on Savings Account with APR = 5.0')

As you can see we forced our code to use the **class** attribute. Depending on what you are designing, you may want to choose one or the other.

More often, we use `type(a)` instead of `a.__class__`, like so:

In [55]:
class Account:
    apr = 3.0
    
    def __init__(self, account_number, balance):
        self.account_number = account_number
        self.balance = balance
        self.account_type = 'Generic Account'
        
    def calc_interest(self):
        return f'Calc interest on {self.account_type} with APR = {type(self).apr}'
        
        
class Savings(Account):
    apr = 5.0
    
    def __init__(self, account_number, balance):
        self.account_number = account_number  # We'll revisit this later - this is clumsy
        self.balance = balance
        self.account_type = 'Savings Account' 

And it works exactly the same way:

In [56]:
a = Account(100, 100)
s1 = Savings(101, 100)
s2 = Savings(102, 100)

In [57]:
s2.apr = 10

In [58]:
a.calc_interest()

'Calc interest on Generic Account with APR = 3.0'

In [59]:
s1.calc_interest()

'Calc interest on Savings Account with APR = 5.0'

In [60]:
s2.calc_interest()

'Calc interest on Savings Account with APR = 5.0'

##  Delegating to Parent

You'll most likely encounter `super()` in the `__init__` method of custom classes, but delegation is not restricted to `__init__`. You can use `super()` anywhere you need to explicitly instruct Python to use a callable definition that is higher up in the inheritance chain. In these cases you only need to use `super()` if there is some ambiguity - i.e. your current class overrides an ancestor's callable and you need to specifically tell Python to use the callable in the ancestry chain.

In [1]:
class Person:
    def work(self):
        return 'Person works...'
    
class Student(Person):
    def work(self):
        result = super().work()
        return f'Student works... and {result}'

In [2]:
s = Student()

In [3]:
s.work()

'Student works... and Person works...'

Now the `super().work()` call in the `Student` class looks up the hierarchy chain until it finds the first definition for that callable.

We can easily see this:

In [4]:
class Person:
    def work(self):
        return 'Person works...'
    
class Student(Person):
    pass

class PythonStudent(Student):
    def work(self):
        result = super().work()
        return f'PythonStudent codes... and {result}'

In [5]:
ps = PythonStudent()

In [6]:
ps.work()

'PythonStudent codes... and Person works...'

Of course every class can delegate up the chain in turn:

In [7]:
class Person:
    def work(self):
        return 'Person works...'
    
class Student(Person):
    def work(self):
        result = super().work()
        return f'Student studies... and {result}'
    
class PythonStudent(Student):
    def work(self):
        result = super().work()
        return f'PythonStudent codes... and {result}'

In [8]:
ps = PythonStudent()
ps.work()

'PythonStudent codes... and Student studies... and Person works...'

Do note that when there is **no ambiguity** there is no need to use `super()`:

In [9]:
class Person:
    def work(self):
        return 'Person works...'
    
class Student(Person):
    def study(self):
        return 'Student studies...'
    
class PythonStudent(Student):
    def code(self):
        result_1 = self.work()
        result_2 = self.study()
        return f'{result_1} and {result_2} and PythonStudent codes...'

In [10]:
ps = PythonStudent()

In [11]:
ps.code()

'Person works... and Student studies... and PythonStudent codes...'

The really important thing to understand is which object (instance) is bound when a delegated method is called. It is **always** the calling object:

In [12]:
class Person:
    def work(self):
        return f'{self} works...'
    
class Student(Person):
    def work(self):
        result = super().work()
        return f'{self} studies... and {result}'

class PythonStudent(Student):
    def work(self):
        result = super().work()
        return f'{self} codes... and {result}'
    

In [13]:
ps = PythonStudent()

In [14]:
hex(id(ps))

'0x7fd388308f98'

In [15]:
ps.work()

'<__main__.PythonStudent object at 0x7fd388308f98> codes... and <__main__.PythonStudent object at 0x7fd388308f98> studies... and <__main__.PythonStudent object at 0x7fd388308f98> works...'

As you can see each of the methods in the parent classes were called bound to the original `PythonStudent` instance `ps`.

What this means is that when a class sets an instance attribute, it will be set in the namespace of the original object. Here's a simple example that illustrates this:

In [16]:
class Person:
    def set_name(self, value):
        print('Setting name using Person set_name method...')
        self.name = value
        
class Student(Person):
    def set_name(self, value):
        print('Student class delegating back to parent...')
        super().set_name(value)

In [17]:
s = Student()

As you can see, the dictionary for `s` is currently empty:

In [18]:
s.__dict__

{}

But if we call set_name:

In [19]:
s.set_name('Eric')

Student class delegating back to parent...
Setting name using Person set_name method...


As you can see the `Person` class `set_name` method did the actual work, but the `name` attribute is created in the `Student` instance `s`:

In [20]:
s.__dict__

{'name': 'Eric'}

So just to re-emphasize, whenever you use `super()`, any `self` in the called methods actually refers to the object used to make the initial call.

One place where this is really handy is in class initialization - we use it to leverage the parent class initializer so we don't have to re-write a lot of initialization code in our child class.

Let's use a simple example first:

In [21]:
class Person:
    def __init__(self, name):
        self.name = name
        
class Student(Person):
    def __init__(self, name, student_number):
        super().__init__(name)
        self.student_number = student_number

In [22]:
s = Student('Python', 30)

In [23]:
s.__dict__

{'name': 'Python', 'student_number': 30}

I do want to point out that if your parent class has initializer and your child class does not, then Python will attempt to call the parent `__init__` automatically - because the `__init__` is **inherited** from the parent class!

In [24]:
class Person:
    def __init__(self):
        print('Person __init__')
        
class Student(Person):
    pass

In [25]:
s = Student()

Person __init__


But watch what happens if the parent class requires an argument:

In [26]:
class Person:
    def __init__(self, name):
        print('Person __init__ called...')
        self.name = name
        
class Student(Person):
    pass

In [27]:
try:
    s = Student()
except TypeError as ex:
    print(ex)

__init__() missing 1 required positional argument: 'name'


In fact, we can pass this argument to the `Student` class and Python will automatically pass it along to the (inherited) `Person` class `__init__`:

In [28]:
s = Student('Alex')

Person __init__ called...


In [29]:
s.__dict__

{'name': 'Alex'}

However, if we provide a custom `__init__` in our child class, then Python will not automatically call the parent init:

In [30]:
class Person:
    def __init__(self):
        print('Person __init__ called...')
        
class Student(Person):
    def __init__(self):
        print('Student __init__ called...')

In [31]:
s = Student()

Student __init__ called...


To do so, we need to call `super().__init__`:

In [32]:
class Person:
    def __init__(self):
        print('Person __init__ called...')

class Student(Person):
    def __init__(self):
        super().__init__()
        print('Student __init__ called...')

In [33]:
s = Student()

Person __init__ called...
Student __init__ called...


Let's take a look at a more practical example:

Let's first create a `Circle` class:

In [34]:
from math import pi
from numbers import Real

class Circle:
    def __init__(self, r):
        self._r = r
        self._area = None
        self._perimeter = None
        
    @property
    def radius(self):
        return self._r
    
    @radius.setter
    def radius(self, r):
        if isinstance(r, Real) and r > 0:
            self._r = r
            self._area = None
            self._perimeter = None
        else:
            raise ValueError('Radius must a positive real number.')
            
    @property
    def area(self):
        if self._area is None:
            self._area = pi * self.radius ** 2
        return self._area
            
    @property
    def perimeter(self):
        if self._perimeter is None:
            self._perimeter = 2 * pi * self.radius
        return self._perimeter

Now let's make a specialized circle class, a `UnitCircle` which is simply a circle with a radius of `1`:

In [35]:
class UnitCircle(Circle):
    def __init__(self):
        super().__init__(1)

And now we can use it this way:

In [36]:
u = UnitCircle()

In [37]:
u.radius, u.area, u.perimeter

(1, 3.141592653589793, 6.283185307179586)

Now one thing that's off here is that we can actually set the radius on the `UnitCircle` - which we probably don't want to allow.

My approach here is to redefine the `radius` property in the unit circle class and disallow setting the radius altogether:

In [38]:
class UnitCircle(Circle):
    def __init__(self):
        super().__init__(1)
        
    @property
    def radius(self):
        return super().radius

In [39]:
u = UnitCircle()

In [40]:
u.radius

1

In [41]:
u.radius = 10

AttributeError: can't set attribute

Note how my overriding property uses `super().radius` - I cannot use `self.radius` as that would be trying to call the radius getter defined in the `UnitCircle` class (the one I am currently defining) - instead I specifically want to access the property from the parent class.

Finally I want to come back to another example that also helps underscore the fact that methods called via `super()` are still bound to the original (child) object, and hence will use methods defined in the child class if they override any in the parent class - this is a little tricky, but fundamental to understand:

In [None]:
class Person:
    def method_1(self):
        print('Person.method_1')
        self.method_2()
        
    def method_2(self):
        print('Person.method_2')
        
class Student(Person):
    def method_1(self):
        print('Student.method_1')
        super().method_1()
        

In [None]:
s = Student()
s.method_1()

So `Student.method_1` called `Person.method_1` via `super`, which in turn called `Person.method_2` - all of these methods were bound to the `Student` instance `s`.

Now watch what happens when we also override `method_2` in the `Student` class:

In [None]:
class Person:
    def method_1(self):
        print('Person.method_1')
        self.method_2()
        
    def method_2(self):
        print('Person.method_2')
        
class Student(Person):
    def method_1(self):
        print('Student.method_1')
        super().method_1()
        
    def method_2(self):
        print('Student.method_2')

In [None]:
s = Student()
s.method_1()

Since `self.method_2()` in the Person class was called from `s`, that `self` is the instance `s`, and hence `method_2` from the `Student` class was called, not the one defined in the `Person` class!

##  Slots

Let's start with an example of how we use slots:

In [1]:
class Location:
    __slots__ = 'name', '_longitude', '_latitude'
    
    def __init__(self, name, longitude, latitude):
        self._longitude = longitude
        self._latitude = latitude
        self.name = name
        
    @property
    def longitude(self):
        return self._longitude
    
    @property
    def latitude(self):
        return self._latitude

`Location` still has that mapping proxy, and we can still add and remove **class** attributes from `Location`:

In [2]:
Location.__dict__

mappingproxy({'__module__': '__main__',
              '__slots__': ('name', '_longitude', '_latitude'),
              '__init__': <function __main__.Location.__init__(self, name, longitude, latitude)>,
              'longitude': <property at 0x7feed0329ae8>,
              'latitude': <property at 0x7feed0329b38>,
              '_latitude': <member '_latitude' of 'Location' objects>,
              '_longitude': <member '_longitude' of 'Location' objects>,
              'name': <member 'name' of 'Location' objects>,
              '__doc__': None})

In [3]:
Location.map_service = 'Google Maps'

In [4]:
Location.__dict__

mappingproxy({'__module__': '__main__',
              '__slots__': ('name', '_longitude', '_latitude'),
              '__init__': <function __main__.Location.__init__(self, name, longitude, latitude)>,
              'longitude': <property at 0x7feed0329ae8>,
              'latitude': <property at 0x7feed0329b38>,
              '_latitude': <member '_latitude' of 'Location' objects>,
              '_longitude': <member '_longitude' of 'Location' objects>,
              'name': <member 'name' of 'Location' objects>,
              '__doc__': None,
              'map_service': 'Google Maps'})

But the use of `slots` affects **instances** of the class:

In [5]:
l = Location('Mumbai', 19.0760, 72.8777)

In [6]:
l.name, l.longitude, l.latitude

('Mumbai', 19.076, 72.8777)

The **instance** no longer has a dictionary for maintaining state:

In [7]:
try:
    l.__dict__
except AttributeError as ex:
    print(ex)

'Location' object has no attribute '__dict__'


This means we can no longer add attributes to the instance:

In [8]:
try:
    l.map_link = 'http://maps.google.com/...'
except AttributeError as ex:
    print(ex)

'Location' object has no attribute 'map_link'


Now we can actually delete the attribute from the instance:

In [9]:
del l.name

And as we can see the instance now longer has that attribute:

In [10]:
try:
    print(l.name)
except AttributeError as ex:
    print(f'Attribute Error: {ex}')

Attribute Error: name


However we can still re-assign a value to that same attribute:

In [11]:
l.name = 'Mumbai'

In [12]:
l.name

'Mumbai'

Mainly we use slots when we expect to have many instances of a class and to gain a performance boost (mostly storage, but also attribute lookup speed). 

##  Slots and Single Inheritance

First let's create a simple class hierarchy that does not use slots:

In [1]:
class Person:
    def __init__(self, name):
        self.name = name
        
class Student(Person):
    pass

If we create an instance of `Student`, we'll see that the `name` attribute is stored in the instance dictionary:

In [2]:
s = Student('Alex')
s.__dict__

{'name': 'Alex'}

Now let's do the same thing, but use slots for the `Person` class:

In [3]:
class Person:
    __slots__ = 'name',
    
    def __init__(self, name):
        self.name = name

class Student(Person):
    pass

We know `Person` instances do not have a dictionary:

In [4]:
p = Person('Eric')
try:
    print(p.__dict__)
except AttributeError as ex:
    print(ex)

'Person' object has no attribute '__dict__'


But the sub class does:

In [5]:
s = Student('Alex')

In [6]:
s.name, s.__dict__

('Alex', {})

As you can see, the `Student` instance `s` has a dictionary - but note that the dictionary does not contain the `name` property - that is still stored in a slot.

So, `name` uses a slot, but the `Student` instance has an instance dictionary, which means we can add instance attributes to it:

In [7]:
s.age = 19

In [8]:
s.__dict__

{'age': 19}

In [9]:
s.name, s.age

('Alex', 19)

If we want our subclass to only use slots, we just need to specify a `__slots__` class attribute for it too:

In [10]:
class Student(Person):
    __slots__ = tuple()

In [11]:
s = Student('Alex')
s.name

'Alex'

And the `Student` instance no longer has an instance dictionary:

In [12]:
try:
    print(s.__dict__)
except AttributeError as ex:
    print(ex)

'Student' object has no attribute '__dict__'


Of course, we did not add to the slots for the `Student` class, so basically our `Student` instances can only have a `name` attribute. We can add additional attributes by just specifying them in the slots for `Student`:

In [13]:
class Student(Person):
    __slots__ = 'school', 'student_number'
    
    def __init__(self, name, school, student_number):
        super().__init__(name)
        self.school = school
        self.student_number = student_number

In [14]:
s = Student('James', 'MI6 Prep', '007')

In [15]:
s.name, s.school, s.student_number

('James', 'MI6 Prep', '007')

Although Python does not currently disallow redefining slota in a subclass, it may in the future, and it can also cause unexpected behavior, so don't do it.

When we subclass a slot-less class, and define slots for the subclass, then we get a similar behavior to oiur first example - the subclass has both an instance dictionary and slots:

In [16]:
class Person:
    def __init__(self, name):
        self.name = name
        
class Student(Person):
    __slots__ = 'age', 
    
    def __init__(self, name, age):
        super().__init__(name)
        self.age = age

In [17]:
s = Student('Python', 30)

In [18]:
s.name, s.age, s.__dict__

('Python', 30, {'name': 'Python'})

As you can see, the `age` attribute is stored in a slot, but the `name` attribute, defined in the slot-less `Person` class ends up in `Student`'s instance dictionary.

As we'll see later, behave essentially the same as properties - neither of them are actually stored in an instance dictionary - the additional effect of slots is that it (may) remove the need for an instance dictionary entirely.

So, when we define a property in a class, we don't need to specify it in the slots if we want to use slots:

In [19]:
class Person:
    __slots__ = '_name', 'age'
    
    def __init__(self, name, age):
        self.name = name
        self.age = age
        
    @property
    def name(self):
        return self._name
    
    @name.setter
    def name(self, name):
        self._name = name

In [20]:
p = Person('Eric', 78)

So `p` has a property `name` and a (slotted) attribute `age`:

In [21]:
p.name, p.age

('Eric', 78)

And we also do not have an instance dictionary:

In [22]:
try:
    print(p.__dict__)
except AttributeError as ex:
    print(ex)

'Person' object has no attribute '__dict__'


So, as we can see neither the property `name` not the slotted attribute `age` are stored in an instance dictionary.

In fact, they are very much related - to something called descriptors, which we'll study later.

But let me just show you a quick preview of it.

Descriptors are objects that implement certain special functions (of course!) - just like iterators are objects that implement the special functions `__iter__` asnd `__next__`.

For data descriptors, we implement the `__get__` and `__set__` methods (some others too, but those are enough for now).

So let's look at the attributes of the property `name` first:

In [23]:
hasattr(Person.name, '__get__'), hasattr(Person.name, '__set__')

(True, True)

And now let's see the slotted attribute `age`:

In [24]:
hasattr(Person.age, '__get__'), hasattr(Person.age, '__set__')

(True, True)

Aha! See, both implement these methods!

And by the way, remember when I said that the `property` class was just a convenience class? Well, in fact it basically creates an class for us that implements the `__get__`, `__set__`, etc methods based on the methods we specify for `fget`, `fset`, etc respectively.

Lastly, we have seen that we can have classes that have both a dictionary and slots - we got those when we used inheritance.

But when we define without the `__slots__` attribute then it has an instance dictionary but no slots, and when we define `__slots__` it has slots but no instance dictionary.

We can actually define classes that have both, simply by specifying `__dict__` as **one of the slots**:

In [25]:
class Person:
    __slots__ = 'name', '__dict__'
    
    def __init__(self, name, age):
        self.name = name
        self.age = age

In [26]:
p = Person('Alex', 19)

In [27]:
p.name, p.age, p.__dict__

('Alex', 19, {'age': 19})

As we can see, we have an instance dictionary (that contains `age` since it was not defined in the `__slots__`, and `name` which was defined as a slot however, is not fouind in the instance dictionary.

Of course, since we have an instance dictionary, we can add and remove arbitrary attributes at "run-time":

In [28]:
p.school = 'Berkeley'

In [29]:
p.__dict__

{'age': 19, 'school': 'Berkeley'}

# Section 07 - Project 3

##  Project 3 - Single Inheritance - Solution

You are writing an inventory application for a budding tech guy who has a video channel featuring computer builds.
Basically they have a pool of inventory, (for example 5 x AMD Ryzen 2-2700 CPUs) that they use for builds. When they take a CPU from the pool, they will indicate this using the object that tracks that sepcific type of CPU. They may also purchase additional CPUs, or retire some (because they overclocked it too much and burnt them out!).

Technically we would want a database to back all this data, but here we're just going to build the classes we'll use while our program is running and not worry about retrieving or saving the state of the inventory.

The base class is going to be a general `Resource`. This class should provide functionality common to all the actual resources (CPU, GPU, Memory, HDD, SSD) - for this exercise we're only going to implement CPU, HDD and SSD.

It should provide this at a minimum:

- `name` : user-friendly name of resource instance (e.g.` Intel Core i9-9900K`)
- `manufacturer` - resource instance manufacturer (e.g. `Nvidia`)
- `total` : inventory total (how many are in the inventory pool)
- `allocated` : number allocated (how many are already in use)
- a `__str__` representation that just returns the resource name
- a mode detailed `__repr__` implementation
- `claim(n)` : method to take n resources from the pool (as long as inventory is available)
- `freeup(n)` : method to return n resources to the pool (e.g. disassembled some builds)
- `died(n)` : method to return and permanently remove inventory from the pool (e.g. they broke something) - as long as total available allows it
- `purchased(n)` - method to add inventory to the pool (e.g. they purchased a new CPU)
- `category` - computed property that returns a lower case version of the class name

Next we are going to define child classes for each of CPU, HDD and SDD.

For the `CPU` class:
- `cores` (e.g. `8`)
- `socket` (e.g. `AM4`)
- `power_watts` (e.g. `94`)

For the HDD and SDD classes, we're going to create an intermediate class called `Storage` with these additional properties:
- `capacity_GB` (e.g. `120`)

The `HDD` class extends `Storage` and has these additional properties:
- `size` (e.g. ``2.5"``)
- `rpm` (e.g. `7000`)

The `SSD` class extends `Storage` and has these additional properties:
- `interface` (e.g. `PCIe NVMe 3.0 x4`)

For all your classes, implement a full constructor that can be used to initialize all the properties, some form of validation on numeric types, as well as customized `__repr__` as you see fit.

For the `total` and `allocated` values in the `Resource` init, think of the arguments there as the **current** total and allocated counts. Those `total` and `allocated` attributes should be private **read-only** properties, but they are modifiable through the various methods such as `claim`, `return`, `died` and `purchased`. Other attributes like `name`, `manufacturer_name`, etc should be read-only.

##  Project 3 - Single Inheritance - Approach

I'm going to use an actual Python project with folders, modules, etc for this solution.

This notebook is simply detailing the sequence of steps I took to get at my final solution.

You can download the full solution from the resources in this video, or (preferrably) directly from the
[github repo](https://github.com/fbaptiste/python-deepdive)

#### Virtual Environment and pytest

I'm going to use `pytest` for testing in this project, so you should install it into your virtual environment.

Note that if you are not already using virtual environments for your projects I strongly suggest you do so.

Creating a virtual environment is incredibly easy.

1. create a folder for your project
2. create a virtual environment named `env` (or any name you prefer) by typing this in a console from inside your new folder:
  - `python -m venv env`
  - note: if you have both Python 2.x and 3.x installed, you'll probably need to specify it as `python3 -m venv env`
  - you should now have a new folder called `env` inside your project folder.
3. Next you should activate your virtual environment. How you do this will differ on Windows vs Mac/Linux:
  - Windows: `env\Scripts\activate`
  - Linux/Mac: `source env/bin/activate`
  - Your command prompt shoudl now reflect the activation of the virtual environment something like `(env)` at the beginning of the prompt.

To deactivate a virtual environment, simply type `deactivate`.

Next we need to install the `pytest` library. We want to install `pytest` in our virtual environment, so do this after **activating** your virtual environment - make sure your prompt reflects that first.

Then install `pytest` by typing this:
`pip install -U pytest`

That's it, you now have a virtual environment that has `pytest`.

#### Project Steps

I'm going to provide proper docstrings for every module, class, function, etc. I will use the Google style of docstrings, which is documented [here](https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings)

1. Create this folder hierarchy in the project root:

```
<project root>
....app
........models
........utils
....tests
........unit
```

Note: there is no need to create packages (no `__init.py__`), we will simply use implicit namespace packages.

2. Create a new module (`validators.py`) inside the `app/utils` package. In that module create a helper function `validate_integer` that will allow us to validate that a value is an integer, optionally between a min and max (inclusive), and raises a `TypeError`, `ValueError` with a custom error message that can be overriden when bound checks fail.

3. Inside the `tests/unit` folder, create a new module called `test_validators.py` and create the unit tests for the `validate_integer` function.

4. Run the unit tests and make sure all the tests pass.
    - to run the unit tests, you can use your IDE's built-in way of doing it, or you can just use the command line, from the root of your project: 
    
    `python -m pytest tests` 
    
    (this will run all the tests found in that folder - you can specify more specific path to limit your tests further)

5. In the `models` folder, create a new module file called `inventory.py`.

6. Implement the `Resource` class

7. Create a new file `test_resource.py` in the `tests` folder

8. Create unit tests for the `Resource` class and make sure they all pass

9. Create `CPU` class

10. Unit test `CPU` class

11. Create `Storage` Class

12. Unit test `Storage` class

13. Create `HDD` class

14. Unit test `HDD` class

15. Create `SDD` class

16. Unit test `SDD` class

# Section 08 - Descriptors

##  Descriptors

Python **descriptors** are simply objects that implement the **descriptor protocol**.

The protocol is comprised of the following special methods - not all are required.
- `__get__`: used to retrieve the property value
- `__set__`: used to store the property value (we'll see where we can do this in a bit)
- `__del__`: delete a property from the instance
- `__set_name__`: new to Python 3.6, we can use this to capture the property name as it is being defined in the owner class (the class where the property is defined).

There are two types of descriptors we need to distingush as I explain in the video lecture:
- non-data descriptors: these are descriptors that only implement `__get__` (and optionally `__set_name__`)
- data descriptors: these implement the `__set__` method, and normally, also the `__get__` method.

Let's create a simple non-data descriptor:

In [1]:
from datetime import datetime

class TimeUTC:
    def __get__(self, instance, owner_class):
        return datetime.utcnow().isoformat()

So `TimeUTC` is a class that implements the `__get__` method only, and is therefore considered a non-data descriptor.

We can now use it to create properties in other classes:

In [2]:
class Logger:
    current_time = TimeUTC()

Note that `current_time` is a class attribute:

In [3]:
Logger.__dict__

mappingproxy({'__module__': '__main__',
              'current_time': <__main__.TimeUTC at 0x7fdcd84bbd68>,
              '__dict__': <attribute '__dict__' of 'Logger' objects>,
              '__weakref__': <attribute '__weakref__' of 'Logger' objects>,
              '__doc__': None})

We can access that attribute from an instance of the `Logger` class:

In [4]:
l = Logger()

In [5]:
l.current_time

'2019-07-13T20:47:06.391770'

We can also access it from the class itself, and for now it behaves the same (we'll come back to that later):

In [6]:
Logger.current_time

'2019-07-13T20:47:06.405059'

Let's consider another example.

Suppose we want to create class that allows us to select a random suit and random card from that suit from a deck of cards (with replacement, i.e. the same card can be picked more than once).

We could approach it this way:

In [7]:
from random import choice, seed

class Deck:
    @property
    def suit(self):
        return choice(('Spade', 'Heart', 'Diamond', 'Club'))
        
    @property
    def card(self):
        return choice(tuple('23456789JQKA') + ('10',))

In [8]:
d = Deck()

In [9]:
seed(0)

for _ in range(10):
    print(d.card, d.suit)

8 Club
2 Diamond
J Club
8 Diamond
9 Diamond
Q Heart
J Heart
6 Heart
10 Spade
Q Diamond


This was pretty easy, but as you can see both properties essentially did the same thing - they picked a random choice from some iterable.

Let's rewrite this using a custom descriptor:

In [10]:
class Choice:
    def __init__(self, *choices):
        self.choices = choices
        
    def __get__(self, instance, owner_class):
        return choice(self.choices)

And now we can rewrite our `Deck` class this way:

In [11]:
class Deck:
    suit = Choice('Spade', 'Heart', 'Diamond', 'Club')
    card = Choice(*'23456789JQKA', '10')

In [12]:
seed(0)

d = Deck()

for _ in range(10):
    print(d.card, d.suit)

8 Club
2 Diamond
J Club
8 Diamond
9 Diamond
Q Heart
J Heart
6 Heart
10 Spade
Q Diamond


Of course we are not limited to just cards, we could use it in other classes too:

In [13]:
class Dice:
    die_1 = Choice(1,2,3,4,5,6)
    die_2 = Choice(1,2,3,4,5,6)
    die_3 = Choice(1,2,3,4,5,6)

In [14]:
seed(0)

dice = Dice()
for _ in range(10):
    print(dice.die_1, dice.die_2, dice.die_3)

4 4 1
3 5 4
4 3 4
3 5 2
5 2 3
2 1 5
3 5 6
5 2 3
1 6 1
6 3 4


##  Getters and Setters

So far we have seen how the `__get__` method is called when we assign an instance of a descriptors to a class attribute.

But we can access that attribute either from the class itself, or the instance - as we saw in the last lecture, both accesses end up calling the `__get__` method.

But what changes are the arguments passed to the method. Let's explore this:

In [1]:
from datetime import datetime

class TimeUTC:
    def __get__(self, instance, owner_class):
        print(f'__get__ called, self={self}, instance={instance}, owner_class={owner_class}')
        return datetime.utcnow().isoformat()

In [2]:
class Logger1:
    current_time = TimeUTC()
    
class Logger2:
    current_time = TimeUTC()

Now let's access `current_time` from the class itself:

In [3]:
Logger1.current_time

__get__ called, self=<__main__.TimeUTC object at 0x7f83d035be48>, instance=None, owner_class=<class '__main__.Logger1'>


'2019-07-13T20:47:14.961760'

As you can see, the `instance` was `None` - this was because we called the descriptor from the `Logger1` class, not an instance of it. The `owner_class` tells us this descriptor instance is defined in the `Logger1` class.

The same holds if we use `Logger2`:

In [4]:
Logger2.current_time

__get__ called, self=<__main__.TimeUTC object at 0x7f83d035be80>, instance=None, owner_class=<class '__main__.Logger2'>


'2019-07-13T20:47:14.997577'

But if we call the descriptor via an instance instead:

In [5]:
l1 = Logger1()
print(hex(id(l1)))

0x7f83d03864a8


In [6]:
l1.current_time

__get__ called, self=<__main__.TimeUTC object at 0x7f83d035be48>, instance=<__main__.Logger1 object at 0x7f83d03864a8>, owner_class=<class '__main__.Logger1'>


'2019-07-13T20:47:15.027484'

As you can see, `instance` is now the `l1` instance, and the owner class is still `Logger1`.

The sme holds for instance of `Logger2`:

In [7]:
l2 = Logger2()
print(hex(id(l2)))
l2.current_time

0x7f83d0386b38
__get__ called, self=<__main__.TimeUTC object at 0x7f83d035be80>, instance=<__main__.Logger2 object at 0x7f83d0386b38>, owner_class=<class '__main__.Logger2'>


'2019-07-13T20:47:15.043101'

This means that we can differentiate, inside our `__get__` method whether the descriptor was accessed via the class or via an instance.

Typically when a descriptor is access from the class we return the descriptor instance, and when accessed from the instance we return the instance specific value we want:

In [8]:
from datetime import datetime

class TimeUTC:
    def __get__(self, instance, owner_class):
        if instance is None:
            # called from class
            return self
        else:
            # called from instance
            return datetime.utcnow().isoformat()

In [9]:
class Logger:
    current_time = TimeUTC()

In [10]:
Logger.current_time

<__main__.TimeUTC at 0x7f83d039a128>

In [11]:
l = Logger()

In [12]:
l.current_time

'2019-07-13T20:47:15.109595'

This is consistent with the way properties work:

In [13]:
class Logger:
    @property
    def current_time(self):
        return datetime.utcnow().isoformat()

In [14]:
Logger.current_time

<property at 0x7f83d0395d68>

This returned the property instance, whereas calling it from an instance:

In [15]:
l = Logger()
l.current_time

'2019-07-13T20:47:15.162299'

Now, there is one subtle point we have to understand when we create multiple instances of a class that uses a descriptor as a class attribute.

Since the descriptor is assigned to an **class attribute**, all instances of the class will **share** the same descriptor instance!

In [16]:
class TimeUTC:
    def __get__(self, instance, owner_class):
        if instance is None:
            # called from class
            return self
        else:
            # called from instance
            print(f'__get__ called in {self}')
            return datetime.utcnow().isoformat()
        
class Logger:
    current_time = TimeUTC()

In [17]:
l1 = Logger()
l2 = Logger()

But look at the `current_time` for each of those instances

In [18]:
l1.current_time, l2.current_time

__get__ called in <__main__.TimeUTC object at 0x7f83d039aeb8>
__get__ called in <__main__.TimeUTC object at 0x7f83d039aeb8>


('2019-07-13T20:47:15.209930', '2019-07-13T20:47:15.210094')

As you can see the **same** instance of `TimeUTC` was used.

This does not matter in this particular example, since we just return the current time, but watch what happens if our property relies on some kind of state in the descriptor:

In [19]:
class Countdown:
    def __init__(self, start):
        self.start = start + 1
        
    def __get__(self, instance, owner):
        if instance is None:
            return self
        else:
            self.start -= 1
            return self.start

In [20]:
class Rocket:
    countdown = Countdown(10)

Now let's say we want to launch two rockets:

In [21]:
rocket1 = Rocket()
rocket2 = Rocket()

And let's start the countdown for each one:

In [22]:
rocket1.countdown

10

In [23]:
rocket2.countdown

9

In [24]:
rocket1.countdown

8

As you can see, the current countdown value is shared by both `rocket1` and `rocket2` instances of `Rocket` - this is because the `Countdown` instance is a class attribute of `Rocket`. So we have to be careful how we deal with instance level state.

The `__set__` method works in a similar way to `__get__` but it is used when we assign a value to the class attribute.

In [25]:
class IntegerValue:
    def __set__(self, instance, value):
        print(f'__set__ called, instance={instance}, value={value}')
        
    def __get__(self, instance, owner_class):
        if instance is None:
            print('__get__ called from class')
        else:
            print(f'__get__ called, instance={instance}, owner_class={owner_class}')

In [26]:
class Point2D:
    x = IntegerValue()
    y = IntegerValue()

In [27]:
Point2D.x

__get__ called from class


In [28]:
p = Point2D()

In [29]:
p.x

__get__ called, instance=<__main__.Point2D object at 0x7f83d03a8f28>, owner_class=<class '__main__.Point2D'>


In [30]:
p.x = 100

__set__ called, instance=<__main__.Point2D object at 0x7f83d03a8f28>, value=100


So, where should we store the values `x` and `y`? 

Many "tutorials" I see on the web naively store the value in the descriptor itself:

In [31]:
class IntegerValue:
    def __set__(self, instance, value):
        self._value = int(value)
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return self._value

In [32]:
class Point2D:
    x = IntegerValue()
    y = IntegerValue()

At first blush, this seems to work just fine:

In [33]:
p1 = Point2D()

In [34]:
p1.x = 1.1
p1.y = 2.2

In [35]:
p1.x, p1.y

(1, 2)

But, remember the point I was making about the instance of the descriptor (`IntegeraValue` in this case) being shared by all instances of the class (`Point2D` in this case)?

In [36]:
p2 = Point2D()

In [37]:
p2.x, p2.y

(1, 2)

And of course if we set the value:

In [38]:
p2.x = 100.9

In [39]:
p2.x, p1.x

(100, 100)

So, obviously using the descriptor instance dictionary for storage at the instance level is probably not going to work in most cases!

And this is the reason both the `__get__` and `__set__` methods need to know which instance we are dealing with.

##  Using as Instance Properties

So let's start exploring how we can use descriptors to read and write instance properties.

We might try something like this first:

In [1]:
class IntegerValue:
    def __set__(self, instance, value):
        instance.stored_value = int(value)
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return getattr(instance, 'stored_value', None)

Basically we are going to use the instance dictionary to store the value under some name (symbol) in it - what name should we use? That could be an issue, and we'll come back to that.

In [2]:
class Point1D:
    x = IntegerValue()

In [3]:
p1, p2 = Point1D(), Point1D()

In [4]:
p1.x = 10.1
p2.x = 20.2

In [5]:
p1.x, p2.x

(10, 20)

As you can see, we now have a descriptor that uses the instances themselves to store the data:

In [6]:
p1.__dict__, p2.__dict__

({'stored_value': 10}, {'stored_value': 20})

But you'll notice that our descriptor is hard coded to using the same key in the instance dictionaries - which leads us to this problem:

In [7]:
class Point2D:
    x = IntegerValue()
    y = IntegerValue()

In [8]:
p = Point2D()

In [9]:
p.x = 10.1

In [10]:
p.__dict__

{'stored_value': 10}

And what happens if we set `y`? What symbol is the descriptor going to use to store the value in the instance?

In [11]:
p.y = 20.2

In [12]:
p.__dict__

{'stored_value': 20}

Yep, the **same** symbol!

In [13]:
p.x, p.y

(20, 20)

So that appropach is not going to work either. Somehow we would need to have a distinct storage name for each property.

We could do this by using the `__init__` of our descriptor:

In [14]:
class IntegerValue:
    def __init__(self, name):
        self.storage_name = '_' + name 
        
    def __set__(self, instance, value):
        setattr(instance, self.storage_name, int(value))
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return getattr(instance, self._storage_name, None)
        
class Point2D:
    x = IntegerValue('x')
    y = IntegerValue('y')

In [15]:
p1 = Point2D()
p2 = Point2D()

In [16]:
p1.x = 10.1
p1.y = 20.2

In [17]:
p1.__dict__

{'_x': 10, '_y': 20}

In [18]:
p2.x = 100.1
p2.y = 200.2

In [19]:
p2.__dict__

{'_x': 100, '_y': 200}

So this approach can work just fine, but there are a few drawbacks:

1. The user needs to specify the name of the property twice
2. We assume that `_` + `name` is not also used by the class in which the descriptor exists (so that could be a major problem)
3. We assume we can add an attribute to the instance - but what if it uses slots?

One way we could get around each of those problems is by using the descriptor instance itself to store the instance values. But as we saw earlier, we can't just set an attribute in the descriptor instance, since that would be shared across multiple instances of the class containing the descriptor.

Instead, we are going to **assume** that the `instance` is a hashable object, and use a dictionary in the descriptor to store instance specific values:

In [20]:
class IntegerValue:
    def __init__(self):
        self.values = {}
        
    def __set__(self, instance, value):
        self.values[instance] = int(value)
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return self.values.get(instance)

In [21]:
class Point2D:
    x = IntegerValue()
    y = IntegerValue()

In [22]:
p1 = Point2D()
p2 = Point2D()

In [23]:
p1.x = 10.1
p1.y = 20.2

In [24]:
p1.x, p1.y

(10, 20)

In fact, we can see the dictionary in the descriptor instances:

In [25]:
Point2D.x.values

{<__main__.Point2D at 0x7fa8e8204828>: 10}

In [26]:
Point2D.x.values

{<__main__.Point2D at 0x7fa8e8204828>: 10}

where the key in both of these is our `p1` object:

In [27]:
hex(id(p1))

'0x7fa8e8204828'

We can now create a second point, and go through the same steps:

In [28]:
p2 = Point2D()
p2.x = 100.1
p2.y = 200.2

In [29]:
hex(id(p2))

'0x7fa8b801bb00'

In [30]:
Point2D.x.values

{<__main__.Point2D at 0x7fa8e8204828>: 10,
 <__main__.Point2D at 0x7fa8b801bb00>: 100}

In [31]:
Point2D.y.values

{<__main__.Point2D at 0x7fa8e8204828>: 20,
 <__main__.Point2D at 0x7fa8b801bb00>: 200}

And everything works just fine:

In [32]:
p1.x, p1.y, p2.x, p2.y

(10, 20, 100, 200)

Or does it??

We actually have a potential memory leak - notice how the dictionary in the desccriptor instance is **also** storing a reference to the point object - as a **key** in the dictionary.

Let's write a simple utility function that allows us to get the reference count for an object given it's id (and it only makes sense if the id we use still has a valid non-destroyed object):

In [33]:
import ctypes

def ref_count(address):
    return ctypes.c_long.from_address(address).value

In [34]:
p1 = Point2D()
id_p1 = id(p1)

In [35]:
ref_count(id_p1)

1

Now let's set the `x` property of `p1`:

In [36]:
p1.x = 100.1

And let's check the ref count again:

In [37]:
ref_count(id_p1)

2

As you can see it's now `2`. if we delete our main reference to `p1` that is in our global namespace:

In [38]:
'p1' in globals()

True

In [39]:
del p1

In [40]:
'p1' in globals()

False

In [41]:
ref_count(id_p1)

1

And our reference count is still `1`, which means the object itself has not been destroyed!

In fact, we can see that object referenced in our data descriptor dictionary:

In [42]:
Point2D.x.values.items()

dict_items([(<__main__.Point2D object at 0x7fa8e8204828>, 10), (<__main__.Point2D object at 0x7fa8b801bb00>, 100), (<__main__.Point2D object at 0x7fa8e820a550>, 100)])

In [43]:
hex(id_p1)

'0x7fa8e820a550'

As you can see, the last element's key is the same id as what `p1` was referencing.

So, although we deleted `p1`, the object was not destroyed - this can result in a memory leak.

There are a few ways we can handle this issue. The first one we are going to look at is something called **weak references**. So let's segway into that next.

##  Strong and Weak References

First let's bring back the function we can use to determine the reference count of an object by id:

In [1]:
import ctypes

def ref_count(address):
    return ctypes.c_long.from_address(address).value

Note that this counts the **strong** references to that object.

So far, we have always worked with strong references. 

In [2]:
class Person:
    def __init__(self, name):
        self.name = name
        
    def __repr__(self):
        return f'Person(name={self.name})'

In [3]:
p1 = Person('Guido')
p2 = p1

In this case both `p1` and `p2` are **strong** references to the same `Person` instance (*Guido*).

In [4]:
p1_id = id(p1)
p2_id = id(p2)

In [5]:
p1_id == p2_id, ref_count(p1_id)

(True, 2)

So we have two strong references. If we delete one of them:

In [6]:
del p2

We should have a strong reference count of `1` now:

In [7]:
ref_count(p1_id)

1

We can delete the last reference:

In [8]:
del p1

Now our reference count function will not work anymore, since the last reference to the object at that mempry address was removed and that memory address is now meaningless:

In [9]:
ref_count(p1_id)

-370994432650002694

So, the garbage collector will destroy any object whose **strong** reference count goes down to `0`.

There is another type of reference to an object that we can use that **does not** affect the (strong) reference count - these are called **weak references**.

We can create weak references to objects in Python using the `weakref` module:

In [10]:
import weakref

In [11]:
p1 = Person('Guido')

In [12]:
p1_id = id(p1)

In [13]:
ref_count(p1_id)

1

Now let's make another strong reference:

In [14]:
p2 = p1

In [15]:
ref_count(p1_id)

2

And finally let's make a weak reference to the same object:

In [16]:
weak1 = weakref.ref(p1)

Let's look at the ref count again:

In [17]:
ref_count(p1_id)

2

As you can see, it's still `2`.

The `weak1` object is a weak reference object:

In [18]:
weak1

<weakref at 0x7fbae83667c8; to 'Person' at 0x7fbae8359908>

As you can see form the representation it is it's own object, but it points to the same object `p1` is currently pointing to:

In [19]:
hex(p1_id)

'0x7fbae8359908'

So `weak1` is not the `Person` instance:

In [20]:
weak1 is p1

False

In [21]:
ref_count(p1_id)

2

But it is callable (so it implements a `__call__` method) that will return the object it is pointing to:

In [22]:
weak1() is p1

True

And we can see the object it is pointing to:

In [23]:
print(weak1())

Person(name=Guido)


Now we have to watch out here, if we did not use the `print` statement, Jupyter would be holding on to strong references to our object! Be sure to use `print` when using Jupyter...

So our reference count should still be `2`:

In [24]:
ref_count(p1_id)

2

Another word of caution, if we do this:

In [25]:
p3 = weak1()

`p3` is now a strong reference to whatever object `weak1()` returned! In this case our *Guido* `Person`:

In [26]:
p1 is p3

True

In [27]:
ref_count(p1_id)

3

And as you can see we now have three strong references.

How many weak references do we have? We should have `1` only.

We can see how many weak references exist from some object by using the `getweakrefcount` function in the `weakref` module:

In [28]:
weakref.getweakrefcount(p1), ref_count(p1_id)

(1, 3)

Another way of getting the strong ref count is in the `sys` module:

In [29]:
import sys

In [30]:
sys.getrefcount(p1)

4

But you'll notice one thing, the ref count is increased by `1` - that's because we have to pass the object itself as an extra argument, so that's an extra strong reference! (so basically always subtract `1` from that ref count to get the true ref count)

Now let's delete some of the strong references:

In [31]:
del p3
del p2

In [32]:
ref_count(p1_id)

1

Our strong ref count is down to 1, and we still have one weak reference (`weak1`).

Now let's delete the final strong reference:

In [33]:
del p1

Our strong ref count wnet down to `0`, so the garbage collector destroyed the object.

So what happened to our weak reference?

In [34]:
weak1

<weakref at 0x7fbae83667c8; dead>

The weak reference object still exists, but the object it is pointing to is **dead**.

In fact, if we try to get the object, we will get `None` back:

In [35]:
obj = weak1()

In [36]:
obj is None

True

As you can see, having a weak reference did not stop our object from being destroyed once all the strong references were gone.

Note that not every object in Python supports weak references. Many of the built-in types do not:

In [37]:
l = [1, 2, 3]
try:
    w = weakref.ref(l)
except TypeError as ex:
    print(ex)

cannot create weak reference to 'list' object


In [38]:
l = {'a': 1}
try:
    w = weakref.ref(l)
except TypeError as ex:
    print(ex)

cannot create weak reference to 'dict' object


In [39]:
l = 100
try:
    w = weakref.ref(l)
except TypeError as ex:
    print(ex)


cannot create weak reference to 'int' object


In [40]:
l = 'python'
try:
    w = weakref.ref(l)
except TypeError as ex:
    print(ex)

cannot create weak reference to 'str' object


But our custom classes do, and that's what we need here.

For our data descriptors, we want to use the instance objects as keys in our dictionary. But as we saw earlier, storing the object itself as the key can lead to memory leaks. So instead, we are going to store weak references to the object in the dictionary.

We could use our own dictionary, but `weakref` also provides a specialized dictionary type, that will store a weak reference to the object being used as the key:

In [41]:
p1 = Person('Guido')

In [42]:
d = weakref.WeakKeyDictionary()

In [43]:
ref_count(id(p1))

1

In [44]:
weakref.getweakrefcount(p1)

0

In [45]:
d[p1] = 'Guido'

Now, notice the reference counts:

In [46]:
ref_count(id(p1)), weakref.getweakrefcount(p1)

(1, 1)

We still have only one strong reference, but now we have a weak reference to `p1` as well! That weak reference is in the `WeakKeyDictionary`.

We can easily see the weak references contained in that dictionary:

In [47]:
hex(id(p1)), list(d.keyrefs())

('0x7fbae83635c0',
 [<weakref at 0x7fbae8381958; to 'Person' at 0x7fbae83635c0>])

Now watch what happens to the dictionary when we delete the last strong reference to `p1`:

In [48]:
del p1

In [49]:
list(d.keyrefs())

[]

It was automatically removed when the object it was pointing to (weakly) was destroyed by the garbage collector!

Now be careful, you can only use keys in the `WeakKeyDictionary` that Python can create weak references to:

So this will not work:

In [50]:
try:
    d['python'] = 'test'
except TypeError as ex:
    print(ex)

cannot create weak reference to 'str' object


Also, even though we are using a weak reference as a key in the dictionary, the object must still be **hashable**.

Let's see an example of this:

In [51]:
class Person:
    def __init__(self, name):
        self.name = name
        
    def __eq__(self, other):
        return isinstance(other, Person) and self.name == other.name

Now `Person` is no longer hashable:

In [52]:
p1 = Person('Guido')
p2 = Person('Guido')

In [53]:
p1 == p2

True

In [54]:
try:
    hash(p1)
except TypeError as ex:
    print(ex)

unhashable type: 'Person'


And so we cannot use it as a key in our `WeakKeyDictionary`:

In [55]:
try:
    d[p1] = 'Guido'
except TypeError as ex:
    print(ex)

unhashable type: 'Person'


So we can certainly use `WeakKeyDictionary` objects in our data descriptors, but that will only work with hashable objects. In the next lectures we'll look at how to use `WeakKeyDictionary` as a storage mechanism for our data descriptors, as well as how to deal with the unhashable issue.

##  Back to Instance Properties

Let's try using `WeakKeyDictionary` to store our instance data in our data descriptor.

Basically, this is exactly the same as what we were doing before, but instead of using a standard dictionary (that potentially causes memory leaks), we'll use a `WeakKeyDictionary`.

Recall what we had before:

In [1]:
class IntegerValue:
    def __init__(self):
        self.values = {}
        
    def __set__(self, instance, value):
        self.values[instance] = int(value)
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return self.values.get(instance)

Now, we are going to refactor this to use the weak key dictionary:

In [2]:
import weakref

In [3]:
class IntegerValue:
    def __init__(self):
        self.values = weakref.WeakKeyDictionary()
        
    def __set__(self, instance, value):
        self.values[instance] = int(value)
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return self.values.get(instance)

And that's all there is to it. We now have weak references instead of strong references in our dictionary, and the dictionary cleans up after itself (removes "dead" entries) when the reference object has been destroyed by the GC.

In [4]:
class Point:
    x = IntegerValue()

In [5]:
p = Point()
print(hex(id(p)))

0x7fa760414400


In [6]:
p.x = 100.1

In [7]:
p.x

100

In [8]:
Point.x.values.keyrefs()

[<weakref at 0x7fa76041d048; to 'Point' at 0x7fa760414400>]

And if we delete `p`, thereby deleting the last strong reference to that object:

In [9]:
del p

In [10]:
Point.x.values.keyrefs()

[]

So this is almost a perfect general solution:

1. We do not need to store the data in the instances themseves (so we can handle objects whose class uses `__slots__`)
2. We are protected from memory leaks

But this only works for **hashable** objects.

So, now let's try to address this hashability issue.

Since we cannot use the object itself as the key in a dictionary (weak or otherwise), we could try using the `id` of the object (which is an int) as the key in a standard dictionary:

In [11]:
class IntegerValue:
    def __init__(self):
        self.values = {}
        
    def __set__(self, instance, value):
        self.values[id(instance)] = int(value)
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return self.values.get(id(instance))

Now we can use this approach with non-hashable objects:

In [12]:
class Point:
    x = IntegerValue()
    
    def __init__(self, x):
        self.x = x
        
    def __eq__(self, other):
        return isinstance(other, Point) and self.x == other.x

In [13]:
p = Point(10.1)

In [14]:
p.x

10

In [15]:
p.x = 20.2

In [16]:
p.x

20

In [17]:
id(p), Point.x.values

(140356851267288, {140356851267288: 20})

Now we no longer have a memory leak:

In [18]:
import ctypes

def ref_count(address):
    return ctypes.c_long.from_address(address).value

In [19]:
p_id = id(p)

In [20]:
ref_count(p_id)

1

In [21]:
del p

In [22]:
ref_count(p_id)

-1

But, we now have a "dead" entry in our dictionary - that memory address is still present as a key. Now, you might think it's not a big deal, but Python does reuse memory addresses, so we could run into potential issues there (where the data descriptor would have a value for a property already set from a previous object), and also the fact that our dictionary is cluttered with these dead entries:

In [23]:
Point.x.values

{140356851267288: 20}

So we need a way to determine if the object has been destroyed.

We know that weak references are aware of when objects are destroyed:

In [24]:
p = Point(10.1)
weak_p = weakref.ref(p)

In [25]:
print(hex(id(p)), weak_p)  
# again note how I need to use print to avoid affecting the ref count

0x7fa76043c588 <weakref at 0x7fa760439318; to 'Point' at 0x7fa76043c588>


In [26]:
ref_count(id(p))

1

And if I remove the last strong reference to `p`:

In [27]:
del p

In [28]:
print(weak_p)

<weakref at 0x7fa760439318; dead>


You can see that the weak reference was made aware of that change - in fact we can as well, by specifying a **callback** function that Python will call once the weak reference becomes dead (i.e. the object was destroyed by the GC):

In [29]:
def obj_destroyed(obj):
    print(f'{obj} is being destroyed')

In [30]:
p = Point(10.1)
w = weakref.ref(p, obj_destroyed)

In [31]:
del p

<weakref at 0x7fa760439f48; dead> is being destroyed


As you can see the callback function receives the weak ref object as the argument.

So, we can use this to our advantage in our data descriptor, by registering a callback that we can use to remove the "dead" entry from our values dictionary.

This means we do need to store a weak reference to the object as well - we'll do that in the value of the `values` dictionary as part of a tuple containing a weak reference to the object, and the corresponding value):

In [32]:
class IntegerValue:
    def __init__(self):
        self.values = {}
        
    def __set__(self, instance, value):
        self.values[id(instance)] = (weakref.ref(instance, self._remove_object), 
                                     int(value)
                                    )
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            value_tuple = self.values.get(id(instance))
            return value_tuple[1]  # return the associated value, not the weak ref
        
    def _remove_object(self, weak_ref):
        print(f'removing dead entry for {weak_ref}')
        # how do we find that weak reference?

Let's just make sure our call back is being called as expected:

In [33]:
class Point:
    x = IntegerValue()

In [34]:
p1 = Point()
p2 = Point()

In [35]:
p1.x, p2.x = 10.1, 100.1

In [36]:
p1.x, p2.x

(10, 100)

Now let's delete those objects:

In [37]:
ref_count(id(p1)), ref_count(id(p2))

(1, 1)

In [38]:
del p1

removing dead entry for <weakref at 0x7fa760420cc8; dead>


In [39]:
del p2

removing dead entry for <weakref at 0x7fa760451098; dead>


OK, so now all that's left is to remove the corresponding entry from the dictionary. Problem is that we do not have the object itself at that point (and therefore do not have it's id either), so we cannot get to the dictionary item using the key - we'll simply have to iterate through the values in the dictionary until we find the value whose first item is the weak reference that caused the call back:

In [40]:
class IntegerValue:
    def __init__(self):
        self.values = {}
        
    def __set__(self, instance, value):
        self.values[id(instance)] = (weakref.ref(instance, self._remove_object), 
                                     int(value)
                                    )
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            value_tuple = self.values.get(id(instance))
            return value_tuple[1]  # return the associated value, not the weak ref
        
    def _remove_object(self, weak_ref):
        reverse_lookup = [key for key, value in self.values.items()
                         if value[0] is weak_ref]
        if reverse_lookup:
            # key found
            key = reverse_lookup[0]
            del self.values[key]

In [41]:
class Point:
    x = IntegerValue()

In [42]:
p = Point()

In [43]:
p.x = 10.1

In [44]:
p.x

10

In [45]:
Point.x.values

{140356851302352: (<weakref at 0x7fa760451db8; to 'Point' at 0x7fa760437fd0>,
  10)}

Now let's delete our (only) strong reference to `p`:

In [46]:
ref_count(id(p))

1

In [47]:
del p

In [48]:
Point.x.values

{}

And as you can see our dictionary was cleaned up.

There is one last caveat, when we create weak references to objects, the weak reference objects are actually stored in the instance itself, in a property called `__weakref__`:

In [49]:
class Person:
    pass

In [50]:
Person.__dict__

mappingproxy({'__module__': '__main__',
              '__dict__': <attribute '__dict__' of 'Person' objects>,
              '__weakref__': <attribute '__weakref__' of 'Person' objects>,
              '__doc__': None})

Notice that `__weakref__` attribute. It is technically a data descriptor:

In [51]:
hasattr(Person.__weakref__, '__get__'), hasattr(Person.__weakref__, '__set__')

(True, True)

And instances will therefore have that property:

In [52]:
p = Person()

In [53]:
hasattr(p, '__weakref__')

True

In [54]:
print(p.__weakref__)

None


As you can see, that `__weakref__` attribute exists, but is currently `None`.

Now let's create a weak reference to `p`:

In [55]:
w = weakref.ref(p)

And `__weakref__` is no longer `None` (internally it is implemented as doubly linked list of all the weak references to that object - but this is an implementation detail and Python does not expose functionality to iterate through the weak references ourselves)

In [56]:
p.__weakref__

<weakref at 0x7fa760451db8; to 'Person' at 0x7fa7603f2d68>

Now the problem if we use slots, is that the instances will no longer have that attribute!

In [57]:
class Person:
    __slots__ = 'name',

In [58]:
Person.__dict__

mappingproxy({'__module__': '__main__',
              '__slots__': ('name',),
              'name': <member 'name' of 'Person' objects>,
              '__doc__': None})

As you can see `__weakref__` is no longer an attribute in our class, and the instances do not have it:

In [59]:
p = Person()

In [60]:
hasattr(p, '__weakref__')

False

So, the problem is that we can no longer create weak references to this object!!

In [61]:
try:
    weakref.ref(p)
except TypeError as ex:
    print(ex)

cannot create weak reference to 'Person' object


In order to enable weak references in objects that use slots, we need to specify `__weakref__` as one of the slots:

In [62]:
class Person:
    __slots__ = 'name', '__weakref__'

In [63]:
Person.__dict__

mappingproxy({'__module__': '__main__',
              '__slots__': ('name', '__weakref__'),
              'name': <member 'name' of 'Person' objects>,
              '__weakref__': <attribute '__weakref__' of 'Person' objects>,
              '__doc__': None})

As you can see `__weakref__` is back, and exists in our instances:

In [64]:
p = Person()

In [65]:
hasattr(p, '__weakref__')

True

Which means we can create weak references to our `Person` object again:

In [66]:
w = weakref.ref(p)

So, if we want to use data descriptors using weak references (whether using our own dictionary or a weak key dictionary) with classes that define slots, we'll need to make sure we add `__weakref__` to the slots!

Let's do another example using this latest technique:

In [67]:
class ValidString:
    def __init__(self, min_length=0, max_length=255):
        self.data = {}
        self._min_length = min_length
        self._max_length = max_length
        
    def __set__(self, instance, value):
        if not isinstance(value, str):
            raise ValueError('Value must be a string.')
        if len(value) < self._min_length:
            raise ValueError(
                f'Value should be at least {self._min_length} characters.'
            )
        if len(value) > self._max_length:
            raise ValueError(
                f'Value cannot exceed {self._max_length} characters.'
            )
        self.data[id(instance)] = (weakref.ref(instance, self._finalize_instance), 
                                   value
                                  )
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            value_tuple = self.data.get(id(instance))
            return value_tuple[1]  
        
    def _finalize_instance(self, weak_ref):
        reverse_lookup = [key for key, value in self.data.items()
                         if value[0] is weak_ref]
        if reverse_lookup:
            # key found
            key = reverse_lookup[0]
            del self.data[key]

We can now use `ValidString` as many times as we need:

In [68]:
class Person:
    __slots__ = '__weakref__',
    
    first_name = ValidString(1, 100)
    last_name = ValidString(1, 100)
    
    def __eq__(self, other):
        return (
            isinstance(other, Person) and 
            self.first_name == other.first_name and 
            self.last_name == other.last_name
        )
    
class BankAccount:
    __slots__ = '__weakref__',
    
    account_number = ValidString(5, 255)
    
    def __eq__(self, other):
        return (
            isinstance(other, BankAccount) and 
            self.account_number == other.account_number
        )

In [69]:
p1 = Person()

In [70]:
try:
    p1.first_name = ''
except ValueError as ex:
    print(ex)

Value should be at least 1 characters.


In [71]:
p2 = Person()

In [72]:
p1.first_name, p1.last_name = 'Guido', 'van Rossum'
p2.first_name, p2.last_name = 'Raymond', 'Hettinger'

In [73]:
b1, b2 = BankAccount(), BankAccount()

In [74]:
b1.account_number, b2.account_number = 'Savings', 'Checking'

In [75]:
p1.first_name, p1.last_name

('Guido', 'van Rossum')

In [76]:
p2.first_name, p2.last_name

('Raymond', 'Hettinger')

In [77]:
b1.account_number, b2.account_number

('Savings', 'Checking')

We can look at the data dictionary in each of the data descriptor instances:

In [78]:
Person.first_name.data

{140356851360776: (<weakref at 0x7fa76043e818; to 'Person' at 0x7fa760446408>,
  'Guido'),
 140356851360152: (<weakref at 0x7fa7400752c8; to 'Person' at 0x7fa760446198>,
  'Raymond')}

In [79]:
Person.last_name.data

{140356851360776: (<weakref at 0x7fa740075138; to 'Person' at 0x7fa760446408>,
  'van Rossum'),
 140356851360152: (<weakref at 0x7fa740075598; to 'Person' at 0x7fa760446198>,
  'Hettinger')}

In [80]:
BankAccount.account_number.data

{140356851360536: (<weakref at 0x7fa76043e868; to 'BankAccount' at 0x7fa760446318>,
  'Savings'),
 140356851361256: (<weakref at 0x7fa740075868; to 'BankAccount' at 0x7fa7604465e8>,
  'Checking')}

And if our objects are garbage collected:

In [81]:
del p1
del p2
del b1
del b2

In [82]:
Person.first_name.data

{}

In [83]:
Person.last_name.data

{}

In [84]:
BankAccount.account_number.data

{}

we can see that our dictionaries were cleaned up too!

OK, so this was a long journey, but it now allows us to handle classes that use slots and are not hashable. 

Depending on your needs, you may not need all this functionality (for example your objects may be guaranteed to be hashable and supports weak refs, in which case you can use the weak key dictionary approach), or maybe your class is guaranteed not to use slots (or contains `__dict__` as one of the slots), in which case you can just use the instance itself for storage (although the name to use is still an outstanding issue).

We'll circle back to using the instance for storage instead of using the data descripor itself in the next set of lectures.

##  The `__set_name__`  Method

Starting in Python 3.6, the `__set_name__` method is an additional method defined in the descriptor protocol.

It gets called once when the descriptor instance is created (so when the class containing it is compiled), and passes the property name as the argument.

Let's see a simple example illustrating this:

In [1]:
class ValidString:
    def __set_name__(self, owner_class, property_name):
        print(f'__set_name__ called: owner={owner_class}, prop={property_name}')

In [2]:
class Person:
    name = ValidString()

__set_name__ called: owner=<class '__main__.Person'>, prop=name


As you can see `__set_name__` was called when the `Person` class was created. This is the only time it gets called.

The main advantage of this is that we can capture the property name:

In [3]:
class ValidString:
    def __set_name__(self, owner_class, property_name):
        print(f'__set_name__ called: owner={owner_class}, prop={property_name}')
        self.property_name = property_name
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            print(f'__get__ called for property {self.property_name} '
                  f'of instance {instance}')

In [4]:
class Person:
    first_name = ValidString()
    last_name = ValidString()

__set_name__ called: owner=<class '__main__.Person'>, prop=first_name
__set_name__ called: owner=<class '__main__.Person'>, prop=last_name


Now watch what happens when we get the property form the instances:

In [5]:
p = Person()

In [6]:
p.first_name

__get__ called for property first_name of instance <__main__.Person object at 0x7fa4604f3cf8>


In [7]:
p.last_name

__get__ called for property last_name of instance <__main__.Person object at 0x7fa4604f3cf8>


So basically we know which property name was assigned to the instance of the descriptor. 

That can be handy for messages that can reference the property name, or even storing values in the instance dictionary (assuming we can):

In [8]:
class ValidString():
    def __init__(self, min_length):
        self.min_length = min_length
        
    def __set_name__(self, owner_class, property_name):
        self.property_name = property_name

    def __set__(self, instance, value):
        if not isinstance(value, str):
            raise ValueError(f'{self.property_name} must be a string.')
        if len(value) < self.min_length:
            raise ValueError(f'{self.property_name} must be at least '
                             f'{self.min_length} characters'
                            )
        key = '_' + self.property_name
        setattr(instance, key, value)
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            key = '_' + self.property_name
            return getattr(instance, key, None)

In [9]:
class Person:
    first_name = ValidString(1)
    last_name = ValidString(2)

In [10]:
p = Person()

In [11]:
try:
    p.first_name = 'Alex'
    p.last_name = 'M'
except ValueError as ex:
    print(ex)

last_name must be at least 2 characters


Nice to know that `last_name` is the property raising the exception!

We also used the property name as the basis for an attribute in the instance itself:

In [12]:
p = Person()
p.first_name = 'Alex'

In [13]:
p.first_name, p.__dict__

('Alex', {'_first_name': 'Alex'})

So although this now fixes the issue we saw at the beginning of this section (having the user specify the property name twice), we still have the issue of potentially overwriting an existing instance attribute:

In [14]:
p = Person()

In [15]:
p._first_name = 'some data I need to store'

In [16]:
p.__dict__

{'_first_name': 'some data I need to store'}

In [17]:
p.first_name = 'Alex'

In [18]:
p.__dict__

{'_first_name': 'Alex'}

So that wiped away our data - this is not good, so we need to do something about it.

How about storing the value in the instance using the exact same name?

Think back to how instance attributes shadow class attributes:

In [19]:
class BankAccount:
    apr = 10

In [20]:
b = BankAccount()

In [21]:
b.apr, b.__dict__

(10, {})

In [22]:
b.apr = 20

In [23]:
b.apr, b.__dict__

(20, {'apr': 20})

So as you can see, the descriptor is a **class** attribute. So if we store the value under the same name in the instance, are we not going to run into this shadowing issue where the attribute will now use the attribute in the instance rather than using the class descriptor attribute?

And the answer is it depends!

Data vs non-data descriptors - that distinction is important, and we'll look at this in the next lectures.

Let's preview this quickly:

In [24]:
class ValidString:
    def __init__(self, min_length):
        self.min_length = min_length
        
    def __set_name__(self, owner_class, property_name):
        self.property_name = property_name

    def __set__(self, instance, value):
        if not isinstance(value, str):
            raise ValueError(f'{self.property_name} must be a string.')
        if len(value) < self.min_length:
            raise ValueError(f'{self.property_name} must be at least '
                             f'{self.min_length} characters'
                            )
        instance.__dict__[self.property_name] = value
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            print (f'calling __get__ for {self.property_name}')
            return instance.__dict__.get(self.property_name, None)

In [25]:
class Person:
    first_name = ValidString(1)
    last_name = ValidString(2)

In [26]:
p = Person()

In [27]:
p.__dict__

{}

In [28]:
p.first_name = 'Alex'

In [29]:
p.__dict__

{'first_name': 'Alex'}

So, `first_name` is in the instance dictionary, and we would expect that accessing `first_name` would use the instance dictionary:

In [30]:
p.first_name

calling __get__ for first_name


'Alex'

Aha, it used the descriptor!!

Let's look at that in detail next.

##  Property Lookup Resolution

As we saw in the last set of lectures, something odd is happening when our class uses a data descriptor, and instances contain the same attribute name in the instance dictionary.

Contrary to what we expected, the descriptor was **still** used.

This boils down to data vs non-data descriptors. Python has a default way of where it looks for attributes depending on whether the descriptor is a data-descriptor or not.

As I explain the lecture video, for data descriptors Python will choose to use the descriptor attribute (in the class), even if the same symbol is found in the instance dictionary.

Let's see this again with a simple example:

In [1]:
class IntegerValue:
    def __set__(self, instance, value):
        print('__set__ called...')
        
    def __get__(self, instance, owner_class):
        print('__get__ called...')

In [2]:
class Point:
    x = IntegerValue()

In [3]:
p = Point()

In [4]:
p.x = 100

__set__ called...


In [5]:
p.x

__get__ called...


Ok, so the descriptor's `__set__` and `__get__` methods were called.

Let's set an attribute named `x` directly on the instance dictionary:

In [6]:
p.__dict__

{}

In [7]:
p.__dict__['x'] = 'hello'

In [8]:
p.__dict__

{'x': 'hello'}

And now let's get the value:

In [9]:
p.x

__get__ called...


As you can see the descriptor was **still** used. The same if we set the value:

In [10]:
p.x = 100

__set__ called...


This works this way because we have a **data descriptor** - the instance attributes do not shadow class descriptors of the same name!

The behavior for a non-data descriptor is different, and the shadowing effect is present:

In [11]:
from datetime import datetime

class TimeUTC:
    def __get__(self, instance, owner_class):
        print('__get__ called...')
        return datetime.utcnow().isoformat()

In [12]:
class Logger:
    current_time = TimeUTC()

In [13]:
l = Logger()

In [14]:
l.current_time

__get__ called...


'2019-07-13T20:47:59.473945'

As you can see the descriptor's `__get__` was called. 

Now let's inject the same symbol directly into our instance dictionary:

In [15]:
l.__dict__

{}

In [16]:
l.__dict__['current_time'] = 'this is not a timestamp'

In [17]:
l.__dict__

{'current_time': 'this is not a timestamp'}

And if we try to get the value for that key:

In [18]:
l.current_time

'this is not a timestamp'

we get the value stored in the instance dictionary, **not** the descriptor's `__get__` method.

Of course we can go back to "normal" by removing that key from the instance dictionary:

In [19]:
del l.__dict__['current_time']

And now:

In [20]:
l.current_time

__get__ called...


'2019-07-13T20:47:59.556109'

What this means is that for data descriptors, where we usually need instance-based storage, we can actually use the property name itself to store the value in the instance **under the same name**. It will **not** shadow the class attribute (the descriptor instance), and it has no risk of overwriting any existing instance attributes our class may have!

Of course, this assume that the class does not use slots, or at least specifies `__dict__` as one of the slots if it does.

Let's apply this to a data descriptor under that assumption:

In [21]:
class ValidString:
    def __init__(self, min_length):
        self.min_length = min_length
        
    def __set_name__(self, owner_class, prop_name):
        self.prop_name = prop_name
        
    def __set__(self, instance, value):
        if not isinstance(value, str):
            raise ValueError(f'{self.prop_name} must be a string.')
        if len(value) < self.min_length:
            raise ValueError(f'{self.prop_name} must be '
                             f'at least {self.min_length} characters.'
                            )
        instance.__dict__[self.prop_name] = value
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return instance.__dict__.get(self.prop_name, None)

In [22]:
class Person:
    first_name = ValidString(1)
    last_name = ValidString(2)

In [23]:
p = Person()

In [24]:
p.__dict__

{}

In [25]:
p.first_name = 'Alex'
p.last_name = 'Martelli'

In [26]:
p.__dict__

{'first_name': 'Alex', 'last_name': 'Martelli'}

In [27]:
p.first_name, p.last_name

('Alex', 'Martelli')

Note that I am **not** using attributes (either dot notation or `getattr`/`setattr`) when setting and getting the values from the instance `__dict__`. If I did, it would actually be calling the descriptors `__get__` and `__set__` methods, resulting in an infinite recursion!!

So be careful with that!

In [28]:
class ValidString:
    def __init__(self, min_length):
        self.min_length = min_length
        
    def __set_name__(self, owner_class, prop_name):
        self.prop_name = prop_name
        
    def __set__(self, instance, value):
        print('calling __set__ ...')
        if not isinstance(value, str):
            raise ValueError(f'{self.prop_name} must be a string.')
        if len(value) < self.min_length:
            raise ValueError(f'{self.prop_name} must be '
                             f'at least {self.min_length} characters.'
                            )
        setattr(instance, self.prop_name, value)
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return instance.__dict__.get(self.prop_name, None)

In [29]:
class Person:
    name = ValidString(1)

In [30]:
p = Person()

In [31]:
p.name = 'Alex'

calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...


calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...
calling __set__ ...


RecursionError: maximum recursion depth exceeded in comparison

##  Properties and Descriptors

Let's start by creating a property using the decorator syntax:

In [1]:
from numbers import Integral

class Person:
    @property
    def age(self):
        return getattr(self, '_age', None)
    
    @age.setter
    def age(self, value):
        if not isinstance(value, Integral):
            raise ValueError('age: must be an integer.')
        if value < 0:
            raise ValueError('age: must be a non-negative integer.')
        self._age = value

In [2]:
p = Person()

In [3]:
try:
    p.age = -10
except ValueError as ex:
    print(ex)

age: must be a non-negative integer.


And notice how the instance dictionary does not contain `age`, even though we have that instance `age` attribute:

In [4]:
p.age = 10

In [5]:
p.age, p.__dict__

(10, {'_age': 10})

Next, let's rewrite this using a `property` class instead of the decorators:

In [6]:
class Person:
    def get_age(self):
        return getattr(self, '_age', None)
    
    def set_age(self, value):
        if not isinstance(value, Integral):
            raise ValueError('age: must be an integer.')
        if value < 0:
            raise ValueError('age: must be a non-negative integer.')
        self._age = value
        
    age = property(fget=get_age, fset=set_age)

And this works the exact same way as before:

In [7]:
p = Person()

In [8]:
try:
    p.age = -10
except ValueError as ex:
    print(ex)

age: must be a non-negative integer.


In [9]:
p.age = 10

In [10]:
p.age, p.__dict__

(10, {'_age': 10})

Now, in both cases the property object instance can be accessed by using the class:

In [11]:
prop = Person.age

In [12]:
prop

<property at 0x7f9f503d1db8>

And this property, is actually a data descriptor!

In [13]:
hasattr(prop, '__set__')

True

In [14]:
hasattr(prop, '__get__')

True

In this case, our property has both the `__get__` and `__set__` methods so we ended up with a data descriptor.

Even if we only defined a read-only property, we would still end up with a data descriptor:

In [15]:
from datetime import datetime

class TimeUTC:
    @property
    def current_time(self):
        return datetime.utcnow().isoformat()

In [16]:
t = TimeUTC()
t.current_time

'2019-07-13T20:48:18.993428'

In [17]:
prop = TimeUTC.current_time

In [18]:
hasattr(prop, '__get__')

True

In [19]:
hasattr(prop, '__set__')

True

But the internal implemetation of the `__set__` method would refuse to set a value:

In [20]:
try:
    t.current_time = datetime.utcnow().isoformat()
except AttributeError as ex:
    print(ex)

can't set attribute


So, if properties are implemented using data descriptors - this means that instance attributes with the same name will not shadow the descriptor:

In [21]:
t.__dict__

{}

In [22]:
t.__dict__['current_time'] = 'not a time'

In [23]:
t.__dict__

{'current_time': 'not a time'}

In [24]:
t.current_time

'2019-07-13T20:48:19.099088'

OK, so given what we know about data descriptors all this should make sense.

Now let's try to implement our own version of the property type, decorators and all!

In [25]:
class MakeProperty:
    def __init__(self, fget=None, fset=None):
        self.fget = fget
        self.fset = fset
        
    def __set_name__(self, owner_class, prop_name):
        self.prop_name = prop_name
        
    def __get__(self, instance, owner_class):
        print('__get__ called...')
        if instance is None:
            return self
        if self.fget is None:
            raise AttributeError(f'{self.prop_name} is not readable.')
        return self.fget(instance)
            
    def __set__(self, instance, value):
        print('__set__ called...')
        if self.fset is None:
            raise AttributeError(f'{self.prop_name} is not writable.')
        self.fset(instance, value)

This is now sufficient to start creating properties using this data descriptor:

In [26]:
class Person:
    def get_name(self):
        return self._name
    
    def set_name(self, value):
        self._name = value
        
    name = MakeProperty(fget=get_name, fset=set_name)

In [27]:
p = Person()

In [28]:
p.__dict__

{}

In [29]:
p.name = 'Guido'

__set__ called...


In [30]:
p.name

__get__ called...


'Guido'

And even if we try to shadow the property name in the instance, things will work just fine:

In [31]:
p.__dict__['name'] = 'Alex'

In [32]:
p.__dict__

{'_name': 'Guido', 'name': 'Alex'}

In [33]:
p.name

__get__ called...


'Guido'

Next we would like to have a decorator approach as well. To do that we're going to mimic the way the property decorators work (you may want to go back to those lectures and refresh your memory if needed).

So how should the `@MakeProperty` decorator work?

It should take a function and return a descriptor object. 

In turn, that descriptor object should have a `setter` method that we can call to *add* the setter method to the descriptor, that also returns the descriptor object - just like we have with `property` types:

In [34]:
class MakeProperty:
    def __init__(self, fget=None, fset=None):
        self.fget = fget
        self.fset = fset
        
    def __set_name__(self, owner_class, prop_name):
        self.prop_name = prop_name
        
    def __get__(self, instance, owner_class):
        print('__get__ called...')
        if instance is None:
            return self
        if self.fget is None:
            raise AttributeError(f'{self.prop_name} is not readable.')
        return self.fget(instance)
            
    def __set__(self, instance, value):
        print('__set__ called...')
        if self.fset is None:
            raise AttributeError(f'{self.prop_name} is not writable.')
        self.fset(instance, value)
        
    def setter(self, fset):
        self.fset = fset
        return self
        

So both the `__init__` and the `setter` methods can be used like decorators, and we can now use our `MakeProperty` class with decorator syntax:

We can do it the "long" way first:

In [35]:
class Person:
    def get_first_name(self):
        return getattr(self, '_first_name', None)
    
    def set_first_name(self, value):
        self._first_name = value
        
    def get_last_name(self):
        return getattr(self, '_last_name', None)
    
    def set_last_name(self, value):
        self._last_name = value
        
    first_name = MakeProperty(fget=get_first_name, fset=set_first_name)
    last_name = MakeProperty(fget=get_last_name, fset=set_last_name)

Or, we can use the "shorthand" decorator syntax:

In [36]:
class Person:
    @MakeProperty
    def first_name(self):
        return getattr(self, '_first_name', None)
    
    @first_name.setter
    def first_name(self, value):
        self._first_name = value
        
    @MakeProperty
    def last_name(self):
        return getattr(self, '_last_name', None)
    
    @last_name.setter
    def last_name(self, value):
        self._last_name = value

In [37]:
p1 = Person()

In [38]:
p1.first_name = 'Raymond'

__set__ called...


In [39]:
p1.last_name = 'Hettinger'

__set__ called...


In [40]:
p1.first_name

__get__ called...


'Raymond'

In [41]:
p1.last_name

__get__ called...


'Hettinger'

And of course this will work with multiple instances of the `Person` class since we are using the instances themselves for the underlying storage:

In [42]:
p2 = Person()
p2.first_name, p2.last_name = 'Alex', 'Martelli'

__set__ called...
__set__ called...


In [43]:
p1.first_name, p1.last_name, p2.first_name, p2.last_name

__get__ called...
__get__ called...
__get__ called...
__get__ called...


('Raymond', 'Hettinger', 'Alex', 'Martelli')

Of course our implementation is quite simplistic, but it should help solidy our understanding of properties, descriptors, and decorators too!

##  Application - Example 1

Now let's look at some further examples of using descriptors that provides better better reusability than using `property` types (remember the repeated code issue we were trying to solve in the first place!)

We have already seen that data validation works well with descriptors.

For example, we may want our object attributes to have valid values for some of it's attributes:

In [1]:
class Int:
    def __set_name__(self, owner_class, prop_name):
        self.prop_name = prop_name
        
    def __set__(self, instance, value):
        if not isinstance(value, int):
            raise ValueError(f'{self.prop_name} must be an integer.')
        instance.__dict__[self.prop_name] = value
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return instance.__dict__.get(self.prop_name, None)
            

In [2]:
class Float:
    def __set_name__(self, owner_class, prop_name):
        self.prop_name = prop_name
        
    def __set__(self, instance, value):
        if not isinstance(value, float):
            raise ValueError(f'{self.prop_name} must be a float.')
        instance.__dict__[self.prop_name] = value
        
    def __get__(self, instance, value):
        if instance is None:
            return self
        else:
            return instance.__dict__.get(self.prop_name, None)

In [3]:
class List:
    def __set_name__(self, owner_class, prop_name):
        self.prop_name = prop_name
        
    def __set__(self, instance, value):
        if not isinstance(value, list):
            raise ValueError(f'{self.prop_name} must be a list.')
        instance.__dict__[self.prop_name] = value
        
    def __get__(self, instance, value):
        if instance is None:
            return self
        else:
            return instance.__dict__.get(self.prop_name, None)
        
    

We can now use these descriptors in multiple class definitions, and as many times as we want in each class:

In [4]:
class Person:
    age = Int()
    height = Float()
    tags = List()
    favorite_foods = List()

In [5]:
p = Person()

In [6]:
try:
    p.age = 12.5
except ValueError as ex:
    print(ex)

age must be an integer.


In [7]:
try:
    p.height = 'abc'
except ValueError as ex:
    print(ex)

height must be a float.


In [8]:
try:
    p.tags = 'python'
except ValueError as ex:
    print(ex)

tags must be a list.


One thing here, is that I got rather tired of writing the same code multiple times for the descriptor classes! (beats having to re-write the same code over and over again that we would have had with properties, but still, we can do better than that!)

So let's rewrite this to be a bit more generic:

In [9]:
class ValidType:
    def __init__(self, type_):
        self._type = type_
        
    def __set_name__(self, owner_clasds, prop_name):
        self.prop_name = prop_name
        
    def __set__(self, instance, value):
        if not isinstance(value, self._type):
            raise ValueError(f'{self.prop_name} must be of type '
                             f'{self._type.__name__}'
                            )
        instance.__dict__[self.prop_name] = value
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return instance.__dict__.get(self.prop_name, None)

And now we can achieve the same functionality as before:

In [10]:
class Person:
    age = ValidType(int)
    height = ValidType(float)
    tags = ValidType(list)
    favorite_foods = ValidType(tuple)
    name = ValidType(str)

In [11]:
p = Person()

In [12]:
try:
    p.age = 10.5
except ValueError as ex:
    print(ex)

age must be of type int


In [13]:
try:
    p.height = 10
except ValueError as ex:
    print(ex)

height must be of type float


Now I'd like to allow setting the height to an integer value, since those are a subset of floats (in the mathematicel sense). That's easy, all I need to do is to use the `numbers.Real` class:

In [14]:
import numbers

In [15]:
isinstance(10.1, numbers.Real)

True

In [16]:
isinstance(10, numbers.Real)

True

So let's tweak our `Person` class:

In [17]:
class Person:
    age = ValidType(int)
    height = ValidType(numbers.Real)
    tags = ValidType(list)
    favorite_foods = ValidType(tuple)
    name = ValidType(str)

In [18]:
p = Person()

In [19]:
p.height = 10

In [20]:
p.height

10

##  Application - Example 2

Suppose we have a `Polygon` class that has a vertices property that needs to be defined as a sequence of `Point2D` instances. So here, not only do we want the `vertices` attribute of our `Polygon` to be an iterable of some kind, we also want the elements to all be instances of the `Point2D` class. In turn we'll also want to make sure that coordinates for `Point2D` are non-negative integer values (as might be expected in computer screen coordinates):

Let's start by defining the `Point2D` class, but we'll need a descriptor for the coordinates to ensure they are integer values, possibly bounded between min and max values:

In [1]:
class Int:
    def __init__(self, min_value=None, max_value=None):
        self.min_value = min_value
        self.max_value = max_value
        
    def __set_name__(self, owner_class, name):
        self.name = name
        
    def __set__(self, instance, value):
        if not isinstance(value, int):
            raise ValueError(f'{self.name} must be an int.')
        if self.min_value is not None and value < self.min_value:
            raise ValueError(f'{self.name} must be at least {self.min_value}')
        if self.max_value is not None and value > self.max_value:
            raise ValueError(f'{self.name} cannot exceed {self.max_value}')
        instance.__dict__[self.name] = value
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return instance.__dict__.get(self.name, None)

In [2]:
class Point2D:
    x = Int(min_value=0, max_value=800)
    y = Int(min_value=0, max_value=400)
    
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __repr__(self):
        return f'Point2D(x={self.x}, y={self.y})'
    
    def __str__(self):
        return f'({self.x}, {self.y})'

And our `Point2D` class will now only allow integer values within the defined range:

In [3]:
p = Point2D(0, 10)

In [4]:
str(p)

'(0, 10)'

In [5]:
repr(p)

'Point2D(x=0, y=10)'

In [6]:
p.x, p.y

(0, 10)

But:

In [7]:
try:
    p = Point2D(0, 500)
except ValueError as ex:
    print(ex)

y cannot exceed 400


Next let's create a validator that checks that we have a sequence (mutable or immutable, does not matter) of `Point2D` objects. 

To check of something is a sequence, we can use the abstract base classes defined in the `collections` module:

In [8]:
import collections

In [9]:
isinstance([1, 2, 3], collections.abc.Sequence)

True

In [10]:
isinstance([1, 2, 3], collections.abc.MutableSequence)

True

In [11]:
isinstance((1, 2, 3), collections.abc.Sequence)

True

In [12]:
isinstance((1, 2, 3), collections.abc.MutableSequence)

False

So let's write the validator:

In [13]:
class Point2DSequence:
    def __init__(self, min_length=None, max_length=None):
        self.min_length = min_length
        self.max_length = max_length
        
    def __set_name__(self, cls, name):
        self.name = name
        
    def __set__(self, instance, value):
        if not isinstance(value, collections.abc.Sequence):
            raise ValueError(f'{self.name} must be a sequence type.')
        if self.min_length is not None and len(value) < self.min_length:
            raise ValueError(f'{self.name} must contain at least '
                             f'{self.min_length} elements'
                            )
        if self.max_length is not None and len(value) > self.max_length:
            raise ValueError(f'{self.name} cannot contain more than  '
                             f'{self.max_length} elements'
                            )
        for index, item in enumerate(value):
            if not isinstance(item, Point2D):
                raise ValueError(f'Item at index {index} is not a Point2D instance.')
                
        # value passes checks - want to store it as a mutable sequence so we can 
        # append to it later
        instance.__dict__[self.name] = list(value)
        
    def __get__(self, instance, cls):
        if instance is None:
            return self
        else:
            if self.name not in instance.__dict__:
                # current point list has not been defined,
                # so let's create an empty list
                instance.__dict__[self.name] = []
            return instance.__dict__.get(self.name)

And now we can use this for our `Polygon` class:

In [14]:
class Polygon:
    vertices = Point2DSequence(min_length=3)
    
    def __init__(self, *vertices):
        self.vertices = vertices

In [15]:
try:
    p = Polygon()
except ValueError as ex:
    print(ex)

vertices must contain at least 3 elements


In [16]:
try:
    p = Polygon(Point2D(-100,0), Point2D(0, 1), Point2D(1, 0))
except ValueError as ex:
    print(ex)

x must be at least 0


In [17]:
p = Polygon(Point2D(0,0), Point2D(0, 1), Point2D(1, 0))

In [18]:
p.vertices

[Point2D(x=0, y=0), Point2D(x=0, y=1), Point2D(x=1, y=0)]

OK, so, for completeness, let's write a method that we can use to append new points to the vertices list (that's why we made it a mutable sequence type!)

In [19]:
class Polygon:
    vertices = Point2DSequence(min_length=3)
    
    def __init__(self, *vertices):
        self.vertices = vertices
        
    def append(self, pt):
        if not isinstance(pt, Point2D):
            raise ValueError('Can only append Point2D instances.')
        max_length = type(self).vertices.max_length
        if max_length is not None and len(self.vertices) >= max_length:
            # cannot add more points!
            raise ValueError(f'Vertices length is at max ({max_length})')
        self.vertices.append(pt)
                

In [20]:
p = Polygon(Point2D(0,0), Point2D(1,0), Point2D(0,1))

In [21]:
p.vertices

[Point2D(x=0, y=0), Point2D(x=1, y=0), Point2D(x=0, y=1)]

In [22]:
p.append(Point2D(10, 10))

In [23]:
p.vertices

[Point2D(x=0, y=0), Point2D(x=1, y=0), Point2D(x=0, y=1), Point2D(x=10, y=10)]

Now we could set a `max_length` directly when we define the `Polygon` class:

In [24]:
class Polygon:
    vertices = Point2DSequence(min_length=3, max_length=3)
    
    def __init__(self, *vertices):
        self.vertices = vertices
        
    def append(self, pt):
        if not isinstance(pt, Point2D):
            raise ValueError('Can only append Point2D instances.')
        max_length = type(self).vertices.max_length
        if max_length is not None and len(self.vertices) >= max_length:
            # cannot add more points!
            raise ValueError(f'Vertices length is at max ({max_length})')
        self.vertices.append(pt)
                

In [25]:
p = Polygon(Point2D(0,0), Point2D(1,0), Point2D(0,1))

In [26]:
try:
    p.append(Point2D(10, 10))
except ValueError as ex:
    print(ex)

Vertices length is at max (3)


But instead, let's use inheritance to create special `Polygon` types!

First let's go back to our original `Polygon` definition:

In [27]:
class Polygon:
    vertices = Point2DSequence(min_length=3)
    
    def __init__(self, *vertices):
        self.vertices = vertices
        
    def append(self, pt):
        if not isinstance(pt, Point2D):
            raise ValueError('Can only append Point2D instances.')
        max_length = type(self).vertices.max_length
        if max_length is not None and len(self.vertices) >= max_length:
            # cannot add more points!
            raise ValueError(f'Vertices length is at max ({max_length})')
        self.vertices.append(pt)
                

In [28]:
class Triangle(Polygon):
    vertices = Point2DSequence(min_length=3, max_length=3)

So `Triangle` redefines the vertices property, but inherits both the `__init__` and `append` methods:

In [29]:
p = Polygon(Point2D(0,0), Point2D(1,0), Point2D(0,1))

In [30]:
p.append(Point2D(10, 10))

In [31]:
p.vertices

[Point2D(x=0, y=0), Point2D(x=1, y=0), Point2D(x=0, y=1), Point2D(x=10, y=10)]

That works fine, but this does not:

In [32]:
t = Triangle(Point2D(0,0), Point2D(1,0), Point2D(0,1))

In [33]:
try:
    t.append(Point2D(10, 10))
except ValueError as ex:
    print(ex)

Vertices length is at max (3)


And we can also do a square:

In [34]:
class Square(Polygon):
    vertices = Point2DSequence(min_length=4, max_length=4)

In [35]:
s = Square(Point2D(0,0), Point2D(1,0), Point2D(0,1), Point2D(1, 1))

In [36]:
s.vertices

[Point2D(x=0, y=0), Point2D(x=1, y=0), Point2D(x=0, y=1), Point2D(x=1, y=1)]

In [37]:
try:
    s.append(Point2D(10, 10))
except ValueError as ex:
    print(ex)

Vertices length is at max (4)


We could actually improve this even more by making our `Polygon` class an actual sequence type. To do that we only need to implement a few special methods:

In [38]:
class Polygon:
    vertices = Point2DSequence(min_length=3)
    
    def __init__(self, *vertices):
        self.vertices = vertices
        
    def append(self, pt):
        if not isinstance(pt, Point2D):
            raise ValueError('Can only append Point2D instances.')
        max_length = type(self).vertices.max_length
        if max_length is not None and len(self.vertices) >= max_length:
            # cannot add more points!
            raise ValueError(f'Vertices length is at max ({max_length})')
        self.vertices.append(pt)
                
    def __len__(self):
        return len(self.vertices)
        
    def __getitem__(self, idx):
        return self.vertices[idx]
        

In [39]:
p = Polygon(Point2D(0,0), Point2D(1,0), Point2D(1,1))

In [40]:
len(p)

3

In [41]:
list(p)

[Point2D(x=0, y=0), Point2D(x=1, y=0), Point2D(x=1, y=1)]

In [42]:
p[0], p[1], p[2]

(Point2D(x=0, y=0), Point2D(x=1, y=0), Point2D(x=1, y=1))

In [43]:
p[0:2]

[Point2D(x=0, y=0), Point2D(x=1, y=0)]

We could even implement in-place addition and containment:

In [44]:
class Polygon:
    vertices = Point2DSequence(min_length=3)
    
    def __init__(self, *vertices):
        self.vertices = vertices
        
    def append(self, pt):
        if not isinstance(pt, Point2D):
            raise ValueError('Can only append Point2D instances.')
        max_length = type(self).vertices.max_length
        if max_length is not None and len(self.vertices) >= max_length:
            # cannot add more points!
            raise ValueError(f'Vertices length is at max ({max_length})')
        self.vertices.append(pt)
                
    def __len__(self):
        return len(self.vertices)
        
    def __getitem__(self, idx):
        return self.vertices[idx]
        
    def __iadd__(self, pt):
        self.append(pt)
        return self
    
    def __contains__(self, pt):
        return pt in self.vertices

In [45]:
p = Polygon(Point2D(0,0), Point2D(1,0), Point2D(1,1))

In [46]:
list(p)

[Point2D(x=0, y=0), Point2D(x=1, y=0), Point2D(x=1, y=1)]

In [47]:
p += Point2D(10, 10)

In [48]:
list(p)

[Point2D(x=0, y=0), Point2D(x=1, y=0), Point2D(x=1, y=1), Point2D(x=10, y=10)]

What about containment?

In [49]:
Point2D(0, 0) in p

False

Why `False`? The point (0,0) is in the vertices list... 

Well, we didn't override the `__eq__` method in our `Point2D` class, so it's using the implementation in `object`, which uses object identity.

We can easily fix that:

In [50]:
class Point2D:
    x = Int(min_value=0, max_value=800)
    y = Int(min_value=0, max_value=400)
    
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __repr__(self):
        return f'Point2D(x={self.x}, y={self.y})'
    
    def __str__(self):
        return f'({self.x}, {self.y})'
    
    def __eq__(self, other):
        return isinstance(other, Point2D) and self.x == other.x and self.y == other.y
        
    def __hash__(self):
        return hash((self.x, self.y))

In [51]:
class Polygon:
    vertices = Point2DSequence(min_length=3)
    
    def __init__(self, *vertices):
        self.vertices = vertices
        
    def append(self, pt):
        if not isinstance(pt, Point2D):
            raise ValueError('Can only append Point2D instances.')
        max_length = type(self).vertices.max_length
        if max_length is not None and len(self.vertices) >= max_length:
            # cannot add more points!
            raise ValueError(f'Vertices length is at max ({max_length})')
        self.vertices.append(pt)
                
    def __len__(self):
        return len(self.vertices)
        
    def __getitem__(self, idx):
        return self.vertices[idx]
        
    def __iadd__(self, pt):
        self.append(pt)
        return self
    
    def __contains__(self, pt):
        return pt in self.vertices

In [52]:
p = Polygon(Point2D(0,0), Point2D(1,0), Point2D(1,1))

In [53]:
Point2D(0,0) in p

True

##  Functions and Descriptors

As I mentioned in the lecture video, Python functions actually implement the non-data descriptor protocol, i.e. they implement the `__get__` method

In [1]:
def add(a, b):
    return a + b

In [2]:
hasattr(add, '__get__')

True

So what does that `__get__` actually return?

We know the arguments for `__get__` are `self, instance, owner_class`, so let's try to call the `__get__` method with `instance` set to `None` and `owner_class` set to our main module:

In [3]:
import sys

In [4]:
me = sys.modules['__main__']

In [5]:
p = add.__get__(None, me)

In [6]:
p, id(p)

(<function __main__.add(a, b)>, 140554287212472)

In [7]:
add, id(add)

(<function __main__.add(a, b)>, 140554287212472)

As you can see, when `instance` is `None`, the `__get__` method just returns the function itself, with owner set to `__main__` in this case.

Now let's see what happens when we define a function inside a class:

In [8]:
class Person:
    def __init__(self, name):
        self.name = name
        
    def say_hello(self):
        return f'{self.name} says hello'

Let's first access the `say_hello` callable from the class:

In [9]:
Person.say_hello

<function __main__.Person.say_hello(self)>

As you can see the owner class is now `__main__.Person`, and we get a plain function back.

What essentially happened is that when we retrieved the attribute `say_hello` from the `Person` class, since functions are descriptors, Python called the `__get__` method, in this case with `instance` set to `None`, and the owner class set to the `Person` class.

And when we call it from an instance:

In [10]:
p = Person('Alex')

In [11]:
hex(id(p))

'0x7fd5585f5470'

In [12]:
p.say_hello

<bound method Person.say_hello of <__main__.Person object at 0x7fd5585f5470>>

Again, since `say_hello` is actually a descriptor, Python invoked the `__get__` method, this time with an instance (`p`) and with owner class set to `Person`.

The descriptor then returns a method object, which it binds to the instance.

So we could retrieve it this way too:

In [13]:
bound_method = Person.say_hello.__get__(p, Person)

In [14]:
bound_method

<bound method Person.say_hello of <__main__.Person object at 0x7fd5585f5470>>

In [15]:
p.say_hello()

'Alex says hello'

In [16]:
bound_method()

'Alex says hello'

So the question is, since `p.say_hello`, a non-data descriptor, does not return a function, but a `method` object, where is the *actual* function stored?

Turns out methods have a special attribute, `__func__` that is is used to keep a reference to the original function that can then be called when needed:

In [17]:
p.say_hello.__func__, id(p.say_hello.__func__)

(<function __main__.Person.say_hello(self)>, 140554287397880)

As you can see, `__func__` is a reference to the `say_hello` function object defined in the `Person` class, and to make sure we can do this:

In [18]:
p.say_hello.__func__ is Person.say_hello

True

We could try to mimic this behavior ourselves by writing our own descriptor. The problem is that we need to define a function using Python functions, so this is a bit circular, but we can try to somewhat mimic instance methods to gain a better understanding of how they work.

Let's say we want to mimic something like this:

In [19]:
class Person:
    def __init__(self, name):
        self.name = name
        
    def say_hello(self):
        return f'{self.name} says hello!'

We want to write a descriptor to replace `say_hello`.

First we're going to write a plain function, directly in our main module:

In [20]:
def say_hello(self):
    if self and hasattr(self, 'name'):
        return f'{self.name} says hello!'
    else:
        return 'Hello!'

Now we can call this as an ordinary function:

In [21]:
say_hello(None)

'Hello!'

But what we really want is to make a descriptor that either returns the function itself when accessed via the class it is contained in (`Person` in this case), or a bound method when it is accessed via an instance of that class.

First a slight detour to look at method types.

A `method` is an actual type in Python, and it is available in the `types` module:

In [22]:
import types

In [23]:
help(types.MethodType)

Help on class method in module builtins:

class method(object)
 |  method(function, instance)
 |  
 |  Create a bound instance method object.
 |  
 |  Methods defined here:
 |  
 |  __call__(self, /, *args, **kwargs)
 |      Call self as a function.
 |  
 |  __delattr__(self, name, /)
 |      Implement delattr(self, name).
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __get__(self, instance, owner, /)
 |      Return an attribute of instance, which is of type owner.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __hash__(self, /)
 |      Return hash(self).
 |  
 |  __le__(self, value, /)
 |      Return self<=value.
 |  
 |  __lt__(self, value, /)
 |      Return self<value.
 |  
 |  __ne__(self, value, /)
 |      Return self!=value.
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return

As we can see the constructor for the `MethodType` requires a function, and an object to bind it to.

Let's try this out:

In [24]:
class Person:
    def __init__(self, name):
        self.name = name

In [25]:
p = Person('Alex')
m = types.MethodType(say_hello, p)

In [26]:
p, m

(<__main__.Person at 0x7fd5585f5358>,
 <bound method say_hello of <__main__.Person object at 0x7fd5585f5358>>)

As we can see, `m` is a `method` object, bound to the object `p`. And we can call this method:

In [27]:
m()

'Alex says hello!'

Ok, so now we can start planning how we are going to implement our descriptor.

When the `__get__` method is called from the class, we will want to return the plain `say_hello` function. But when `__get__` is called from an instance we'll want to return a method object bound to the specific instance.

In [28]:
class MyFunc:
    def __init__(self, func):
        self._func = func
    
    def __get__(self, instance, owner):
        if instance is None:
            # called from class
            print('__get__ called from class')
            return self._func
        else:
            # called from instance
            print('__get__ called from an instance')
            return types.MethodType(self._func, instance)

I made a slight tweak here to allow us to specify any function we want in the init - this make this descriptor a little more generic.

Now let's go ahead and use that in a class:

In [29]:
def hello(self):
    print(f'{self.name} says hello!')
    
class Person:
    def __init__(self, name):
        self.name = name
        
    say_hello = MyFunc(hello)

Now let's see what happens when we access `say_hello` from the class:

In [30]:
Person.say_hello

__get__ called from class


<function __main__.hello(self)>

We get the original function back.

And when we access it from an instance of `Person`:

In [31]:
p = Person('Alex')
p.say_hello

__get__ called from an instance


<bound method hello of <__main__.Person object at 0x7fd5585f5d68>>

We get a bound method.

In [32]:
p.say_hello()

__get__ called from an instance
Alex says hello!


Moreover, the original function `hello` is referenced by the bound method:

In [33]:
p.say_hello.__func__

__get__ called from an instance


<function __main__.hello(self)>

Hopefully it is now a little clearer how methods actually work in Python!

# Section 09 - Project 4

##  Solution - Part 1

Let's go ahead and just create the descriptors one by one first:

In [1]:
import numbers

In [2]:
class IntegerField:
    def __init__(self, min_, max_):
        self._min = min_
        self._max = max_

    def __set_name__(self, owner_class, prop_name):
        self.prop_name = prop_name
        
    def __set__(self, instance, value):
        if not isinstance(value, numbers.Integral):
            raise ValueError(f'{self.prop_name} must be an integer.')
        if value < self._min:
            raise ValueError(f'{self.prop_name} must be >= {self._min}.')
        if value > self._max:
            raise ValueError(f'{self.prop_name} must be <= {self._max}')
        instance.__dict__[self.prop_name] = value
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return instance.__dict__.get(self.prop_name, None)
    

Let's just make sure this works as expected:

In [3]:
class Person:
    age = IntegerField(0, 100)

In [4]:
p = Person()

In [5]:
p.age = 5

In [6]:
p.age

5

In [7]:
try:
    p.age = 200
except ValueError as ex:
    print(ex)

age must be <= 100


But of course, we really need unit testing. So let's write some unit tests to test this functionality. If you're rusty you may want to go back to Project 1 and review the unit test section in there.

In [8]:
import unittest

def run_tests(test_class):
    suite = unittest.TestLoader().loadTestsFromTestCase(test_class)
    runner = unittest.TextTestRunner(verbosity=2)
    result = runner.run(suite)

For each test we are going to need a class that defines an instance of our descriptor as an attribute.

We could do it this way:

In [9]:
class TestIntegerField(unittest.TestCase):
    class Person:
        age = IntegerField(0, 10)
        
    def test_set_age_ok(self):
        p = self.Person()
        p.age = 0
        self.assertEqual(0, p.age)

In [10]:
run_tests(TestIntegerField)

test_set_age_ok (__main__.TestIntegerField) ... ok

----------------------------------------------------------------------
Ran 1 test in 0.001s

OK


So this kind of testing works just fine, but  our `Person` class `age` is hardcoded to min and max values. We would ideally like to be able to modify those settings for every test (so we can test later with and without those values).

So, we'll override the descriptor attribute when we run the test!

In [11]:
class TestIntegerField(unittest.TestCase):
    class Person:
        age = IntegerField(0, 10)
        
    def test_set_age_ok(self):
        min_ = 5
        max_ = 10
        self.Person.age = IntegerField(5, 10)
        p = self.Person()
        
        p.age = 5
        self.assertEqual(5, p.age)

In [12]:
run_tests(TestIntegerField)

test_set_age_ok (__main__.TestIntegerField) ... ERROR

ERROR: test_set_age_ok (__main__.TestIntegerField)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython-input-11-52de3d0f3544>", line 11, in test_set_age_ok
    p.age = 5
  File "<ipython-input-2-f3204d7bb071>", line 16, in __set__
    instance.__dict__[self.prop_name] = value
AttributeError: 'IntegerField' object has no attribute 'prop_name'

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (errors=1)


Hmm... that's not working.

That's because we defined the instance of our descriptor outside of a class, so the `__set_name__` method was never called!

We could fix this by calling `__set_name__` ourselves, but a cleaner approach would be to do a bit of meta programming. 

I'll show you both approaches.

In [13]:
class TestIntegerField(unittest.TestCase):
    class Person:
        pass
    
    def create_person(self, min_, max_):
        self.Person.age = IntegerField(min_, max_)
        self.Person.age.__set_name__(Person, 'age')
        return self.Person()
        
    def test_set_age_ok(self):
        min_ = 5
        max_ = 10
        p = self.create_person(min_, max_)
        p.age = 5
        self.assertEqual(5, p.age)

In [14]:
run_tests(TestIntegerField)

test_set_age_ok (__main__.TestIntegerField) ... ok

----------------------------------------------------------------------
Ran 1 test in 0.001s

OK


Let's avoid using this hardcoded `Person` class and this weird patching we had to do by creating a class using a functional approach instead of a declarative one (using the `class` keyword).

We already know that the type of any custom class we create is `type`. It is a metaclass, and classes are actually instances of the `type` metaclass.

The `type` metaclass is actually callable, and can be used to create classes, without having to write a `class` definition.

The constructor for `type` is: `type(class_name, parent_classes, class_attributes)`
where `class_attributes` is a dictionary contain the names and values of the class attributes we want to define for our class.

In [15]:
Person = type('Person', (), {'a': 10})

In [16]:
type(Person)

type

In [17]:
Person.__dict__

mappingproxy({'a': 10,
              '__module__': '__main__',
              '__dict__': <attribute '__dict__' of 'Person' objects>,
              '__weakref__': <attribute '__weakref__' of 'Person' objects>,
              '__doc__': None})

As you can see we have the same as if we had done this:

In [18]:
class Person:
    age = 10

In [19]:
type(Person)

type

In [20]:
Person.__dict__

mappingproxy({'__module__': '__main__',
              'age': 10,
              '__dict__': <attribute '__dict__' of 'Person' objects>,
              '__weakref__': <attribute '__weakref__' of 'Person' objects>,
              '__doc__': None})

The blank argument we provided is there for inheritance - but we're not using inheritance here, hence the empty tuple.

So let's refactor our test class to use this approach:

In [21]:
class TestIntegerField(unittest.TestCase):
    @staticmethod
    def create_test_class(min_, max_):
        obj = type('TestClass', (), {'age': IntegerField(min_, max_)})
        return obj()
        
    def test_set_age_ok(self):
        min_ = 5
        max_ = 10
        p = self.create_test_class(min_, max_)
        p.age = 5
        self.assertEqual(5, p.age)

In [22]:
run_tests(TestIntegerField)

test_set_age_ok (__main__.TestIntegerField) ... ok

----------------------------------------------------------------------
Ran 1 test in 0.001s

OK


OK, now that this is out of the way, let's continue writing our unit tests:

In [23]:
class TestIntegerField(unittest.TestCase):
    @staticmethod
    def create_test_class(min_, max_):
        obj = type('TestClass', (), {'age': IntegerField(min_, max_)})
        return obj()
        
    def test_set_age_ok(self):
        """Tests that valid values can be assigned/retrieved"""
        min_ = 5
        max_ = 10
        obj = self.create_test_class(min_, max_)
        valid_values = range(min_, max_)
        
        for i, value in enumerate(valid_values):
            with self.subTest(test_number=i):
                obj.age = value
                self.assertEqual(value, obj.age)

In [24]:
run_tests(TestIntegerField)

test_set_age_ok (__main__.TestIntegerField)
Tests that valid values can be assigned/retrieved ... ok

----------------------------------------------------------------------
Ran 1 test in 0.001s

OK


Now let's add failure tests and a check that we have implemented `__get__` such that using it from the class returns the descriptor instance.

In [25]:
class TestIntegerField(unittest.TestCase):
    @staticmethod
    def create_test_class(min_, max_):
        obj = type('TestClass', (), {'age': IntegerField(min_, max_)})
        return obj()
        
    def test_set_age_ok(self):
        """Tests that valid values can be assigned/retrieved"""
        min_ = 5
        max_ = 10
        obj = self.create_test_class(min_, max_)
        valid_values = range(min_, max_)
        
        for i, value in enumerate(valid_values):
            with self.subTest(test_number=i):
                obj.age = value
                self.assertEqual(value, obj.age)
                
    def test_set_age_invalid(self):
        """Tests that invalid values raise ValueErrors"""
        min_ = -10
        max_ = 10
        obj = self.create_test_class(min_, max_)
        bad_values = list(range(min_ - 5, min_))
        bad_values += list(range(max_ + 1, max_ + 5))
        bad_values += [10.5, 1 + 0j, 'abc', (1, 2)]
        
        for i, value in enumerate(bad_values):
            with self.subTest(test_number=i):
                with self.assertRaises(ValueError):
                    obj.age = value
                    
    def test_class_get(self):
        """Tests that class attribute retrieval returns the descriptor instance"""
        obj = self.create_test_class(0, 0)
        obj_class = type(obj)
        self.assertIsInstance(obj_class.age, IntegerField)
        

In [26]:
run_tests(TestIntegerField)

test_class_get (__main__.TestIntegerField)
Tests that class attribute retrieval returns the descriptor instance ... ok
test_set_age_invalid (__main__.TestIntegerField)
Tests that invalid values raise ValueErrors ... ok
test_set_age_ok (__main__.TestIntegerField)
Tests that valid values can be assigned/retrieved ... ok

----------------------------------------------------------------------
Ran 3 tests in 0.002s

OK


OK, so that's our `IntegerField` so far. Let's modify it (and the unit tests) so that we can optionally not specify min/max.

We're actually going to write the tests **first**, run them and make sure they fail, then implement the functionality, re-run the tests and make sure they now pass. (This is an example of test-driven development - we write the tests first, then implement the functionality making sure our tests fail before, and pass after).

In [27]:
class TestIntegerField(unittest.TestCase):
    @staticmethod
    def create_test_class(min_, max_):
        obj = type('TestClass', (), {'age': IntegerField(min_, max_)})
        return obj()
        
    def test_set_age_ok(self):
        """Tests that valid values can be assigned/retrieved"""
        min_ = 5
        max_ = 10
        obj = self.create_test_class(min_, max_)
        valid_values = range(min_, max_)
        
        for i, value in enumerate(valid_values):
            with self.subTest(test_number=i):
                obj.age = value
                self.assertEqual(value, obj.age)
                
    def test_set_age_invalid(self):
        """Tests that invalid values raise ValueErrors"""
        min_ = -10
        max_ = 10
        obj = self.create_test_class(min_, max_)
        bad_values = list(range(min_ - 5, min_))
        bad_values += list(range(max_ + 1, max_ + 5))
        bad_values += [10.5, 1 + 0j, 'abc', (1, 2)]
        
        for i, value in enumerate(bad_values):
            with self.subTest(test_number=i):
                with self.assertRaises(ValueError):
                    obj.age = value
                    
    def test_class_get(self):
        """Tests that class attribute retrieval returns the descriptor instance"""
        obj = self.create_test_class(0, 0)
        obj_class = type(obj)
        self.assertIsInstance(obj_class.age, IntegerField)
        
    def test_set_age_min_only(self):
        """Tests that we can specify a min value only"""
        min_ = 0
        max_ = None
        obj = self.create_test_class(min_, max_)
        values = range(min_, min_ + 100, 10)
        for i, value in enumerate(values):
            with self.subTest(test_number=i):
                obj.age = value
                self.assertEqual(value, obj.age)
                
    def test_set_age_max_only(self):
        """Tests that we can specify a max value only"""
        min_ = None
        max_ = 10
        obj = self.create_test_class(min_, max_)
        values = range(max_ - 100, max_, 10)
        for i, value in enumerate(values):
            with self.subTest(test_number=i):
                obj.age = value
                self.assertEqual(value, obj.age)
                
    def test_set_age_no_limits(self):
        """Tests that we can use IntegerField without any limits at all"""
        min_ = None
        max_ = None
        obj = self.create_test_class(min_, max_)
        values = range(-100, 100, 10)
        for i, value in enumerate(values):
            with self.subTest(test_number=i):
                obj.age = value
                self.assertEqual(value, obj.age)

In [28]:
run_tests(TestIntegerField)

test_class_get (__main__.TestIntegerField)
Tests that class attribute retrieval returns the descriptor instance ... ok
test_set_age_invalid (__main__.TestIntegerField)
Tests that invalid values raise ValueErrors ... ok
test_set_age_max_only (__main__.TestIntegerField)
Tests that we can specify a max value only ... test_set_age_min_only (__main__.TestIntegerField)
Tests that we can specify a min value only ... test_set_age_no_limits (__main__.TestIntegerField)
Tests that we can use IntegerField without any limits at all ... test_set_age_ok (__main__.TestIntegerField)
Tests that valid values can be assigned/retrieved ... ok

ERROR: test_set_age_max_only (__main__.TestIntegerField) (test_number=0)
Tests that we can specify a max value only
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython-input-27-17f43502dd35>", line 58, in test_set_age_max_only
    obj.age = value
  File "<ipython-input-2-f3204d7bb071>", line 12, i

OK, so now that we have the tests written (and that they all fail), let's implement the functionality and re-test:

In [29]:
class IntegerField:
    def __init__(self, min_, max_):
        self._min = min_
        self._max = max_

    def __set_name__(self, owner_class, prop_name):
        self.prop_name = prop_name
        
    def __set__(self, instance, value):
        if not isinstance(value, numbers.Integral):
            raise ValueError(f'{self.prop_name} must be an integer.')
        if self._min is not None and value < self._min:
            raise ValueError(f'{self.prop_name} must be >= {self._min}.')
        if self._max is not None and value > self._max:
            raise ValueError(f'{self.prop_name} must be <= {self._max}')
        instance.__dict__[self.prop_name] = value
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return instance.__dict__.get(self.prop_name, None)
    

In [30]:
run_tests(TestIntegerField)

test_class_get (__main__.TestIntegerField)
Tests that class attribute retrieval returns the descriptor instance ... ok
test_set_age_invalid (__main__.TestIntegerField)
Tests that invalid values raise ValueErrors ... ok
test_set_age_max_only (__main__.TestIntegerField)
Tests that we can specify a max value only ... ok
test_set_age_min_only (__main__.TestIntegerField)
Tests that we can specify a min value only ... ok
test_set_age_no_limits (__main__.TestIntegerField)
Tests that we can use IntegerField without any limits at all ... ok
test_set_age_ok (__main__.TestIntegerField)
Tests that valid values can be assigned/retrieved ... ok

----------------------------------------------------------------------
Ran 6 tests in 0.006s

OK


Cool!

Now there are some additional tests we could create, like testing if things work when one of the bounds is `0` (this would catch errors such as 

```
if self._min and value < self._min:
```

which would not work correctly for `_min = 0`

But I'll leave this and other tests for you :-)

Let's move on to the `CharField` descriptor - it's pretty much the same as `IntegerField` so, I'm going to copy/paste and refactor. One main difference is that it does not make sense for `min_` to be a negative number, or to be `None`.

In [31]:
class CharField:
    def __init__(self, min_=None, max_=None):
        min_ = min_ or 0  # in case min_ is None
        min_ = max(min_, 0)  # replaces negative value with zero
        self._min = min_
        self._max = max_

    def __set_name__(self, owner_class, prop_name):
        self.prop_name = prop_name
        
    def __set__(self, instance, value):
        if not isinstance(value, str):
            raise ValueError(f'{self.prop_name} must be a string.')
        if self._min is not None and len(value) < self._min:
            raise ValueError(f'{self.prop_name} must be >= {self._min} chars.')
        if self._max is not None and len(value) > self._max:
            raise ValueError(f'{self.prop_name} must be <= {self._max} chars')
        instance.__dict__[self.prop_name] = value
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return instance.__dict__.get(self.prop_name, None)
    

Let's do a quick manual test:

In [32]:
class Person:
    name = CharField(1, 10)

In [33]:
p = Person()

In [34]:
try:
    p.name = ''
except ValueError as ex:
    print(ex)

name must be >= 1 chars.


In [35]:
try:
    p.name = 'Python Rocks!'
except ValueError as ex:
    print(ex)

name must be <= 10 chars


In [36]:
p.name = 'John'

In [37]:
class Person:
    name = CharField(-10, 10)

In [38]:
p = Person()
p.name = ''
p.name

''

In [39]:
class Person:
    name = CharField(1)

In [40]:
p = Person()
p.name = "I'm a lumberjack and I'm OK, I sleep all night and I work all day."
p.name

"I'm a lumberjack and I'm OK, I sleep all night and I work all day."

Of course, we really should write unit tests. These will basically be very similar to the unit tests we created for `IntegerField`, so let's get cracking!

In [41]:
class TestCharField(unittest.TestCase):
    @staticmethod
    def create_test_class(min_, max_):
        obj = type('TestClass', (), {'name': CharField(min_, max_)})
        return obj()
        
    def test_set_name_ok(self):
        """Tests that valid values can be assigned/retrieved"""
        min_ = 1
        max_ = 10
        obj = self.create_test_class(min_, max_)
        valid_lengths = range(min_, max_)
        
        for i, length in enumerate(valid_lengths):
            value = 'a' * length
            with self.subTest(test_number=i):
                obj.name = value
                self.assertEqual(value, obj.name)
            
    def test_set_name_invalid(self):
        """Tests that invalid values raise ValueErrors"""
        min_ = 5
        max_ = 10
        obj = self.create_test_class(min_, max_)
        bad_lengths = list(range(min_ - 5, min_))
        bad_lengths += list(range(max_ + 1, max_ + 5))
        for i, length in enumerate(bad_lengths):
            value = 'a' * length
            with self.subTest(test_number=i):
                with self.assertRaises(ValueError):
                    obj.name = value
                    
    def test_class_get(self):
        """Tests that class attribute retrieval returns the descriptor instance"""
        obj = self.create_test_class(0, 0)
        obj_class = type(obj)
        self.assertIsInstance(obj_class.name, CharField)
        
    def test_set_name_min_only(self):
        """Tests that we can specify a min length only"""
        min_ = 0
        max_ = None
        obj = self.create_test_class(min_, max_)
        valid_lengths = range(min_, min_ + 100, 10)
        for i, length in enumerate(valid_lengths):
            value = 'a' * length
            with self.subTest(test_number=i):
                obj.name = value
                self.assertEqual(value, obj.name)
    
    def test_set_name_min_negative_or_none(self):
        """Tests that setting a negative or None length results in a zero length"""
        obj = self.create_test_class(-10, 100)
        self.assertEqual(type(obj).name._min, 0)
        self.assertEqual(type(obj).name._max, 100)
        
        obj = self.create_test_class(None, None)
        self.assertEqual(type(obj).name._min, 0)
        self.assertIsNone(type(obj).name._max)
        
    def test_set_name_max_only(self):
        """Tests that we can specify a max length only"""
        min_ = None
        max_ = 10
        obj = self.create_test_class(min_, max_)
        valid_lengths = range(max_ - 100, max_, 10)
        for i, length in enumerate(valid_lengths):
            value = 'a' * length
            with self.subTest(test_number=i):
                obj.name = value
                self.assertEqual(value, obj.name)
                
    def test_set_name_no_limits(self):
        """Tests that we can use CharField without any limits at all"""
        min_ = None
        max_ = None
        obj = self.create_test_class(min_, max_)
        valid_lengths = range(0, 100, 10)
        for i, length in enumerate(valid_lengths):
            value = 'a' * length
            with self.subTest(test_number=i):
                obj.name = value
                self.assertEqual(value, obj.name)

In [42]:
run_tests(TestCharField)

test_class_get (__main__.TestCharField)
Tests that class attribute retrieval returns the descriptor instance ... ok
test_set_name_invalid (__main__.TestCharField)
Tests that invalid values raise ValueErrors ... ok
test_set_name_max_only (__main__.TestCharField)
Tests that we can specify a max length only ... ok
test_set_name_min_negative_or_none (__main__.TestCharField)
Tests that setting a negative or None length results in a zero length ... ok
test_set_name_min_only (__main__.TestCharField)
Tests that we can specify a min length only ... ok
test_set_name_no_limits (__main__.TestCharField)
Tests that we can use CharField without any limits at all ... ok
test_set_name_ok (__main__.TestCharField)
Tests that valid values can be assigned/retrieved ... ok

----------------------------------------------------------------------
Ran 7 tests in 0.005s

OK


##  Solution - Part 2

Here's where we left off in the last video:

In [1]:
import numbers
import unittest

In [2]:
class TestIntegerField(unittest.TestCase):
    @staticmethod
    def create_test_class(min_, max_):
        obj = type('TestClass', (), {'age': IntegerField(min_, max_)})
        return obj()
        
    def test_set_age_ok(self):
        """Tests that valid values can be assigned/retrieved"""
        min_ = 5
        max_ = 10
        obj = self.create_test_class(min_, max_)
        valid_values = range(min_, max_)
        
        for i, value in enumerate(valid_values):
            with self.subTest(test_number=i):
                obj.age = value
                self.assertEqual(value, obj.age)
                
    def test_set_age_invalid(self):
        """Tests that invalid values raise ValueErrors"""
        min_ = -10
        max_ = 10
        obj = self.create_test_class(min_, max_)
        bad_values = list(range(min_ - 5, min_))
        bad_values += list(range(max_ + 1, max_ + 5))
        bad_values += [10.5, 1 + 0j, 'abc', (1, 2)]
        
        for i, value in enumerate(bad_values):
            with self.subTest(test_number=i):
                with self.assertRaises(ValueError):
                    obj.age = value
                    
    def test_class_get(self):
        """Tests that class attribute retrieval returns the descriptor instance"""
        obj = self.create_test_class(0, 0)
        obj_class = type(obj)
        self.assertIsInstance(obj_class.age, IntegerField)
        
    def test_set_age_min_only(self):
        """Tests that we can specify a min value only"""
        min_ = 0
        max_ = None
        obj = self.create_test_class(min_, max_)
        values = range(min_, min_ + 100, 10)
        for i, value in enumerate(values):
            with self.subTest(test_number=i):
                obj.age = value
                self.assertEqual(value, obj.age)
                
    def test_set_age_max_only(self):
        """Tests that we can specify a max value only"""
        min_ = None
        max_ = 10
        obj = self.create_test_class(min_, max_)
        values = range(max_ - 100, max_, 10)
        for i, value in enumerate(values):
            with self.subTest(test_number=i):
                obj.age = value
                self.assertEqual(value, obj.age)
                
    def test_set_age_no_limits(self):
        """Tests that we can use IntegerField without any limits at all"""
        min_ = None
        max_ = None
        obj = self.create_test_class(min_, max_)
        values = range(-100, 100, 10)
        for i, value in enumerate(values):
            with self.subTest(test_number=i):
                obj.age = value
                self.assertEqual(value, obj.age)

class TestCharField(unittest.TestCase):
    @staticmethod
    def create_test_class(min_, max_):
        obj = type('TestClass', (), {'name': CharField(min_, max_)})
        return obj()
        
    def test_set_name_ok(self):
        """Tests that valid values can be assigned/retrieved"""
        min_ = 1
        max_ = 10
        obj = self.create_test_class(min_, max_)
        valid_lengths = range(min_, max_)
        
        for i, length in enumerate(valid_lengths):
            value = 'a' * length
            with self.subTest(test_number=i):
                obj.name = value
                self.assertEqual(value, obj.name)
            
    def test_set_name_invalid(self):
        """Tests that invalid values raise ValueErrors"""
        min_ = 5
        max_ = 10
        obj = self.create_test_class(min_, max_)
        bad_lengths = list(range(min_ - 5, min_))
        bad_lengths += list(range(max_ + 1, max_ + 5))
        for i, length in enumerate(bad_lengths):
            value = 'a' * length
            with self.subTest(test_number=i):
                with self.assertRaises(ValueError):
                    obj.name = value
                    
    def test_class_get(self):
        """Tests that class attribute retrieval returns the descriptor instance"""
        obj = self.create_test_class(0, 0)
        obj_class = type(obj)
        self.assertIsInstance(obj_class.name, CharField)
        
    def test_set_name_min_only(self):
        """Tests that we can specify a min length only"""
        min_ = 0
        max_ = None
        obj = self.create_test_class(min_, max_)
        valid_lengths = range(min_, min_ + 100, 10)
        for i, length in enumerate(valid_lengths):
            value = 'a' * length
            with self.subTest(test_number=i):
                obj.name = value
                self.assertEqual(value, obj.name)
    
    def test_set_name_min_negative_or_none(self):
        """Tests that setting a negative or None length results in a zero length"""
        obj = self.create_test_class(-10, 100)
        self.assertEqual(type(obj).name._min, 0)
        self.assertEqual(type(obj).name._max, 100)
        
        obj = self.create_test_class(None, None)
        self.assertEqual(type(obj).name._min, 0)
        self.assertIsNone(type(obj).name._max)
        
    def test_set_name_max_only(self):
        """Tests that we can specify a max length only"""
        min_ = None
        max_ = 10
        obj = self.create_test_class(min_, max_)
        valid_lengths = range(max_ - 100, max_, 10)
        for i, length in enumerate(valid_lengths):
            value = 'a' * length
            with self.subTest(test_number=i):
                obj.name = value
                self.assertEqual(value, obj.name)
                
    def test_set_name_no_limits(self):
        """Tests that we can use CharField without any limits at all"""
        min_ = None
        max_ = None
        obj = self.create_test_class(min_, max_)
        valid_lengths = range(0, 100, 10)
        for i, length in enumerate(valid_lengths):
            value = 'a' * length
            with self.subTest(test_number=i):
                obj.name = value
                self.assertEqual(value, obj.name)
                
def run_tests(test_class):
    suite = unittest.TestLoader().loadTestsFromTestCase(test_class)
    runner = unittest.TextTestRunner(verbosity=2)
    result = runner.run(suite)

In [3]:
class IntegerField:
    def __init__(self, min_, max_):
        self._min = min_
        self._max = max_

    def __set_name__(self, owner_class, prop_name):
        self.prop_name = prop_name
        
    def __set__(self, instance, value):
        if not isinstance(value, numbers.Integral):
            raise ValueError(f'{self.prop_name} must be an integer.')
        if self._min is not None and value < self._min:
            raise ValueError(f'{self.prop_name} must be >= {self._min}.')
        if self._max is not None and value > self._max:
            raise ValueError(f'{self.prop_name} must be <= {self._max}')
        instance.__dict__[self.prop_name] = value
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return instance.__dict__.get(self.prop_name, None)
    
    
class CharField:
    def __init__(self, min_=None, max_=None):
        min_ = min_ or 0  # in case min_ is None
        min_ = max(min_, 0)  # replaces negative value with zero
        self._min = min_
        self._max = max_

    def __set_name__(self, owner_class, prop_name):
        self.prop_name = prop_name
        
    def __set__(self, instance, value):
        if not isinstance(value, str):
            raise ValueError(f'{self.prop_name} must be a string.')
        if self._min is not None and len(value) < self._min:
            raise ValueError(f'{self.prop_name} must be >= {self._min} chars.')
        if self._max is not None and len(value) > self._max:
            raise ValueError(f'{self.prop_name} must be <= {self._max} chars')
        instance.__dict__[self.prop_name] = value
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return instance.__dict__.get(self.prop_name, None)

And of course, our unit tests should run just fine:

In [4]:
run_tests(TestIntegerField)

test_class_get (__main__.TestIntegerField)
Tests that class attribute retrieval returns the descriptor instance ... ok
test_set_age_invalid (__main__.TestIntegerField)
Tests that invalid values raise ValueErrors ... ok
test_set_age_max_only (__main__.TestIntegerField)
Tests that we can specify a max value only ... ok
test_set_age_min_only (__main__.TestIntegerField)
Tests that we can specify a min value only ... ok
test_set_age_no_limits (__main__.TestIntegerField)
Tests that we can use IntegerField without any limits at all ... ok
test_set_age_ok (__main__.TestIntegerField)
Tests that valid values can be assigned/retrieved ... ok

----------------------------------------------------------------------
Ran 6 tests in 0.004s

OK


In [5]:
run_tests(TestCharField)

test_class_get (__main__.TestCharField)
Tests that class attribute retrieval returns the descriptor instance ... ok
test_set_name_invalid (__main__.TestCharField)
Tests that invalid values raise ValueErrors ... ok
test_set_name_max_only (__main__.TestCharField)
Tests that we can specify a max length only ... ok
test_set_name_min_negative_or_none (__main__.TestCharField)
Tests that setting a negative or None length results in a zero length ... ok
test_set_name_min_only (__main__.TestCharField)
Tests that we can specify a min length only ... ok
test_set_name_no_limits (__main__.TestCharField)
Tests that we can use CharField without any limits at all ... ok
test_set_name_ok (__main__.TestCharField)
Tests that valid values can be assigned/retrieved ... ok

----------------------------------------------------------------------
Ran 7 tests in 0.005s

OK


As you may noticed, quite a bit of code was redundant between the `IntegerField` and `CharField` descriptors.

So, let's restructure things a bit to make use of inheritance for the common bits.

Notice that the implementation of `__set_name__` and `__get__` are actually identical. The `__init__` methods are slightly different, but there is still some commonality. And same goes for the `__set__` - although the validations are different, the storage mechanism is the same - so we could factor that out.

We're going to create a base class as follows:

In [6]:
class BaseValidator:
    def __init__(self, min_=None, max_=None):
        self._min = min_
        self._max = max_
        
    def __set_name__(self, owner_class, prop_name):
        self.prop_name = prop_name
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        else:
            return instance.__dict__.get(self.prop_name, None)
        
    def validate(self, value):
        # this will need to be implemented specifically by each subclass
        # here we just default to not raising any exceptions
        pass
    
    def __set__(self, instance, value):
        self.validate(value)
        instance.__dict__[self.prop_name] = value

Of course we can use this `BaseValidator` directly, but it won't be very useful:

In [7]:
class Person:
    name = BaseValidator()

In [8]:
p = Person()

In [9]:
p.name = 'Alex'

In [10]:
p.name

'Alex'

Now let's leverage this class to create our integer and char descriptors:

In [11]:
class IntegerField(BaseValidator):
    def validate(self, value):
        if not isinstance(value, numbers.Integral):
            raise ValueError(f'{self.prop_name} must be an integer.')
        if self._min is not None and value < self._min:
            raise ValueError(f'{self.prop_name} must be >= {self._min}.')
        if self._max is not None and value > self._max:
            raise ValueError(f'{self.prop_name} must be <= {self._max}')

In [12]:
class CharField(BaseValidator):
    def __init__(self, min_, max_):
        min_ = max(min_ or 0, 0)
        super().__init__(min_, max_)
        
    def validate(self, value):
        if not isinstance(value, str):
            raise ValueError(f'{self.prop_name} must be a string.')
        if self._min is not None and len(value) < self._min:
            raise ValueError(f'{self.prop_name} must be >= {self._min} chars.')
        if self._max is not None and len(value) > self._max:
            raise ValueError(f'{self.prop_name} must be <= {self._max} chars')
        

And this should work just as before. Lucky for us we don't have to test anything manually, we can just re-run our unit tests and make sure nothing broke!

In [13]:
run_tests(TestIntegerField)

test_class_get (__main__.TestIntegerField)
Tests that class attribute retrieval returns the descriptor instance ... ok
test_set_age_invalid (__main__.TestIntegerField)
Tests that invalid values raise ValueErrors ... ok
test_set_age_max_only (__main__.TestIntegerField)
Tests that we can specify a max value only ... ok
test_set_age_min_only (__main__.TestIntegerField)
Tests that we can specify a min value only ... ok
test_set_age_no_limits (__main__.TestIntegerField)
Tests that we can use IntegerField without any limits at all ... ok
test_set_age_ok (__main__.TestIntegerField)
Tests that valid values can be assigned/retrieved ... ok

----------------------------------------------------------------------
Ran 6 tests in 0.004s

OK


In [14]:
run_tests(TestCharField)

test_class_get (__main__.TestCharField)
Tests that class attribute retrieval returns the descriptor instance ... ok
test_set_name_invalid (__main__.TestCharField)
Tests that invalid values raise ValueErrors ... ok
test_set_name_max_only (__main__.TestCharField)
Tests that we can specify a max length only ... ok
test_set_name_min_negative_or_none (__main__.TestCharField)
Tests that setting a negative or None length results in a zero length ... ok
test_set_name_min_only (__main__.TestCharField)
Tests that we can specify a min length only ... ok
test_set_name_no_limits (__main__.TestCharField)
Tests that we can use CharField without any limits at all ... ok
test_set_name_ok (__main__.TestCharField)
Tests that valid values can be assigned/retrieved ... ok

----------------------------------------------------------------------
Ran 7 tests in 0.006s

OK


Woohoo!!

One thing I want to mention here: now that we are using unittests, and iterating our development, using Jupyter notebooks, even with relatively simple programs like this one, is getting unwieldy. Best would be to use a proper Python app, with a root and multiple modules, including one for unit tests. An IDE like PyCharm or VSCode works really great, but of course you can choose to use any text editor, and the command line to run your app instead of an IDE.

# Section 10 - Enumerations

##  Enumerations

We'll need the `enum` module:

In [1]:
import enum

The base class for enums is `Enum`. To create an enumeration we need to **subclass** it:

In [2]:
class Color(enum.Enum):
    red = 1
    green = 2
    blue = 3

Associated values can be anything, not just integer values:

In [3]:
class Status(enum.Enum):
    PENDING = 'pending'
    RUNNING = 'running'
    COMPLETED = 'completed'    

In [4]:
class UnitVector(enum.Enum):
    V1D = (1, )
    V2D = (1, 1)
    V3D = (1, 1, 1)

Each member of an enumeration has a type of the enumeration class itself:

In [5]:
Status.PENDING

<Status.PENDING: 'pending'>

In [6]:
type(Status.PENDING)

<enum 'Status'>

In [7]:
isinstance(Status.PENDING, Status)

True

Each member (instance of the enumeration) has properties, just like any object:

In [8]:
Status.PENDING.name, Status.PENDING.value

('PENDING', 'pending')

Although `==` is supported, member equality is generally tested using identity, `is`. It is also faster than using `==`:

In [9]:
Status.PENDING is Status.PENDING

True

In [10]:
Status.PENDING == Status.PENDING

True

Note that although `==` (and `!=`) is supported, rich comparison operators are not (it would not make sense, except maybe if the values are values such as integers - we'll come back to that):

In [11]:
class Constants(enum.Enum):
    ONE = 1
    TWO = 2
    THREE = 3

In [12]:
try:
    Constants.ONE > Constants.TWO
except TypeError as ex:
    print(ex)

'>' not supported between instances of 'Constants' and 'Constants'


Membership can be tested using `in`:

In [13]:
Status.PENDING in Status

True

Note that the names (strings) and associated values are not themselves members of the enumeration - remember that enumeration members are instances of the enumeration class:

In [14]:
Status.PENDING.name, Status.PENDING.value

('PENDING', 'pending')

In [15]:
'PENDING' in Status, 'pending' in Status

(False, False)

Enums are callables, and we can look up a member by **value** by calling the enumeration:

In [16]:
Status('pending'), UnitVector((1,1))

(<Status.PENDING: 'pending'>, <UnitVector.V2D: (1, 1)>)

But if we try to lookup a member with a non-existent value, we get a `ValueError` exception:

In [17]:
try:
    Status('invalid')
except ValueError as ex:
    print(ex)

'invalid' is not a valid Status


Recall that a class that implements the `__getitem__` method supports the [] operation:

In [18]:
class Person:
    def __getitem__(self, val):
        return f'__getitem__({val}) called...'

In [19]:
p = Person()
p['some value']

'__getitem__(some value) called...'

Enumerations implement this `__getitem__` method:

In [20]:
hasattr(Status, '__getitem__')

True

So we can look up a member by it's name (think of it as a key):

In [21]:
Status['PENDING']

<Status.PENDING: 'pending'>

But the enumeration members, although instances of the enumeration, are also class attributes of the enumeration, so we can also use `getattr` like we would with any standard class attribute:

In [22]:
getattr(Status, 'PENDING')

<Status.PENDING: 'pending'>

Enumeration members are always hashable, even if their associated values are not (makes sense, since member names are basically strings):

In [23]:
class Person:
    __hash__ = None

In [24]:
p = Person()
try:
    hash(p)
except TypeError as ex:
    print(ex)

unhashable type: 'Person'


So, although `Person` objects are not hashable:

In [25]:
class Family(enum.Enum):
    person_1 = Person()
    person_2 = Person()

In [26]:
Family.person_1

<Family.person_1: <__main__.Person object at 0x7fbb083d1320>>

We can still use members as keys in a dictionary:

In [27]:
{
    Family.person_1: 'person 1',
    Family.person_2: 'person 2'
}

{<Family.person_1: <__main__.Person object at 0x7fbb083d1320>>: 'person 1',
 <Family.person_2: <__main__.Person object at 0x7fbb083d1390>>: 'person 2'}

Enumerations are iterables:

In [28]:
hasattr(Status, '__iter__')

True

So we can iterate over the members:

In [29]:
for member in Status:
    print(repr(member))

<Status.PENDING: 'pending'>
<Status.RUNNING: 'running'>
<Status.COMPLETED: 'completed'>


Note that iteration order is the order in which the members are declared in the enumeration, and has nothing to do with the associated values:

In [30]:
class Numbers1(enum.Enum):
    ONE = 1
    TWO = 2
    THREE = 3
    
class Numbers2(enum.Enum):
    THREE = 3
    TWO = 2
    ONE = 1

In [31]:
list(Numbers1)

[<Numbers1.ONE: 1>, <Numbers1.TWO: 2>, <Numbers1.THREE: 3>]

In [32]:
list(Numbers2)

[<Numbers2.THREE: 3>, <Numbers2.TWO: 2>, <Numbers2.ONE: 1>]

Lastly, enumerations are immutable: we cannot add/remove elements from the enumeration, **and** we canniot modify the associated values:

In [33]:
try:
    Status.PENDING.value = 10
except AttributeError as ex:
    print(ex)

can't set attribute


In [34]:
try:
    Status['NEW'] = 100
except TypeError as ex:
    print(ex)

'EnumMeta' object does not support item assignment


We'll come back to this later, but we cannot extend an enumeration once it has members defined:

In [35]:
class EnumBase(enum.Enum):
    pass

In [36]:
class EnumExt(EnumBase):
    ONE = 1
    TWO = 2

In [37]:
EnumExt.ONE

<EnumExt.ONE: 1>

But this would not work:

In [38]:
class EnumBase(enum.Enum):
    ONE = 1

In [39]:
try:
    class EnumExt(EnumBase):
        TWO = 2
except TypeError as ex:
    print(ex)

Cannot extend enumerations


##### Example

So the basics of enumerations are quite straightforward. You might be wondering though why we have two ways of referencing members by name:

In [40]:
Status.PENDING, Status['PENDING']

(<Status.PENDING: 'pending'>, <Status.PENDING: 'pending'>)

This is because sometimes we might get a string from some input, and need to match it up with a member in the enumeration.

For example it might be a status that comes back from an API call in a JSON payload:

In [41]:
payload = """
{
  "name": "Alex",
  "status": "PENDING"
}
"""

In [42]:
import json

data = json.loads(payload)

In [43]:
data['status']

'PENDING'

And now we can look up the status in the enumeration, but we have to use the `__getitem__` method:

In [44]:
Status[data['status']]

<Status.PENDING: 'pending'>

##### Example 2

A natural question given the last example might be: how do we determine if some string corresponds to a member name in our enumeration?

We have three basic ways of doing this.

First we could simply lookup the value by name, and trap the `KeyError` exception:

In [45]:
def is_member(en, name):
    try:
        en[name]
    except KeyError:
        return False
    return True

In [46]:
is_member(Status, 'PENDING')

True

In [47]:
is_member(Status, 'pending')

False

We could also just use the `getattr` function:

In [48]:
getattr(Status, 'PENDING', None), getattr(Status, 'OK', None)

(<Status.PENDING: 'pending'>, None)

But we could also just use the `__members__` property:

In [49]:
Status.__members__

mappingproxy({'PENDING': <Status.PENDING: 'pending'>,
              'RUNNING': <Status.RUNNING: 'running'>,
              'COMPLETED': <Status.COMPLETED: 'completed'>})

As you can see we get a `mappingproxy` object back, so we can use membership in that object (that defaults to using the keys), or the `keys()` view if we want to be more explicit:

In [50]:
'PENDING' in Status.__members__

True

In [51]:
'PENDING' in Status.__members__.keys()

True

##  Aliases

Although member values are considered unique in enumerations, we can still define multiple member names with the same value. But they do not create different members!


They are, in fact, considered aliases of each other, with the first member becoming the "master" member.

Let's see a simple example of this first:

In [1]:
import enum

In [2]:
class NumSides(enum.Enum):
    Triangle = 3
    Rectangle = 4
    Square = 4
    Rhombus = 4

As you can see we have two members with different names (names must **always** be unique), but with the **same** value.

However, the `Square` and `Rhombus` members are considered **aliases** of the `Rectangle` member since `Rectangle` is defined first.

This means that `Rectangle` and `Square` are actually considered the **same** member:

In [3]:
NumSides.Rectangle is NumSides.Square

True

And of course aliases are equal to each other too:

In [4]:
NumSides.Square is NumSides.Rhombus

True

Aliases can be referenced just like an ordinary member, and are considered *contained* in the enumeration:

In [5]:
NumSides.Square in NumSides

True

And when we look up the member, by value:

In [6]:
NumSides(4)

<NumSides.Rectangle: 4>

we always get the "master" back.

Same holds when when looking up by key:

In [7]:
NumSides['Square']

<NumSides.Rectangle: 4>

When we iterate an enumeration that contains aliases, none of the aliases are returned in the iteration:

In [8]:
list(NumSides)

[<NumSides.Triangle: 3>, <NumSides.Rectangle: 4>]

The only way to get all the members, including aliases, is to use the `__members__` property:

In [9]:
NumSides.__members__

mappingproxy({'Triangle': <NumSides.Triangle: 3>,
              'Rectangle': <NumSides.Rectangle: 4>,
              'Square': <NumSides.Rectangle: 4>,
              'Rhombus': <NumSides.Rectangle: 4>})

Notice how the aliases are treated. Although the keys in the mapping proxy are different, the object they point to are all the "master" member.

#### Example

There are times when the ability to define these aliases can be useful. Let's say you have to deal with statuses that are returned as strings from different systems.

These systems may not always define exactly the same strings to mean the same thing (maybe they were developed independently). In a case like this, being able to create aliases could be useful to bring uniformity to our own code.

Let's say that the statuses from system 1 are: `ready, busy, finished_no_error, finished_with_errors`

And for system 2 we have correspondingly: `ready, processing, ran_ok, errored`

And in our own system we might want the statuses: `ready, running, ok, errors`

In other words we have:

```
Us        System 1               System 2
-------------------------------------------
ready     ready                  ready
running   busy                   processing
ok        finished_no_error      ran_ok
errors    finished_with_errors   errored
```

We can the easily achieve this using this class with aliases:

In [10]:
class Status(enum.Enum):
    ready = 'ready'
    
    running = 'running'
    busy = 'running'
    processing = 'running'
    
    ok = 'ok'
    finished_no_error = 'ok'
    ran_ok = 'ok'
    
    errors = 'errors'
    finished_with_errors = 'errors'
    errored = 'errors'

Then when we list our own statuses, we only see our (master) members:

In [11]:
list(Status)

[<Status.ready: 'ready'>,
 <Status.running: 'running'>,
 <Status.ok: 'ok'>,
 <Status.errors: 'errors'>]

But now we can look up a status from any of the other two systems, and automatically get our "master" member:

In [12]:
Status['busy']

<Status.running: 'running'>

In [13]:
Status['processing']

<Status.running: 'running'>

Note that in our case the actual value of the members does not matter. I used strings, but we could equally well just use numbers:

In [14]:
class Status(enum.Enum):
    ready = 1
    
    running = 2
    busy = 2
    processing = 2
    
    ok = 3
    finished_no_error = 3
    ran_ok = 3
    
    errors = 4
    finished_with_errors = 4
    errored = 4

This will work the same way:

In [15]:
Status.ran_ok

<Status.ok: 3>

In [16]:
status = 'ran_ok'

In [17]:
status in Status.__members__

True

In [18]:
Status[status]

<Status.ok: 3>

#### Ensuring No Aliases

Sometimes we want to make sure we are creating enumerations that do **not** contain aliases.

Of course, we can just be careful and not define aliases, but the `enum` module provides a special decorator that can enforce this:

In [19]:
@enum.unique
class Status(enum.Enum):
    ready = 1
    done_ok = 2
    errors = 3

And if we try to create aliases, our code will not compile - we'll get an exception as soon as the class is compiled:

In [20]:
try:
    @enum.unique
    class Status(enum.Enum):
        ready = 1
        waiting = 1
        done_ok = 2
        errors = 3
except ValueError as ex:
    print(ex)

duplicate values found in <enum 'Status'>: waiting -> ready


So if you know that your enumeration should never contain aliases, go ahead and use the decorator for extra safety.

##  Customizing and Extending Enumerations

Enumerations, although they behave a little differently than normal classes, are **still** classes.

This means there are many things we can customize about them.

Keep in mind that members of the enumerations are **instances** of the enumeration class, so we can implement methods in that class, and each member will have that method (boud to itself) available.

In [1]:
from enum import Enum

In [2]:
class Color(Enum):
    red = 1
    green = 2
    blue = 3
    
    def purecolor(self, value):
        return {self: value}

In [3]:
Color.red.purecolor(100), Color.blue.purecolor(200)

({<Color.red: 1>: 100}, {<Color.blue: 3>: 200})

Amongst other things, we can implement some of the "standard" dunder methods. For example we may wish to override the default representation:

In [4]:
Color.red

<Color.red: 1>

In [5]:
class Color(Enum):
    red = 1
    green = 2
    blue = 3
    
    def __repr__(self):
        return f'{self.name} ({self.value})'

In [6]:
Color.red

red (1)

Of course, we can implement other more interesting dunder methods.

For example, in standard enums, we do not have ordering defined for the members:

In [7]:
class Number(Enum):
    ONE = 1
    TWO = 2
    THREE = 3

In [8]:
try:
    Number.ONE > Number.TWO
except TypeError as ex:
    print(ex)

'>' not supported between instances of 'Number' and 'Number'


But in this particular example it might make sense to actually have ordering defined. We can simply implement some of the rich comparison operators:

In [9]:
class Number(Enum):
    ONE = 1
    TWO = 2
    THREE = 3
    
    def __lt__(self, other):
        return isinstance(other, Number) and self.value < other.value

And now we have an ordering defined:

In [10]:
Number.ONE < Number.TWO

True

In [11]:
Number.TWO > Number.ONE

True

We could also potentially override the definition for equality (`==`):

In [12]:
class Number(Enum):
    ONE = 1
    TWO = 2
    THREE = 3
    
    def __lt__(self, other):
        return isinstance(other, Number) and self.value < other.value
    
    def __eq__(self, other):
        if isinstance(other, Number):
            return self is other
        elif isinstance(other, int):
            return self.value == other
        else:
            return False

In [13]:
Number.ONE == Number.ONE

True

In [14]:
Number.ONE == 1.0

False

In [15]:
Number.ONE == 1

True

A good question to ask ourselves is whether our members are still hashable since we implemented a custom `__eq__` method:

In [16]:
try:
    hash(Number.ONE)
except TypeError as ex:
    print(ex)

unhashable type: 'Number'


And of course, they are not. We could remedy this by implementing our own `__hash__` method.

Going back to ordering:

In [17]:
class Number(Enum):
    ONE = 1
    TWO = 2
    THREE = 3
    
    def __lt__(self, other):
        return isinstance(other, Number) and self.value < other.value

Although we have `<` (and by reflection `>`) defined, we still do not have operators such as `<=`:

In [18]:
try:
    Number.ONE <= Number.TWO
except TypeError as ex:
    print(ex)

'<=' not supported between instances of 'Number' and 'Number'


We could of course define a `__le__` method, but we could also just use the `@totalordering` decorator:

In [19]:
from functools import total_ordering

In [20]:
@total_ordering
class Number(Enum):
    ONE = 1
    TWO = 2
    THREE = 3
    
    def __lt__(self, other):
        return isinstance(other, Number) and self.value < other.value

In [21]:
Number.ONE <= Number.TWO, Number.ONE != Number.TWO

(True, True)

A slightly more useful application of this ability to implement these special methods might be in this example:

In [22]:
class Phase(Enum):
    READY = 'ready'
    RUNNING = 'running'
    FINISHED = 'finished'
    
    def __str__(self):
        return self.value

    def __eq__(self, other):
        if isinstance(other, Phase):
            return self is other
        elif isinstance(other, str):
            return self.value == other
        return False
    
    def __lt__(self, other):
        ordered_items = list(Phase)
        self_order_index = ordered_items.index(self)
        
        if isinstance(other, Phase):
            other_order_index = ordered_items.index(other)
            return self_order_index < other_order_index
        
        if isinstance(other, str):
            try:
                other_member = Phase(other)
                other_order_index = ordered_items.index(other_member)
                return self_order_index < other_order_index
            except ValueError:
                # other is not a value in our enum
                return False
            

In [23]:
Phase.READY == 'ready'

True

In [24]:
Phase.READY < Phase.RUNNING

True

In [25]:
Phase.READY < 'running'

True

One thing to watch out for, is that, by default, all members of an enumeration are **truthy** - irrespective of their value:

In [26]:
class State(Enum):
    READY = 1
    BUSY = 0    

In [27]:
bool(State.READY), bool(State.BUSY)

(True, True)

We can of course override the `__bool__` method to customize this:

In [28]:
class State(Enum):
    READY = 1
    BUSY = 0    
    
    def __bool__(self):
        return bool(self.value)

In [29]:
bool(State.READY), bool(State.BUSY)

(True, False)

So we might implement this ready/not-ready flag in our application by simply testing the truthyness of the member:

In [30]:
request_state = State.READY

In [31]:
if request_state:
    print('Launching next query')
else:
    print('Not ready for another query yet')

Launching next query


We could also easily implement a default associated truth value that reflects the truthyness of the member **values**:

In [32]:
class Dummy(Enum):
    A = 0
    B = 1
    C = ''
    D = 'python'
    
    def __bool__(self):
        return bool(self.value)

In [33]:
bool(Dummy.A), bool(Dummy.B), bool(Dummy.C), bool(Dummy.D)

(False, True, False, True)

#### Extending Custom Enumerations

We can also extend (subclass) our custom enumerations - but only under certain circumstances: as long as the enumeration we are extending does not define any **members**:

In [34]:
class Color(Enum):
    RED = 1
    GREEN = 2
    BLUE = 3

In [35]:
try:
    class ColorAlpha(Color):
        ALPHA = 4
except TypeError as ex:
    print(ex)

Cannot extend enumerations


But this would work:

In [36]:
class ColorBase(Enum):
    def hello(self):
        return f'{str(self)} says hello!'
    
class Color(ColorBase):
    RED = 'red'
    GREEN = 'green'
    BLUE = 'blue'

In [37]:
Color.RED.hello()

'Color.RED says hello!'

This might not seem particularly useful (we cannot use subclassing to extended the members), but remember that we can add methods to our enumerations - this means we could define a base class that implements some common functionality for all our instances, and then extend this enumeration class to concrete enumerations that define the members.

Here's an example of where this might be useful:

In [38]:
@total_ordering
class OrderedEnum(Enum):
    """Creates an ordering based on the member values. 
    So member values have to support rich comparisons.
    """
    
    def __lt__(self, other):
        if isinstance(other, OrderedEnum):
            return self.value < other.value
        return NotImplemented

And now we can create other enumerations that will support ordering without having to retype the `__lt__` implementation, or even the decorator:

In [39]:
class Number(OrderedEnum):
    ONE = 1
    TWO = 2
    THREE = 3
    
class Dimension(OrderedEnum):
    D1 = 1,
    D2 = 1, 1
    D3 = 1, 1, 1

In [40]:
Number.ONE < Number.THREE

True

In [41]:
Dimension.D1 < Dimension.D3

True

In [42]:
Number.ONE >= Number.ONE

True

In [43]:
Dimension.D1 >= Dimension.D2

False

Of course we could implement other functionality in our base enum (maybe customized `__str__`, `__repr__`, `__bool__`, etc).

We'll actually come back to this when we discuss auto numbering in enums.

#### Example

Here's a handy enumeration that's built-in to Python (handy if you work with http requests that is :-) )

In [44]:
from http import HTTPStatus

In [45]:
type(HTTPStatus)

enum.EnumMeta

It's technically an `EnumMeta`, but that's beyond our current scope. Still, it's easy to use and you don't need to know anything about meta classes:

In [46]:
list(HTTPStatus)[0:10]

[<HTTPStatus.CONTINUE: 100>,
 <HTTPStatus.SWITCHING_PROTOCOLS: 101>,
 <HTTPStatus.PROCESSING: 102>,
 <HTTPStatus.OK: 200>,
 <HTTPStatus.CREATED: 201>,
 <HTTPStatus.ACCEPTED: 202>,
 <HTTPStatus.NON_AUTHORITATIVE_INFORMATION: 203>,
 <HTTPStatus.NO_CONTENT: 204>,
 <HTTPStatus.RESET_CONTENT: 205>,
 <HTTPStatus.PARTIAL_CONTENT: 206>]

In [47]:
HTTPStatus(200)

<HTTPStatus.OK: 200>

In [48]:
HTTPStatus.OK, HTTPStatus.OK.name, HTTPStatus.OK.value

(<HTTPStatus.OK: 200>, 'OK', 200)

In [49]:
HTTPStatus(200)

<HTTPStatus.OK: 200>

In [50]:
HTTPStatus['OK']

<HTTPStatus.OK: 200>

It even has a `phrase` property that provides a more readable version of the HTTP status (name):

In [51]:
HTTPStatus.NOT_FOUND.value, HTTPStatus.NOT_FOUND.name, HTTPStatus.NOT_FOUND.phrase

(404, 'NOT_FOUND', 'Not Found')

Now we could implement similar functionality very easily - maybe for our own error codes in our application:

In [52]:
class AppStatus(Enum):
    OK = (0, 'No problem!')
    FAILED = (1, 'Crap!')

In [53]:
AppStatus.OK

<AppStatus.OK: (0, 'No problem!')>

In [54]:
AppStatus.OK.value

(0, 'No problem!')

What we really want is to separate the code (lie `0`) from the phrase (like `No problem!`). We could do this:

In [55]:
class AppStatus(Enum):
    OK = (0, 'No problem!')
    FAILED = (1, 'Crap!')
    
    @property
    def code(self):
        return self.value[0]
    
    @property
    def phrase(self):
        return self.value[1]

In [56]:
AppStatus.OK.code, AppStatus.OK.phrase

(0, 'No problem!')

As you can see, it's close, but not quite the same as `HTTPStatus`...

One major problem is that we can no longer lookup a member by just the code:

In [57]:
try:
    AppStatus(0)
except ValueError as ex:
    print(ex)

0 is not a valid AppStatus


We would have to do this:

In [58]:
AppStatus((0, 'No problem!'))

<AppStatus.OK: (0, 'No problem!')>

Not ideal...

#### Let's dig in...

OK, so, we can actually fix this issue by making use of the `__new__` method (which we have not studied yet, but I did mention it). 

Remember that this is the method that gets called to **instantiate** the class - so it should return a new instance of the class. 

Furthemore we'll have it set the value property - for that `Enum` has a special class attribute we can use, called `_value_`. 

This is probably going to be a little confusing, but we'll circle back to this later:

In [59]:
class AppStatus(Enum):
    OK = (0, 'No Problem!')
    FAILED = (1, 'Crap!')
    
    def __new__(cls, member_value, member_phrase):
        # create a new instance of cls
        member = object.__new__(cls)
        
        # set up instance attributes
        member._value_ = member_value
        member.phrase = member_phrase
        return member

In [60]:
AppStatus.OK.value, AppStatus.OK.name, AppStatus.OK.phrase

(0, 'OK', 'No Problem!')

And now even looking up by numeric code works:

In [61]:
AppStatus(0)

<AppStatus.OK: 0>

Now, we could easily break this out into a base class:

In [62]:
class TwoValueEnum(Enum):
    def __new__(cls, member_value, member_phrase):
        member = object.__new__(cls)
        member._value_ = member_value
        member.phrase = member_phrase
        return member

And then inherit this for any enumeration where we want to support a value as a `(code, phrase)` tuple:

In [63]:
class AppStatus(TwoValueEnum):
    OK = (0, 'No Problem!')
    FAILED = (1, 'Crap!')

In [64]:
AppStatus.FAILED, AppStatus.FAILED.name, AppStatus.FAILED.value, AppStatus.FAILED.phrase

(<AppStatus.FAILED: 1>, 'FAILED', 1, 'Crap!')

##  Automatic Values

Enumerations have a builtin mechanism to auto assign values to members.

This is often useful when you migth have a simple associated integer value that is sequential, for example `1, 2, 3, 4, ...`

We can easily let enums assign their own values this way, using the `auto()` function in the enum module.

By default it will use sequential integers, starting at `1`:

In [1]:
import enum

In [2]:
class State(enum.Enum):
    WAITING = enum.auto()
    STARTED = enum.auto()
    FINISHED = enum.auto()

In [3]:
for member in State:
    print(member.name, member.value)

WAITING 1
STARTED 2
FINISHED 3


We can actually mix in our own values too, but we have to be really careful - nothing in the Python documentation states what will/will not work - their only advice is ```Care must be taken if you mix auto with other values```. That's not saying much, and so I **never** mix auto-generated values and my own - just to be on the safe side.

This seems to work fine:

In [4]:
class State(enum.Enum):
    WAITING = 5
    STARTED = enum.auto()
    FINISHED = enum.auto()

In [5]:
for member in State:
    print(member.name, member.value)

WAITING 5
STARTED 6
FINISHED 7


But observe what happens here:

In [6]:
class State(enum.Enum):
    WAITING = enum.auto()
    STARTED = 1
    FINISHED = enum.auto()
    
for member in State:
    print(member.name, member.value)
    
State.__members__

WAITING 1
FINISHED 2


mappingproxy({'WAITING': <State.WAITING: 1>,
              'STARTED': <State.WAITING: 1>,
              'FINISHED': <State.FINISHED: 2>})

As you can see, `STARTED` ended up being an alias for `WAITING` - not what my intention was.

Using `@unique` does not solve the issue, although it does make it immediately clear that there is a problem:

In [7]:
try:
    @enum.unique
    class State(enum.Enum):
        WAITING = enum.auto()
        STARTED = 1
        FINISHED = enum.auto()
except ValueError as ex:
    print(ex)

duplicate values found in <enum 'State'>: STARTED -> WAITING


Enum classes use the `_generate_next_value_` method to generate these automatic values, and we can actually override this to provide our implementation of an automatic value. The default implemtation currently generates a sequence of numbers, but the actual algorithm is an implementation detail - i.e. we cannot rely on any specific sequence of values being generated.

We can however override it if we wish:

In [8]:
class State(enum.Enum):
    def _generate_next_value_(name, start, count, last_values):
        print(name, start, count, last_values)
        return 100
    
    a = enum.auto()
    b = enum.auto()
    c = enum.auto()

a 1 0 []
b 1 1 [100]
c 1 2 [100, 100]


As we can see the `last_values` property is a list of all the preceding values used for member. The `count` property is simply the number of enum members already created (including aliases!). The `name` property is the name of the member. The `start` argument is actually only used when we create enumerations using a functional approach (very similar to how we created named tuples) - but I am not going to cover this in this course (feel free to explore the Python docs, it's quite straightforward).

Let's see a more interesting example of how we could use this override. Let's say we want the associated values to be random integers, where we do not want duplicates.

In [9]:
import random

random.seed(0)

class State(enum.Enum):
    def _generate_next_value_(name, start, count, last_values):
        while True:
            new_value = random.randint(1, 100)
            if new_value not in last_values:
                return new_value
            
    a = enum.auto()
    b = enum.auto()
    c = enum.auto()

In [10]:
for member in State:
    print(member.name, member.value)

a 50
b 98
c 54


Another example, shown in the Python docs is using the string of the member name as the value. In this example I choose to title case the name:

In [11]:
class State(enum.Enum):
    def _generate_next_value_(name, start, count, last_values):
        return name.title()  
    
    WAITING = enum.auto()
    STARTED = enum.auto()
    FINISHED = enum.auto()
    
for member in State:
    print(member.name, member.value)

WAITING Waiting
STARTED Started
FINISHED Finished


If we want to make our `_generate_next_value_` implementation reusable across more than one enumeration, we could create an enumeration that only implements this functionality, and then use that as the parent class to our other enumerations:

In [12]:
class NameAsString(enum.Enum):
    def _generate_next_value_(name, start, count, last_values):
        return name.lower()

In [13]:
class Enum1(NameAsString):
    A = enum.auto()
    B = enum.auto()
    
class Enum2(NameAsString):
    WAIT = enum.auto()
    RUNNING = enum.auto()
    FINISHED = enum.auto()

In [14]:
for member in Enum1:
    print(member.name, member.value)
    
for member in Enum2:
    print(member.name, member.value)

A a
B b
WAIT wait
RUNNING running
FINISHED finished


### Note

Sometimes, we don't actually care about the associated value for each member. In that case we can certainly use `auto()`, but the problem might be that users of our enumeration rely on that associated value.

Later, if we want to add items to the enumeration (somewhere in the middle), our users' code would break.

We might therefore want to discourage our users from ever using the associated value, and only using the keys.

Although we can (and should) document this, we can also enforce this using a simple trick. We assign an instance of `object` as the value for each member. There is very little our users can then do with that value, and so we are ensuring their safety.

In [15]:
class State(enum.Enum):
    WAIT = object()
    RUNNING = object()
    FINISHED = object()

In [16]:
State.WAIT, State.RUNNING, State.FINISHED

(<State.WAIT: <object object at 0x7fab7807bd30>>,
 <State.RUNNING: <object object at 0x7fab7807bd40>>,
 <State.FINISHED: <object object at 0x7fab7807bda0>>)

In order for a user to use the value, they would have to first get a handle to the object instance itself - they would never get that back from a literal string, integer, etc.

Now, instead of remembering to use `object()` for every member, we could use a base class to make it reusable (and a consistent implementation), and the auto functionality:

In [17]:
class ValuelessEnum(enum.Enum):
    def _generate_next_value_(name, start, count, last_values):
        return object()
    
class State(ValuelessEnum):
    WAIT = enum.auto()
    RUNNING = enum.auto()
    FINISHED = enum.auto()
    
class Errors(ValuelessEnum):
    NumberError = enum.auto()
    IndexError = enum.auto()
    TimeoutError = enum.auto()

In [18]:
State.WAIT, Errors.TimeoutError

(<State.WAIT: <object object at 0x7fab7807bdd0>>,
 <Errors.TimeoutError: <object object at 0x7fab7807be40>>)

By using a base class, we could technically change our implementation of how the values are generated without having to touch our subclassed enumerations:

In [19]:
class ValuelessEnum(enum.Enum):
    def _generate_next_value_(name, start, count, last_values):
        while True:
            new_value = random.randint(1, 100)
            if new_value not in last_values:
                return new_value
    
class State(ValuelessEnum):
    WAIT = enum.auto()
    RUNNING = enum.auto()
    FINISHED = enum.auto()
    
class Errors(ValuelessEnum):
    NumberError = enum.auto()
    IndexError = enum.auto()
    TimeoutError = enum.auto()

In [20]:
State.WAIT, Errors.TimeoutError

(<State.WAIT: 6>, <Errors.TimeoutError: 39>)

### Auto and Aliases

I want to touch back on the `count` argument of `_generate_next_value_` when are are dealing with aliases.

Since the default implementation of `_generate_next_value_` generates sequential integer numbers, we can never create aliases using this default.

However, nothing stops us from doing so when we have our own implementation of that function. In that case `count` will reflect the number of items created, **including** any aliases.

In [21]:
class Aliased(enum.Enum):
    def _generate_next_value_(name, start, count, last_values):
        print(f'count={count}')
        if count % 2 == 1:
            # odd, make this member an alias of the previous one
            return last_values[-1]
        else:
            # make a new value
            return last_values[-1] + 1
       
    GREEN = 1
    GREEN_ALIAS = 1
    RED = 10
    CRIMSON = enum.auto()
    BLUE = enum.auto()
    AQUA = enum.auto()

count=3
count=4
count=5


As you can see `_generate_next_value_` was called for the last three members of our enum, and reflect the number of items that were created to that point, including aliases.

In [22]:
list(Aliased)

[<Aliased.GREEN: 1>, <Aliased.RED: 10>, <Aliased.BLUE: 11>]

In [23]:
Aliased.__members__

mappingproxy({'GREEN': <Aliased.GREEN: 1>,
              'GREEN_ALIAS': <Aliased.GREEN: 1>,
              'RED': <Aliased.RED: 10>,
              'CRIMSON': <Aliased.RED: 10>,
              'BLUE': <Aliased.BLUE: 11>,
              'AQUA': <Aliased.BLUE: 11>})

# Section 11 - Project 5

##  Project 5 - Solution

Suppose we are writing an application that uses exceptions and we want our exception messages (and type) to be very consistent, as well as provide some way to easily list out all the possible exceptions used in our app.

Although there are many other approaches to doing this (as with any problem), let's use enumerations specifically to implement this functionality.

What we want is a mechanism whereby we can raise an exception this way:

```
AppException.Timeout.throw()
```
which will raise a custom exception `ConnectionException('100 - Timeout connecting to resource')`

And something like this as well:
```
AppException.NotAnInteger.throw()
```
which will raise a `ValueError('200 - Value is not an integer')`

This means our exception will need to contain the exception key (such as `Timeout` or `NotAnInteger`) as well as the exception class we want to raise, and the default message itself. We also want to have consistent error codes (integer values) for each exception.

We'll need to implement a `throw` method (we can't use the reserved name `raise`) that will raise the exception with the default message. In addition we'd like to be able to override the default message with a custom one if we prefer:
```
AppException.Timeout.throw('Timeout connecting to database')
```

We'll also need to implement some properties for the exception code, class (type), and message.

First let's create a few custom exceptions that we can use, but of course we can also use all the builtin exceptions too.

In [1]:
class GenericException(Exception):
    pass

class Timeout(Exception):
    pass

We'll come back to exceptions later and see why we may actually want to build a hierarchy of exception instead of this flat appropach I took here.

In [2]:
from enum import Enum

First we're going to need to store a tuple for each key's value and that tuple will need to contain the error code, the exception class, and a custom message. So three entities.

We'll use the same approach we took when we looked at extending enums, and use the `__new__` method to achieve our goals.    

In [3]:
class AppException(Enum):
    Generic = (100, GenericException, 'Application exception.')
    Timeout = (101, Timeout, 'Timeout connecting to resource.')
    NotAnInteger = (200, ValueError, 'Value must be an integer.')
    NotAList = (201, ValueError, 'Value must be a list.')
    
    def __new__(cls, ex_code, ex_class, ex_message):
        # create a new instance of cls
        member = object.__new__(cls)
        
        # set up instance attributes
        member._value_ = ex_code
        member.exception = ex_class
        member.message = ex_message
        return member

So this is a good start. We can use our enum this way:

In [4]:
AppException.Timeout.value, AppException.Timeout.message, AppException.Timeout.exception

(101, 'Timeout connecting to resource.', __main__.Timeout)

So we could technically raise an exception directly from this:

In [5]:
try:
    raise AppException.Timeout.exception(f'{AppException.Timeout.value} - {AppException.Timeout.message}')
except Timeout as ex:
    print(ex)

101 - Timeout connecting to resource.


But we really do not want to have to raise exceptions this way - it's a lot of typing. I also don't like using `value` for the exception code, I'd rather have a property called `code` that is maybe a better name for it.

So, we'll immplement a `code` property (we'll leave value as is, because we can look up an exception by it's code that way), and we'll implement a `raise` method to actually raise the exception for us.

In [6]:
class AppException(Enum):
    Generic = (100, GenericException, 'Application exception.')
    TimeOut = (101, Timeout, 'Timeout connecting to resource.')
    NotAnInteger = (200, ValueError, 'Value must be an integer.')
    NotAList = (201, ValueError, 'Value must be a list.')
    
    def __new__(cls, ex_code, ex_class, ex_message):
        # create a new instance of cls
        member = object.__new__(cls)
        
        # set up instance attributes
        member._value_ = ex_code
        member.exception = ex_class
        member.message = ex_message
        return member
    
    @property
    def code(self):
        return self.value
    
    def throw(self):
        raise self.exception(f'{self.code} - {self.message}')

Now it becomes much easier to raise an exception:

In [7]:
try:
    AppException.NotAnInteger.throw()
except ValueError as ex:
    print(ex)

200 - Value must be an integer.


We can easily access exceptions by name (key) or code (value):

In [8]:
AppException.NotAList.code, AppException.NotAList.message

(201, 'Value must be a list.')

or:

In [9]:
AppException(201), AppException['NotAList']

(<AppException.NotAList: 201>, <AppException.NotAList: 201>)

One additional thing is that I would like the ability to override the default error message. So let's add this to the `throw` method:

In [10]:
class AppException(Enum):
    Generic = (100, GenericException, 'Application exception.')
    Timeout = (101, Timeout, 'Timeout connecting to resource.')
    NotAnInteger = (200, ValueError, 'Value must be an integer.')
    NotAList = (201, ValueError, 'Value must be a list.')
    
    def __new__(cls, ex_code, ex_class, ex_message):
        # create a new instance of cls
        member = object.__new__(cls)
        
        # set up instance attributes
        member._value_ = ex_code
        member.exception = ex_class
        member.message = ex_message
        return member
    
    @property
    def code(self):
        return self.value
    
    def throw(self, message=None):
        message = message or self.message
        raise self.exception(f'{self.code} - {message}')

In [11]:
try:
    AppException.Timeout.throw()
except Exception as ex:
    print(ex)

101 - Timeout connecting to resource.


In [12]:
try:
    AppException.Timeout.throw('Timeout connecting to database.')
except Exception as ex:
    print(ex)

101 - Timeout connecting to database.


And of course we can list out all the errors in our app:

In [13]:
list(AppException)

[<AppException.Generic: 100>,
 <AppException.Timeout: 101>,
 <AppException.NotAnInteger: 200>,
 <AppException.NotAList: 201>]

We can get a more usable list of exception names, codes and messages this way:

In [14]:
[(ex.name, ex.code, ex.message) for ex in AppException]

[('Generic', 100, 'Application exception.'),
 ('Timeout', 101, 'Timeout connecting to resource.'),
 ('NotAnInteger', 200, 'Value must be an integer.'),
 ('NotAList', 201, 'Value must be a list.')]

# Section 12 - Exceptions

##  Python Exceptions

Exceptions are objects - instances of classes.

In Python, all exceptions inherit from `BaseException`, but most of the majority of the builtin exceptions we work with derive from a subclass of that class, the `Exception` class.

As I showed you in the lecture there is a hierarchy to those classes.

When exceptions are `raised` (either by Python, or by ourselves), it triggers an exception workflow.

Let's first see that exceptions are objects:

In [1]:
type(Exception)

type

In [2]:
ex = Exception()

As you can see, creating an exception object does **not** trigger an exception workflow.

Let's examine this `Exception` instance:

In [3]:
ex.__class__, type(ex)

(Exception, Exception)

And it is indeed a subclass of `BaseException`:

In [4]:
isinstance(ex, BaseException)

True

Other exceptions, such as an `IndexError`, inherit from a hierarchy of exceptions that go back all the way to `BaseException` (and `object` as well of course!)

In [5]:
issubclass(IndexError, LookupError)

True

In [6]:
issubclass(LookupError, Exception)

True

Exception workflows can be triggered by Python itself:

In [7]:
l = [1, 2, 3]
l[4]

IndexError: list index out of range

As you can see Python raised an `IndexError` exception.

We can "handle" an exception workflow by using the a `try` statement and handling the exception (if any) in the `except` clause of the handler:

In [8]:
try:
    l[4]
except IndexError as ex:
    print(ex.__class__, ':', str(ex))

<class 'IndexError'> : list index out of range


As you can see we **handled** the `IndexError` exception.

But since `IndexError` inherits from `LookupError` which itself inherits from `Exception`, we could actually handle any of those exception types with the same effect:

In [9]:
try:
    l[4]
except LookupError as ex:
    print(ex.__class__, ':', str(ex))

<class 'IndexError'> : list index out of range


As you may have noticed, the exception that is raised is **still** an `IndexError`, but it was handled by the `except LookupError` handler.

So when we handle an exception, the handler will "catch" the exception type we specify, **and any subclass of it**.

We can broaden our handler to include any subclass of `Exception`:

In [10]:
try:
    l[4]
except Exception as ex:
    print(ex.__class__, ':', str(ex))

<class 'IndexError'> : list index out of range


But be careful of writing broad handlers like that - it is unlikely (but not always) that you can write handlers that do any meaningful error handling for such broad exceptions - the better approach is to handle specific exceptions in specific ways.

By the way, most standard exceptions implement both `str` and `repr` custom representations:

In [11]:
ex = ValueError('custom message')

In [12]:
str(ex)

'custom message'

In [13]:
repr(ex)

"ValueError('custom message',)"

Next we should talk about the stack trace. Recall what I mentioned in the lecture about exceptions propagating up if they are no handled.

Let's start with an example of some nested function calls, and we'll raise an exception in the innermost function call.

In [14]:
def func_1():
    func_2()
    
def func_2():
    func_3()
    
def func_3():
    # create an instance of a ValueError exception, and raise it
    raise ValueError()

Now if I call `func_3` directly, we'll see an unhandled `ValueError` exception:

In [15]:
func_3()

ValueError: 

But now let's call `func_1`:

In [16]:
func_1()

ValueError: 

Notice the stack trace above. 

The bottom of the stack is where the exception started, then each "frame" above it tells us that the exception propagated - first to `func_2` (in the line that called `func_3`), and then finally in `func_1` (in the line that called `func_2`)

Now of course we can handle the exception at any level we wish. When we handle an exception it is up to us to decide what to do with it - at that point we have interrupted the exception propagation, and we could either do something and continue running our code, or we could raise another exception, or we could re-raise the exception. We'll come back to that later.

For now, let's see how we could handle the exception in `func_2` and silence it:

In [17]:
def func_2():
    try:
        func_3()
    except ValueError:
        print('error occurred - silencing it')

In [18]:
func_1()

error occurred - silencing it


As you can see we essentially stopped the exception propagation in `func_2`.

I just want to go back to the statement I made about not making our handlers too broad.

Suppose we have a function that, given a sequence, returns the square of the numbers, up to (but not including) a specific index number in the sequence:

In [19]:
def square(seq, index):
    return seq[index] ** 2

def squares(seq, max_n):
    for i in range(max_n):
        yield square(seq, i)

Now if we have a problem with our max index, we'll get an `IndexError` exception:

In [20]:
l = [1, 2, 3]

In [21]:
list(squares(l, 4))

IndexError: list index out of range

So, we may want to trap that exception using a broad `Exception` handler:

In [22]:
def square(seq, index):
    return seq[index] ** 2

def squares(seq, max_n):
    for i in range(max_n):
        try:
            yield square(seq, i)
        except Exception:
            return

In [23]:
l = [1, 2, 3]
list(squares(l, 5))

[1, 4, 9]

So that seems to work, and we can now deal with a bad max index. But what happens if I pass a seq where one of the values is not squarable?

This is the exception we should be seeing:

In [24]:
'a' ** 2

TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'int'

But watch what happens when we iterate:

In [25]:
l = [1, 2, '3', 4, 5]
list(squares(l, 10))

[1, 4]

As you can see that exception was handled just like the index exception. That's probably not what I want - so it would be much better to write it this way:

In [26]:
def square(seq, index):
    return seq[index] ** 2

def squares(seq, max_n):
    for i in range(max_n):
        try:
            yield square(seq, i)
        except IndexError:
            return

In [27]:
l = [1, 2, '3', 4, 5]
list(squares(l, 10))

TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'int'

And now I get an exception, which means I am aware of the problem, whereas the broad exception handler earlier completely hid from me.

And of course this still works as expected:

In [28]:
l = [1, 2, 3]
list(squares(l, 10))

[1, 4, 9]

So be careful - broad exception handlers can easily hide bugs in your code. They are not recommended in practice, but are sometimes useful.

For example, you might start a database transaction, and start writing some data to a database. 

Your application specs call for rolling back the transaction should **any** exception occur. 

In that case, a broad exception handler might make sense.

Better yet though, use a context manager!!

In fact, we can make our exception handler even broader, by using a **bare** except:

In [29]:
try:
    1 / 0
except:
    print('exception occurred')

exception occurred


Again, not a good idea in general, but there are some valid use cases for this, which we'll see later.

##  Handling Exceptions

We'll come back to how we can raise exceptions, but we've used it before, so I'll use it again now without explanation, just so we can raise some exceptions to examine exception **handling**.

In [1]:
raise ValueError('custom exception')

ValueError: custom exception

If this exception had occured at the module level when running the module, the Python application would exit. We did not **handle** the exception, so the exception propagated all the way to the top and ended up aborting the program execution.

In here though, Jupyter basically handles any exception (prints it out and silences it) so our notebook does not crash! 

(By the way, this is a very good use case for a bare exception handler!)

Let's try a simple handler first:

In [2]:
try:
    raise ValueError('custom message')
except ValueError as ex:
    print(ex)

custom message


As you can see, the string representation of the `ValueError` exception object is just the custom message we supplied as an argument to the exception. Most standard exceptions will actually support multiple arguments in their constructor, so we can actually do something like this:

In [3]:
try:
    raise ValueError('custom message', 'secondary message')
except ValueError as ex:
    print(ex)

('custom message', 'secondary message')


Alternatively, we could use the `repr()` of the exception when printing it out:

In [4]:
try:
    raise ValueError('custom message', 'secondary message')
except ValueError as ex:
    print(repr(ex))

ValueError('custom message', 'secondary message')


When we guard code (in a `try` block), we can handle different exception types in separate exception **handlers**:

In [5]:
def func_1():
    raise ValueError('bad value')
    
try:
    func_1()
except ValueError as ex:
    print('handling a value error', repr(ex))
except IndexError as ex:
    print('handling an index error', repr(ex))

handling a value error ValueError('bad value',)


But if `func_1` caused an `IndexError` exception to be raised, our second handler would catch it:

In [6]:
def func_1():
    raise IndexError('bad index')
    
try:
    func_1()
except ValueError as ex:
    print('handling a value error', repr(ex))
except IndexError as ex:
    print('handling an index error', repr(ex))

handling an index error IndexError('bad index',)


The first exception handler that "matches" (subclass!) the exception type will be used - so be careful about not catching broad exceptions first.

For example, this will not handle the exception in the `ValueError` handler, because it is a subclass of `Exception` and that handler is defined first:

In [7]:
try:
    raise ValueError('value error')
except Exception as ex:
    print('handling Exception', repr(ex))
except ValueError as ex:
    print('handling ValueError', repr(ex))

handling Exception ValueError('value error',)


Note that the exception is still an instance of `ValueError`, but is being handled by the code in the `except Exception` handler.

If we write exception handlers, and none of them match the exception type, then the exception is essentially unhandled, and it will propagate up:

In [8]:
try:
    raise KeyError('bad key')
except ValueError:
    print('handling value error...')
except IndexError:
    print('handling index error...')

KeyError: 'bad key'

The `finally` block is guaranteed to execute, whether an exception is raised or not, and whether it is handled or not!

In [9]:
try:
    raise ValueError('bad value')
except ValueError:
    print('handling value error...')
finally:
    print('running finally...')

handling value error...
running finally...


If no exception occurs:

In [10]:
try:
    pass
except ValueError:
    print('handling value error...')
finally:
    print('running finally...')    

running finally...


And with an unhandled exception:

In [11]:
try:
    raise ValueError('bad value')
except IndexError:
    print('handling index error...')
finally:
    print('running finally...')

running finally...


ValueError: bad value

This means that the `finally` block will execute even if there are no exception handlers defined, and whether or not an exception is raised:

In [12]:
try:
    pass
finally:
    print('finally...')

finally...


In [13]:
try:
    raise ValueError()
finally:
    print('finally...')

finally...


ValueError: 

The `except` clause on the other hand is a block that excues if no exceptions occurred - it requires at least one `except` clause to be present:

In [14]:
try:
    pass
except ValueError:
    print('value error...')
else:
    print('no exception occurred...')

no exception occurred...


In [15]:
try:
    raise ValueError();
except ValueError:
    print('value error...')
else:
    print('no exception occurred...')

value error...


In [16]:
try:
    raise ValueError()
except IndexError:
    print('index error...')
else:
    print('no exception occurred...')
    

ValueError: 

Some developers often ignore the `else` clause altogether, and write the following:

In [17]:
try:
    pass
except ValueError:
    print('value error...')
else:
    print('no exception occurred...')

no exception occurred...


this way:

In [18]:
try:
    pass
except ValueError:
    print('value error...')
print('no exception occurred')

no exception occurred


These two are in fact **not** equivalent!

What happens if a `ValueError` exception does occur?

In [19]:
try:
    raise ValueError()
except ValueError:
    print('value error...')
else:
    print('no exception occurred...')

value error...


In [20]:
try:
    raise ValueError()
except ValueError:
    print('value error...')
print('no exception occurred')

value error...
no exception occurred


As you can see we do **not** have the same functionality.

`try` statement can be nested. Obviously they can be nested if one `try` clause calls another function that itself contains a `try`. But they can also be nested, one within the other directly.

Let's first see the direct nesting:

Suppose we want to create a list of `Person` objects from a deserialized `json` object:

In [21]:
import json

In [22]:
json_data = """{
    "Alex": {"age": 18},
    "Bryan": {"age": 21, "city": "London"},
    "Guido": {"age": "unknown"}
}"""

First we can deserialize the json string into a dictionary:

In [23]:
data = json.loads(json_data)

In [24]:
data

{'Alex': {'age': 18},
 'Bryan': {'age': 21, 'city': 'London'},
 'Guido': {'age': 'unknown'}}

Next we are going to create a list of `Person` objects, and iterate through the properties of each person in the `data` dict and set them directly on the `Person` instance. 

Firstly, the `city` attribute is going to be a problem since `Person` only has two slots defined (`name` and `age`). 
This will be an `AttributeError`.

Secondly, `Guido`'s age is not a valid value - this is going to cause a `ValueError`.

In [25]:
class Person:
    __slots__ = 'name', '_age'
    
    def __init__(self, name):
        self.name = name
        self._age = None
        
    @property
    def age(self):
        return self._age
    
    @age.setter
    def age(self, value):
        if isinstance(value, int) and value >= 0:
            self._age = value
        else:
            raise ValueError('Invalid age')
            
    def __repr__(self):
        return f'Person(name={self.name}, age={self.age})'

The way we want to handle this is that if some "extra" attributes exist we just want to ignore them, but if a value is of the wrong type, we do not want to create the object in our list.

In [26]:
persons = []
for name, attributes in data.items():
    try:
        p = Person(name)
        
        for attrib_name, attrib_value in attributes.items():
            try:
                setattr(p, attrib_name, attrib_value)
            except AttributeError:
                print(f'ignoring attribute: {name}.{attrib_name}={attrib_value}')
    except ValueError as ex:
        print(f'Data for Person({name}) contains an invalid attribute value: {ex}')
    else:
        # note that this runs if the outer try does not encounter an exception
        # since the inner try catches and does not propagate an `AttributeError`
        # this does not affect this else - the outer try never sees the inner exception
        # since it was handled (and essentially silenced)
        persons.append(p)
        
print(persons)

ignoring attribute: Bryan.city=London
Data for Person(Guido) contains an invalid attribute value: Invalid age
[Person(name=Alex, age=18), Person(name=Bryan, age=21)]


While we could certainly handle the `ValueError` in the nested `for` loop, it makes the logic a bit more difficult:

In [27]:
persons = []
for name, attributes in data.items():
    p = Person(name)

    for attrib_name, attrib_value in attributes.items():
        skip_person = False
        try:
            setattr(p, attrib_name, attrib_value)
        except AttributeError:
            print(f'ignoring attribute: {name}.{attrib_name}={attrib_value}')
        except ValueError as ex:
            print(f'Data for Person({name}) contains an invalid attribute value: {ex}')
            skip_person = True
            break
    if not skip_person:
        persons.append(p)
        
print(persons)

ignoring attribute: Bryan.city=London
Data for Person(Guido) contains an invalid attribute value: Invalid age
[Person(name=Alex, age=18), Person(name=Bryan, age=21)]


Obviously the nested `try` is more elegant, and easier to understand.

Exception handlers may also be nested a different levels of the call stack, and either an exception is handled, or it is propagated up.

Here we want to create a simple function to transform `0`, `1`, `"0"`, `"1"`, `"T"`, `"F"`, `"True"`, `"False"`, `True` and `False` into the equivalent boolean type, as well as case insensitive versions of the strings.

In [28]:
def convert_int(val):
    if not isinstance(val, int):  # remember this will work for booleans too!
        raise TypeError()
    if val not in {0, 1}:
        raise ValueError("Integer values 0 or 1 only")
    return bool(val)

In [29]:
def convert_str(val):
    if not isinstance(val, str):
        raise TypeError()
        
    val = val.casefold()  # for case-insensitive comparisons
    if val in {'0', 'f', 'false'}:
        return False
    elif val in {'1', 't', 'true'}:
        return True
    else:
        raise ValueError('Admissible string values are: T, F, True, False (case insensitive)')

Now let's write the main converter function:

In [30]:
class ConversionError(Exception):
    pass

def make_bool(val):
    try:
        try:
            b = convert_int(val)
        except TypeError:
            # it wasn't an int/bool, so let's try it as a string
            try:
                b = convert_str(val)
            except TypeError:
                raise ConversionError(f'The type {type(val).__name__} cannot be converted to a bool')
    except ValueError as ex:
        # this will catch ValueError exceptions from either convert_int or convert_str
        raise ConversionError(f'The value {val} cannot be converted to a bool: {ex}')
    else:
        return b
    

The way we have this written, a `ConversionError` exception will be raised, both on a type error, and a value error.

Notice how we are using exception handling to control the execution flow of our code.

In particular, we are not testing for conditions prior to attempting something (i.e. we do not check if something is an instance of an `int` before calling `convert_int` - we just try it, and catch the exception if that did not work, and then proceed to do the same with `convert_str`).

This is called "asking for forgiveness later". Just try the code, and handle the exception (ask forgiveness) later.

Now we can convert our values:

In [31]:
values = [True, 0, 'T', 'false', 10, 'ABC', 1.0]

for value in values:
    try:
        result = make_bool(value)
    except ConversionError as ex:
        result = str(ex)

    print(value, result)

True True
0 False
T True
false False
10 The value 10 cannot be converted to a bool: Integer values 0 or 1 only
ABC The value ABC cannot be converted to a bool: Admissible string values are: T, F, True, False (case insensitive)
1.0 The type float cannot be converted to a bool


If having three levels of nested try's in a single function is too much for you, we could simplify it a little, at the expense of some repetitive code (usually not a good idea):

In [32]:
class ConversionError(Exception):
    pass

def make_bool(val):
    try:
        b = convert_int(val)
    except TypeError:
        pass  # for now we ignore type errors
    except ValueError as ex:
        # it wasn't an int/bool, so let's try it as a string
        raise ConversionError(f'The value {val} cannot be converted to a bool: {ex}')
    else:
        return b
    
    # reached here so we must have had a type error
    try:
        b = convert_str(val)
    except TypeError:
        pass  # silence this again
    except ValueError as ex:
        raise ConversionError(f'The value {val} cannot be converted to a bool: {ex}')
    else:
        return b
        
    # reached here, so neither an int nor a string
    raise ConversionError(f'The type {type(val).__name__} cannot be converted to a bool')

In [33]:
values = [True, 0, 'T', 'false', 10, 'ABC', 1.0]

for value in values:
    try:
        result = make_bool(value)
    except ConversionError as ex:
        result = str(ex)

    print(value, result)

True True
0 False
T True
false False
10 The value 10 cannot be converted to a bool: Integer values 0 or 1 only
ABC The value ABC cannot be converted to a bool: Admissible string values are: T, F, True, False (case insensitive)
1.0 The type float cannot be converted to a bool


We could have tried a different strategy here, the "look before you leap" strategy. In this case we try to not to cause exceptions by guarding against them.

Here's an equivalent functionality using this approach. Note that we cannot really break out the `int` and `str` conversions cleanly, because we need to test for admissible types and values before we even try the conversion:

In [34]:
def make_bool(val):
    if isinstance(val, int):
        if val in {0, 1}:
            return bool(val)
        else:
            raise ConversionError('Invalid integer value.')
    if isinstance(val, str):
        if val.casefold() in {'1', 'true', 't'}:
            return True
        if val.casefold() in {'0', 'false', 'f'}:
            return False
        raise ConversionError('Invalid string value')
    raise ConversionError('Invalid type')

In [35]:
values = [True, 0, 'T', 'false', 10, 'ABC', 1.0]

for value in values:
    try:
        result = make_bool(value)
    except ConversionError as ex:
        result = str(ex)

    print(value, result)

True True
0 False
T True
false False
10 Invalid integer value.
ABC Invalid string value
1.0 Invalid type


Usually the "ask forgiveness later" approach is favored over the "look before you leap" approach in Python. This is sometimes referred to as **EAFP** - easier to ask for permission.

But the above example shows you that that is not always clear cut - honestly I think the second version is more comprehensible than the first.

Here's a much clear example. Let's write a function that needs to use an element at some index of a sequence type, and use a default value it it's not there:

The "forgiveness" approach first:

In [36]:
def get_item_forgive_me(seq, idx, default=None):
    try:
        return seq[idx]
    except (IndexError, TypeError, KeyError):
        # catch either not indexable (TypeError), or index out of bounds, 
        # or even a KeyError for mapping types
        return default

The "ask permission" first is not that simple! How do we determine if an object is a sequence type?

In [37]:
def get_item_ask_perm(seq, idx, default=None):
    if hasattr(seq, '__getitem__'):
        if idx < len(seq):
            return seq[idx]
    return default

The first one works quite well:

In [38]:
get_item_forgive_me([1, 2, 3], 0)

1

In [39]:
get_item_forgive_me([1, 2, 3], 10, 'Nope')

'Nope'

The second one seems to work ok:

In [40]:
get_item_ask_perm([1, 2, 3], 0)

1

In [41]:
get_item_ask_perm([1, 2, 3], 10, 'Nope')

'Nope'

But what about this:

In [42]:
get_item_forgive_me({'a': 100}, 'a')

100

In [43]:
get_item_ask_perm({'a': 1}, 'a')

TypeError: '<' not supported between instances of 'str' and 'int'

So, now we would have to do a lot more work to support getting a key from a mapping using this approach. The dictionary has a `__getitem__` method, but does not support numerical indexing.

We could get bogged down in more and more checks:

In [44]:
def get_item_ask_perm(seq, idx, default=None):
    if hasattr(seq, '__getitem__'):
        # could be sequence type or mapping type, or something else altogether??
        if isinstance(seq, dict):
            return seq.get(idx, default)
        elif isinstance(idx, int):
            # looks like a numerical index...
            if idx < len(seq):
                return seq[idx]
    return default

That fixes the problem somewhat:

In [45]:
get_item_ask_perm({'a': 100}, 'a')

100

In [46]:
get_item_ask_perm([1, 2, 3], 0)

1

But now we are also relying on the sequence type having a length!

In [47]:
class ConstantSequence:
    def __init__(self, val):
        self.val = val
        
    def __getitem__(self, idx):
        return self.val

This is a sequence, an infinite sequence in fact:

In [48]:
seq = ConstantSequence(10)

In [49]:
seq[0]

10

And watch what happens with both our functions:

In [50]:
get_item_forgive_me(seq, 10, 'Nope')

10

In [51]:
get_item_ask_perm(seq, 10, 'Nope')

TypeError: object of type 'ConstantSequence' has no len()

And so on, we could really dig ourselves into a hole here. When all we're interested in in making this call `seq[idx]`, and using a default if that does not work.

And that's why EAFP is favored - in Python, we are more interested in can an object perform this type of work, versus

##  Raising Exceptions

An exception workflow can be initiated by using the `raise` statement.

To *raise* an exception we need to `raise` an **instance** of an exception type (one that is a subclass of `BaseException`).

You cannot raise an instance of a class that is not a subclass of `BaseException`.

In [1]:
class Person:
    pass

In [2]:
try:
    raise Person()
except TypeError as ex:
    print(repr(ex))

TypeError('exceptions must derive from BaseException',)


All the standard exceptions derive from `BaseException` and it allows for any number of positional arguments in the initializer (`*args`). The only place those arguments are actually used in `BaseException` is in the `args` attribute and the string representations:

In [3]:
ex = BaseException('a', 'b', 'c')

In [4]:
ex.args

('a', 'b', 'c')

In [5]:
str(ex)

"('a', 'b', 'c')"

In [6]:
repr(ex)

"BaseException('a', 'b', 'c')"

This means that other standard exceptions, that inherit from `BaseException` support this too:

In [7]:
ex = ValueError('a', 'b', 'c')
print(ex.args)
print(str(ex))
print(repr(ex))

('a', 'b', 'c')
('a', 'b', 'c')
ValueError('a', 'b', 'c')


Often we only use a single argument, some type of explanatory message, but it is handy to have the option of extra arguments available.

So raising an exception is very easy:

In [8]:
try:
    raise ValueError('some message here')
except ValueError as ex:
    print(repr(ex))

ValueError('some message here',)


But there are some useful variations on the `raise` statement.

Sometimes we want to catch an exception, try to handle it, maybe because we realize we can't handle that specific exception, or because we want to perform some action before letting the exception continue to propagate - essentially inserting ourselves in the propagation workflow, but letting it continue once we're done.

Here's a more concrete example:

In [9]:
def div(a, b):
    try:
        return a // b
    except ZeroDivisionError as ex:
        print('logging zero division exception: ', type(ex).__name__, ex.args)
        raise

In [10]:
div(1, 0)

logging zero division exception:  ZeroDivisionError ('integer division or modulo by zero',)


ZeroDivisionError: integer division or modulo by zero

As you can see, we interrupted the flow, logged what we needed, and resume the propagation flow.

Sometimes we may want to change the particular exception we are raising - this is particularly useful when using custom exceptions, as we'll cover later.

But here's what I mean:

In [11]:
class CustomError(Exception):
    """a custom exception"""
    
def my_func(a, b):
    try:
        return a // b
    except ZeroDivisionError as ex:
        print('logging...')
        raise CustomError(*ex.args)

In [12]:
my_func(1, 0)

logging...


CustomError: integer division or modulo by zero

So, the exception we obtained was a `CustomError` exception - what we substituted for the `ZeroDivisionError` exception that occurred.

One very important note here, is the traceback.

Notice how we can see precisely the exception stack - first a `ZeroDivisionError`, that then resulted in a `CustomError` exception.

Whenever we raise an exception in this way, the stack trace of the current exception is maintained and added to our new exception being raised.

We could see this nested more levels:

In [13]:
try:
    raise ValueError('level 1')
except ValueError:
    try:
        raise TypeError('level 2')
    except TypeError:
        raise KeyError('level 3')

KeyError: 'level 3'

As you can see the entire stack trace is preserved.

Sometimes we may want to modify whether we want to keep the original stack trace - we may be writing a function where the specific exceptions that result in the final exception we want to raise are implementation details we don't want our user to have to wade through.

In that case, we can squash the current traceback completely, by using `raise Exc from None` - the `from` here tells Python what traceback to use - in this case `None`.

Let's see where this might be handy. Remember that set of functions we wrote earlier to convert a value to it's boolean equivalent?

Here it is again:

In [14]:
class ConversionError(Exception):
    pass

def convert_int(val):
    if not isinstance(val, int):  # remember this will work for booleans too!
        raise TypeError()
    if val not in {0, 1}:
        raise ValueError("Integer values 0 or 1 only")
    return bool(val)

def convert_str(val):
    if not isinstance(val, str):
        raise TypeError()
        
    val = val.casefold()  # for case-insensitive comparisons
    if val in {'0', 'f', 'false'}:
        return False
    elif val in {'1', 't', 'true'}:
        return True
    else:
        raise ValueError('Admissible string values are: T, F, True, False (case insensitive)')
        
def make_bool(val):
    try:
        try:
            b = convert_int(val)
        except TypeError:
            # it wasn't an int/bool, so let's try it as a string
            try:
                b = convert_str(val)
            except TypeError:
                raise ConversionError(f'The type {type(val).__name__} cannot be converted to a bool')
    except ValueError as ex:
        # this will catch ValueError exceptions from either convert_int or convert_str
        raise ConversionError(f'The value {val} cannot be converted to a bool: {ex}')
    else:
        return b
    

And when we call the function with a bad value:

In [15]:
make_bool('ABC')

ConversionError: The value ABC cannot be converted to a bool: Admissible string values are: T, F, True, False (case insensitive)

Notice how the stack trace is quite complicated. Do we really want users of our function to see this? The internal implementation details of our function is not of interest to them, we just want to raise a "clean" `ConversionError` exception.

We can do so by using `from None` when we raise our custom exception:

In [16]:
class ConversionError(Exception):
    pass

def convert_int(val):
    if not isinstance(val, int):  # remember this will work for booleans too!
        raise TypeError()
    if val not in {0, 1}:
        raise ValueError("Integer values 0 or 1 only")
    return bool(val)

def convert_str(val):
    if not isinstance(val, str):
        raise TypeError()
        
    val = val.casefold()  # for case-insensitive comparisons
    if val in {'0', 'f', 'false'}:
        return False
    elif val in {'1', 't', 'true'}:
        return True
    else:
        raise ValueError('Admissible string values are: T, F, True, False (case insensitive)')
        
def make_bool(val):
    try:
        try:
            b = convert_int(val)
        except TypeError:
            # it wasn't an int/bool, so let's try it as a string
            try:
                b = convert_str(val)
            except TypeError:
                raise ConversionError(f'The type {type(val).__name__} cannot be converted to a bool') from None
    except ValueError as ex:
        # this will catch ValueError exceptions from either convert_int or convert_str
        raise ConversionError(f'The value {val} cannot be converted to a bool: {ex}') from None
    else:
        return b
    

In [17]:
make_bool('ABC')

ConversionError: The value ABC cannot be converted to a bool: Admissible string values are: T, F, True, False (case insensitive)

In [18]:
make_bool(1.0)

ConversionError: The type float cannot be converted to a bool

As you can see, the traceback is much cleaner.

We can also be very specific as to which traceback to use when we raise an exception. 

In [19]:
try:
    raise ValueError('level 1')
except ValueError as ex_1:
    try:
        raise ValueError('level 2')
    except ValueError as ex_2:
        try:
            raise ValueError('level 3')
        except ValueError as ex_3:
            raise ValueError('value error occurred')

ValueError: value error occurred

Notice how the traceback contains the entire exception stack. We could of course remove it entirely:

In [20]:
try:
    raise ValueError('level 1')
except ValueError as ex_1:
    try:
        raise ValueError('level 2')
    except ValueError as ex_2:
        try:
            raise ValueError('level 3')
        except ValueError as ex_3:
            raise ValueError('value error occurred') from None

ValueError: value error occurred

But we could also choose to only skip `level2` by using the traceback from `level1`:

In [21]:
try:
    raise ValueError('level 1')
except ValueError as ex_1:
    try:
        raise ValueError('level 2')
    except ValueError as ex_2:
        try:
            raise ValueError('level 3')
        except ValueError as ex_3:
            raise ValueError('value error occurred') from ex_1

ValueError: value error occurred

As you can see, we used the traceback from `ex_1` when we raised our final `ValueError`.

This can be useful if you trap some exception, try to handle it, and in the process cause another exception to be raised. 

When you handle that secondary exception, you may very well consider it an implementation detail and wish to shield the user from that particular exception - but the original one is important enough to include it in the traceback.

Let's look at an example that uses the `convert_int` function from earlier. We know that if we pass it a non-integer value, it will give us a type exception:

In [22]:
convert_int(1.0)

TypeError: 

Now suppose we are writing a function that makes use of it:

In [23]:
def calc(b):
    try:
        b_bool = convert_int(b)
    except TypeError as ex_1:
        # bad type, but maybe it was a float and we could try to convert it to an int first
        try:
            b_int = int(b)
        except (ValueError, TypeError):
            raise CustomError('Bad type')
            
        b_bool = convert_int(b_int)

    return b_bool   

In [24]:
calc(1), calc(0)

(True, False)

In [25]:
calc(1.0)

True

In [26]:
calc('A')

CustomError: Bad type

As you can see we get an ugly stack trace here, that includes the exception when we tried to cast our argument to an int. We can hide it by using the traceback from `ex_1` instead:

In [27]:
def calc(b):
    try:
        b_bool = convert_int(b)
    except TypeError as ex_1:
        # bad type, but maybe it was a float and we could try to convert it to an int first
        try:
            b_int = int(b)
        except (ValueError, TypeError):
            raise CustomError('Bad type') from ex_1
            
        b_bool = convert_int(b_int)

    return b_bool   

In [28]:
calc('ab')

CustomError: Bad type

##  Custom Exceptions

We can create our own exception types, by simply inheriting from `Exception`. (Usually, we want to inherit from `Exception`, not `BaseException` since `BaseException` includes exceptions such as `SystemExit`, `KeyboardInterrupt` and a few others - our custom exceptions mostly do not fall under the same *base* type of exceptions, but rather under `Exception`. 

Plus, it is usually expected that custom exceptions inherit from `Exception`, and people will think that trapping `Exception` will trap your exceptions as well.

So, to create a custom exception we simply inherit from `Exception`, or any subclass thereof.

In [1]:
class TimeoutError(Exception):
    """Timeout exception"""

Note: we should really always provide a docstring for any class or function we create. If we do so, a docstring **is** a valid Python statement, and it is enough for an "empty" class - we do not need to use `pass`.

Now we can trap an instance of `TimeoutError` with `TimeoutError`, `Exception`, or even `BaseException`.

In [2]:
try:
    raise TimeoutError('timeout occurred')
except TimeoutError as ex:
    print(ex)

timeout occurred


Note that we do now need to provide an `__init__` since that is inherited from `BaseException`, and we get the variable number of arguments functionality, as well as `args` and the traceback. It works just like any standard Python exception.

We don't have to inherit from `Exception`, we can inherit from any exception type, including our own custom exceptions.

In [3]:
class ReadOnlyError(AttributeError):
    """Indicates an attribute is read-only"""

In [4]:
try:
    raise ReadOnlyError('Account number is read-only', 'BA10001')
except ReadOnlyError as ex:
    print(repr(ex))

ReadOnlyError('Account number is read-only', 'BA10001')


Often when we have a relatively complex application, we create our own hierarchy of exceptions, where we use some base exception for our application, and every other exception is a subclass of that exception.

For example, suppose we are writing a library that is used to scrape some web sites and extract product information and pricing.

Let's say our library's name is *WebScraper*.

We might first create a base exception for our library:

In [5]:
class WebScraperException(Exception):
    """Base exception for WebScraper"""

In [6]:
class HTTPException(WebScraperException):
    """General HTTP exception for WebScraper"""
    
class InvalidUrlException(HTTPException):
    """Indicates the url is invalid (dns lookup fails)"""
    
class TimeoutException(HTTPException):
    """Indicates a general timeout exception in http connectivity"""
    
class PingTimeoutException(TimeoutException):
    """Ping time out"""
    
class LoadTimeoutException(TimeoutException):
    """Page load time out"""
    
class ParserException(WebScraperException):
    """General page parsing exception"""

As you can see we have this hierarchy:

```
WebScraperException
   - HTTPException
       - InvalidUrlException
       - TimeoutException
           - PingTimeoutException
           - LoadTimeoutException
    - ParserException
```

Now someone using our library can expect to trap **any** exception we raise by catching the `WebScraperException` type, or anything more specific if they prefer:

In [7]:
try:
    raise PingTimeoutException('Ping to www.... timed out')
except HTTPException as ex:
    print(repr(ex))

PingTimeoutException('Ping to www.... timed out',)


or more broadly:

In [8]:
try:
    raise PingTimeoutException('Ping time out')
except WebScraperException as ex:
    print(repr(ex))

PingTimeoutException('Ping time out',)


So this is very useful when we write modules or packages and want to keep our exception hierarchy neatly contained with some base exception class. This way, users of our class are not forced to use `except Exception` to trap exceptions we might raise from inside our library.

Custom exception classes are like any custom class, which means we can add custom attributes, properties and methods to the class.

This might be useful to provide additional context and functionality to our exceptions.

For example, suppose we are writing a REST API. When we raise a custom exception, we'll also want to return an HTTP exception response to the API caller. We could write code like this in our API calls:

Suppose we need to retrieve an account (by ID) from a database. Here I'm just going to mock this:

In [9]:
class APIException(Exception):
    """Base API exception"""

In [10]:
class ApplicationException(APIException):
    """Indicates an application error (not user caused) - 5xx HTTP type errors"""
    
class DBException(ApplicationException):
    """General database exception"""
    
class DBConnectionError(DBException):
    """Indicates an error connecting to database"""
    
class ClientException(APIException):
    """Indicates exception that was caused by user, not an internal error"""
    
class NotFoundError(ClientException):
    """Indicates resource was not found"""

class NotAuthorizedError(ClientException):
    """User is not authorized to perform requested action on resource"""
    
    
class Account:
    def __init__(self, account_id, account_type):
        self.account_id = account_id
        self.account_type = account_type

So we have this exception hierarchy:

```
APIException
   - ApplicationException (5xx errors)
       - DBException
           - DBConnectionError
   - ClientException
       - NotFoundError
       - NotAuthorizedError
```

In [11]:
def lookup_account_by_id(account_id):
    # mock of various exceptions that could be raised getting an account from database
    if not isinstance(account_id, int) or account_id <= 0:
        raise ClientException(f'Account number {account_id} is invalid.')
        
    if account_id < 100:
        raise DBConnectionError('Permanent failure connecting to database.')
    elif account_id < 200:
        raise NotAuthorizedError('User does not have permissions to read this account')
    elif account_id < 300:
        raise NotFoundError(f'Account not found.')
    else:
        return Account(account_id, 'Savings')

Now suppose we have this endpoint for a **GET** on the **Account** resource, and we need to return the appropriate HTTP exception, and message to the user.

We're going to make use of the `HTTPStatus` enumeration we have seen before.

In [12]:
from http import HTTPStatus

In [13]:
def get_account(account_id):
    try:
        account = lookup_account_by_id(account_id)
    except ApplicationException as ex:
        return HTTPStatus.INTERNAL_SERVER_ERROR, str(ex)
    except NotFoundError as ex:
        return HTTPStatus.NOT_FOUND, 'The account {} does not exist.'.format(account_id)
    except NotAuthorizedError as ex:
        return HTTPStatus.UNAUTHORIZED, 'You do not have the proper authorization.'
    except ClientException as ex:
        return HTTPStatus.BAD_REQUEST, str(ex)
    else:
        return HTTPStatus.OK, {"id": account.account_id, "type": account.account_type}

Now when we call our end point with different account numbers:

In [14]:
get_account('abc')

(<HTTPStatus.BAD_REQUEST: 400>, 'Account number abc is invalid.')

In [15]:
get_account(50)

(<HTTPStatus.INTERNAL_SERVER_ERROR: 500>,
 'Permanent failure connecting to database.')

In [16]:
get_account(150)

(<HTTPStatus.UNAUTHORIZED: 401>, 'You do not have the proper authorization.')

In [17]:
get_account(250)

(<HTTPStatus.NOT_FOUND: 404>, 'The account 250 does not exist.')

In [18]:
get_account(350)

(<HTTPStatus.OK: 200>, {'id': 350, 'type': 'Savings'})

As you can see this was quite a lot of exception handling we had to do. And really, the HTTP status and message shoudl remain consistent with any exception type.

So instead of dealing with it the way we did, we are going to do the work in the exception classes themselves.

First we know we need an `HTTPStatus` for each exception, as well as an error message to present to our user that may need to be different from the internal error message we would want to log for example.

In [19]:
class APIException(Exception):
    """Base API exception"""
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR
    internal_err_msg = 'API exception occurred.'
    user_err_msg = "We are sorry. An unexpected error occurred on our end."

Now having the default `internal_err_msg` and `user_err_msg` is great, but what if we ever wanted to override it for some reason?

Let's create an `__init__` to take care of that:

In [20]:
class APIException(Exception):
    """Base API exception"""
    
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR
    internal_err_msg = 'API exception occurred.'
    user_err_msg = "We are sorry. An unexpected error occurred on our end."
    
    def __init__(self, *args, user_err_msg = None):
        if args:
            self.internal_err_msg = args[0]
            super().__init__(*args)
        else:
            super().__init__(self.internal_err_msg)
            
        if user_err_msg is not None:
            self.user_err_msg = user_err_msg

And we can use this exception quite easily:

In [21]:
try:
    raise APIException()
except APIException as ex:
    print(repr(ex))
    print(ex.user_err_msg)

APIException('API exception occurred.',)
We are sorry. An unexpected error occurred on our end.


Or with a custom (internal) message:

In [22]:
try:
    raise APIException('custom message...', 10, 20)
except APIException as ex:
    print(repr(ex))

APIException('custom message...', 10, 20)


And of course, the user message can be customized too:

In [23]:
try:
    raise APIException('custom message...', 10, 20, user_err_msg='custom user message')
except APIException as ex:
    print(repr(ex))
    print(ex.user_err_msg)

APIException('custom message...', 10, 20)
custom user message


While we're at it, we know that we'll need to return the same JSON format when an exception occurs - so let's write it into our base exception class:

In [24]:
import json

class APIException(Exception):
    """Base API exception"""
    
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR
    internal_err_msg = 'API exception occurred.'
    user_err_msg = "We are sorry. An unexpected error occurred on our end."
    
    def __init__(self, *args, user_err_msg = None):
        if args:
            self.internal_err_msg = args[0]
            super().__init__(*args)
        else:
            super().__init__(self.internal_err_msg)
            
        if user_err_msg is not None:
            self.user_err_msg = user_err_msg
            
    def to_json(self):
        err_object = {'status': self.http_status, 'message': self.user_err_msg}
        return json.dumps(err_object)

Now we can easily use this base class, and get consistent results:

In [25]:
try:
    raise APIException()
except APIException as ex:
    print(repr(ex), ex.to_json())

APIException('API exception occurred.',) {"status": 500, "message": "We are sorry. An unexpected error occurred on our end."}


And because we'll want to log exceptions, let's also write a logger directly into our base class:

In [26]:
from datetime import datetime

class APIException(Exception):
    """Base API exception"""
    
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR
    internal_err_msg = 'API exception occurred.'
    user_err_msg = "We are sorry. An unexpected error occurred on our end."
    
    def __init__(self, *args, user_err_msg = None):
        if args:
            self.internal_err_msg = args[0]
            super().__init__(*args)
        else:
            super().__init__(self.internal_err_msg)
            
        if user_err_msg is not None:
            self.user_err_msg = user_err_msg
    
    def to_json(self):
        err_object = {'status': self.http_status, 'message': self.user_err_msg}
        return json.dumps(err_object)
    
    def log_exception(self):
        exception = {
            "type": type(self).__name__,
            "http_status": self.http_status,
            "message": self.args[0] if self.args else self.internal_err_msg,
            "args": self.args[1:]
        }
        print(f'EXCEPTION: {datetime.utcnow().isoformat()}: {exception}')

In [27]:
try:
    raise APIException()
except APIException as ex:
    ex.log_exception()
    print(ex.to_json())

EXCEPTION: 2019-08-09T23:53:42.088051: {'type': 'APIException', 'http_status': <HTTPStatus.INTERNAL_SERVER_ERROR: 500>, 'message': 'API exception occurred.', 'args': ()}
{"status": 500, "message": "We are sorry. An unexpected error occurred on our end."}


Now let's finish up our hierarchy:

In [28]:
class ApplicationException(APIException):
    """Indicates an application error (not user caused) - 5xx HTTP type errors"""
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR
    internal_err_msg = "Generic server side exception."
    user_err_msg = "We are sorry. An unexpected error occurred on our end."
    
class DBException(ApplicationException):
    """General database exception"""
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR
    internal_err_msg = "Database exception."
    user_err_msg = "We are sorry. An unexpected error occurred on our end."
    
class DBConnectionError(DBException):
    """Indicates an error connecting to database"""
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR
    internal_err_msg = "DB connection error."
    user_err_msg = "We are sorry. An unexpected error occurred on our end."
    
class ClientException(APIException):
    """Indicates exception that was caused by user, not an internal error"""
    http_status = HTTPStatus.BAD_REQUEST
    internal_err_msg = "Client submitted bad request."
    user_err_msg = "A bad request was received."
    
class NotFoundError(ClientException):
    """Indicates resource was not found"""
    http_status = HTTPStatus.NOT_FOUND
    internal_err_msg = "Resource was not found."
    user_err_msg = "Requested resource was not found."

class NotAuthorizedError(ClientException):
    """User is not authorized to perform requested action on resource"""
    http_status = HTTPStatus.UNAUTHORIZED
    internal_err_msg = "Client not authorized to perform operation."
    user_err_msg = "You are not authorized to perform this request."

Also, since we have a but more functionality available to us with our exceptions, let's refine the function that raises these exceptions:

In [29]:
def lookup_account_by_id(account_id):
    # mock of various exceptions that could be raised getting an account from database
    if not isinstance(account_id, int) or account_id <= 0:
        raise ClientException(f'Account number {account_id} is invalid.', 
                              f'account_id = {account_id}',
                              'type error - account number not an integer')
        
    if account_id < 100:
        raise DBConnectionError('Permanent failure connecting to database.', 'db=db01')
    elif account_id < 200:
        raise NotAuthorizedError('User does not have permissions to read this account', f'account_id={account_id}')
    elif account_id < 300:
        raise NotFoundError(f'Account not found.', f'account_id={account_id}')
    else:
        return Account(account_id, 'Savings')

Now we can re-write our API endpoint and very easily handle those exceptions:

In [30]:
def get_account(account_id):
    try:
        account = lookup_account_by_id(account_id)
    except APIException as ex:
        ex.log_exception()
        return ex.to_json()
    else:
        return HTTPStatus.OK, {"id": account.account_id, "type": account.account_type}

In [31]:
get_account('ABC')

EXCEPTION: 2019-08-09T23:53:43.380819: {'type': 'ClientException', 'http_status': <HTTPStatus.BAD_REQUEST: 400>, 'message': 'Account number ABC is invalid.', 'args': ('account_id = ABC', 'type error - account number not an integer')}


'{"status": 400, "message": "A bad request was received."}'

In [32]:
get_account(50)

EXCEPTION: 2019-08-09T23:53:43.569481: {'type': 'DBConnectionError', 'http_status': <HTTPStatus.INTERNAL_SERVER_ERROR: 500>, 'message': 'Permanent failure connecting to database.', 'args': ('db=db01',)}


'{"status": 500, "message": "We are sorry. An unexpected error occurred on our end."}'

In [33]:
get_account(150)

EXCEPTION: 2019-08-09T23:53:43.738034: {'type': 'NotAuthorizedError', 'http_status': <HTTPStatus.UNAUTHORIZED: 401>, 'message': 'User does not have permissions to read this account', 'args': ('account_id=150',)}


'{"status": 401, "message": "You are not authorized to perform this request."}'

In [34]:
get_account(250)

EXCEPTION: 2019-08-09T23:53:43.934897: {'type': 'NotFoundError', 'http_status': <HTTPStatus.NOT_FOUND: 404>, 'message': 'Account not found.', 'args': ('account_id=250',)}


'{"status": 404, "message": "Requested resource was not found."}'

In [35]:
get_account(350)

(<HTTPStatus.OK: 200>, {'id': 350, 'type': 'Savings'})

#### Inheriting from Multiple Exceptions

We haven't covered multiple inheritance yet, but Python supports it, and it is very easy to use to solve a specific problem we may encounter with exceptions, so i want to mention it here.

Although we may want to raise a custom exception for some specific error, sometimes we may be wondering whether to raise a built-in exception that would work just as well, or raise a custom exception.

Here's an example of where this might occur:

Suppose we have a custom exception we use to tell a user of our function/library that the value they provided to some function is not the right value - maybe it needs to be a integer greater than or equal to 0.

We might have a custom exception just for that - remember what we discussed earlier, we might want our application to raise custom exceptions for everything, based off some application base exception our users could broadly trap.

In [36]:
class AppException(Exception):
    """generic application exception"""
    
class NegativeIntegerError(AppException):
    """Used to indicate an error when an integer is negative."""

In [37]:
def set_age(age):
    if age < 0:
        raise NegativeIntegerError('age cannot be negative')

In [38]:
try:
    set_age(-10)
except NegativeIntegerError as ex:
    print(repr(ex))

NegativeIntegerError('age cannot be negative',)


But the problem is that this is also a `ValueError`, and our users may want to trap it as a `ValueError` for some reason, not a `NegativeIntegerError` (or `AppException` as is possible here).

The beauty of multiple inheritance is that we can have our custom exception inherit from **more than one** exception.

All we need to understand here, is that if we inherit from more than one class, then our subclass is considered a subclass of **both** parents.

In [39]:
class BaseClass1:
    pass

class BaseClass2:
    pass

class MyClass(BaseClass1, BaseClass2):
    pass

In [40]:
issubclass(MyClass, BaseClass1)

True

In [41]:
issubclass(MyClass, BaseClass2)

True

So, we can do the same thing with our exception:

In [42]:
class NegativeIntegerError(AppException, ValueError):
    """Used to indicate an error when an integer is negative."""

Now this exception is a subclass of **both** `AppException` and `ValueError`:

In [43]:
issubclass(NegativeIntegerError, AppException)

True

In [44]:
issubclass(NegativeIntegerError, ValueError)

True

And we can trap it with either of those exception types:

In [45]:
def set_age(age):
    if age < 0:
        raise NegativeIntegerError('age cannot be negative')

In [46]:
try:
    set_age(-10)
except NegativeIntegerError as ex:
    print(repr(ex))

NegativeIntegerError('age cannot be negative',)


In [47]:
try:
    set_age(-10)
except ValueError as ex:
    print(repr(ex))

NegativeIntegerError('age cannot be negative',)


So this solves the problem - deciding between a custom exception vs a standard exception - we can just use both (or more!)

# Section 13 - Project 6

##  Project 6 - Solution

```
1. Supplier exceptions
    a. Not manufactured anymore
    b. Production delayed
    c. Shipping delayed
    
2. Checkout exceptions
    a. Inventory type exceptions
        - out of stock
    b. Pricing exceptions
        - invalid coupon code
        - cannot stack coupons
```

In [1]:
from datetime import datetime 

class WidgetException(Exception):
    message = 'Generic Widget exception.'
    
    def __init__(self, *args, customer_message=None):
        super().__init__(args)
        if args:
            self.message = args[0]
        self.customer_message = customer_message if customer_message is not None else self.message
        
    def log_exception(self):
        exception = {
            "type": type(self).__name__,
            "message": self.message,
            "args": self.args[1:]
        }
        print(f'EXCEPTION: {datetime.utcnow().isoformat()}: {exception}')

In [2]:
ex1 = WidgetException('some custom message', 10, 100)
ex2 = WidgetException(customer_message='A custom user message.')

In [3]:
ex1.log_exception()

EXCEPTION: 2019-08-15T05:25:05.724235: {'type': 'WidgetException', 'message': 'some custom message', 'args': ()}


In [4]:
ex2.log_exception()

EXCEPTION: 2019-08-15T05:25:05.732242: {'type': 'WidgetException', 'message': 'Generic Widget exception.', 'args': ()}


Now we can create our hierarchy, and override the appropriate values for `message` to make it more specific:

In [5]:
class SupplierException(WidgetException):
    message = 'Supplier exception.'

class NotManufacturedException(SupplierException):
    message = 'Widget is no longer manufactured by supplier.'
    
class ProductionDelayedException(SupplierException):
    message = 'Widget production has been delayed by supplier.'
    
class ShippingDelayedException(SupplierException):
    message = 'Widget shipping has been delayed by supplier.'
    
class CheckoutException(WidgetException):
    message = 'Checkout exception.'
    
class InventoryException(CheckoutException):
    message = 'Checkout inventory exception.'
    
class OutOfStockException(InventoryException):
    message = 'Inventory out of stock'
    
class PricingException(CheckoutException):
    message = 'Checkout pricing exception.'
    
class InvalidCouponCodeException(PricingException):
    message = 'Invalid checkout coupon code.'
    
class CannotStackCouponException(PricingException):
    message = 'Cannot stack checkout coupon codes.'

And now we can use any of these exceptions in our code, and use the defined "logger" we implemented:

In [6]:
try:
    raise CannotStackCouponException()
except WidgetException as ex:
    ex.log_exception()
    raise

EXCEPTION: 2019-08-15T05:25:05.748971: {'type': 'CannotStackCouponException', 'message': 'Cannot stack checkout coupon codes.', 'args': ()}


CannotStackCouponException: ()

Next let's add the http status codes we want to assign to each exception type.

In [7]:
from http import HTTPStatus

In [8]:
class WidgetException(Exception):
    message = 'Generic Widget exception.'
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR
    
    def __init__(self, *args, customer_message=None):
        super().__init__(*args)
        if args:
            self.message = args[0]
        self.customer_message = customer_message if customer_message is not None else self.message
        
    def log_exception(self):
        exception = {
            "type": type(self).__name__,
            "message": self.message,
            "args": self.args[1:]
        }
        print(f'EXCEPTION: {datetime.utcnow().isoformat()}: {exception}')

Before we redefine our child classes, let's also implement the `to_json` function that we can use to send back to our users:

In [9]:
import json

In [10]:
class WidgetException(Exception):
    message = 'Generic Widget exception.'
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR
    
    def __init__(self, *args, customer_message=None):
        super().__init__(*args)
        if args:
            self.message = args[0]
        self.customer_message = customer_message if customer_message is not None else self.message
        
    def log_exception(self):
        exception = {
            "type": type(self).__name__,
            "message": self.message,
            "args": self.args[1:]
        }
        print(f'EXCEPTION: {datetime.utcnow().isoformat()}: {exception}')
        
    def to_json(self):
        response = {
            'code': self.http_status.value,
            'message': '{}: {}'.format(self.http_status.phrase, self.customer_message),
            'category': type(self).__name__,
            'time_utc': datetime.utcnow().isoformat()            
        }
        return json.dumps(response)

In [11]:
e = WidgetException('same custom message for log and user')

In [12]:
e.log_exception()

EXCEPTION: 2019-08-15T05:25:13.484482: {'type': 'WidgetException', 'message': 'same custom message for log and user', 'args': ()}


In [13]:
json.loads(e.to_json())

{'code': 500,
 'message': 'Internal Server Error: same custom message for log and user',
 'category': 'WidgetException',
 'time_utc': '2019-08-15T05:25:13.650056'}

In [14]:
e = WidgetException('custom internal message', customer_message='custom user message')

In [15]:
e.log_exception()

EXCEPTION: 2019-08-15T05:25:13.973345: {'type': 'WidgetException', 'message': 'custom internal message', 'args': ()}


In [16]:
e.to_json()

'{"code": 500, "message": "Internal Server Error: custom user message", "category": "WidgetException", "time_utc": "2019-08-15T05:25:14.136676"}'

Now for the bonus exercise - I asked you to try and log the stack trace as well.

To do that we could cannot simply use the `str` or `repr` of the  `__traceback__` property of the exception:

In [17]:
try:
    raise WidgetException('custom error message')
except WidgetException as ex:
    print(repr(ex.__traceback__))

<traceback object at 0x7fecb03b1f88>


Instead we can use the `traceback` module:

In [18]:
import traceback

In [19]:
try:
    raise ValueError
except ValueError:
    try:
        raise WidgetException('custom error message')
    except WidgetException as ex:
        print(list(traceback.TracebackException.from_exception(ex).format()))

['Traceback (most recent call last):\n', '  File "<ipython-input-19-2a9225338511>", line 2, in <module>\n    raise ValueError\n', 'ValueError\n', '\nDuring handling of the above exception, another exception occurred:\n\n', 'Traceback (most recent call last):\n', '  File "<ipython-input-19-2a9225338511>", line 5, in <module>\n    raise WidgetException(\'custom error message\')\n', 'WidgetException: custom error message\n']


So we can use that to implement logging the traceback. What would be nice too would be to expose the formatted traceback in our exception class while we're at it.

Since tracebacks can be huge, we're not going to materialize the traceback generator in that property (we'll still have to when we log the exception):

In [20]:
class WidgetException(Exception):
    message = 'Generic Widget exception.'
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR
    
    def __init__(self, *args, customer_message=None):
        super().__init__(*args)
        if args:
            self.message = args[0]
        self.customer_message = customer_message if customer_message is not None else self.message
        
    @property
    def traceback(self):
        return traceback.TracebackException.from_exception(self).format()
    
    def log_exception(self):
        exception = {
            "type": type(self).__name__,
            "message": self.message,
            "args": self.args[1:],
            "traceback": list(self.traceback)
        }
        print(f'EXCEPTION: {datetime.utcnow().isoformat()}: {exception}')
        
    def to_json(self):
        response = {
            'code': self.http_status.value,
            'message': '{}: {}'.format(self.http_status.phrase, self.customer_message),
            'category': type(self).__name__,
            'time_utc': datetime.utcnow().isoformat()            
        }
        return json.dumps(response)

In [21]:
try:
    raise WidgetException('custom internal message', customer_message='custom user message')
except WidgetException as ex:
    ex.log_exception()
    print('------------')
    print(ex.to_json())

EXCEPTION: 2019-08-15T05:25:15.569467: {'type': 'WidgetException', 'message': 'custom internal message', 'args': (), 'traceback': ['Traceback (most recent call last):\n', '  File "<ipython-input-21-472686457160>", line 2, in <module>\n    raise WidgetException(\'custom internal message\', customer_message=\'custom user message\')\n', 'WidgetException: custom internal message\n']}
------------
{"code": 500, "message": "Internal Server Error: custom user message", "category": "WidgetException", "time_utc": "2019-08-15T05:25:15.569634"}


What's nice now, is that we could just print the traceback wihout logging the exception:

In [22]:
try:
    a = 1 / 0
except ZeroDivisionError:
    try:
        raise WidgetException()
    except WidgetException as ex:
        print(''.join(ex.traceback))

Traceback (most recent call last):
  File "<ipython-input-22-2212fef7bb30>", line 2, in <module>
    a = 1 / 0
ZeroDivisionError: division by zero

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<ipython-input-22-2212fef7bb30>", line 5, in <module>
    raise WidgetException()
WidgetException



Now we can define our exception sub types, including the http status for each:

In [23]:
class SupplierException(WidgetException):
    message = 'Supplier exception.'
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR

class NotManufacturedException(SupplierException):
    message = 'Widget is no longer manufactured by supplier.'
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR
    
class ProductionDelayedException(SupplierException):
    message = 'Widget production has been delayed by supplier.'
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR
    
class ShippingDelayedException(SupplierException):
    message = 'Widget shipping has been delayed by supplier.'
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR
    
class CheckoutException(WidgetException):
    message = 'Checkout exception.'
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR
    
class InventoryException(CheckoutException):
    message = 'Checkout inventory exception.'
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR
    
class OutOfStockException(InventoryException):
    message = 'Inventory out of stock'
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR
    
class PricingException(CheckoutException):
    message = 'Checkout pricing exception.'
    http_status = HTTPStatus.INTERNAL_SERVER_ERROR
    
class InvalidCouponCodeException(PricingException):
    message = 'Invalid checkout coupon code.'
    http_status = HTTPStatus.BAD_REQUEST
    
class CannotStackCouponException(PricingException):
    message = 'Cannot stack checkout coupon codes.'
    http_status = HTTPStatus.BAD_REQUEST

In [24]:
e = InvalidCouponCodeException('User tried to use an old coupon', customer_message='Sorry. This coupon has expired.')

In [25]:
e.log_exception()

EXCEPTION: 2019-08-15T05:25:16.939141: {'type': 'InvalidCouponCodeException', 'message': 'User tried to use an old coupon', 'args': (), 'traceback': ['InvalidCouponCodeException: User tried to use an old coupon\n']}


In [26]:
e.to_json()

'{"code": 400, "message": "Bad Request: Sorry. This coupon has expired.", "category": "InvalidCouponCodeException", "time_utc": "2019-08-15T05:25:17.108099"}'

As you can see our traceback was empty above (the exception is present, but there is no call stack) - because we did not actually raise the exception!

In [27]:
try:
    raise ValueError
except ValueError:
    try:
        raise InvalidCouponCodeException(
            'User tried to use an old coupon', customer_message='Sorry. This coupon has expired.'
        )
    except InvalidCouponCodeException as ex:
        ex.log_exception()
        print('------------')
        print(ex.to_json())
        print('------------')
        print(''.join(ex.traceback))

EXCEPTION: 2019-08-15T05:25:17.441852: {'type': 'InvalidCouponCodeException', 'message': 'User tried to use an old coupon', 'args': (), 'traceback': ['Traceback (most recent call last):\n', '  File "<ipython-input-27-775351168ae0>", line 2, in <module>\n    raise ValueError\n', 'ValueError\n', '\nDuring handling of the above exception, another exception occurred:\n\n', 'Traceback (most recent call last):\n', '  File "<ipython-input-27-775351168ae0>", line 6, in <module>\n    \'User tried to use an old coupon\', customer_message=\'Sorry. This coupon has expired.\'\n', 'InvalidCouponCodeException: User tried to use an old coupon\n']}
------------
{"code": 400, "message": "Bad Request: Sorry. This coupon has expired.", "category": "InvalidCouponCodeException", "time_utc": "2019-08-15T05:25:17.442103"}
------------
Traceback (most recent call last):
  File "<ipython-input-27-775351168ae0>", line 2, in <module>
    raise ValueError
ValueError

During handling of the above exception, another

##  Project 6 - Exceptions

Suppose we have a Widget online sales application and we are writing the backend for it. We want a base `WidgetException` class that we will use as the base class for all our custom exceptions we raise from our Widget application.

Furthermore we have determined that we will need the following categories of exceptions:

```
1. Supplier exceptions
    a. Not manufactured anymore
    b. Production delayed
    c. Shipping delayed
    
2. Checkout exceptions
    a. Inventory type exceptions
        - out of stock
    b. Pricing exceptions
        - invalid coupon code
        - cannot stack coupons
```

Write an exception class hierarchy to capture this. In addition, we would like to implement the following functionality:
* implement separate internal error message and user error message
* implement an http status code associated to each exception type (keep it simple, use a 500 (server error) error for everything except invalid coupon code, and cannot stack coupons, these can be 400 (bad request)
* implement a logging function that can be called to log the exception details, time it occurred, etc.
* implement a function that can be called to produce a json string containing the exception details you want to display to your user (include http status code (e.g. 400), the user error message, etc)

##### Bonus

Log the traceback too - you'll have to do a bit of research for that! 

I'm going to use the `TracebackException` class in the `traceback` module:

https://docs.python.org/3/library/traceback.html#tracebackexception-objects

In particular, look at the class method `from_exception` (and remember that inside your exception class, the exception will be `self`!) and the `format` instance method. That method returns a generator, so you'll need to `list` it to print out everything in that traceback.

Good luck!

# Section 14 - Metaprogramming

##  Decorators and Descriptors - Review

#### Decorators

Decorators are in fact a form of metaprogramming.

Decorators are pieces of code that modify the behavior of another piece of code.

For example, if we want to write a "super duper debugger", that prints out every function call and the arguments it was called with, we can easily modify (decorate) any function we want to "debug" without modifying the function body directly:

In [1]:
from functools import wraps

In [2]:
def debugger(fn):
    @wraps(fn)
    def inner(*args, **kwargs):
        print(f'{fn.__qualname__}', args, kwargs)
        return fn(*args, **kwargs)
    return inner

And now we can just decorate our functions:

In [4]:
@debugger
def func_1(*args, **kwargs):
    pass

@debugger
def func_2(*args, **kwargs):
    pass

In [5]:
func_1(10, 20, kw='a')

func_1 (10, 20) {'kw': 'a'}


In [6]:
func_2(10)

func_2 (10,) {}


The advantage of this decorator approach is that if we want to modify our debugger output, we only need to modify the decorator function once, and when we re-run our program the new changes take effect for every decorated function.

#### Descriptors

Although it may not seem like it, descriptors are also a form of metaprogramming.

Descriptors essentially allow us to modify the behavior of the dot (`.`) operator.

If we have a simple class like so:

In [7]:
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

Then the dot operator works directly against the object's dictionary (namespace):

In [8]:
p = Point(10, 20)

In [9]:
p.x

10

In [10]:
p.x=100

In [11]:
p.__dict__

{'x': 100, 'y': 20}

So here, `p.x` was basically referencing the instance dictionary. But descriptors allows us to essentially redefine how the `.` operator works.

We saw properties, and properties are based on descriptors, but they are not always conducive to DRY code as we saw earlier.

Let's say we want to provide type checking on the `x` and `y` components of the `Point` class.

We can use a data descriptor to essentially modify the way the `.` operator works by passing it through getter and setter (and deleter) functions - and we also eliminate repetitive code:

In [14]:
class IntegerField:
    def __set_name__(self, owner, name):
        self.name = name
        
    def __get__(self, instance, owner):
        return instance.__dict__.get(self.name, None)
    
    def __set__(self, instance, value):
        if not isinstance(value, int):
            raise TypeError('Must be an integer.')
        instance.__dict__[self.name] = value

In [15]:
class Point:
    x = IntegerField()
    y = IntegerField()
    
    def __init__(self, x, y):
        self.x = x
        self.y = y

In [16]:
p = Point(10, 20)

In [17]:
p.x, p.y

(10, 20)

In [18]:
try:
    p.x = 10.5
except TypeError as ex:
    print(ex)

Must be an integer.


So, without changing the interface of our class, we replaced the default functionality of the `.` operator with another piece of code (that implemented the descriptor protocol).

We'll come back to decorators, and see how we can actually decorate entire classes (so-called **class decorators**).

##  The `__new__` Method

We've studied the `__init__` method quite a bit so far. It is basically a method that gets called right after the class instance has been created, usually invoked when we call the Class with arguments to instantiate an instance.

The `__new__` method is the method that is invoked to actually create the new object, as an instance of the desired class.

Since the `object` class provides a default implementation for `__new__` we rarely have to bother with it, but sometimes we want to intercept the instance creation to tweak things a bit.

The `__new__` method, unlike the `__init__` method is actually a **static** method, not an **instance** method. Which kinds of make sense since the instance does not exist yet - that's what the `__new__` method is trying to create.

Why it's not a **class** method is more complicated. We'll see why that's the case as we explore `__new__`.

Remember how we create instances of a class - we call the class with whatever arguments we need to initialize the class state:

```
p = Person(name, age)
```

The creation of the class instance is then done in two steps:
1. The `__new__` method is called. It receives, as arguments, the class object we want an instance of, and any additional arguments we pass to the creation call (e.g. `name` and `age`). It should return a new instance of the class (and it may have used the arguments to initialize stuff in the class too, that's up to how you write your `__new__ method)
2. If the object returned by `__new__` is an instance of the class specified in the call to `__new__`, then the `__init__` method is also called. The `__init__` method is an instance method and does not return anything (well, it returns None).

The `__new__` method is present in the `object` class, so we can easily use it to create an instance of a class, without calling the class itself.

Let's take a look:

In [1]:
class Point:
    pass

In [2]:
p = object.__new__(Point)

In [3]:
type(p)

__main__.Point

So as you see, we created an instance of `Point` by using the `__new__` method defined in `object`.

Let's take that a step further and include the initialization as well:

In [4]:
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

In [5]:
p = object.__new__(Point)

In [6]:
p.__init__(10, 20)

In [7]:
p.__dict__

{'x': 10, 'y': 20}

One thing to note is that although `object.__new__` will accept `*args` and `**kwargs` it does not actually use them:

In [8]:
p = object.__new__(Point, 10, 20)

In [9]:
p.__dict__

{}

Remember that this automatic chaining of `__new__` and `__init__` happens when we create a class using it as a callable (e.g.  `Person(10, 20)`).

So, since `__new__` is just another method, we can choose to override it in our custom classes.

In [10]:
class Point:
    def __new__(cls, x, y):
        print('Creating instance...', x, y)
        instance = object.__new__(cls)  # delegate to object.__new__
        return instance  # don't forget to return the new instance!
    
    def __init__(self, x, y):
        print('Initializing instance...', x, y)
        self.x = x
        self.y = y

In [11]:
p = Point(10, 20)

Creating instance... 10 20
Initializing instance... 10 20


What's interesting also about the `__new__` method is that we can override it even when we inherit from the **built-in** types, whereas often the same does not work with `__init__` (we'll come back to the topic of inheriting from built-in types later when we look at abstract base classes.)

Let's see an example of this:

In [12]:
class Squared(int):
    def __new__(cls, x):
        return super().__new__(cls, x**2)  # delegate creating an int instance to the int class itself

In [13]:
result = Squared(4)

In [14]:
result

16

Anf of course, the `type` of result is `Squared`:

In [15]:
type(result)

__main__.Squared

But, it is **also** an `int` since we **inherited** from `int`:

In [16]:
isinstance(result, int)

True

Trying to do this using `__init__` would not work - the built-in `__init__` for integers does not actually do anything, and does not allow for an argument to be passed:

In [17]:
class Squared(int):
    def __init__(self, x):
        print('calling init...')
        super().__init__(x ** 2)

In [18]:
try:
    result = Squared(4)
except TypeError as ex:
    print(ex)

calling init...
object.__init__() takes exactly one argument (the instance to initialize)


Most often when we override the `__new__` method we use delegation to the parent class to do some of the work. But of course, as we saw just now we don't have to, we can just use `object.__new__` directly. 

Here's how we did it using `object` explicitly:

In [19]:
class Person:
    def __new__(cls, name):
        print(f'Person: Instantiating {cls.__name__}...')
        instance = object.__new__(cls)
        return instance
        
    def __init__(self, name):
        print(f'Person: Initializing instance...')
        self.name = name

In [20]:
p = Person('Guido')

Person: Instantiating Person...
Person: Initializing instance...


But the problem here is that this technique does not play well with inheritance.

Let's do the same thing with a sub class of `Person`:

In [21]:
class Student(Person):
    def __new__(cls, name, major):
        print(f'Student: Instantiating {cls.__name__}...')
        instance = object.__new__(cls)
        return instance
    
    def __init__(self, name, major):
        print(f'Student: Initializing instance...')
        super().__init__(name)
        self.major = major

In [22]:
s = Student('John', 'Major')

Student: Instantiating Student...
Student: Initializing instance...
Person: Initializing instance...


You'll notice that the `__new__` method of `Person` was not called - that's because we called `object.__new__` directly.

So instead we really should do it this way:

In [23]:
class Person:
    def __new__(cls, name):
        print(f'Person: Instantiating {cls.__name__}...')
        instance = super().__new__(cls)
        return instance
        
    def __init__(self, name):
        print(f'Person: Initializing instance...')
        self.name = name
        
class Student(Person):
    def __new__(cls, name, major):
        print(f'Student: Instantiating {cls.__name__}...')
        instance = super().__new__(cls, name)
        return instance
    
    def __init__(self, name, major):
        print(f'Student: Initializing instance...')
        super().__init__(name)
        self.major = major

In [24]:
s = Student('John', 'Major')

Student: Instantiating Student...
Person: Instantiating Student...
Student: Initializing instance...
Person: Initializing instance...


So why override `__new__`? We saw one example where we can inherit from a built-in type and modify the behavior (the `Squared` class - the value is still an `int`, since we inherited from `int`

It allows us to tweak how the class is created. For example we could inject some extra attributes onto the class before creating the instance:

In [25]:
class Square:
    def __new__(cls, w, l):
        cls.area = lambda self: self.w * self.l
        # or use setattr(cls, 'area', lambda self: self.w * self.l)
        instance = super().__new__(cls)  
        return instance
    
    def __init__(self, w, l):
        self.w = w
        self.l = l

In [26]:
s = Square(3, 4)

In [27]:
s.area()

12

As you see we injected a function into the class before creating it. We could also tweak the instance before returning it.

In [28]:
class Square:
    def __new__(cls, w, l):
        setattr(cls, 'area', lambda self: self.w * self.l)
        instance = super().__new__(cls)
        instance.w = w
        instance.l = l
        return instance

Notice that since we are setting the instance variables inside the `__new__`, we don't even need to provide an override for the `__init__`.

In [29]:
s = Square(3, 4)

In [30]:
s.__dict__

{'w': 3, 'l': 4}

In [31]:
s.area()

12

Keep in mind that `__new__` is a static method, and we can also invoke it explicitly ourselves - we just need to remember that we need to pass the class (type) we want to create an instance of to the `__new__` method as the first argument.

In [32]:
s = Square.__new__(Square, 3, 4)

In [33]:
s.__dict__, s.area()

({'w': 3, 'l': 4}, 12)

Another important point. Remember that I said that when we call `MyClass(args, kwargs)`, it will essentially call:
```
MyClass.__new__(MyClass, args, kwargs)
```

But that's not the only thing that happens - the `__init__` is also automatically called right after.

**But only if the type returned by `__new__` matches the type specified as the first argument of `__new__`**

Let's see this:

In [34]:
class Person:
    def __new__(cls, name):
        print(f'Creating instance of {cls.__name__}... not really...')
        instance = str(name)
        return instance

In [35]:
p = Person('Alex')

Creating instance of Person... not really...


In [36]:
p, type(p)

('Alex', str)

As you can see, we requested a new instance of `Person`, but `__new__` ignored that and created an instance of `str` instead.

Now let's add an init method:

In [37]:
class Person:
    def __new__(cls, name):
        print(f'Creating instance of {cls.__name__}... not really...')
        instance = str(name)
        return instance
    
    def __init__(self, name):
        print('Init called...')
        self.name = name

In [38]:
p = Person('Raymond')

Creating instance of Person... not really...


In [39]:
type(p), p

(str, 'Raymond')

As you can see the `__init__` was not called - and that makes sense - if `__new__` is not returning an instance of `Person` it does not make sense to invoke the `__init__` for `Person`, nor for the newly created instance (the signature might not even be compatible!)

One more point to make, is that if we override the `__new__` method, there is probably no reason to also override the `__init__` method, since we can take care of any custom initialization in the `__new__` method ourselves.

In [40]:
class Person:
    def __new__(cls, name, age):
        instance = super().__new__(cls)
        instance.name = name
        instance.age = age
        return instance

In [41]:
p = Person('Guido', 42)

In [42]:
p.__dict__

{'name': 'Guido', 'age': 42}

##  How are Classes Constructed?

When we write a class such as this:

In [1]:
import math

class Circle(object):
    def __init__(self, x, y, r):
        self.x = x
        self.y = y
        self.r = r
        
    def area(self):
        return math.pi * self.r ** 2

Remember that a class is an **instance** of the `type` class:

In [2]:
type(Circle)

type

And `type` is a class itself, so it is callable (with some arguments), and is used to create classes, instances of the `type` class.

There are four main steps involved with creating instances of a class:

1. The class body is extracted - think of it as just a lump of text that contains code.
2. The class dictionary (used for the **class** state) is created for the class namespace
3. The body (extracted in 1), is executed in the class namespace (created in 2), thereby populating the class dictionary (in this case with two symbols, `__init__` and `area`)
4. A new `type` **instance** is constructed using the name of the class, the base classes (remember Python supports multiple inheritance), and that dictionary.

Let's actually step through this process manually ourselves:

First we need to look at the `exec` built-in method:

Let's try it out with a simple example first:

In [3]:
namespace = {}

exec('''
a = 10
b = 20
''', globals(), namespace)

And now let's see what's in the `namespace` dictionary:

In [4]:
namespace

{'a': 10, 'b': 20}

As you can see, that dictionary was used as the local namespace when the code (in the string) was executed. Of course, the code can contain any valid Python code, including function definitions:

In [5]:
exec('''
def add(a, b):
    return a + b
    
def mul(a, b):
    return a * b
''', globals(), namespace)

In [6]:
namespace

{'a': 10,
 'b': 20,
 'add': <function __main__.add(a, b)>,
 'mul': <function __main__.mul(a, b)>}

And we can use those functions, since now they are actual function objects in the namespace (dictionary):

In [7]:
namespace['add'](10, 20)

30

Remember what I told you about the class body scope? Well, this is it! And you should now understand why functions defined in that scope do not actually know anything about what else is in that scope - those functions are created independently of the dictionary into which they are inserted.

So, this is how we are going to "run" the class **body** in the context of the class namespace dictionary.

We'll also need to create a new `type` instance, so let's see what the signature for the `type` constructor is:

In [8]:
help(type)

Help on class type in module builtins:

class type(object)
 |  type(object_or_name, bases, dict)
 |  type(object) -> the object's type
 |  type(name, bases, dict) -> a new type
 |  
 |  Methods defined here:
 |  
 |  __call__(self, /, *args, **kwargs)
 |      Call self as a function.
 |  
 |  __delattr__(self, name, /)
 |      Implement delattr(self, name).
 |  
 |  __dir__(self, /)
 |      Specialized __dir__ implementation for types.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __instancecheck__(self, instance, /)
 |      Check if an object is an instance.
 |  
 |  __repr__(self, /)
 |      Return repr(self).
 |  
 |  __setattr__(self, name, value, /)
 |      Implement setattr(self, name, value).
 |  
 |  __sizeof__(self, /)
 |      Return memory consumption of the type object.
 |  
 |  __subclasscheck__(self, subclass, /)
 |     

The constructor variant we are interested in is the third one. That one requires three things:
1. the `name` of the class
2. an tuple containing the `bases` - the classes this class inherits from (can be empty, in which case it just inherits from `object`)
3. the class namespace `dict`

Remember when I said that classes were basically just dictionaries? As you can see here, apart from the `name` and `bases`, all the functionality of the class is stored in the namespace dictionary!!

So now, let's go ahead and create our `Circle` class using this approach:

In [9]:
class_name = 'Circle'

In [10]:
class_body = """
def __init__(self, x, y, r):
    self.x = x
    self.y = y
    self.r = r

def area(self):
    return math.pi * self.r ** 2
"""

In [11]:
class_bases = ()  # defaults to object

In [12]:
class_dict = {}

In [13]:
exec(class_body, globals(), class_dict)

Now that we have executed that code in that namespace, that dictionary has some content:

In [14]:
class_dict

{'__init__': <function __main__.__init__(self, x, y, r)>,
 'area': <function __main__.area(self)>}

And we can now create the `Circle` class, or type, by creating a new instance of `type`:

In [15]:
Circle = type(class_name, class_bases, class_dict)

In [16]:
Circle

__main__.Circle

In [17]:
type(Circle)

type

In [18]:
Circle.__dict__

mappingproxy({'__init__': <function __main__.__init__(self, x, y, r)>,
              'area': <function __main__.area(self)>,
              '__module__': '__main__',
              '__dict__': <attribute '__dict__' of 'Circle' objects>,
              '__weakref__': <attribute '__weakref__' of 'Circle' objects>,
              '__doc__': None})

As you can see the `Circle` namespace dict contains our functions `__init__` and `area`.

And we now have a `Circle` class that we can use just like before:

In [19]:
c = Circle(0, 0, 1)

In [20]:
c.x, c.y, c.r

(0, 0, 1)

In [21]:
c.area()

3.141592653589793

So as you can see, we use the `type` class to construct new types (classes), basically creating instances of `type`.

This is why we refer to `type` as a **metaclass**. It is a class used to construct classes.

Also, make sure you understand that `type` is callable in two different ways - depending on what arguments are passed to `type()` it will do different things:

Creates a new `type` instance:

In [22]:
Circle = type(class_name, class_bases, class_dict)

Returns the `type` of an object:

In [23]:
type(Circle)

type

##  Inheriting from type

In the last lectures, we saw how classes can be created by calling the `type` class.

But what if we wanted to use something other than `type` to construct classes?

Since `type` is a class, maybe we could define a class that inherits from `type` (so we can leverage the actual type creation process), and override some things that would enable us to inject something in the class creation process.

Here we want to intercept the creation of the `type` instance before it is created, so we would want to use the `__new__` method.

Remember that the `__new__` method basically needs to build and return the new instance. So we'll do the customizations we want, but ultimately we'll punt (delegate) to the `type` class to do the actual work, just adding the tweaks (before and/or after the class creation) we want.

Just a quick reminder of how the static `__new__` method works in general:

In [1]:
class Test:
    def __new__(cls, *args, **kwargs):
        print(f'New instance of {cls} being created with these values:', args, kwargs)

In [2]:
t = Test(10, 20, kw='a')

New instance of <class '__main__.Test'> being created with these values: (10, 20) {'kw': 'a'}


And it's really the same as doing this:

In [3]:
Test.__new__(Test, 10, 20, kw='a')

New instance of <class '__main__.Test'> being created with these values: (10, 20) {'kw': 'a'}


Of course, it's now up to us to return an object from the `__new__` function.

So, instead of calling `type` to create the class (type), let's create a custom type generator by subclassing `type`.

We'll inherit from `type`, and override the `__new__` function to create the instance of the class.

In [4]:
import math

class CustomType(type):
    def __new__(cls, name, bases, class_dict):
        # above is the signature that type.__new__ has - 
        # and args are collected and passed by Python when we call a class (to create an instance of that class)
        # we'll see where those actually come from later
        print('Customized type creation!')
        cls_obj = super().__new__(cls, name, bases, class_dict)  # delegate to super (type in this case)
        cls_obj.circ = lambda self: 2 * math.pi * self.r  # basically attaching a function to the class
        return cls_obj

Now let's go through the same process to create our `Circle` class that we used in the last lecture, the manual way, but using `CustomType` instead of `type`.

In [5]:
class_body = """
def __init__(self, x, y, r):
    self.x = x
    self.y = y
    self.r = r

def area(self):
    return math.pi * self.r ** 2
"""

And we create our class dictionary by executing the above code in the context of that dictionary:

In [6]:
class_dict = {}
exec(class_body, globals(), class_dict)

Then we create the `Circle` class:

In [7]:
Circle = CustomType('Circle', (), class_dict)

Customized type creation!


We basically customized the class creation, and `Circle` is just a standard object, but, as you can see below, the type of our class is no longer `type`, but `CustomType`.

In [8]:
type(Circle)

__main__.CustomType

Of course, `Circle` is still an instance of `type` since `CustomType` is a subclass of `type`:

In [9]:
isinstance(Circle, CustomType), issubclass(CustomType, type)

(True, True)

And just like before, `Circle` still has the `__init__` and `area` methods:

In [10]:
hasattr(Circle, '__init__'), hasattr(Circle, 'area')

(True, True)

So we can use `Circle` just as normal:

In [11]:
c = Circle(0, 0, 1)

In [12]:
c.area()

3.141592653589793

Additionally, we injected a new function, `circ`, into the class while we were constructing it in the `__new__` method of `CustomType`:

In [13]:
hasattr(Circle, 'circ')

True

In [14]:
c.circ()

6.283185307179586

So, this is another example of metaprogramming!

But yeah, creating classes (types) this way is a bit tedious!!!

This is where the concept of a `metaclass` comes in, which we'll cover in the next set of lectures.

##  Metaclasses

In the last lecture, we saw how we could create new types (new classes), using `type` or custom types (classes that inherit from `type`).

But the actual creation process in either case is difficult.

We have to get the code text somehow, execute it in the context of a dictionary, and then call `type(name, bases, dict)` or `CustomType(name, bases, dict)`.

Not the best user experience!

When we define classes in Python:

In [1]:
class Person:
    def __init__(self, name):
        self.name = name
        
class Student(Person):
    def __init__(self, name, major):
        super().__init__(name)
        self._major = major
        
    @property
    def major(self):
        return self._major

This code is executed by Python and we end up with a new type (like `Person`, or `Student`) that has been created.

This means Python has done all the steps we were doing manually for us, and called `type` with the name, bases and class dictionary. Makes it a lot easier for us...

But why did Python call `type` to create our `Student` class?

This is where the concept of a `metaclass` comes in.

When Python encounters `class Student(Person):`, it decides what class to use to create the class.

This class is called the metaclass, and by default it is the `type` class.

But, there is a way we can actually tell Python to use something other than `type` to do this - we can specify a different **metaclass** in the class definition itself, by passing it as a named argument:

So technically, this is what happens by default:

In [2]:
import math

class Circle(metaclass=type):
    def __init__(self, x, y, r):
        self.x = x
        self.y = y
        self.r = r
        
    def area(self):
        return math.pi * self.r ** 2

In [3]:
type(Circle), Circle.__name__

(type, 'Circle')

In [4]:
c = Circle(0, 0, 1)
c.area()

3.141592653589793

As you can see this worked as normal, and the default `metaclass` is `type`.

The `metaclass` argument essentially allows us to specify what class we want to use to construct our class. So we could create a custom class that will build a new type, injecting whatever functionality we want into the creation process - essentially allowing us to modify the definition/functionality of the class we are creating using code.

Python will call our metaclass with the same arguments it would pass to the `type` constructor: `name`, `bases` and `class_dict`, so we need to handle those arguments, but it does the work of creating the class dictionary and executing the code in that context, gathering the bases and the name of the class we are defining.

In [5]:
class CustomType(type):
    def __new__(mcls, name, bases, class_dict):
        print(f'Using custom metaclass {mcls} to create class {name}...')
        cls_obj = super().__new__(mcls, name, bases, class_dict)
        cls_obj.circ = lambda self: 2 * math.pi * self.r
        return cls_obj

In [6]:
class Circle(metaclass=CustomType):
    def __init__(self, x, y, r):
        self.x = x
        self.y = y
        self.r = r
        
    def area(self):
        return math.pi * self.r ** 2

Using custom metaclass <class '__main__.CustomType'> to create class Circle...


As you can see from the print output, our custom metaclass was used, and here's the class info:

In [7]:
Circle

__main__.Circle

And just like before, it has the `__init__`, `area` and `circ` functions:

In [8]:
vars(Circle)

mappingproxy({'__module__': '__main__',
              '__init__': <function __main__.Circle.__init__(self, x, y, r)>,
              'area': <function __main__.Circle.area(self)>,
              '__dict__': <attribute '__dict__' of 'Circle' objects>,
              '__weakref__': <attribute '__weakref__' of 'Circle' objects>,
              '__doc__': None,
              'circ': <function __main__.CustomType.__new__.<locals>.<lambda>(self)>})

And we can use it just like before:

In [9]:
c = Circle(0, 0, 1)
print(c.area())
print(c.circ())

3.141592653589793
6.283185307179586


And that's how we use metaclasses declaratively. Python handles the complexity of creating the instance of the metaclass, getting the name, bases and class dictionary we otherwise have to create ourselves and pass as arguments when we call the metaclass.

Much of the difficulty with metaclasses, is how to use them, and, especially, not overdoing it.

Just because you know how to create metaclasses, does not mean every problem you encounter should be solved with one!

Don't be the person who invents problems because they have a solution!

##  Class Decorators

Let's come back to decorators.

So far, we have been using decorators to decorate functions - but of course, we could also use them to decorate classes:

Let's start with a simple example first, like we saw in the lecture:

In [1]:
def savings(cls):
    cls.account_type = 'savings'
    return cls
    
def checking(cls):
    cls.account_type = 'checking'
    return cls

In [2]:
class Account:
    pass

@savings
class Bank1Savings(Account):
    pass

@savings
class Bank2Savings(Account):
    pass

@checking
class Bank1Checking(Account):
    pass

@checking
class Bank2Checking(Account):
    pass

And if we inspect our classes, we'll see that the `account_type` attribute has been injected by the decorator:

In [3]:
Bank1Savings.account_type, Bank1Checking.account_type

('savings', 'checking')

Of course, we could make even this simple example a little DRYer, by making a parameterized decorator:

In [4]:
def account_type(type_):
    def decorator(cls):
        cls.account_type = type_
        return cls
    return decorator

In [5]:
@account_type('Savings')
class Bank1Savings:
    pass

@account_type('Checking')
class Bank1Checking:
    pass

In [6]:
Bank1Savings.account_type, Bank1Checking.account_type

('Savings', 'Checking')

We're not restricted to just adding data attributes either.

Let's create a class decorator to inject a new function into the class before we return it:

In [7]:
def hello(cls):
    cls.hello = lambda self: f'{self} says hello!'
    return cls

In [8]:
@hello
class Person:
    def __init__(self, name):
        self.name = name
        
    def __str__(self):
        return self.name

In [9]:
vars(Person)

mappingproxy({'__module__': '__main__',
              '__init__': <function __main__.Person.__init__(self, name)>,
              '__str__': <function __main__.Person.__str__(self)>,
              '__dict__': <attribute '__dict__' of 'Person' objects>,
              '__weakref__': <attribute '__weakref__' of 'Person' objects>,
              '__doc__': None,
              'hello': <function __main__.hello.<locals>.<lambda>(self)>})

As you can see, the `Person` class now has an attribute `hello` which is a function.

So, it will then become a bound method when we call it from an instance of `Person`:

In [10]:
p = Person('Guido')

In [11]:
p.hello()

'Guido says hello!'

These examples are simple enough to understand what's going on, but not very useful.

But we can do some interesting things.

For example, suppose we want to log every call to every callable in some class.

We could certainly do it this way:

In [12]:
from functools import wraps

def func_logger(fn):
    @wraps(fn)
    def inner(*args, **kwargs):
        result = fn(*args, **kwargs)
        print(f'log: {fn.__qualname__}({args}, {kwargs}) = {result}')
        return result
    return inner    

In [13]:
class Person:
    @func_logger
    def __init__(self, name, age):
        self.name = name
        self.age = age
    
    @func_logger
    def greet(self):
        return f'Hello, my name is {self.name} and I am {self.age}'

In [14]:
p = Person('John', 78)

log: Person.__init__((<__main__.Person object at 0x7fad8914b3d0>, 'John', 78), {}) = None


In [15]:
p.greet()

log: Person.greet((<__main__.Person object at 0x7fad8914b3d0>,), {}) = Hello, my name is John and I am 78


'Hello, my name is John and I am 78'

But this is kind of tedious if we have many methods in our class. Not very DRY!

Instead, how about creating a class decorator that will decorate every callable in a given class with the logger decorator:

In [16]:
def class_logger(cls):
    for name, obj in vars(cls).items():
        if callable(obj):
            print('decorating:', cls, name)
            setattr(cls, name, func_logger(obj))
    return cls

So now we could do this:

In [17]:
@class_logger
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age
    
    def greet(self):
        return f'Hello, my name is {self.name} and I am {self.age}'

decorating: <class '__main__.Person'> __init__
decorating: <class '__main__.Person'> greet


In [18]:
vars(Person)

mappingproxy({'__module__': '__main__',
              '__init__': <function __main__.Person.__init__(self, name, age)>,
              'greet': <function __main__.Person.greet(self)>,
              '__dict__': <attribute '__dict__' of 'Person' objects>,
              '__weakref__': <attribute '__weakref__' of 'Person' objects>,
              '__doc__': None})

In [19]:
p = Person('John', 78)

log: Person.__init__((<__main__.Person object at 0x7fad6820ca10>, 'John', 78), {}) = None


In [20]:
p.greet()

log: Person.greet((<__main__.Person object at 0x7fad6820ca10>,), {}) = Hello, my name is John and I am 78


'Hello, my name is John and I am 78'

Now we have to be a bit careful. Although this class decorator seems to work fine, it will have issues with static and class methods!

In [21]:
@class_logger
class Person:
    @staticmethod
    def static_method():
        print('static_method invoked...')
    
    @classmethod
    def cls_method(cls):
        print(f'cls_method invoked for {cls}...')
        
    def instance_method(self):
        print(f'instance_method invoked for {self}')

decorating: <class '__main__.Person'> instance_method


In [22]:
Person.static_method()

static_method invoked...


In [23]:
Person.cls_method()

cls_method invoked for <class '__main__.Person'>...


In [24]:
Person().instance_method()

instance_method invoked for <__main__.Person object at 0x7fad8914bd50>
log: Person.instance_method((<__main__.Person object at 0x7fad8914bd50>,), {}) = None


You'll notice that in the `cls_method` and `instance_method` cases, the logger printout never showed up! In fact, we did not get the message that these methods had been decorated.

What happened?

The problem is that static and class methods are not functions, they are actually descriptors, not callables.

In [25]:
class Person:
    @staticmethod
    def static_method():
        pass

In [26]:
Person.__dict__['static_method']

<staticmethod at 0x7fad6820c910>

In [27]:
callable(Person.__dict__['static_method'])

False

So, they were not decorated at all.

Which is probably a good thing, because our decorator is expecting to decorate a function, not a class!

This, by the way, is why when you decorate static or class methods using a function decorator in your classes, you should do so before you decorate it with the `@staticmethod` or `@classmethod` decorators:

In [28]:
class Person:
    @staticmethod
    @func_logger
    def static_method():
        pass

In [29]:
Person.static_method()

log: Person.static_method((), {}) = None


But if you try it this way around, things aren't so happy:

In [30]:
class Person:
    @func_logger
    @staticmethod
    def static_method():
        pass

In [31]:
Person.static_method()

TypeError: 'staticmethod' object is not callable

We can actually fix this problem in our class decorator if we really wanted to.

Let's first examine two things separately.

First let's make sure we can recognize the type of a class or static method in our class:

In [32]:
class Person:
    @staticmethod
    def static_method():
        pass
    
    @classmethod
    def class_method(cls):
        pass

In [33]:
type(Person.__dict__['static_method'])

staticmethod

In [34]:
type(Person.__dict__['class_method'])

classmethod

Next, can we somehow get back to the original function that was wrapped by the `@staticmethod` and `@classmethod` decorators?

The answer is yes, since these are method objects - we've seen this before when we studied the relationship between functions and descriptors and how methods were created.

In [35]:
Person.__dict__['static_method'].__func__

<function __main__.Person.static_method()>

In [36]:
Person.__dict__['class_method'].__func__

<function __main__.Person.class_method(cls)>

So now, we could modify our class decorator needs to unwrap any class or static methods, decorate the original function, and then re-wrap it with the appropriate `classmethod` or `instancemethod` decorator:

In [37]:
def class_logger(cls):
    for name, obj in vars(cls).items():
        if callable(obj):
            print('decorating:', cls, name)
            setattr(cls, name, func_logger(obj))
        elif isinstance(obj, staticmethod):
            original_func = obj.__func__
            print('decorating static method', original_func)
            decorated_func = func_logger(original_func)
            method = staticmethod(decorated_func)
            print(method, type(method))
            setattr(cls, name, method)
        elif isinstance(obj, classmethod):
            original_func = obj.__func__
            print('decorating class method', original_func)
            decorated_func = func_logger(original_func)
            method = classmethod(decorated_func)
            setattr(cls, name, method)
    return cls

In [38]:
@class_logger
class Person:
    @staticmethod
    def static_method(a, b):
        print('static_method called...', a, b)
        
    @classmethod
    def class_method(cls, a, b):
        print('class_method called...', cls, a, b)
        
    def instance_method(self, a, b):
        print('instance_method called...', self, a, b)

decorating static method <function Person.static_method at 0x7fad8914a680>
<staticmethod object at 0x7fad6822af10> <class 'staticmethod'>
decorating class method <function Person.class_method at 0x7fad8914ab00>
decorating: <class '__main__.Person'> instance_method


In [39]:
Person.static_method(10, 20)

static_method called... 10 20
log: Person.static_method((10, 20), {}) = None


In [40]:
Person.class_method(10, 20)

class_method called... <class '__main__.Person'> 10 20
log: Person.class_method((<class '__main__.Person'>, 10, 20), {}) = None


In [41]:
Person().instance_method(10, 20)

instance_method called... <__main__.Person object at 0x7fad98071950> 10 20
log: Person.instance_method((<__main__.Person object at 0x7fad98071950>, 10, 20), {}) = None


Not bad... Not what about properties?

In [42]:
@class_logger
class Person:
    def __init__(self, name):
        self._name = name
        
    @property
    def name(self):
        return self._name

decorating: <class '__main__.Person'> __init__


Hmm, the property was not decorated...

Let's see what the type of that property is (you should already know this):

In [43]:
type(Person.__dict__['name'])

property

In [44]:
isinstance(Person.__dict__['name'], property)

True

And how do we get the original functions on a property?

In [45]:
prop = Person.__dict__['name']

In [46]:
prop.fget

<function __main__.Person.name(self)>

In [47]:
prop.fset, prop.fdel

(None, None)

Hmm, so maybe we can decorate the `fget`, `fset`, and `fdel` functions of the property (if they are not `None`).

We can't just replace the functions, because `fget`, `fset` and `fdel` are actually read-only properties themselves, that return the original functions. But we could create a new property based off thge original one, substituting our decorated getter, setter and deleter.

Recall that the `getter()`, `setter()` and `deleter()` methods are methods that will create a copy of the original property, but substitute the `fget`, `fset` and `fdel` methods (that's how these are used as decorators).

In [48]:
def class_logger(cls):
    for name, obj in vars(cls).items():
        if callable(obj):
            print('decorating:', cls, name)
            setattr(cls, name, func_logger(obj))
        elif isinstance(obj, staticmethod):
            original_func = obj.__func__
            print('decorating static method', original_func)
            decorated_func = func_logger(original_func)
            method = staticmethod(decorated_func)
            print(method, type(method))
            setattr(cls, name, method)
        elif isinstance(obj, classmethod):
            original_func = obj.__func__
            print('decorating class method', original_func)
            decorated_func = func_logger(original_func)
            method = classmethod(decorated_func)
            setattr(cls, name, method)
        elif isinstance(obj, property):
            print('decorating property', obj)
            if obj.fget:
                obj = obj.getter(func_logger(obj.fget))
            if obj.fset:
                obj = obj.setter(func_logger(obj.fset))
            if obj.fdel:
                obj = obj.deleter(func_logger(obj.fdel))
            setattr(cls, name, obj)
    return cls

In [49]:
@class_logger
class Person:
    def __init__(self, name):
        self._name = name
        
    @property
    def name(self):
        return self._name

decorating: <class '__main__.Person'> __init__
decorating property <property object at 0x7fad89142ad0>


In [50]:
p = Person('David')

log: Person.__init__((<__main__.Person object at 0x7fad6821db90>, 'David'), {}) = None


In [51]:
p.name

log: Person.name((<__main__.Person object at 0x7fad6821db90>,), {}) = David


'David'

Ha!! Pretty cool...

Let's make sure it works if we have setters and deleters as well:

In [52]:
@class_logger
class Person:
    def __init__(self, name):
        self._name = name
        
    @property
    def name(self):
        return self._name
    
    @name.setter
    def name(self, value):
        self._name = value
        
    @name.deleter
    def name(self):
        print('deleting name...')

decorating: <class '__main__.Person'> __init__
decorating property <property object at 0x7fad68232bf0>


In [53]:
p = Person('David')

log: Person.__init__((<__main__.Person object at 0x7fad6821d7d0>, 'David'), {}) = None


In [54]:
p.name

log: Person.name((<__main__.Person object at 0x7fad6821d7d0>,), {}) = David


'David'

In [55]:
p.name = 'Beazley'

log: Person.name((<__main__.Person object at 0x7fad6821d7d0>, 'Beazley'), {}) = None


In [56]:
del p.name

deleting name...
log: Person.name((<__main__.Person object at 0x7fad6821d7d0>,), {}) = None


Success!!

A bit mind-bending, but nonetheless, cool stuff!

Still, this is not perfect... :(

We can still run into trouble because not every callable is a function that can be decorated:

In [57]:
@class_logger
class Person:
    class Other:
        def __call__(self):
            print('called instance of Other...')
            
    other = Other()

decorating: <class '__main__.Person'> Other
decorating: <class '__main__.Person'> other


So, as you see it decorated both the class `Other` (since classes are callables), and it decorated `other` since we made instances of `Other` callable too.

How does that work with the logger though:

In [58]:
Person.Other

<function __main__.Person.Other()>

In [59]:
Person.other

<function __main__.func_logger.<locals>.inner()>

And that's the problem, because `Other` and `other` are callables, they have been replaced in our class by what comes out of the decorator - a function.

So maybe we can use the `inspect` module to restrict our callables further:

In [60]:
import inspect

In [61]:
class MyClass:
    @staticmethod
    def static_method():
        pass
    
    @classmethod
    def cls_method(cls):
        pass
    
    def inst_method(self):
        pass
    
    @property
    def name(self):
        pass
    
    def __add__(self, other):
        pass
    
    class Other:
        def __call__(self):
            pass
        
    other = Other()
    

In [62]:
keys = ('static_method', 'cls_method', 'inst_method', 'name', '__add__', 'Other', 'other')
inspect_funcs = ('isroutine', 'ismethod', 'isfunction', 'isbuiltin', 'ismethoddescriptor')

In [63]:
print(keys)

('static_method', 'cls_method', 'inst_method', 'name', '__add__', 'Other', 'other')


In [64]:
max_header_length = max(len(key) for key in keys)
max_fname_length = max(len(func) for func in inspect_funcs)
print(format('', f'{max_fname_length}s'), '\t'.join(format(key, f'{max_header_length}s') for key in keys))
for inspect_func in inspect_funcs:
    fn = getattr(inspect, inspect_func)
    inspect_results = (format(str(fn(MyClass.__dict__[key])), f'{max_header_length}s') for key in keys)
    print(format(inspect_func, f'{max_fname_length}s'), '\t'.join(inspect_results))

                   static_method	cls_method   	inst_method  	name         	__add__      	Other        	other        
isroutine          True         	True         	True         	False        	True         	False        	False        
ismethod           False        	False        	False        	False        	False        	False        	False        
isfunction         False        	False        	True         	False        	True         	False        	False        
isbuiltin          False        	False        	False        	False        	False        	False        	False        
ismethoddescriptor True         	True         	False        	False        	False        	False        	False        


As you can see we could use inspect to only pick things that are routines instead of more general callables. Properties, static and class methods we are already handling specially, so I'm going to move the callable check last in the `if...elif` block so we handle static and class methods first (since they are classified as routines too).

In [65]:
import inspect

def class_logger(cls):
    for name, obj in vars(cls).items():
        if isinstance(obj, staticmethod):
            original_func = obj.__func__
            print('decorating static method', original_func)
            decorated_func = func_logger(original_func)
            method = staticmethod(decorated_func)
            setattr(cls, name, method)
        elif isinstance(obj, classmethod):
            original_func = obj.__func__
            print('decorating class method', original_func)
            decorated_func = func_logger(original_func)
            method = classmethod(decorated_func)
            setattr(cls, name, method)
        elif isinstance(obj, property):
            print('decorating property', obj)
            if obj.fget:
                obj = obj.getter(func_logger(obj.fget))
            if obj.fset:
                obj = obj.setter(func_logger(obj.fset))
            if obj.fdel:
                obj = obj.deleter(func_logger(obj.fdel))
            setattr(cls, name, obj)
        elif inspect.isroutine(obj):
            print('decorating:', cls, name)
            setattr(cls, name, func_logger(obj))
    return cls

In [66]:
@class_logger
class MyClass:
    @staticmethod
    def static_method():
        print('static_method called...')
    
    @classmethod
    def cls_method(cls):
        print('class method called...')
    
    def inst_method(self):
        print('instance method called...')
    
    @property
    def name(self):
        print('name getter called...')
    
    def __add__(self, other):
        print('__add__ called...')
    
    class Other:
        def __call__(self):
            print(f'{self}.__call__ called...')
        
    other = Other()
    

decorating static method <function MyClass.static_method at 0x7fad6821c0e0>
decorating class method <function MyClass.cls_method at 0x7fad6821c320>
decorating: <class '__main__.MyClass'> inst_method
decorating property <property object at 0x7fad6820f5f0>
decorating: <class '__main__.MyClass'> __add__


In [67]:
MyClass.Other, MyClass.other

(__main__.MyClass.Other, <__main__.MyClass.Other at 0x7fad6828bb10>)

In [68]:
MyClass.other()

<__main__.MyClass.Other object at 0x7fad6828bb10>.__call__ called...


No log, that was expected.

In [69]:
MyClass.static_method()

static_method called...
log: MyClass.static_method((), {}) = None


In [70]:
MyClass.cls_method()

class method called...
log: MyClass.cls_method((<class '__main__.MyClass'>,), {}) = None


In [71]:
MyClass().inst_method()

instance method called...
log: MyClass.inst_method((<__main__.MyClass object at 0x7fad682c19d0>,), {}) = None


In [72]:
MyClass().name

name getter called...
log: MyClass.name((<__main__.MyClass object at 0x7fad6828b650>,), {}) = None


In [73]:
MyClass() + MyClass()

__add__ called...
log: MyClass.__add__((<__main__.MyClass object at 0x7fad8914bb50>, <__main__.MyClass object at 0x7fad8914b5d0>), {}) = None


If we really wanted to, we could also decorate the `Other` class:

In [74]:
@class_logger
class MyClass:
    @staticmethod
    def static_method():
        print('static_method called...')
    
    @classmethod
    def cls_method(cls):
        print('class method called...')
    
    def inst_method(self):
        print('instance method called...')
    
    @property
    def name(self):
        print('name getter called...')
    
    def __add__(self, other):
        print('__add__ called...')
    
    @class_logger
    class Other:
        def __call__(self):
            print(f'{self}.__call__ called...')
        
    other = Other()
    

decorating: <class '__main__.MyClass.Other'> __call__
decorating static method <function MyClass.static_method at 0x7fad68239b90>
decorating class method <function MyClass.cls_method at 0x7fad68239170>
decorating: <class '__main__.MyClass'> inst_method
decorating property <property object at 0x7fad682185f0>
decorating: <class '__main__.MyClass'> __add__


In [75]:
MyClass.other()

<__main__.MyClass.Other object at 0x7fad8912f3d0>.__call__ called...
log: MyClass.Other.__call__((<__main__.MyClass.Other object at 0x7fad8912f3d0>,), {}) = None


We could also do a bit of DRYing on our decorator code.

First let's handle the static and class methods:

In [76]:
import inspect

def class_logger(cls):
    for name, obj in vars(cls).items():
        if isinstance(obj, staticmethod) or isinstance(obj, classmethod):
            type_ = type(obj)
            original_func = obj.__func__
            print(f'decorating {type_.__name__} method', original_func)
            decorated_func = func_logger(original_func)
            method = type_(decorated_func)
            setattr(cls, name, method)
        elif isinstance(obj, property):
            print('decorating property', obj)
            if obj.fget:
                obj = obj.getter(func_logger(obj.fget))
            if obj.fset:
                obj = obj.setter(func_logger(obj.fset))
            if obj.fdel:
                obj = obj.deleter(func_logger(obj.fdel))
            setattr(cls, name, obj)
        elif inspect.isroutine(obj):
            print('decorating:', cls, name)
            setattr(cls, name, func_logger(obj))
    return cls

In [77]:
@class_logger
class MyClass:
    @staticmethod
    def static_method():
        print('static_method called...')
    
    @classmethod
    def cls_method(cls):
        print('class method called...')
    
    def inst_method(self):
        print('instance method called...')
    
    @property
    def name(self):
        print('name getter called...')
    
    def __add__(self, other):
        print('__add__ called...')
    
    @class_logger
    class Other:
        def __call__(self):
            print(f'{self}.__call__ called...')
        
    other = Other()

decorating: <class '__main__.MyClass.Other'> __call__
decorating staticmethod method <function MyClass.static_method at 0x7fad6820bb00>
decorating classmethod method <function MyClass.cls_method at 0x7fad6820b0e0>
decorating: <class '__main__.MyClass'> inst_method
decorating property <property object at 0x7fad68218e30>
decorating: <class '__main__.MyClass'> __add__


In [78]:
MyClass.static_method()

static_method called...
log: MyClass.static_method((), {}) = None


In [79]:
MyClass.cls_method()

class method called...
log: MyClass.cls_method((<class '__main__.MyClass'>,), {}) = None


Finally, let's see if we can clean up the block to handle properties - I don't like these repeated nested if statements that basically do the almost same thing:

In [80]:
import inspect

def class_logger(cls):
    for name, obj in vars(cls).items():
        if isinstance(obj, staticmethod) or isinstance(obj, classmethod):
            type_ = type(obj)
            original_func = obj.__func__
            print(f'decorating {type_.__name__} method', original_func)
            decorated_func = func_logger(original_func)
            method = type_(decorated_func)
            setattr(cls, name, method)
        elif isinstance(obj, property):
            print('decorating property', obj)
            methods = (('fget', 'getter'), ('fset', 'setter'), ('fdel', 'deleter'))
            for prop, method in methods:
                if getattr(obj, prop):
                    obj = getattr(obj, method)(func_logger(getattr(obj, prop)))
            setattr(cls, name, obj)
        elif inspect.isroutine(obj):
            print('decorating:', cls, name)
            setattr(cls, name, func_logger(obj))
    return cls

In [81]:
@class_logger
class MyClass:
    @staticmethod
    def static_method():
        print('static_method called...')
    
    @classmethod
    def cls_method(cls):
        print('class method called...')
    
    def inst_method(self):
        print('instance method called...')
    
    @property
    def name(self):
        print('name getter called...')
        
    @name.setter
    def name(self, value):
        print('name setter called...')
        
    @name.deleter
    def name(self):
        print('name deleter called...')
    
    def __add__(self, other):
        print('__add__ called...')
    
    @class_logger
    class Other:
        def __call__(self):
            print(f'{self}.__call__ called...')
        
    other = Other()

decorating: <class '__main__.MyClass.Other'> __call__
decorating staticmethod method <function MyClass.static_method at 0x7fad8914a170>
decorating classmethod method <function MyClass.cls_method at 0x7fad8914a3b0>
decorating: <class '__main__.MyClass'> inst_method
decorating property <property object at 0x7fad682be950>
decorating: <class '__main__.MyClass'> __add__


In [82]:
MyClass().name

name getter called...
log: MyClass.name((<__main__.MyClass object at 0x7fad98076fd0>,), {}) = None


In [83]:
MyClass().name = 'David'

name setter called...
log: MyClass.name((<__main__.MyClass object at 0x7fad980764d0>, 'David'), {}) = None


In [84]:
del MyClass().name

name deleter called...
log: MyClass.name((<__main__.MyClass object at 0x7fad98076410>,), {}) = None


##  Decorator Classes

I've already covered the topic of decorator classes, but let's review it quickly.

First off, don't confuse this with class decorators - here I'm talking about using a class to create decorators - that can be used to decorate functions, or classes - but instead of the decorator being a function, it is a class whose instances will act as decorators.

We do this by making instances of the decorator class **callable**, by implementing the `__call__` method.

Let's see a quick example of rewriting a regular decorator function into a decorator class:

In [1]:
from functools import wraps

def logger(fn):
    @wraps(fn)
    def wrapped(*args, **kwargs):
        print(f'Log: {fn.__name__} called.')
        return fn(*args, **kwargs)
    return wrapped

And we can use this decorator to log function calls:

In [2]:
@logger
def say_hello():
    pass

In [3]:
say_hello()

Log: say_hello called.


We can rewrite this decorator function into a class, by making `__init__` take the function being decorated as an argument, and implementing the `__call__` method to actually run the original function (and output the log):

In [4]:
class Logger:
    def __init__(self, fn):
        self.fn = fn
        
    def __call__(self, *args, **kwargs):
        print(f'Log: {self.fn.__name__} called.')
        return self.fn(*args, **kwargs)

In [5]:
@Logger
def say_hello():
    pass

In [6]:
say_hello()

Log: say_hello called.


Remember also that the decorator syntax we used is the same as having done it this way:

In [7]:
def say_hello():
    pass

In [8]:
type(say_hello)

function

In [9]:
say_hello = Logger(say_hello)

In [10]:
say_hello()

Log: say_hello called.


But the **big** difference is that `say_hello` is no longer a function, but a **callable** object - an instance of the `Logger` class.

In [11]:
type(say_hello)

__main__.Logger

And this actually leads us to an issue.

Let's try to use the same decorator to decorate methods in a class.

We'll start with instance methods first.

In [12]:
class Person:
    def __init__(self, name):
        self.name = name
        
    @Logger
    def say_hello(self):
        return f'{self.name} says hello!'

In [13]:
p = Person('David')

In [14]:
p.say_hello()

Log: say_hello called.


TypeError: say_hello() missing 1 required positional argument: 'self'

What's going on here? Why is Python complaining that `self` has not been passed to `say_hello`?

We called it from an instance, so why is `self` not being passed to it.

Well, you have to remember what `say_hello` is now that it has been decorated - it is an instance of a class, not a function!

And do you remember how functions are turned into methods?

The descriptor protocol... Functions implement a `__get__` method, and that is ultimately used to create the bound method.

Our class does not implement the `__get__` method, so that callable remain a plain callable, not a bound method, and that's why our implementation is broken.

In [15]:
p.say_hello

<__main__.Logger at 0x7facc02c7b10>

But it's actually an easy fix, we can implement the `__get__` method in our class, to turn it into a (non-data) descriptor, just like a function does, and we just need to return a bound method.

Remember how we can create a method bound to an object.

We can use `types.MethodType`. the first argument is the callable we want to bind, and the second argument is the instance we want to bind it to.

In [16]:
from types import MethodType

class Logger:
    def __init__(self, fn):
        self.fn = fn
        
    def __call__(self, *args, **kwargs):
        print(f'Log: {self.fn.__name__} called.')
        return self.fn(*args, **kwargs)
    
    def __get__(self, instance, owner_class):
        print(f'__get__ called: self={self}, instance={instance}')
        if instance is None:
            print('\treturning self unbound...')
            return self
        else:
            # self is callable, since it implements __call__
            print('\treturning self as a method bound to instance')
            return MethodType(self, instance)

In [17]:
class Person:
    def __init__(self, name):
        self.name = name
        
    @Logger
    def say_hello(self):
        return f'{self.name} says hello!'

In [18]:
p = Person('David')

In [19]:
p.say_hello

__get__ called: self=<__main__.Logger object at 0x7facc02c8610>, instance=<__main__.Person object at 0x7facc02c8750>
	returning self as a method bound to instance


<bound method ? of <__main__.Person object at 0x7facc02c8750>>

As you can see `say_hello` is now considered a bound method. And it bound the callable instance of Logger to the Person instance.

In [20]:
p.say_hello()

__get__ called: self=<__main__.Logger object at 0x7facc02c8610>, instance=<__main__.Person object at 0x7facc02c8750>
	returning self as a method bound to instance
Log: say_hello called.


'David says hello!'

We can still use our `Logger` decorator class to decorate functions, since in that case `__get__` doesn't even come into play:

In [21]:
@Logger
def say_bye():
    pass

In [22]:
say_bye

<__main__.Logger at 0x7face0d1e850>

As you can see, the `__get__` method does not even get called.

The last thing we should check is that the decorator works with class and static methods too.

Just remember that the order of the decorators is important - we need to decorate with our logger before we decorate with the static and class decorators. that way we end up decorating the decorated function (so just a plain fuinction decorator), and then making it into a class or static method.

In [23]:
class Person:
    @classmethod
    @Logger
    def cls_method(cls):
        print('class method called...')
        
    @staticmethod
    @Logger
    def static_method():
        print('static method called...')
        

In [24]:
Person.cls_method()

Log: cls_method called.
class method called...


In [25]:
Person.static_method()

Log: static_method called.
static method called...


##  Metaclasses vs Class Decorators

As we have seen, class decorators can achieve a lot of the metaprogramming goals we might have.

But there is one area where they fall short of metaclasses - inheritance.

Metaclasses are carried through inheritance, whereas decorators are not.

Let's go back to the previous class decorator example we had (and I'll use the original one to keep the code simple):

In [1]:
from functools import wraps

def func_logger(fn):
    @wraps(fn)
    def inner(*args, **kwargs):
        result = fn(*args, **kwargs)
        print(f'log: {fn.__qualname__}({args}, {kwargs}) = {result}')
        return result
    return inner    

def class_logger(cls):
    for name, obj in vars(cls).items():
        if callable(obj):
            print('decorating:', cls, name)
            setattr(cls, name, func_logger(obj))
    return cls

And as we saw, we can decorate a class with it:

In [2]:
@class_logger
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age
    
    def greet(self):
        return f'Hello, my name is {self.name} and I am {self.age}'

In [3]:
Person('Alex', 10).greet()

We could do this with a metaclass too:

In [4]:
class ClassLogger(type):
    def __new__(mcls, name, bases, class_dict):
        new_cls = super().__new__(mcls, name, bases, class_dict)
        for key, obj in vars(new_cls).items():
            if callable(obj):
                setattr(new_cls, key, func_logger(obj))
        return new_cls        

In [5]:
class Person(metaclass=ClassLogger):
    def __init__(self, name, age):
        self.name = name
        self.age = age
    
    def greet(self):
        return f'Hello, my name is {self.name} and I am {self.age}'

In [6]:
p = Person('John', 78).greet()

log: Person.__init__((<__main__.Person object at 0x7f9be0ce18d0>, 'John', 78), {}) = None
log: Person.greet((<__main__.Person object at 0x7f9be0ce18d0>,), {}) = Hello, my name is John and I am 78


So, why not just use a class decorator?

Now let's see how inheritance works with both those methods.

Let's do the decorator approach first:

In [7]:
@class_logger
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age
    
    def greet(self):
        return f'Hello, my name is {self.name} and I am {self.age}'

decorating: <class '__main__.Person'> __init__
decorating: <class '__main__.Person'> greet


Now let's inherit `Person`:

In [8]:
class Student(Person):
    def __init__(self, name, age, student_number):
        super().__init__(name, age)
        self.student_number = student_number
        
    def study(self):
        return f'{self.name} studies...'

In [9]:
s = Student('Alex', 19, 'abcdefg')

log: Person.__init__((<__main__.Student object at 0x7f9be0cec790>, 'Alex', 19), {}) = None


So first off, you can see that the print worked, but only for the `__init__` in the `Person` class, no logs were generated for the `__init__` in the `Student` class.

By the same token, we don't get logging on the `study` method:

In [10]:
s.study()

'Alex studies...'

So we would need to remember to decorate the `Student` class as well:

In [11]:
@class_logger
class Student(Person):
    def __init__(self, name, age, student_number):
        super().__init__(name, age)
        self.student_number = student_number
        
    def study(self):
        return f'{self.name} studies...'

decorating: <class '__main__.Student'> __init__
decorating: <class '__main__.Student'> study


In [12]:
s = Student('Alex', 19, 'abcdefg')

log: Person.__init__((<__main__.Student object at 0x7f9be0ce1090>, 'Alex', 19), {}) = None
log: Student.__init__((<__main__.Student object at 0x7f9be0ce1090>, 'Alex', 19, 'abcdefg'), {}) = None


In [13]:
s.greet()

log: Person.greet((<__main__.Student object at 0x7f9be0ce1090>,), {}) = Hello, my name is Alex and I am 19


'Hello, my name is Alex and I am 19'

In [14]:
s.study()

log: Student.study((<__main__.Student object at 0x7f9be0ce1090>,), {}) = Alex studies...


'Alex studies...'

So, we just have to remember to decorate **every** subclass as well.

But if we use a metaclass, watch what happens when inherit:

In [15]:
class Person(metaclass=ClassLogger):
    def __init__(self, name, age):
        self.name = name
        self.age = age
    
    def greet(self):
        return f'Hello, my name is {self.name} and I am {self.age}'
    
class Student(Person):
    def __init__(self, name, age, student_number):
        super().__init__(name, age)
        self.student_number = student_number
        
    def study(self):
        return f'{self.name} studies...'

In [16]:
s = Student('Alex', 19, 'abcdefg')

log: Person.__init__((<__main__.Student object at 0x7f9be0cff210>, 'Alex', 19), {}) = None
log: Student.__init__((<__main__.Student object at 0x7f9be0cff210>, 'Alex', 19, 'abcdefg'), {}) = None


In [17]:
s.study()

log: Student.study((<__main__.Student object at 0x7f9be0cff210>,), {}) = Alex studies...


'Alex studies...'

This works because `Student` inherits from `Person`, and since `Person` uses a metaclass for the creation, this follows down to the `Student` class as well.

In [18]:
type(Person)

__main__.ClassLogger

In [19]:
type(Student)

__main__.ClassLogger

As you can see the type of both the parent and the subclass is `ClassLogger` even though we did not explicitly state that `Student` shouls use the metaclass for creation.

It happened automatically because we did not have a `__new__` method in the `Student` class, so the parent's `__new__` was essentially used, and that one uses the metaclass.

We can see this more explicitly this way:

In [20]:
class Student(Person):
    def __new__(cls, name, age, student_number):
        return super().__new__(cls)
    
    def __init__(self, name, age, student_number):
        super().__init__(name, age)
        self.student_number = student_number
        
    def study(self):
        return f'{self.name} studies...'

In [21]:
s = Student('Alex', 19, 'ABC')

log: Person.__init__((<__main__.Student object at 0x7f9be0d041d0>, 'Alex', 19), {}) = None
log: Student.__init__((<__main__.Student object at 0x7f9be0d041d0>, 'Alex', 19, 'ABC'), {}) = None


In [22]:
s.study()

log: Student.study((<__main__.Student object at 0x7f9be0d041d0>,), {}) = Alex studies...


'Alex studies...'

One of the disadvantages of metaclasses vs class decorators is that only a "single" metaclass can be used. (Actually it's a bit more subtle than that, we can use a different metaclass in for a subclass if the metclass is a subclass of the parent's metaclass - we'll cover this point again when we look at multiple inheritance.)

In [26]:
class Metaclass1(type):
    pass

class Metaclass2(type):
    pass

In [27]:
class Person(metaclass=Metaclass1):
    pass

In [28]:
class Student(Person, metaclass=Metaclass2):
    pass

TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

As you can see we cannot specify a custom metaclass for `Student` because that would conflict with the class it is inheriting from.

An exception is if we inherit from a parent who has `type` as its metaclass:

In [29]:
class Person:
    pass

class Student(Person, metaclass=Metaclass1):
    pass

In [30]:
p = Person()
s = Student()

In [31]:
type(Person), type(Student)

(type, __main__.Metaclass1)

It can also cause problems in multiple inheritance.

We haven't covered multiple inheritance yet, but let me show you the issue at least:

In [32]:
class Class1(metaclass=Metaclass1):
    pass

class Class2(metaclass=Metaclass2):
    pass

Here we have created two classes that use different custom metaclasses.

If we try to create a new class that inherits from both:

In [33]:
class MultiClass(Class1, Class2):
    pass

TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

Again, if one of the base classes is `type` and the other is a custom metaclass, then this is allowed (this is because `Metaclass1` is itself a subclass of `type`:

In [36]:
class Class1(metaclass=type):
    pass

class Class2(metaclass=Metaclass1):
    pass

In [37]:
class MultiClass(Class1, Class2):
    pass

On the other hand we can stack decorators as much as we want (we just have to be careful with the order in which we stack them sometimes).

##  Metaclass Parameters

When we use a metaclass we typically have something like this:

In [1]:
class Metaclass(type):
    def __new__(mcls, name, bases, cls_dict):
        return super().__new__(mcls, name, bases, cls_dict)
    
class MyClass(metaclass=Metaclass):
    pass

In [2]:
type(MyClass), type(MyClass())

(__main__.Metaclass, __main__.MyClass)

But is there a way to pass *additional* arguments to the metaclass `__new__` method?

Starting in Python 3.6, the answer is yes. The restriction is that they **must** be passed as named arguments (positional args being used for specifying inheritance).

First let's just try out a simple example to understand the syntax:

In [3]:
class Metaclass(type):
    def __new__(mcls, name, bases, cls_dict, arg1, arg2, arg3=None):
        print(arg1, arg2, arg3)
        return super().__new__(mcls, name, bases, cls_dict)

In [4]:
class MyClass(metaclass=Metaclass, arg1=10, arg2=20, arg3=30):
    pass

10 20 30


In [5]:
class MyClass(metaclass=Metaclass, arg1=10, arg2=20):
    pass

10 20 None


As you can see our metaclass `__new__` method received those arguments.

Let's look at a more practical example of this:

In [6]:
class AutoClassAttrib(type):
    def __new__(cls, name, bases, cls_dict, extra_attrs=None):
        if extra_attrs:
            print('Creating class with some extra attributes: ', extra_attrs)
            # here I'm going to things directly into the cls_dict namespace
            # but could also create the class first, then add using setattr
            for attr_name, attr_value in extra_attrs:
                cls_dict[attr_name] = attr_value
        return super().__new__(cls, name, bases, cls_dict)
                

In [7]:
class Account(metaclass=AutoClassAttrib, extra_attrs=[('account_type', 'Savings'), ('apr', 0.5)]):
    pass

Creating class with some extra attributes:  [('account_type', 'Savings'), ('apr', 0.5)]


In [8]:
vars(Account)

mappingproxy({'__module__': '__main__',
              'account_type': 'Savings',
              'apr': 0.5,
              '__dict__': <attribute '__dict__' of 'Account' objects>,
              '__weakref__': <attribute '__weakref__' of 'Account' objects>,
              '__doc__': None})

As you can see the class now has these two extra attributes.

We could also have just done it this way:

In [9]:
class AutoClassAttrib(type):
    def __new__(cls, name, bases, cls_dict, extra_attrs=None):
        new_cls = super().__new__(cls, name, bases, cls_dict)
        if extra_attrs:
            print('Creating class with some extra attributes: ', extra_attrs)
            for attr_name, attr_value in extra_attrs:
                setattr(new_cls, attr_name, attr_value)
        return new_cls
                

In [10]:
class Account(metaclass=AutoClassAttrib, extra_attrs=[('account_type', 'Savings'), ('apr', 0.5)]):
    pass

Creating class with some extra attributes:  [('account_type', 'Savings'), ('apr', 0.5)]


In [11]:
vars(Account)

mappingproxy({'__module__': '__main__',
              '__dict__': <attribute '__dict__' of 'Account' objects>,
              '__weakref__': <attribute '__weakref__' of 'Account' objects>,
              '__doc__': None,
              'account_type': 'Savings',
              'apr': 0.5})

Of course, we could just use `**kwargs` instead, to make it easier:

In [12]:
class AutoClassAttrib(type):
    def __new__(cls, name, bases, cls_dict, **kwargs):
        new_cls = super().__new__(cls, name, bases, cls_dict)
        if kwargs:
            print('Creating class with some extra attributes: ', kwargs)
            for attr_name, attr_value in kwargs.items():
                setattr(new_cls, attr_name, attr_value)
        return new_cls
                

In [13]:
class Account(metaclass=AutoClassAttrib, account_type='Savings', apr=0.5):
    pass

Creating class with some extra attributes:  {'account_type': 'Savings', 'apr': 0.5}


In [14]:
vars(Account)

mappingproxy({'__module__': '__main__',
              '__dict__': <attribute '__dict__' of 'Account' objects>,
              '__weakref__': <attribute '__weakref__' of 'Account' objects>,
              '__doc__': None,
              'account_type': 'Savings',
              'apr': 0.5})

##  The `__prepare__` Method

We know that when we create a class, the metaclass `__new__` method is invoked with an argument (`cls_dict`) for the class dictionary.

It is not in fact an empty dictionary at first:

In [1]:
class MyMeta(type):
    def __new__(mcls, name, bases, cls_dict, **kwargs):
        print('MyMeta.__new__ called...')
        print('\tcls: ', mcls, type(mcls))
        print('\tname:', name, type(name))
        print('\tbases: ', bases, type(bases))
        print('\tcls_dict:', cls_dict, type(cls_dict))
        print('\tkwargs:', kwargs)
        return super().__new__(mcls, name, bases, cls_dict)

In [2]:
class MyClass(metaclass=MyMeta):
    pass

MyMeta.__new__ called...
	cls:  <class '__main__.MyMeta'> <class 'type'>
	name: MyClass <class 'str'>
	bases:  () <class 'tuple'>
	cls_dict: {'__module__': '__main__', '__qualname__': 'MyClass'} <class 'dict'>
	kwargs: {}


So, as we see, `cls_dict` is a dictionary and it also contains some information already. It is obviously being created somewhere before being passed to the `__new__` method.

The class dictionary is actually created by calling the `__prepare__` method, which the `type` class implements.

When the class is created, Python calls `__prepare__` and uses the return value of that method as the initialized class dictionary.
Then right before calling `__new__` it adds a few items into that dictionary, and then calls the `__new__` method using that pre-created and initialized dictionary.

Since `__prepare__` is just a method in `type`, we can override it.

In [3]:
class MyMeta(type):
    @staticmethod
    def __prepare__(name, bases, **kwargs):
        print('MyMeta.__prepare__ called...')
        print('\tname:', name)
        print('\tkwargs:', kwargs)
        return {'a': 100, 'b': 200}
    
    def __new__(mcls, name, bases, cls_dict, **kwargs):
        print('MyMeta.__new__ called...')
        print('\tcls: ', mcls, type(mcls))
        print('\tname:', name, type(name))
        print('\tbases: ', bases, type(bases))
        print('\tcls_dict:', cls_dict, type(cls_dict))
        print('\tkwargs:', kwargs)
        return super().__new__(mcls, name, bases, cls_dict)

In [4]:
class MyClass(metaclass=MyMeta, kw1=10, kw2=20):
    pass

MyMeta.__prepare__ called...
	name: MyClass
	kwargs: {'kw1': 10, 'kw2': 20}
MyMeta.__new__ called...
	cls:  <class '__main__.MyMeta'> <class 'type'>
	name: MyClass <class 'str'>
	bases:  () <class 'tuple'>
	cls_dict: {'a': 100, 'b': 200, '__module__': '__main__', '__qualname__': 'MyClass'} <class 'dict'>
	kwargs: {'kw1': 10, 'kw2': 20}


Notice how the `__prepare__` method was called **before** the `__new__` method was called.

Also notice how it contains the items `'a': 100` and `'b': 200` which we injected in the `__prepare__` method.

The `cls_dict` argument in `__new__` has a couple of extra items that it injects for us prior to calling the `__new__` method.

Of course, if we do not specify a `__prepare__` method in our metaclass, we inherit the one that is already defined in `type` - which returns an empty dictionary.

In [5]:
type.__prepare__()

{}

Here's an example where using this method can simplify things somewhat.

Recall the example where we passed named arguments to the metaclass in order to create some additional class attributes:

In [6]:
class MyMeta(type):
    def __new__(mcls, name, bases, class_dict, **kwargs):
        class_dict.update(kwargs)
        return super().__new__(mcls, name, bases, class_dict)
    
class MyClass(metaclass=MyMeta, arg1=100, arg2=200):
    pass        

In [7]:
vars(MyClass)

mappingproxy({'__module__': '__main__',
              'arg1': 100,
              'arg2': 200,
              '__dict__': <attribute '__dict__' of 'MyClass' objects>,
              '__weakref__': <attribute '__weakref__' of 'MyClass' objects>,
              '__doc__': None})

We were able to override the `__new__` method and inject the additional arguments right into the class dictionary.

But we could just as easily inject those items in the class dictionary right in the `__prepare__` method.

What's important to understand is that whatever extra arguments we pass to the metaclass are also passed along to the `__prepare__` method, just like they are eventually passed to `__new__`.

In [8]:
class MyMeta(type):
    def __prepare__(name, bases, **kwargs):
        print(f'MyMeta.__prepare__ called... with {kwargs}')
        # we could create a new dictionary and insert everything we need from kwargs
        # or we could just use the kwargs dictionary directly
        kwargs['bonus_attr'] = 'Python rocks!'
        return kwargs
    
    def __new__(cls, name, bases, cls_dict, **kwargs):
        print('MyMeta.__new__ called...')
        print('\tcls: ', cls, type(cls))
        print('\tname:', name, type(name))
        print('\tbases: ', bases, type(bases))
        print('\tcls_dict:', cls_dict, type(cls_dict))
        print('\tkwargs:', kwargs)
        return super().__new__(cls, name, bases, cls_dict)

In [9]:
class MyClass(metaclass=MyMeta, kw1=1, kw2=2):
    pass

MyMeta.__prepare__ called... with {'kw1': 1, 'kw2': 2}
MyMeta.__new__ called...
	cls:  <class '__main__.MyMeta'> <class 'type'>
	name: MyClass <class 'str'>
	bases:  () <class 'tuple'>
	cls_dict: {'kw1': 1, 'kw2': 2, 'bonus_attr': 'Python rocks!', '__module__': '__main__', '__qualname__': 'MyClass'} <class 'dict'>
	kwargs: {'kw1': 1, 'kw2': 2}


In [10]:
vars(MyClass)

mappingproxy({'kw1': 1,
              'kw2': 2,
              'bonus_attr': 'Python rocks!',
              '__module__': '__main__',
              '__dict__': <attribute '__dict__' of 'MyClass' objects>,
              '__weakref__': <attribute '__weakref__' of 'MyClass' objects>,
              '__doc__': None})

And as you can see, we have our class attributes, and we did not have to use `__new__`. So often, `__prepare__` is a much simpler alternative to overriding `__new__`.

The return value of `__prepare__` must be a mapping type:

In [11]:
class MyMeta(type):
    def __prepare__(name, bases):
        return 'some string'

In [12]:
class MyClass(metaclass=MyMeta):
    pass

TypeError: string indices must be integers

This exception is raised because Python is trying to use the class dictionary as a mapping type.

In [13]:
cls_dict = 'some string'
cls_dict['__module__']

TypeError: string indices must be integers

The return value must therefore be a mapping type, but it does not have to be a dict - it could be an OrderedDict for example, or even a custom dictionary.

In [14]:
from collections import OrderedDict

In [15]:
class MyMeta(type):
    def __prepare__(name, bases):
        d = OrderedDict()
        d['bonus'] = 'Python rocks!'
        return d

In [16]:
class MyClass(metaclass=MyMeta):
    pass

In [17]:
vars(MyClass)

mappingproxy({'bonus': 'Python rocks!',
              '__module__': '__main__',
              '__dict__': <attribute '__dict__' of 'MyClass' objects>,
              '__weakref__': <attribute '__weakref__' of 'MyClass' objects>,
              '__doc__': None})

Or it could even be a custom dictionary type:

In [18]:
from collections import UserDict

In [19]:
class CustomDict(UserDict):
    def __setitem__(self, key, value):
        print(f'Setting {key} = {value} in custom dictionary')
        super().__setitem__(key, value)
        
    def __getitem__(self, key):
        print(f'Getting {key} from custom dictionary')
        return int(super().__getitem__(key))   

In [20]:
class MyMeta(type):
    def __prepare__(name, bases):
        return CustomDict()

In [21]:
class MyClass(metaclass=MyMeta):
    pass

Getting __name__ from custom dictionary
Setting __module__ = __main__ in custom dictionary
Setting __qualname__ = MyClass in custom dictionary


TypeError: type.__new__() argument 3 must be dict, not CustomDict

As you can see, we have a slight problem here. The `__new__` method actually expects a `dict`. Even though `CustomDict` essentially behaves like a dictionary, it is not in fact a subclass of `dict`:

In [22]:
issubclass(CustomDict, dict)

False

But as long as our custom dictionary inherits from `dict` we should be fine:

In [23]:
class CustomDict(dict):
    def __setitem__(self, key, value):
        print(f'Setting {key} = {value} in custom dictionary')
        super().__setitem__(key, value)
        
    def __getitem__(self, key):
        print(f'Getting {key} from custom dictionary')
        return int(super().__getitem__(key))   

In [24]:
class MyMeta(type):
    def __prepare__(name, bases):
        return CustomDict()
    
    def __new__(mcls, name, bases, cls_dict):
        print('metaclass __new__ called...')
        print(f'\ttype(cls_dict) = {type(cls_dict)}')
        print(f'\tcls_dict={cls_dict}')

In [25]:
class MyClass(metaclass=MyMeta):
    pass

Getting __name__ from custom dictionary
Setting __module__ = __main__ in custom dictionary
Setting __qualname__ = MyClass in custom dictionary
metaclass __new__ called...
	type(cls_dict) = <class '__main__.CustomDict'>
	cls_dict={'__module__': '__main__', '__qualname__': 'MyClass'}


As you can see, the dictionary we returned from `__prepare__` was a `CustomDict` instance that is eventually passed to `__new__` when it is called. 

And between `__prepare__` and `__new__`, Python accessed our dictionary to read/write a few items.

##  Metaprogramming - Application 1

Are you tired of writing boiler-plate code like this:

In [1]:
class Point2D:
    __slots__ = ('_x', '_y')
    
    def __init__(self, x, y):
        self._x = x
        self._y = y
        
    @property
    def x(self):
        return self._x
    
    @property
    def y(self):
        return self._y
    
    def __eq__(self, other):
        return isinstance(other, Point) and (self.x, self.y) == (other.x, other.y)
    
    def __hash__(self):
        return hash((self.x, self.y))
    
    def __repr__(self):
        return f'Point2D({self.x}, {self.y})'
    
    def __str__(self):
        return f'({self.x}, {self.y})'
        
class Point3D:
    __slots__ = ('_x', '_y', '_z')
    
    def __init__(self, x, y, z):
        self._x = x
        self._y = y
        self._z = z
    
    @property
    def x(self):
        return self._x
    
    @property
    def y(self):
        return self._y
    
    @property
    def z(self):
        return self._z
    
    def __eq__(self, other):
        return isinstance(other, Point) and (self.x, self.y, self.z) == (other.x, other.y, other.z)
    
    def __hash__(self):
        return hash((self.x, self.y, self.z))

    def __repr__(self):
        return f'Point2D({self.x}, {self.y}, {self.z})'
    
    def __str__(self):
        return f'({self.x}, {self.y}, {self.z})'


It's basically the opposite of DRY!

Let's try to solve this problem using metaclasses (because we might care about inheritance).

First we are going to define our fields using a class attribute, like so:

In [2]:
class Point2D:
    _fields = ['x', 'y']
    
    def __init__(self, x, y):
        self._x = x
        self._y = y
    
class Point3D:
    _fields = ['x', 'y', 'z']
    
    def __init__(self, x, y, z):
        self._x = x
        self._y = y
        self._z = z

For now, we'll keep the `__init__` in our classes themselves, but we'll come back to that later.

Next we are going to define a metaclass that will create the properties and slots, as well as implement the `__eq__`, `__hash__`, `__repr__` and `__str__` methods.

In [3]:
class SlottedStruct(type):
    def __new__(cls, name, bases, class_dict):
        cls_object = super().__new__(cls, name, bases, class_dict)
        
        # setup the __slots__
        setattr(cls_object, '__slots__', [f'_{field}' for field in cls_object._fields])
            
        # create read-only property for each field
        for field in cls_object._fields:
            slot = f'_{field}'
            # this will not work!
            # remember about how closures work! The free variable is resolved when the function is called!
            #     setattr(cls_object, field, property(fget=lambda self: getattr(self, slot)))
            # so instead we have to use this workaround, by specifying the slot as a defaulted argument
            setattr(cls_object, field, property(fget=lambda self, attrib=slot: getattr(self, attrib)))

        return cls_object

Let's see how this is looking so far:

In [4]:
class Person(metaclass=SlottedStruct):
    _fields = ['name', 'age']
    
    def __init__(self, name, age):
        self._name = name
        self._age = age

In [5]:
vars(Person)

mappingproxy({'__module__': '__main__',
              '_fields': ['name', 'age'],
              '__init__': <function __main__.Person.__init__(self, name, age)>,
              '__dict__': <attribute '__dict__' of 'Person' objects>,
              '__weakref__': <attribute '__weakref__' of 'Person' objects>,
              '__doc__': None,
              '__slots__': ['_name', '_age'],
              'name': <property at 0x7fc7d0255f48>,
              'age': <property at 0x7fc7d0255f98>})

As you can see we have `__slots__` defined, and properties for `name` and `age`. Let's try it out:

In [6]:
p = Person('Alex', 19)

In [7]:
p.name

'Alex'

In [8]:
p.age

19

So far so good, now let's continue implementing the rest of the functions:

In [9]:
class SlottedStruct(type):
    def __new__(cls, name, bases, class_dict):
        cls_object = super().__new__(cls, name, bases, class_dict)
        
        # setup the __slots__
        setattr(cls_object, '__slots__', [f'_{field}' for field in cls_object._fields])
            
        # create read-only property for each field
        for field in cls_object._fields:
            slot = f'_{field}'
            # this will not work!
            #     setattr(cls_object, field, property(fget=lambda self: getattr(self, slot)))
            # Remember about how closures work! The free variable is resolved when the function is called!
            # So instead we have to use this workaround, by specifying the slot as a defaulted argument
            setattr(cls_object, field, property(fget=lambda self, attrib=slot: getattr(self, attrib)))

        # create __eq__ method
        def eq(self, other):
            if isinstance(other, cls_object):
                # ensure each corresponding field is equal
                self_fields = [getattr(self, field) for field in cls_object._fields]
                other_fields = [getattr(other, field) for field in cls_object._fields]
                return self_fields == other_fields
            return False
        setattr(cls_object, '__eq__', eq)

        # create __hash__ method
        def hash_(self):
            field_values = (getattr(self, field) for field in cls_object._fields)
            return hash(tuple(field_values))
        setattr(cls_object, '__hash__', hash_)
        
        # create __str__ method
        def str_(self):
            field_values = (getattr(self, field) for field in cls_object._fields)
            field_values_joined = ', '.join(map(str, field_values))  # make every value a string
            return f'{cls_object.__name__}({field_values_joined})'
        setattr(cls_object, '__str__', str_)
        
        # create __repr__ method
        def repr_(self):
            field_values = (getattr(self, field) for field in cls_object._fields)
            field_key_values = (f'{key}={value}' for key, value in zip(cls_object._fields, field_values))
            field_key_values_str = ', '.join(field_key_values)
            return f'{cls_object.__name__}({field_key_values_str})'
        setattr(cls_object, '__repr__', repr_)
        
        return cls_object

In [10]:
class Person(metaclass=SlottedStruct):
    _fields = ['name']
    
    def __init__(self, name):
        self._name = name

Let's try this out:

In [11]:
type(Person)

__main__.SlottedStruct

In [12]:
p1 = Person('Alex')
p2 = Person('Alex')

In [13]:
type(p1), isinstance(p1, Person)

(__main__.Person, True)

In [14]:
p1 == p2

True

In [15]:
hash(p1), hash(p2)

(-4434760416215847140, -4434760416215847140)

In [16]:
repr(p1)

'Person(name=Alex)'

In [17]:
str(p1)

'Person(Alex)'

And now, we can use this metaclass for any of our other classes too that need to follow the same pattern: slots for all the fields, read-only properties for all the fields, and equality, hashing, repr and str as implemented.

In [18]:
class Point2D(metaclass=SlottedStruct):
    _fields = ['x', 'y']
    
    def __init__(self, x, y):
        self._x = x
        self._y = y
        
class Point3D(metaclass=SlottedStruct):
    _fields = ['x', 'y', 'z']
    
    def __init__(self, x, y, z):
        self._x = x
        self._y = y
        self._z = z

In [19]:
p1 = Point2D(1, 2)
p2 = Point2D(1, 2)
p3 = Point2D(0, 0)

In [20]:
repr(p1), str(p1), hash(p1), p1.x, p1.y

('Point2D(x=1, y=2)', 'Point2D(1, 2)', 3713081631934410656, 1, 2)

In [21]:
repr(p2), str(p2), hash(p2), p2.x, p2.y

('Point2D(x=1, y=2)', 'Point2D(1, 2)', 3713081631934410656, 1, 2)

In [22]:
p1 is p2, p1 == p2

(False, True)

In [23]:
p1 is p3, p1 == p3

(False, False)

And `Point3D` works exactly the same:

In [24]:
p1 = Point3D(1, 2, 3)
p2 = Point3D(1, 2, 3)
p3 = Point3D(0, 0, 0)

In [25]:
p1.x, p1.y, p1.z

(1, 2, 3)

In [26]:
p1 == p2, p1 == p3

(True, False)

Here's an additional twist!

I don't like writing `metaclass=SlottedStruct` every time - so I'm going to use a class decorator to do that for me!!

We already know that a class has properties named `__name__` and `__dict__`.

An additional property it has is `__bases__`:

In [27]:
Point2D.__name__, Point2D.__bases__, Point2D.__dict__

('Point2D',
 (object,),
 mappingproxy({'__module__': '__main__',
               '_fields': ['x', 'y'],
               '__init__': <function __main__.Point2D.__init__(self, x, y)>,
               '__dict__': <attribute '__dict__' of 'Point2D' objects>,
               '__weakref__': <attribute '__weakref__' of 'Point2D' objects>,
               '__doc__': None,
               '__slots__': ['_x', '_y'],
               'x': <property at 0x7fc7d0256778>,
               'y': <property at 0x7fc7d02567c8>,
               '__eq__': <function __main__.SlottedStruct.__new__.<locals>.eq(self, other)>,
               '__hash__': <function __main__.SlottedStruct.__new__.<locals>.hash_(self)>,
               '__str__': <function __main__.SlottedStruct.__new__.<locals>.str_(self)>,
               '__repr__': <function __main__.SlottedStruct.__new__.<locals>.repr_(self)>}))

So, our class decorator will need to take the class, and rebuild it, but specifying the metaclass we want to use:

In [28]:
def struct(cls):
    return SlottedStruct(cls.__name__, cls.__bases__, dict(cls.__dict__))

In [29]:
@struct
class Point2D:
    _fields = ['x', 'y']
    
    def __init__(self, x, y):
        self._x = x
        self._y = y

In [30]:
type(Point2D)

__main__.SlottedStruct

In [31]:
p = Point2D(1, 2)

In [32]:
type(p)

__main__.Point2D

In [33]:
p.x, p.y

(1, 2)

In [34]:
repr(p)

'Point2D(x=1, y=2)'

All this takes a little bit of getting used to, but the basic concepts are not particularly difficult. The applications thereof do mean you have to use just about everything you've learned about Python in this series!

This was a good exercise to see metaprogramming in action, but as far as this example is concerned we have a much better alternative, starting in Python 3.7 - **dataclasses**.

We'll come back to those later.

##  Metaprogramming - Application 2

There's another pattern we can implement using metaprogramming - Singletons.

If you read online, you'll see that singleton objects are controversial in Python. 

I'm not going to get into a debate on this, other than to say I do not use singleton objects, not because I have deep thoughts about it (or even shallow ones for that matter), but rather because I have never had a need for them.

However, the question often comes up, so here it is - the metaclass way of implementing the singleton pattern.

Whether you think you should use it or not, is entirely up to you!

We have seen singleton objects - objects such as `None`, `True` or `False` for example.

No matter where we create them in our code, they always refer to the **same** object.

We can recover the type used to create `None` objects:

In [361]:
NoneType = type(None)

And now we can create multiple instances of that type:

In [362]:
n1 = NoneType()
n2 = NoneType()

In [363]:
id(n1), id(n2)

(4466448280, 4466448280)

As you can see, any instance of `NoneType` is actually the **same** object.

The same holds true for booleans:

In [364]:
b1 = bool([])
b2 = bool("")

In [365]:
id(b1), id(b2)

(4466348224, 4466348224)

These are all examples of singleton objects. Now matter how we create them, we always end up with a reference to the same instance.

There is no built-in mechanism to Python for singleton objects, so we have to do it ourselves.

The basic idea is this:

When an instance of the class is being created (but **before** the instance is actually created), check if an instance has already been created, in which case return that instance, otherwise, create a new instance and store that instance reference somewhere so we can recover it the next time an instance is requested.

We could do it entirely in the class itself, without any metaclasses, using the `__new__` method.

We can start with this:

In [27]:
class Hundred:
    def __new__(cls):
        new_instance = super().__new__(cls)
        setattr(new_instance, 'name', 'hundred')
        setattr(new_instance, 'value', 100)
        return new_instance

In [31]:
h1 = Hundred()

In [32]:
vars(h1)

{'name': 'hundred', 'value': 100}

But of course, this is not a singleton object.

In [33]:
h2 = Hundred()

In [34]:
h1 is h2

False

So, let's fix this to make it a singleton:

In [36]:
class Hundred:
    _existing_instance = None  # a class attribute!
    
    def __new__(cls):
        if not cls._existing_instance:
            print('creating new instance...')
            new_instance = super().__new__(cls)
            setattr(new_instance, 'name', 'hundred')
            setattr(new_instance, 'value', 100)
            cls._existing_instance = new_instance
        else:
            print('instance exists already, using that one...')
        return cls._existing_instance

In [37]:
h1 = Hundred()

creating new instance...


In [38]:
h2 = Hundred()

instance exists already, using that one...


In [39]:
h1 is h2

True

And there you are, we have a singleton object.

So this works, but if you need to have multiple of these singleton objects, the code will just become repetitive.

Metaclasses to the rescue!

Remember what we are trying to do:

If we create two instances of our class `Hundred` we expect the same instance back.

But how do we create an instance of a class - we **call** it, so `Hundred()`.

Which `__call__` method is that? It is not the one in the `Hundred` class, that would make **instances** of `Hundred` callable, it is the `__call__` method in the **metaclass**.

So, we need to override the `__call__` in our metaclass.

In [84]:
class Singleton(type):
    def __call__(cls, *args, **kwargs):
        print(f'Request received to create an instance of class: {cls}...')
        return super().__call__(*args, **kwargs)

In [85]:
class Hundred(metaclass=Singleton):
    value = 100

In [86]:
h = Hundred()

Request received to create an instance of class: <class '__main__.Hundred'>...


In [87]:
h.value

100

OK, that works, but now we need to make it into a singleton instance.

We have to be careful here. Initially we had used the class itself (`Hundred`) to store, as a class variable, whether an instance had already been created. 

And here we could try to do the same thing. 

We could store the instance as a class variable in the class of the instance being created

That's actually quite simple, since the class is received as the first argument of the `__call__` method.

In [100]:
class Singleton(type):
    def __call__(cls, *args, **kwargs):
        print(f'Request received to create an instance of class: {cls}...')
        if getattr(cls, 'existing_instance', None) is None:
            print('Creating instance for the first time...')
            setattr(cls, 'existing_instance', super().__call__(*args, **kwargs))
        else:
            print('Using existing instance...')
        return getattr(cls, 'existing_instance')

In [101]:
class Hundred(metaclass=Singleton):
    value = 100

In [102]:
h1 = Hundred()

Request received to create an instance of class: <class '__main__.Hundred'>...
Creating instance fior the first time...


In [103]:
h2 = Hundred()

Request received to create an instance of class: <class '__main__.Hundred'>...
Using existing instance...


In [108]:
h1 is h2, h1.value, h2.value

(True, 100, 100)

So that seems to work just fine. Let's create another singleton class and see if things still work.

In [105]:
class Thousand(metaclass=Singleton):
    value = 1000

In [106]:
t1 = Thousand()

Request received to create an instance of class: <class '__main__.Thousand'>...
Creating instance fior the first time...


In [107]:
t2 = Thousand()

Request received to create an instance of class: <class '__main__.Thousand'>...
Using existing instance...


In [109]:
h1 is h2, h1.value, h2.value

(True, 100, 100)

In [110]:
t1 is t2, t1.value, t2.value

(True, 1000, 1000)

In [111]:
h1 is t1, h2 is t2

(False, False)

So far so good.

Finally let's make sure everything works with **inheritance** too - if we inherit from a Singleton class, that subclass should also be a singleton.

In [112]:
class HundredFold(Hundred):
    value = 100 * 100

In [113]:
hf1 = HundredFold()

Request received to create an instance of class: <class '__main__.HundredFold'>...
Using existing instance...


Whaaat? Using existing instance? But this is the first time we created it!!

The problem is this: How are we checking if an instance has already been created?

We did this:
```if getattr(cls, 'existing_instance')```

But since `HundredFold` inherits from `Hundred`, it also inherited the class attribute `existing_instance`.

This means we have to be a bit more careful in our metaclass, we need to see if we have an instance of the **specific** class already created - and we cannot rely on storing a class attribute in the classes themselves since that breaks the pattern when subclassing.

So, instead, we are going to store the class, and the instance of that class, in a dictionary **in the metaclass** itself, and use that dictionary to lookup the existing instance (if any) for a specific class.

In [127]:
class Singleton(type):
    instances = {}
    
    def __call__(cls, *args, **kwargs):
        print(f'Request received to create an instance of class: {cls}...')
        existing_instance = Singleton.instances.get(cls, None)
        if existing_instance is None:
            print('Creating instance for the first time...')
            existing_instance = super().__call__(*args, **kwargs)
            Singleton.instances[cls] = existing_instance
        else:
            print('Using existing instance...')
        return existing_instance

In [128]:
class Hundred(metaclass=Singleton):
    value = 100
    
class Thousand(metaclass=Singleton):
    value = 1000
    
class HundredFold(Hundred):
    value = 100 * 100

In [129]:
h1 = Hundred()
h2 = Hundred()

Request received to create an instance of class: <class '__main__.Hundred'>...
Creating instance for the first time...
Request received to create an instance of class: <class '__main__.Hundred'>...
Using existing instance...


In [130]:
t1 = Thousand()
t2 = Thousand()

Request received to create an instance of class: <class '__main__.Thousand'>...
Creating instance for the first time...
Request received to create an instance of class: <class '__main__.Thousand'>...
Using existing instance...


In [131]:
hf1 = HundredFold()
hf2 = HundredFold()

Request received to create an instance of class: <class '__main__.HundredFold'>...
Creating instance for the first time...
Request received to create an instance of class: <class '__main__.HundredFold'>...
Using existing instance...


In [132]:
h1 is h2, t1 is t2, hf1 is hf2

(True, True, True)

In [133]:
h1.value, h2.value, t1.value, t2.value, hf1.value, hf2.value

(100, 100, 1000, 1000, 10000, 10000)

And just to make sure :-)

In [135]:
h1 is hf1

False

##  Metaprogramming - Application 3

Let's say we have some `.ini` files that hold various application configurations. We want to read those `.ini` files into an object structure so that we can access the data in our config files using dot notation.

Let's start by creating some `.ini` files:

In [1]:
with open('prod.ini', 'w') as prod, open('dev.ini', 'w') as dev:
    prod.write('[Database]\n')
    prod.write('db_host=prod.mynetwork.com\n')
    prod.write('db_name=my_database\n')
    prod.write('\n[Server]\n')
    prod.write('port=8080\n')
    
    dev.write('[Database]\n')
    dev.write('db_host=dev.mynetwork.com\n')
    dev.write('db_name=my_database\n')
    dev.write('\n[Server]\n')
    dev.write('port=3000\n')

Note: I could have used the `configparser` module to write out these ini files, but we don't have to - generally these config files are created and edited manually. We will use `configparser` to load up the config files though.

When we start our program, we want to load up one of these files into a config object of some sort.

We could certainly do it this way:

In [2]:
import configparser

class Config:
    def __init__(self, env='dev'):
        print(f'Loading config from {env} file...')
        config = configparser.ConfigParser()
        file_name = f'{env}.ini'
        config.read(file_name)
        self.db_host = config['Database']['db_host']
        self.db_name = config['Database']['db_name']
        self.port = config['Server']['port']

In [3]:
config = Config('dev')

Loading config from dev file...


In [4]:
config.__dict__

{'db_host': 'dev.mynetwork.com', 'db_name': 'my_database', 'port': '3000'}

but whenever we need access to this config object again, we either have to store the object somewhere in a global variable (common, and extremely simple!), or we need to re-create it:

In [5]:
config = Config('dev')

Loading config from dev file...


Which means we end up parsing the `ini` file over and over again.

In [6]:
config.db_name

'my_database'

In [7]:
help(config)

Help on Config in module __main__ object:

class Config(builtins.object)
 |  Config(env='dev')
 |  
 |  Methods defined here:
 |  
 |  __init__(self, env='dev')
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)



Furthermore, `help` is not very useful to us here.

The other thing is that we had to "hardcode" each config value in our `Config` class. 

That's a bit of a pain. 

Could we maybe create instance attributes from inspecting what's inside the `ini` files instead?

In [8]:
class Config:
    def __init__(self, env='dev'):
        print(f'Loading config from {env} file...')
        config = configparser.ConfigParser()
        file_name = f'{env}.ini'
        config.read(file_name)
        for section_name in config.sections():
            for key, value in config[section_name].items():
                setattr(self, key, value)

In [9]:
config = Config('prod')

Loading config from prod file...


In [10]:
config.__dict__

{'db_host': 'prod.mynetwork.com', 'db_name': 'my_database', 'port': '8080'}

So this is good, we can access our config values using dot notation:

In [11]:
config.port

'8080'

The next issue we need to deal with is that our config files are organized into sections, and here we've essentially ignored this and create just a "flat" data structure.

So let's deal with that next.

Let's write a custom class for representing sections:

In [12]:
class Section:
    def __init__(self, name, item_dict):
        """
        name: str
            name of section
        item_dict : dictionary
            dictionary of named (key) config values (value)
        """
        self.name = name
        for key, value in item_dict.items():
            setattr(self, key, value)

And now we can rewrite our `Config` class this way:

In [13]:
class Config:
    def __init__(self, env='dev'):
        print(f'Loading config from {env} file...')
        config = configparser.ConfigParser()
        file_name = f'{env}.ini'
        config.read(file_name)
        for section_name in config.sections():
            section = Section(section_name, config[section_name])
            setattr(self, section_name.lower(), section)

In [14]:
config = Config()

Loading config from dev file...


Now we have sections:

In [15]:
vars(config)

{'database': <__main__.Section at 0x7f8ce09f6e48>,
 'server': <__main__.Section at 0x7f8ce09f65f8>}

And each section has its config values:

In [16]:
vars(config.database)

{'name': 'Database', 'db_host': 'dev.mynetwork.com', 'db_name': 'my_database'}

But that still does not solve our documentation issue:

In [17]:
help(Config)

Help on class Config in module __main__:

class Config(builtins.object)
 |  Config(env='dev')
 |  
 |  Methods defined here:
 |  
 |  __init__(self, env='dev')
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)



Most modern IDE's will still be able to provide us some auto-completion on the attributes though, using some form of introspection.

But let's assume we really want `help` to give us some useful information, or we're working with an IDE that isn't sophisticated enough.

To do that, we are going to switch to metaclasses.

Our custom metaclass will load up the `ini` file and use it to create class attributes instead:

And we'll need to do this for both the sections and the overall config.

To keep things a little simpler, we're going to create two distinct metaclasses. One for the sections in the config file, and one that combines the sections together - very similar to what we did with our original `Config` class.

One key difference, is that each `Section` class instance, will be a brand new class, created via its metaclass.

Let's write the `Section` metaclass first.

In [18]:
class SectionType(type):
    def __new__(cls, name, bases, cls_dict, section_name, items_dict):
        cls_dict['__doc__'] = f'Configs for {section_name} section'
        cls_dict['section_name'] = section_name
        for key, value in items_dict.items():
            cls_dict[key] = value
        return super().__new__(cls, name, bases, cls_dict)

We can now create `Section` classes for different sections in our configs, passing the metaclass the section name, and a dictionary of the values it should create as class attributes.

In [19]:
class DatabaseSection(metaclass=SectionType, section_name='database', items_dict={'db_name': 'my_database', 'host': 'myhost.com'}):
    pass

In [20]:
vars(DatabaseSection)

mappingproxy({'__module__': '__main__',
              '__doc__': 'Configs for database section',
              'section_name': 'database',
              'db_name': 'my_database',
              'host': 'myhost.com',
              '__dict__': <attribute '__dict__' of 'DatabaseSection' objects>,
              '__weakref__': <attribute '__weakref__' of 'DatabaseSection' objects>})

As you can see, our items `db_name` and `host` are in the class.

In [21]:
DatabaseSection.db_name

'my_database'

And the `help` function introspection will work too:

In [22]:
help(DatabaseSection)

Help on class DatabaseSection in module __main__:

class DatabaseSection(builtins.object)
 |  Configs for database section
 |  
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  db_name = 'my_database'
 |  
 |  host = 'myhost.com'
 |  
 |  section_name = 'database'



And we can now create any section we want using this metaclass, for example:

In [23]:
class PasswordsSection(metaclass=SectionType, section_name='passwords', items_dict={'db': 'secret', 'site': 'super secret'}):
    pass

In [24]:
vars(PasswordsSection)

mappingproxy({'__module__': '__main__',
              '__doc__': 'Configs for passwords section',
              'section_name': 'passwords',
              'db': 'secret',
              'site': 'super secret',
              '__dict__': <attribute '__dict__' of 'PasswordsSection' objects>,
              '__weakref__': <attribute '__weakref__' of 'PasswordsSection' objects>})

Just like we can create classes programmatically by calling the `type` metaclass:

In [25]:
MyClass = type('MyClass', (object,), {})

In [26]:
MyClass

__main__.MyClass

We can also create `Section` **classes** by calling the `SectionType` metaclass:

In [27]:
MySection = SectionType('DBSection', (object,), {}, section_name='databases', items_dict={'db_name': 'my_db', 'port': 8000})

In [28]:
MySection

__main__.DBSection

In [29]:
vars(MySection)

mappingproxy({'__doc__': 'Configs for databases section',
              'section_name': 'databases',
              'db_name': 'my_db',
              'port': 8000,
              '__module__': '__main__',
              '__dict__': <attribute '__dict__' of 'DBSection' objects>,
              '__weakref__': <attribute '__weakref__' of 'DBSection' objects>})

Now that we have a metaclass to create section classes, we can build our main config metaclass to build the `Config` class.

In [30]:
class ConfigType(type):
    def __new__(cls, name, bases, cls_dict, env):
        """
        env : str
            The environment we are loading the config for (e.g. dev, prod)
        """
        cls_dict['__doc__'] = f'Configurations for {env}.'
        cls_dict['env'] = env
        config = configparser.ConfigParser()
        file_name = f'{env}.ini'
        config.read(file_name)
        for section_name in config.sections():
            class_name = section_name.capitalize()
            class_attribute_name = section_name.casefold()
            section_items = config[section_name]
            bases = (object, )
            section_cls_dict = {}
            # create a new Section class for this section
            Section = SectionType(
                class_name, bases, section_cls_dict, section_name=section_name, items_dict=section_items
            )
            # And assign it to an attribute in the main config class
            cls_dict[class_attribute_name] = Section
        return super().__new__(cls, name, bases, cls_dict)

Now we can create config classes for each of our environments:

In [31]:
class DevConfig(metaclass=ConfigType, env='dev'):
    pass

class ProdConfig(metaclass=ConfigType, env='prod'):
    pass

In [32]:
vars(DevConfig)

mappingproxy({'__module__': '__main__',
              '__doc__': 'Configurations for dev.',
              'env': 'dev',
              'database': __main__.Database,
              'server': __main__.Server,
              '__dict__': <attribute '__dict__' of 'DevConfig' objects>,
              '__weakref__': <attribute '__weakref__' of 'DevConfig' objects>})

In [33]:
help(DevConfig)

Help on class DevConfig in module __main__:

class DevConfig(builtins.object)
 |  Configurations for dev.
 |  
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  database = <class '__main__.Database'>
 |      Configs for Database section
 |  
 |  env = 'dev'
 |  
 |  server = <class '__main__.Server'>
 |      Configs for Server section



In [34]:
vars(DevConfig.database)

mappingproxy({'__doc__': 'Configs for Database section',
              'section_name': 'Database',
              'db_host': 'dev.mynetwork.com',
              'db_name': 'my_database',
              '__module__': '__main__',
              '__dict__': <attribute '__dict__' of 'Database' objects>,
              '__weakref__': <attribute '__weakref__' of 'Database' objects>})

In [35]:
help(DevConfig.database)

Help on class Database in module __main__:

class Database(builtins.object)
 |  Configs for Database section
 |  
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  db_host = 'dev.mynetwork.com'
 |  
 |  db_name = 'my_database'
 |  
 |  section_name = 'Database'



In [36]:
DevConfig.database.db_host, ProdConfig.database.db_host

('dev.mynetwork.com', 'prod.mynetwork.com')

##  Attribute Read Accessors

We saw in the lecture how `__getattribute__` and `__getattr__` work.

The more common approach is to let Python's default `__getattribute__` method do it's thing, and then override `__getattr__` to handle cases where an attribute could not be found.

#### Overriding `__getattr__`

Let's first see how the override works:

If we do not override the `__getattr__` method, here's what happens when we try to look up an non-existent attribute:

In [1]:
class Person:
    pass

In [2]:
p = Person()

try:
    p.name
except AttributeError as ex:
    print('AttributeError', ex)

AttributeError 'Person' object has no attribute 'name'


Now let's override the `__getattr__` method:

In [3]:
class Person:
    def __getattr__(self, name):
        print(f'__getattribute__ did not find {name}')
        return 'not found!'

In [4]:
p = Person()
p.name

__getattribute__ did not find name


'not found!'

As you can see we did not get an `AttributeError`, and our custom `__getattr__` method was called.

You do have to be careful to avoid infinite recursion though - remember that every attribute lookup that does not exist calls the `__getattr__` method, so something like this is going to cause us problems:

Suppose we want to implement functionality where if an attribute is not found we try to look up the corresponding "private" variable, e.g. if `attr` is not found, maybe try to look up `_attr`:

In [5]:
class Person:
    def __getattr__(self, name):
        print(f'Could not find {name}')
        alt_name = '_' + name
        if getattr(self, alt_name, None) is not None:
            return getattr(self, alt_name)
        else:
            raise AttributeError(f'Could not find {name} or {alt_name}')

In [6]:
p = Person()

In [7]:
p.age

Could not find age
Could not find _age
Could not find __age
Could not find ___age
Could not find ____age
Could not find _____age
Could not find ______age
Could not find _______age
Could not find ________age
Could not find _________age
Could not find __________age
Could not find ___________age
Could not find ____________age
Could not find _____________age
Could not find ______________age
Could not find _______________age
Could not find ________________age
Could not find _________________age
Could not find __________________age
Could not find ___________________age
Could not find ____________________age
Could not find _____________________age
Could not find ______________________age
Could not find _______________________age
Could not find ________________________age
Could not find _________________________age
Could not find __________________________age
Could not find ___________________________age
Could not find ____________________________age
Could not find ____________________________

RecursionError: maximum recursion depth exceeded while calling a Python object

The problem of course is the code `getattr(self, alt_name, None)`.

We have an attribute lookup for `alt_name` which does not exist, so `__getattr__` gets called again. and again. and again...

There's two ways we can fix this issue: reach directly into the instance dictionary, but attributes are not always stored in the instance dictionary, so instead we should use the attribute lookup mechanism in the `super()` object:

In [8]:
class Person:
    def __getattr__(self, name):
        alt_name = '_' + name
        print(f'Could not find {name}, trying {alt_name}...')
        try:
            return super().__getattribute__(alt_name)
        except AttributeError:
            raise AttributeError(f'Could not find {name} or {alt_name}')

In [9]:
p = Person()

In [10]:
try:
    p.age
except AttributeError as ex:
    print(type(ex).__name__, ex)

Could not find age, trying _age...
AttributeError Could not find age or _age


And of course if we have our class defined thus:

In [11]:
class Person:
    def __init__(self, age):
        self._age = age
        
    def __getattr__(self, name):
        print(f'Could not find {name}')
        alt_name = '_' + name
        try:
            return super().__getattribute__(alt_name)
        except AttributeError:
            raise AttributeError(f'Could not find {name} or {alt_name}')

In [12]:
p = Person(100)

In [13]:
p.age

Could not find age


100

Here you can see it succesfully looked up `_age` and returned that for us.

#### Example 1

Here we're going to create a class that behaves a little bit like `defaultdict`. 

If an attribute is requested that does not exist, we're going to set in in the instance, to some default value, and then return it.

In [14]:
class DefaultClass:
    def __init__(self, attribute_default=None):
        self._attribute_default = attribute_default
        
    def __getattr__(self, name):
        print(f'{name} not found. creating it and setting it to default...')
        setattr(self, name, self._attribute_default)
        return self._attribute_default

In [15]:
d = DefaultClass('NotAvailable')

In [16]:
d.test

test not found. creating it and setting it to default...


'NotAvailable'

In [17]:
d.__dict__

{'_attribute_default': 'NotAvailable', 'test': 'NotAvailable'}

And of course, the next time we request it, the `__getattr__` is no longer called:

In [18]:
d.test

'NotAvailable'

Which means we can set it to a different value and not have `__getattr__` stomp over the value we set:

In [19]:
d.test = 'hello'

In [20]:
d.test

'hello'

In [21]:
d.__dict__

{'_attribute_default': 'NotAvailable', 'test': 'hello'}

Now that we have this class defined, we could also inherit from it to provide this functionality to other classes:

In [22]:
class Person(DefaultClass):
    def __init__(self, name):
        super().__init__('Unavailable')
        self.name = name

In [23]:
p = Person('Raymond')

In [24]:
p.name

'Raymond'

In [25]:
p.age

age not found. creating it and setting it to default...


'Unavailable'

#### Example 2

Another use case might be logging the fact that a non-existent attribute was requested - sometimes useful in debugging complex applications and monitoring things.

When we do that we need to make sure we raise an `AttributeError` from our `__getattr__` method, since we don't actually want to provide a value for the attribute (in this particular case):

In [26]:
class AttributeNotFoundLogger:
    def __getattr__(self, name):
        err_msg = f"'{type(self).__name__}' object has no attribute '{name}'"
        print(f'Log: {err_msg}')
        raise AttributeError(err_msg)

In [27]:
class Person(AttributeNotFoundLogger):
    def __init__(self, name):
        self.name = name

In [28]:
p = Person('Raymond')

In [29]:
p.name

'Raymond'

In [30]:
try:
    p.age
except AttributeError as ex:
    print(f'AttributeError raised: {ex}')

Log: 'Person' object has no attribute 'age'
AttributeError raised: 'Person' object has no attribute 'age'


#### Example 3: Overriding `__getattribute__`

As we discussed in the lecture, `__getattribute__` is called for **every** attribute access on our object.

We'll come back to more examples of this, but let's do a simple example, where we want to disallow accessing any attribute names that start with an underscore:

In [31]:
class Person:
    def __init__(self, name, age):
        self._name = name
        self._age = age
        
    def __getattribute__(self, name):
        if name.startswith('_'):
            raise AttributeError(f'Forbidden access to {name}')
        return super().__getattribute__(name)

In [32]:
p = Person('Alex', 19)

In [33]:
try:
    p._name
except AttributeError as ex:
    print(ex)

Forbidden access to _name


We have a problem now, we don't have access to `_name` and no property for `name`. We could try to reach into the instance dictionary (assuming the attribute was stored there):

In [34]:
p.__dict__['_name']

AttributeError: Forbidden access to __dict__

Oh-oh... We have another problem - we can't even get to `__dict__`. LOL.

First let's fix the `__dict__` issue by preventing access to attribute names that start with `_` and not `__`:

In [35]:
class Person:
    def __init__(self, name, age):
        self._name = name
        self._age = age
        
    def __getattribute__(self, name):
        if name.startswith('_') and not name.startswith('__'):
            raise AttributeError(f'Forbidden access to {name}')
        return super().__getattribute__(name)

In [36]:
p = Person('Eric', 78)

In [37]:
p.__dict__

{'_name': 'Eric', '_age': 78}

Now let's implement properties for `name` and `age`:

In [38]:
class Person:
    def __init__(self, name, age):
        self._name = name
        self._age = age
        
    def __getattribute__(self, name):
        if name.startswith('_') and not name.startswith('__'):
            raise AttributeError(f'Forbidden access to {name}')
        return super().__getattribute__(name)
    
    @property
    def name(self):
        return self._name
    
    @property
    def age(self):
        return self._age

I hope before we even run this, that you realize we are going to have an issue...

In the properties, what did we do? We accessed `self._name` and `self._age`.

How is Python going to look up those attributes? By using the `__getattribute__` method - and we just stopped access to variables that start with a single underscore!

In [39]:
p = Person('Python', 42)

In [40]:
try:
    p.name
except AttributeError as ex:
    print(ex)

Forbidden access to _name


Somehow we need to bypass our custom implementation of `__getattribute__`. And we do that by delegating the attribute lookup to `super()` - that will use the standard lookup method (define in `object` in this case), and not our custom method.

In [41]:
class Person:
    def __init__(self, name, age):
        self._name = name
        self._age = age
        
    def __getattribute__(self, name):
        if name.startswith('_') and not name.startswith('__'):
            raise AttributeError(f'Forbidden access to {name}')
        return super().__getattribute__(name)
    
    @property
    def name(self):
        return super().__getattribute__('_name')
    
    @property
    def age(self):
        return super().__getattribute__('_age')

In [42]:
p = Person('Python', 42)

In [43]:
p.name

'Python'

Now let's mix in the functionality we had for `DefaultClass` by inheriting it.

Here's what that class looked like:

In [44]:
class DefaultClass:
    def __init__(self, attribute_default=None):
        self._attribute_default = attribute_default
        
    def __getattr__(self, name):
        print(f'{name} not found. creating it and setting it to default...')
        setattr(self, name, self._attribute_default)
        return self._attribute_default

Now this is going to create some problems if we just use it. Because we are trying to get `self._attribute_default`.
Since our custom `__getattribute__` forbids that, we'll have a problem. So here again, we'll start by delegating back to `super()` to use the `__getattribute__` from the parent:

In [45]:
class DefaultClass:
    def __init__(self, attribute_default=None):
        self._attribute_default = attribute_default
        
    def __getattr__(self, name):
        print(f'{name} not found. creating it and setting it to default...')
        default_value = super().__getattribute__('_attribute_default')
        setattr(self, name, default_value)
        return default_value

And now we can inherit `DefaultClass`:

In [46]:
class Person(DefaultClass):
    def __init__(self, name=None, age=None):
        super().__init__('Not Available')
        if name is not None:
            self._name = name
        if age is not None:
            self._age = age
        
    def __getattribute__(self, name):
        if name.startswith('_') and not name.startswith('__'):
            raise AttributeError(f'Forbidden access to {name}')
        return super().__getattribute__(name)
    
    @property
    def name(self):
        return super().__getattribute__('_name')
    
    @property
    def age(self):
        return super().__getattribute__('_age')


In [47]:
p = Person('Python', 42)

In [48]:
p.name, p.age

('Python', 42)

In [49]:
p.language

language not found. creating it and setting it to default...


'Not Available'

In [50]:
p.__dict__

{'_attribute_default': 'Not Available',
 '_name': 'Python',
 '_age': 42,
 'language': 'Not Available'}

### Overriding Class Attribute Accessors

So far we've been overriding these accessors as instance methods in our class - this means we are dealing with instance attribute access.

What about class attributes instead?

Since `__getattribute__` and `__getattr__` are always instance methods, this means we need to define them in the **metaclass** in order to override our class attribute access.

In [51]:
class MetaLogger(type):
    def __getattribute__(self, name):
        print('class __getattribute__ called...')
        return super().__getattribute__(name)
    
    def __getattr__(self, name):
        print('class __getattr__ called...')
        return 'Not Found'

In [52]:
class Account(metaclass=MetaLogger):
    apr = 10

In [53]:
Account.apr

class __getattribute__ called...


10

In [54]:
Account.apy

class __getattribute__ called...
class __getattr__ called...


'Not Found'

Apart from the fact that we defined these methods in the metaclass, everything else works the same way.

### Gets called for Method access as well

When we call our custom methods in a custom class, the method needs to be retrieved from the instance as well - so it uses the `__getattribute__` and `__getattr__` methods as well.

In [55]:
class MyClass:
    def __getattribute__(self, name):
        print(f'__getattribute__ called... for {name}')
        return super().__getattribute__(name)
    
    def __getattr__(self, name):
        print(f'__getattr__ called... for {name}')
        raise AttributeError(f'{name} not found')
    
    def say_hello(self):
        return 'hello'

In [56]:
m = MyClass()

In [57]:
m.say_hello()

__getattribute__ called... for say_hello


'hello'

In [58]:
m.other()

__getattribute__ called... for other
__getattr__ called... for other


AttributeError: other not found

##  Attribute Write Accessors

As we saw in the lecture there is one special method for attribute writes: `__setattribute__`.

Let's just see when it gets called:

In [1]:
class Person:
    def __setattr__(self, name, value):
        print('setting instance attribute...')
        super().__setattr__(name, value)

In [2]:
p = Person()

In [3]:
p.name = 'Guido'

setting instance attribute...


Of course, if we set a class attribute it does not get called:

In [4]:
Person.class_attr = 'test'

In order to override this setter for class attributes we would have to define it in the metaclass:

In [5]:
class MyMeta(type):
    def __setattr__(self, name, value):
        print('setting class attribute...')
        return super().__setattr__(name, value)
    
class Person(metaclass=MyMeta):
    def __setattr__(self, name, value):
        print('setting instance attribute...')
        super().__setattr__(name, value)

In [6]:
Person.test = 'test'

setting class attribute...


In [7]:
p = Person()
p.test = 'test'

setting instance attribute...


And as we discussed in the lecture, if our `__setattr__` is setting a **data** descriptor, then it calls the descriptor's `__set__` method instead:

In [8]:
class MyNonDataDesc:
    def __get__(self, instance, owner_class):
        print('__get__ called on non-data descriptor...')
        
class MyDataDesc:
    def __set__(self, instance, value):
        print('__set__ called on data descriptor...')
        
    def __get__(self, instance, owner_class):
        print('__get__ called on data descriptor...')

In [9]:
class MyClass:
    non_data_desc = MyNonDataDesc()
    data_desc = MyDataDesc()
    
    def __setattr__(self, name, value):
        print('__setattr__ called...')
        super().__setattr__(name, value)

In [10]:
m = MyClass()

In [11]:
m.__dict__

{}

In [12]:
m.data_desc = 100

__setattr__ called...
__set__ called on data descriptor...


In [13]:
m.non_data_desc = 200

__setattr__ called...


In [14]:
m.__dict__

{'non_data_desc': 200}

So `__setattr__` can be used to intercept and customize any attribute set operation on the instance that the method is defined for.

Just as with `__getattr__` or `__getattribute__` we have to extra careful with infinite recursion.

Suppose we want to disallow setting values for variables that start with a single underscore (but not a double underscore). We might try something like this:

In [15]:
class MyClass:
    def __setattr__(self, name, value):
        print('__setattr__ called...')
        if name.startswith('_') and not name.startswith('__'):
            raise AttributeError('Sorry, this attribute is read-only.')
        setattr(self, name, value)

In [16]:
m = MyClass()

This works fine:

In [17]:
try:
    m._test = 'test'
except AttributeError as ex:
    print(ex)

__setattr__ called...
Sorry, this attribute is read-only.


But this will not:

In [18]:
try:
    m.test = 'test'
except RecursionError as ex:
    print(ex)

__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr_

__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr__ called...
__setattr_

And of course this is because the line `self.name = value` we have in `__setattr__` is itself calling `__setattr__`. So instead, we have to delegate this back to the parent:

In [19]:
class MyClass:
    def __setattr__(self, name, value):
        print('__setattr__ called...')
        if name.startswith('_') and not name.startswith('__'):
            raise AttributeError('Sorry, this attribute is read-only.')
        super().__setattr__(name, value)

In [20]:
m = MyClass()

In [21]:
m.test = 'test'

__setattr__ called...


In [22]:
m.__dict__

{'test': 'test'}

So, just as with the getters, when we want to actually get to the attributes in our instance, we just need to distinguish wether we want the default way of getting/setting the attribute, or our custom override, and use `super()` accordingly. As long as you remember that, you should be fine :-)

##  Accessors - Application

Another useful application of `__getattr__` and `__setattr__` is dealing with objects where we may not know the attributes in advance.

Consider this scenario where we have a database with various tables and fields. We want to create a class that allows us to retrieve data from these tables.

We could certainly write a class for each specific table, and hardcode the fields as properties in the class - but that's going to create repetitive code, and anytime there is a new table or the schema of an existing table changes we'll have to revise our code.

I'm going to simulate a database here by using dictionaries. The outer dictionary will contain tables (as keys), and each table will contain records with a numeric key for each record.

In [1]:
DB = {
    'Person': {
        1: {'first_name': 'Isaac', 'last_name': 'Newton', 'born': 1642, 'country_id': 1},
        2: {'first_name': 'Gottfried', 'last_name': 'von Leibniz', 'born': 1646, 'country_id': 5},
        3: {'first_name': 'Joseph', 'last_name': 'Fourier', 'born': 1768, 'country_id': 3},
        4: {'first_name': 'Bernhard', 'last_name': 'Riemann', 'born': 1826, 'country_id': 5},
        5: {'first_name': 'David', 'last_name': 'Hilbert', 'born': 1862 , 'country_id': 5},
        6: {'first_name': 'Srinivasa', 'last_name': 'Ramanujan', 'born': 1887, 'country_id': 4},
        7: {'first_name': 'John', 'last_name': 'von Neumann', 'born': 1903, 'country_id': 2},
        8: {'first_name': 'Andrew', 'last_name': 'Wiles', 'born': 1928, 'country_id': 6}
    },
    'Country': {
        1: {'name': 'United Kingdom', 'capital': 'London', 'continent': 'Europe'},
        2 :{'name': 'Hungary', 'capital': 'Budapest', 'continent': 'Europe'},
        3: {'name': 'France', 'capital': 'Paris', 'continent': 'Europe'},
        4: {'name': 'India', 'capital': 'New Delhi', 'continent': 'Asia'},
        5: {'name': 'Germany', 'capital': 'Berlin', 'continent': 'Europe'},
        6: {'name': 'USA', 'capital': 'Washington DC', 'continent': 'North America'}
        }
}

Now we could certainly do something like this for each table:

In [2]:
class Country:
    def __init__(self, id_):
        if _id in DB['Country']:
            self._db_record = DB['Country'][id_]
        else:
            raise ValueError(f'Record not found (Country.id={id_})')

    @property
    def name(self):
        return self._db_record['name']
    
    @property
    def capital(self):
        return self._db_record['capital']
    
    @property
    def continent(self):
        return self._db_record['continent']

And we would have to do the same thing with the `Person` table, and any other table we want from our database. Tedious and repetitive code!!

We could create a metaclass that inspects the table structure and creates the appropriate fields, that would work well with code completion for example. 

But if we don't want to get too fancy, we can instead just use `__getattr__`. We'll implement the `__setattr__` as well, but of course in a real database situation you would need to implement some mechanism to persist the changes back to the database.

We are going to create a `DBTable` class that will be used to represent a table in the database, and we'll make it callable so we can pass the record id to the instance, which will return a `DBRecord` object that we can then use to access the fields in the table.

Let's write the `DBRecord` class first. This class will be passed a database record (so a dictionary in this example), and will be tasked with looking up "fields" (keys in this example) in the table (dictionary).

In [3]:
class DBRecord:
    def __init__(self, db_record_dict):
        # again, careful how you set a property on instances of this class
        # because we are overriding __setattr__ we cannot just use 
        # self._record = db_record_dict
        # this will call OUR version of `__setattr__`, which attempts to 
        # see if name is in _record - but _record does not exist yet, so it will
        # call __getattr__, which in turn tries to check if that is contained in _record
        # so, infinite recursion.
        # What we want to here is BYPASS our custom __setattr__ - so we'll use
        # the one in the superclass.
        super().__setattr__('_record', db_record_dict)    
        
    def __getattr__(self, name):
        # here we could write
        #     if name in self._record 
        # since this method should not get called
        # before _record as been created.
        # But just to be on the safe side, I'm still going to use super
        if name in super().__getattribute__('_record'):
            return self._record[name]
        else:
            raise AttributeError(f'Field name {name} does not exist.')

    def __setattr__(self, name, value):
        # and again here, we could write
        # if name in self._record, but I'm still going to use super
        if name in super().__getattribute__('_record'):
            # super().__setattr__(name, value)
            self._record[name] = value
        else:
            raise AttributeError(f'Field name {name} does not exist.')

Next, we define the `DBTable` class. It will be initialized with the name of the table we want to use in our instance. Furthermore we'll make it callable (passing in the record id) and that shoudl return an instance of `DBRecord` for the particular record.

In [4]:
class DBTable:
    def __init__(self, db, table_name):
        if table_name not in db:
            raise ValueError(f'The table {table_name} does not exist in the database.')
        self._table_name = table_name
        self._table = db[table_name]
        
    @property
    def table_name(self):
        return self._table_name
    
    def __call__(self, record_id):
        if record_id not in self._table:
            raise ValueError(f'Specified id ({record_id}) does not exist '
                             f'in table {self._table_name}')
        return DBRecord(self._table[record_id])

And now we can use our classes this way:

In [5]:
tbl_person = DBTable(DB, 'Person')
tbl_country = DBTable(DB, 'Country')

In [6]:
person_1 = tbl_person(1)

In [7]:
person_1.first_name, person_1.last_name, person_1.born, person_1.country_id

('Isaac', 'Newton', 1642, 1)

In [8]:
country_1 = tbl_country(person_1.country_id)

In [9]:
country_1.name, country_1.capital

('United Kingdom', 'London')

There's quite a bit more functionality we might want to add - maybe a way to determine all the fields available in a record for example:

In [10]:
class DBRecord:
    def __init__(self, db_record_dict):
        # again, careful how you set a property on instances of this class
        # because we are overriding __setattr__ we cannot just use 
        # self._record = db_record_dict
        # this will call OUR version of `__setattr__`, which attempts to 
        # see if name is in _record - but _record does not exist yet, so it will
        # call __getattr__, which in turn tries to check if that is contained in _record
        # so, infinite recursion.
        # What we want to here is BYPASS our custom __setattr__ - so we'll use
        # the one in the superclass.
        super().__setattr__('_record', db_record_dict)    
        
    def __getattr__(self, name):
        # here we could write
        #     if name in self._record 
        # since this method should not get called
        # before _record as been created.
        # But just to be on the safe side, I'm still going to use super
        if name in super().__getattribute__('_record'):
            return self._record[name]
        else:
            raise AttributeError(f'Field name {name} does not exist.')

    def __setattr__(self, name, value):
        # and again here, we could write
        # if name in self._record, but I'm still going to use super
        if name in super().__getattribute__('_record'):
            self._record[name] = value
        else:
            raise AttributeError(f'Field name {name} does not exist.')
            
    @property
    def fields(self):
        return tuple(self._record.keys())

In [11]:
tbl_person = DBTable(DB, 'Person')

In [12]:
person_1 = tbl_person(2)

In [13]:
person_1.fields

('first_name', 'last_name', 'born', 'country_id')

We can of course set the field values, via the `__setattr__`:

In [14]:
person_1.last_name

'von Leibniz'

In [15]:
person_1.last_name = 'Leibniz'

In [16]:
person_1.last_name

'Leibniz'

In [17]:
person_1.__dict__

{'_record': {'first_name': 'Gottfried',
  'last_name': 'Leibniz',
  'born': 1646,
  'country_id': 5}}

There are many more improvements we could make, but this is good enough to show how we can use `__getattr__` and `__setattr__`.

The main difficulty with using `__getattr__` and, especially, `__setattr__` is to make sure we do not accidentally create recursive calls.