## Single Dispatch Generic Functions
First lets define what overloading is:

`Overloading` in object oriented programming is the ability to create more then a function with the same name al long as its signature is different (essentially if the two functions are distinguishable, i.e. different number/type of arguments etc..). When the program is compiled, the interpreter will understand, based on the signature at which function with the same name we are referring to. 

In python, since there is no static typing, we can't declare a function signature, therefore, overloading, in its strict sense, is not possible. A workaround to this problem is called  `single dispatch generic function`, which allows us to overload functions based on the type of the first argument (if we want to consider the type of more arguments we need `multi dispatch`).


# `HTMLizer.py`
HTMLizer is a library for converting text to HTML. It will make use of single dispatch generic functions to dynamically cache the type of HTML elements encountered.

As first attempt we are going to create a series of function that are able to convert some types of structure in html format. These will be then the arguments of our single dispatch function, that should, in the end, be able to recognize which type of object is receiving and use the correct function to htmlize the content.

In [2]:
from html import escape     

def html_escape(arg):
    return escape(str(arg))

def html_int(a):
    return f'{a}(<i>{hex(a)}</i>)'

def html_real(a):
    return f'{round(a):.2f}'

def html_str(s):
    return html_escape(s).replace('\n', '<br/>\n')

def html_list(l):
    items = (f'<li>{htmlize(item)}</li>' for item in l)
    return '<ul>\n'+'\n'.join(items)+'\n</ul>'

def html_dict(d):
    items = (f'<li>{key}={htmlize(value)}</li>' for key, value in d.items())
    return '<ul>\n' + '\n'.join(items) + '\n</ul>'


# N.B. the function `htmlize` called in html_list and html_dict is the single dispatcher defined below.
# This is possible because python let us call a function in a body of another function if this will
# exist before its execution.



In [3]:
print(html_str("""This is a very long sentence
that span on multiple lines and contains
special characters 10 > 9 """))

This is a very long sentence<br/>
that span on multiple lines and contains<br/>
special characters 10 &gt; 9 


In [4]:
html_int(100)

'100(<i>0x64</i>)'

Now let's define the first version of our single dispatcher:    

In [5]:
def htmlize(arg):

    if isinstance(arg, int): 
        return html_int(arg)

    elif isinstance(arg, float):
        return html_real(arg)

    elif isinstance(arg, str):
        return html_str(arg)

    elif isinstance(arg, list) or isinstance(arg, tuple):
        return html_list(arg)

    elif isinstance(arg, dict):
        return html_dict(arg)

    else: # if the instance is not included in our functions
        return html_escape(arg)

Now we can call our single dispatcher on any object and it will automatically select the correct function to transform the output in html format

In [6]:
print(htmlize(['10 > 1 \n vero!', (1,2), {0:'zero'}]))

<ul>
<li>10 &gt; 1 <br/>
 vero!</li>
<li><ul>
<li>1(<i>0x1</i>)</li>
<li>2(<i>0x2</i>)</li>
</ul></li>
<li><ul>
<li>0=zero</li>
</ul></li>
</ul>


Now, it works but there is a fundamental coding problem: each time we need to add a `type` to the dispatcher we need first to write the appropriate function and then to increase the horrendous if elif statement inside the htmlize. Thi is not a good approach since each time we need to go back to our function and implement new code. What we want to achieve is to be able to add instruction to the `htmilize` function from outside its body, i.e. update dynamically it's capability of recognize html formats.

As a first step we are going to get rid of the if elif stack and substitute it with a more elegant dictionary.

In [7]:
def htmlize(arg):

    registry = {
        object: html_escape, # everything is an object, therefore any unknown type will fall here
        int: html_int,
        float: html_real,
        str: html_str,
        list: html_list,
        tuple: html_list,
        dict: html_dict
    }

    # Now we check the instance of arg and look up in the dictionary for its associated function

    fn = registry.get(type(arg), registry[object]) # if the type(arg) is not found, fall back to key `object`

    '''
    N.B. the problem is that now we are capturing only the `type` of an object, therefore, if args is something that inherit from,
    let's say the object list, the dispatcher won't be able to recognize it. What we should do, but it gets more compicated for now,
    is not tu refer to the `type` but to use abstract base classes (import abc) functionalities
    '''

    return fn(arg)


In [8]:
print(htmlize([1,2,3]))

<ul>
<li>1(<i>0x1</i>)</li>
<li>2(<i>0x2</i>)</li>
<li>3(<i>0x3</i>)</li>
</ul>


We have cleaned a lot our htmlize function but we still need to go inside of it each time we want to populate the registry dictionary with new elements

In [9]:
def singledispatch(fn):
    registry = {}

    registry[object] = fn # default function like html_escape

    def inner(arg): # single dispatcher == we expect that the function fn we are decorating only requires one argument
        return registry[object](arg) # call the function from the registry and then apply it to (arg)

    return inner

In [10]:
@singledispatch
def htmlize(arg):
    return escape(str(arg))

htmlize('100 > 10')

'100 &gt; 10'

This is essentially a simple decorator that right now make no sense, since it only takes a function and applies it to an argument. But lets expand the concept..

We are going to expand the registry with more instruction, like int and str, and now the inner function will look at the type(arg) to decide which key of the registry to choose

In [11]:
def singledispatch(fn):
    registry = {}

    registry[object] = fn # default function like html_escape
    registry[int] = lambda a: f'{a}(<i>{hex(a)}</i>)'
    registry[str] = lambda s: escape(s).replace('\n', '<br/>\n')


    def inner(arg): # single dispatcher == we expect that the function fn we are decorating only requires one argument
        selected = registry.get(type(arg), registry[object]) # find the association with the type of arg in the registry
        return selected(arg) # call the selected function from the registry and then apply it to (arg)

    return inner

We still need the first decoration of htmlize that return escape(arg) for two reasons: to initialize the registry and to create the key association with the default function (registry[object])

In [12]:
@singledispatch
def htmlize(arg):
    return escape(str(arg))

htmlize(100)

'100(<i>0x64</i>)'

Very cool, but we are still writing the function in the registry from inside the single dispatcher and that is not what we want. We want to be able to inject into registry from outside the function! The dispatcher is not generic enough, the registry key and the function associated are still hardcoded!

To do this we are going to create a decorator factory (a decorator that takes arguments, also called parametrized decorator) inside the single dispatcher. The scope of this decorator is to assign the key value pairs inside the registry. This is possible because `registry` (the decorator factory) lives inside the local scope of `singledispatch` and therefore have access to its free variable `registry` (nonlocal variable from the decorator point of view).

In [13]:
def singledispatch(fn):
    registry = {}

    registry[object] = fn # default function like html_escape

    def decorated(arg): # changed the name to highlight the scope of this decorator
        selected = registry.get(type(arg), registry[object]) # find the association with the type of arg in the registry
        return selected(arg) # call the selected function from the registry and then apply it to (arg)

    def register(type_): # decorator factory that takes the type of arg

        def inner(fn): # actual decorator that take the function to associate with type(arg)
            registry[type_] = fn # registry is a nonlocal variable of single dispatch so we have access to it
            return fn

        return inner
    
    return decorated


In [32]:
@singledispatch
def htmlize(arg):
    return escape(str(arg))

htmlize # htmlize has become the `decorated`` function

<function __main__.singledispatch.<locals>.decorated(arg)>

As we can see now `htmlize` is has been decorated by the single dispatcher, and since it has only the default registry value associated to `escape(str(arg))`, it can do nothing more than that. e.g. if we call hmtlize on an int it will just escape it

In [15]:
htmlize(100)

'100'

The problem now is how to get to the `registry` function and to be able to access it from outside the single dispatcher.

The solution is elegant and easy, we assign registry as an attribute of the `decorated` function that the single dispatcher is returning (? is this actually monkey patching?)

In [16]:
def singledispatch(fn):
    registry = {}

    registry[object] = fn 

    def decorated(arg): 
        selected = registry.get(type(arg), registry[object]) 
        return selected(arg) 

    def register(type_): 

        def inner(fn): 
            registry[type_] = fn 
            return fn

        return inner

    '''
    we assign `register` as an attribute to `decorated`
    in this way, since `singledispatch` is returning
    `decorated`, we are able to access `register` from the outside
    '''
    decorated.register = register

    '''
    N.B. the .register is arbitrary, is just the name we are giving to
    the attribute of `decorated`, to which we assign the `register` 
    decorator factory
    '''

    return decorated

Now if we decorate `htmlize` we are going to able to access `register`

In [17]:
@singledispatch
def htmlize(arg):
    return escape(str(arg))

htmlize.register

<function __main__.singledispatch.<locals>.register(type_)>

Here we go, now we are use `register`, i.e. to use the decorator factory to populate the `registry` dictionary. 

`register` is coded so that it can receive an argument (type_) and `htmlize.register(int)` actually returns a decorator (the function `inner` inside `register`). Therefore, we can use it to decorate our function (for example `html_int`) passing the `type()` we want to associate to this function as the argument of `register`.

In [18]:
@htmlize.register(int)
def html_int(a):
    return f'{a}(<i>{hex(a)}</i>)'

N.B. the role of the decorator is only to insert the pairs (type: html_function) in the dictionary `registry`. The function itself is not touched


```py
def register(type_): 

        def inner(fn): 
            registry[type_] = fn 
            return fn # <---
            '''
            N.B. we are returning fn and not the inner function.
            In a general decorator factory we would have another closure 
            inside inner, and that would be the return.
            '''
        return inner
```

As a matter of fact, if we look inside the html_int function, it is still itself and not is decorated version.

In [19]:
html_int

<function __main__.html_int(a)>

but if we call the function `htmlize` on an integer, it will correctly look in the registry and find the association `int`:`html_int`

In [20]:
htmlize(100)

'100(<i>0x64</i>)'

In the same way now we are able to "register" into the `registry` dictionary every functions we need to htmlize our code, without the need to hardcode the `htmlize` function directly (it can even be imported from another module).

In [22]:
@htmlize.register(str)
def html_str(s):
    return escape(s).replace('\n', '<br/>\n')

@htmlize.register(list)
def html_list(l):
    items = (f'<li>{htmlize(item)}</li>' for item in l)
    return '<ul>\n'+'\n'.join(items)+'\n</ul>'

Since the decorator return the function without modifying it, we can stack multiple decorators in order to associate the same functions to more then one key (type).

In [23]:
@htmlize.register(tuple)
@htmlize.register(list)
def html_sequence(l):
    items = (f'<li>{htmlize(item)}</li>' for item in l)
    return '<ul>\n'+'\n'.join(items)+'\n</ul>'

Now, something useful that is missing is the possibility to access the `registry` dictionary to be able to see which functions have been registered. To do this, we only need to add another property to the `decorated` function, in the same way we have done to apply the `regisrter` property.

In [24]:
def singledispatch(fn):
    registry = {}

    registry[object] = fn 

    def decorated(arg): 
        selected = registry.get(type(arg), registry[object]) 
        return selected(arg) 

    def register(type_): 

        def inner(fn): 
            registry[type_] = fn 
            return fn

        return inner

    decorated.register = register
    decorated.registry = registry

    return decorated

Now we can access the registry property from htmlize and see that already contains the default association with the `object` type.

In [29]:
@singledispatch
def htmlize(arg):
    return escape(str(arg))

htmlize.registry

{object: <function __main__.htmlize(arg)>}

In [31]:
@htmlize.register(str)
def html_str(s):
    return escape(s).replace('\n', '<br/>\n')

@htmlize.register(tuple)
@htmlize.register(list)
def html_sequence(l):
    items = (f'<li>{htmlize(item)}</li>' for item in l)
    return '<ul>\n'+'\n'.join(items)+'\n</ul>'

htmlize.registry

{object: <function __main__.htmlize(arg)>,
 str: <function __main__.html_str(s)>,
 list: <function __main__.html_sequence(l)>,
 tuple: <function __main__.html_sequence(l)>}

N.B. this is something that I usually don't want in production environment; I don't want the user to have access to the registry dictionary.

Something more appropriated would be to allow the user to be able to access indirectly the `registry` to find which function is associated with a particular `type`. Therefore, we create a dispatch function that take a type as argument and, if exists, return the associated value in the dictionary or the default `object` otherwise. Again, the only way to be able to access it from the outside is to make it a property of the `decorated` function, that is returned after decorating the  

In [34]:
def singledispatch(fn):
    registry = {}

    registry[object] = fn 

    def decorated(arg): 
        selected = registry.get(type(arg), registry[object]) 
        return selected(arg) 

    def register(type_): 

        def inner(fn): 
            registry[type_] = fn 
            return fn

        return inner

    def dispatch(type_): # has the same behavior of `decorated` except that doesn't call the function
        selected = registry.get(type_, registry[object]) 
        return selected


    decorated.register = register
    #decorated.registry = registry # use only for debug
    decorated.dispatch = dispatch 

    return decorated

Now we call the dispatch function on a particular type and see what and if there is something associated with it in the registry dictionary.

In [35]:
@singledispatch
def htmlize(arg):
    return escape(str(arg))

@htmlize.register(str)
def html_str(s):
    return escape(s).replace('\n', '<br/>\n')

@htmlize.register(tuple)
@htmlize.register(list)
def html_sequence(l):
    items = (f'<li>{htmlize(item)}</li>' for item in l)
    return '<ul>\n'+'\n'.join(items)+'\n</ul>'

htmlize.dispatch(list)

<function __main__.html_sequence(l)>

Everything works just fine! But.. there is a but, a limitation given by the fact that we are looking at the `type` of the argument passed to our htmlize function, and this can be abstracted to a lower level using `abstract baseclass`. For example, list and tuples both inherit from the abstract `sequence` class, as well as int and boolean that comes from the `integral` class.

Of course, python is a battery-included programming language and it has in the standard library a single dispatcher that is able to use abstract base class:

In [47]:
from functools import singledispatch
from numbers import Integral
from collections.abc import Sequence

@singledispatch
def htmlize(arg):
    return escape(str(arg))

In the same way as we defined our dispatcher, it is possible to access the registry, to register and dispatch function

In [48]:
htmlize.registry

mappingproxy({object: <function __main__.htmlize(arg)>})

In [49]:
htmlize.dispatch

<function functools.singledispatch.<locals>.dispatch(cls)>

The main difference is that the built-in dispatcher doesn't look at the type() attribute and it is able to recognize the abstract base class of objects.