# Python: dive into the functions

Recently I found a <a href="http://eli.thegreenplace.net/">really useful blog</a> and it motivated me to make my own posts about Python and some other technologies. As he said in <a href="http://eli.thegreenplace.net/2003/05/06/why-start-a-weblog">his first post</a>, writing about things you discover helps you to remember how things work, and I hope I wouldn't be too lazy to continue writing :)

Let's define an easy function.

In [1]:
def foo(a=[]):
    a.append(5)
    return a

What does the following code do ?
```python
for i in xrange(4):
    foo()

result = foo()
```

For the ones who think that `result == [5]`, you need to learn some things about Python.

As you may know, everything in Python is an object. `3`, `3.5`, `"Hello World"`... are objects, but even the types (`int`, `float` or `type`) and the functions are objects. The question is *what's a function object ?*

## The function object ?

At this point, you can see that in your globals, the `foo` name target to a `function` object :

```python
>>> globals()
{
    # .../...
    'foo': <function foo at 0x2b919f6a17d0>,
    # .../...
}
>>> type(foo)
<type 'function'>
>>> print type(foo).__doc__
function(code, globals[, name[, argdefs[, closure]]])

Create a function object from a code object and a dictionary.
The optional name string overrides the name from the code object.
The optional argdefs tuple specifies the default argument values.
The optional closure tuple supplies the bindings for free variables.
```

Function are first-class object. This means that as any other object you can :
- store it in variables or data structures (i.e. classes)
- compare them with other entities
- pass it as parameter or result of an other function
- construct it at runtime
- print or read it
- manipulate their attributes
- etc...


## The function object's attributes

In [2]:
print [e for e in dir(type(foo)) if not e.startswith("_")]

['func_closure', 'func_code', 'func_defaults', 'func_dict', 'func_doc', 'func_globals', 'func_name']


A function instance is composed of 7 components. This notebook is written with the Python 2.7 interpreter.

From the <a href="https://docs.python.org/3.0/whatsnew/3.0.html#operators-and-special-methods">What's new in Python 3.0</a> :
<quote>The function attributes named `func_X` have been renamed to use the `__X__` form, freeing up these names in the function attribute namespace for user-defined attributes.</quote>

So Python 3.x does not support anymore this attributes, and use instead :

In [3]:
print ["__%s__" % e[5:] for e in dir(type(foo)) if not e.startswith("_")]

['__closure__', '__code__', '__defaults__', '__dict__', '__doc__', '__globals__', '__name__']


In [4]:
public_attributes = (e for e in dir(foo) if not e.startswith('_') and e != 'func_globals')
print "\n".join("%-14s : %s" % (e, getattr(foo, e)) for e in public_attributes)

func_closure   : None
func_code      : <code object foo at 0x13bdbb0, file "<ipython-input-1-a62784216969>", line 1>
func_defaults  : ([],)
func_dict      : {}
func_doc       : None
func_name      : foo


### The function describers

The following attributes helps to represent the Python function for the user :

- `func_name` (or `__name__`) attribute is the original function name. This value is used when you want to represent your object (i.e. in the `__repr__` method).

```python
    def my_func(): pass
    bar = my_func
    assert bar.func_name == my_func.func_name == "my_func"  # True
    assert str(bar).startswith("<function my_func at")  # True
```

- `func_doc` (or `__doc__`) attribute contains the docstring of your function. This is highly used by some tools such as <a href="http://sphinx-doc.org/">Sphinx</a> (or even the `help` function) to generate the function's documentation (so don't forget to write your docstring !)

### The end-user storage properties

Sometimes, it would be useful to mark your function for whatever reasons. If you often use decorateurs, you may heard about the `synchronized` decorator. This decorator intend to make the same thing as the keyword `synchronized` in java. In threaded context, sometimes you don't want that two thread access to a function at the same time. The decorator looks like this :

```python
from functools import wraps
import threading

def synchronized(func):
    """ Decorator to make a function thread-safe """
    lock = threading.Lock()
    
    @wraps(func)
    def _wrapper(*args, **kwargs):
        """ Wrapper of the function to lock the thread until the lock is acquired """
        with lock:
            return func(*args, **kwargs)
    
    return _wrapper

@synchronized
def foo():
    print "I'm synchronized !"
```

Sounds cool isn't it ? Yes but sometimes, for some reason, you want a specific lock on your function. For example, if you have 2 functions that must have the same lock (reading the same socket for example), a good way could be to register the lock into a shared space and then change a bit your `synchronized` decorator in order to adapt.

Good news, you have a end-user dictionary here for registering meta-data of the function (in this example, a lock). This dictionary is known under the `func_dict` attribute, and here is an example of how to use it.

```python
def synchronized_v2(lock=None):
    if lock is None:
        lock = threading.Lock()
    
    def _decorator(func):
        func.func_dict["lock"] = lock
        
        @wraps(func)
        def _wrapper(*args, **kwargs):
            with func.func_dict["lock"]:
                return func(*args, **kwargs)
        
        return _wrapper
    return _decorator

@synchronized_v2()
def bar(): pass

@synchronized_v2(bar.func_dict["lock"])
def joe(): pass
```

The functions `bar` and `joe` are now synchronized with the same lock.

With this kind of code, you can easily maintain or change the lock of your function by changing their attribute `func_dict["lock"]`. This is a good way to deal with that, but there're tons of way to deal with that, as you an see on the <a href="http://blog.dscpl.com.au/2014/01/the-missing-synchronized-decorator.html">Graham Dumpleton's blog</a>.

### A scope and closure overview

One of the most important thing of a Python function is its scopes. Before Python 2.2, Python defined 3 scopes :

- the local namespace, which reference all the objects names defined in the block of the function, the class or the method ;
- the global namespace, which reference all the global objects defined in the module ;
- the built-in namespace referencing all the built-in functions.

This was useful, but not enough. A common issue happens with nested function :
```python
def outer():
    somevar = []
    
    def inner():
        somevar.append(4)
        return somevar
        
    return inner
```

The function `outer` a function generator. It generate the `inner` function, with a variable `somevar`. But... What is the scope of this variable ? Obviously, not built-in. In order to be global, the function definition would look like this :
```python
def outer_global():
    global somevar
    somevar = []
    
    def inner():
        global somevar
        somevar.append(4)
        assert "somevar" in globals() and not "somevar" in locals()
        return somevar
        
    return inner
```

In [5]:
def outer_test():
    somevar = []
    
    def inner():
        assert "somevar" in locals()
        somevar.append(5)
        return somevar
       
    return inner

inner = outer_test()
inner()

[5]

Alright, the assertion has been passed. The overall idea here is to say that all the variable from the outer scope are copied to the current scope (except if the outer scope is the module scope, because we fall into the global namespace).

Let's see...

In [6]:
def outer_test_2():
    somevar, someothervar = [], []
    
    def inner():
        assert "somevar" in locals() and "someothervar" in locals()
        somevar.append(5)
        return somevar
    
    return inner

inner = outer_test_2()
inner()

AssertionError: 

Okay so it just copy a part of the variable (the variable used in the current scope and defined in the outer scope).

And what if the variables are not copied ?

In [7]:
def outer_test_3():
    somevar = []
    
    def inner_1():
        somevar.append(5)
        return somevar
    
    def inner_2():
        somevar.append(6)
        return somevar
    
    return inner_1, inner_2

inner_1, inner_2 = outer_test_3()
assert inner_1() == [5]
inner_2()

[5, 6]

Hmm it looks like the variables are just references to the outer scope.

Let's destroy all your dreams.

In [8]:
del outer_test_3

try:
    _ = outer()
except NameError:
    print "Yes, it does not exists anymore."

print inner_1()
print inner_2()

Yes, it does not exists anymore.
[5, 6, 5]
[5, 6, 5, 6]


As you may know, Python objects have a reference counter. When the reference counter goes to 0, the object is ready to be garbage collected.

A reference is symbolized by a variable associated to the object, the fact that the object is in another object (for example a list or the attribute of another object) or whatever pointing to this object. Here, the `inner` function have a local variable `somevar` pointing to the list created in the `outer` function. Because it's local, the `inner`'s `somevar` variable is automatically unreferenced at the end of function's execution.

So because we deleted `outer` function, it looks like nothing reference anymore to the list. And that's not true. Python mechanism create a reference to the object in an attribute of the function, in order to be able to use it in the function and to not garbage collect it. This is known as a function closure.

In [9]:
print "Function closure content  : ", function.func_closure
print "Closure cell contents     : ", function.func_closure[0].cell_contents
print "Id of the closure object  : ", id(function.func_closure[0].cell_contents)
print "Id of the returned object : ", id(function())
print "Returned == closure ?     : ", id(function()) == id(function.func_closure[0].cell_contents)

Function closure content  : 

NameError: name 'function' is not defined

### How Python manage the closures ?

When Python compile a function, at every direct variable reference, the compiler will solve the reference in the following order :

- If the action is an assignation, the reference will be generated at runtime in the locals
- If the action is not an assignation, if an assignation with the same name has been done before, the reference will be resolved at runtime in the locals
- If the action is not an assignation, check if the reference has been made in locals of the outer scope until reaching the global scope, then add a closure associated to this reference
- If the previous case failed, find the reference at runtime in the globals (and raise a `NameError` if not found).

Once a closure is detected, a cell will be created in the `func_closure` attribute of the function, containing the reference to the element, and the name of the element will be bind in the code object as a `co_freevars` of the `inner` function, and a `co_cellvars` of the `outer` function.

I will write another article on this, because it's a bit hard to understand, and we need deep analysis on the Python comportement which is not the current subject.

### Closures side effects

Closure also give a not well known issue. If your variable associated to the closure is mutated for some reasons outside of the function, the function will not have the same behavior. Let's see an example :

In [10]:
def closure_issue(value):
    funcs = []
    
    for i in xrange(3):
        def my_func(x):
            return x * i
        funcs.append(my_func)
        
    for func in funcs:
        print func(value),
        
    return funcs

funcs = closure_issue(2)

 4 4 4


Here, we expected a result like "0 2 4" and we have a "4 4 4". However, 3 different functions have been created :

In [11]:
assert id(funcs[0]) != id(funcs[1]) != id(funcs[2])

The point is the function closure. For every instance of `my_func`, `i` is a closure variable. This means that if `i` mutate outside of the function, `my_func`'s `i` will also mutate. Even if `funcs` is a list of 3 different functions, all the closures target to the same instance :

In [12]:
closures_contents = [func.func_closure[0].cell_contents for func in funcs]
assert id(closures_contents[0]) == id(closures_contents[1]) == id(closures_contents[2])

A way to deal with this kind of issue is to bind the closure as a default parameter.

In [13]:
def closure_solved(value):
    funcs = []
    
    for i in xrange(3):
        def my_func(x, i=i):
            print "Value of i:", i, " - Id of i:", id(i)
            return x * i
        funcs.append(my_func)
    
    print " ".join(str(func(value)) for func in funcs)
    
    return funcs

funcs = closure_solved(2)

Value of i: 0  - Id of i: 9369584
Value of i: 1  - Id of i: 9369560
Value of i: 2  - Id of i: 9369536
0 2 4
