# Closure

Wikipedia defines *closures* as "techniques for implementing lexically scoped name binding in languages with first-class functions."

In plain language, a closure is a function that remembers variables from the namespace of an enclosing function even after the enclosing function goes out of scope.

An analogy will shed more light. In object-oriented programming, an object is a data structure that comes with methods to operate on data. A closure is the opposite: It is a function that comes with its own data, the variable(s) passed by the enclosing function.

Wikipedia gives the following example of a closure:

<pre>
function startAt(x)
   function incrementBy(y)
       return x + y
   return incrementBy

variable closure1 = startAt(1)
variable closure2 = startAt(5)
</pre>

The *startAt()* returns a function that adds 2 numbers. The first number is supplied by *startAt()* - this is the argument *x* from the enclosing namespace - and the second number is passed to the enclosed function when it is called.

Let's implement the above Wikipedia closure example in Python.

In [1]:
def startAt(x):
    """
    Define the enclosing function with a free variable called 'x'
    """
    def incrementBy(y):
        """
        Define the enclosed function that binds free variable 'x'
        """
        return x+y
    return incrementBy

Now, let's create the 2 closures from the wikipedia example above:

In [2]:
closure1 = startAt(1)
closure2 = startAt(5)

Check the types of these 2 variables:

In [3]:
print("The types of 'closure1' and 'closure2' are {} and {}, respectively.".\
        format(type(closure1), type(closure2)))

The types of 'closure1' and 'closure2' are <class 'function'> and <class 'function'>, respectively.


*closure1* and *closure2* are functions that remember the variable from their enclosing namespace. This variable has the value of 1 in the case of *closure1* and 5 in that of *closure2*, as these were the values passed when defining these closures above. Let's use these functions to perform additions.

In [4]:
# pass 5 to closure1. This performs 1 + 5, returning 6
closure1(5)

6

In [5]:
# pass 1 million to closure2. This performs 5 + 1 million, returning 1,000,005

In [6]:
closure2(1e6)

1000005.0

These examples suggest a simple closure use case, which is that of closures as function factories to produce customised functions. In the above example, *closure1* is a custom function that increments any number passed to it by 1 whereas *closure2* is another custom function that increments passed values by 5. *startAt()* acts as a function factory to produce these custom functions.

## Another Use Case
Let's look at a more interesting closure use case.

It was noted above that closures can be thought of as functions that come with their own data. Sometimes, functions may need to pack their own data, in which case closures come into their own.

Let's create a function that takes the mean of a Series object between two quantiles.

In [7]:
def quantile_range_mean(s, low=0.25, high=0.75):
    """
    This function takes the mean of a Series object 's' between two quantiles low and high 
    inclusive. If no values are supplied for low and high, this function takes the mean of the
    inter-quartile range
    """
    return s[(s>=s.quantile(low)) & (s<=s.quantile(high))].mean()

Create another function that returns the minimum of a series between two quantiles.

In [8]:
def quantile_range_min(s, low=0.25, high=0.75):
    """
    This function takes the min of a Series object 's' between two quantiles low and high 
    inclusive. If no values are supplied for low and high, this function takes the min of the
    inter-quartile range
    """
    return s[(s>=s.quantile(low)) & (s<=s.quantile(high))].min()

Finally, create another function that returns the maximum of a series between two quantiles.

In [9]:
def quantile_range_max(s, low=0.25, high=0.75):
    """
    This function takes the max of a Series object 's' between two quantiles low and high 
    inclusive. If no values are supplied for low and high, this function takes the max of the
    inter-quartile range
    """
    return s[(s>=s.quantile(low)) & (s<=s.quantile(high))].max()

Let's create a dataframe out of random values.

In [10]:
import numpy as np
import pandas as pd
from numpy import random
np.random.seed(10)
df = pd.DataFrame(random.rand(1000, 4), columns=list('abcd'))
df.shape

(1000, 4)

Add a column of randomly selected values from a list of words.

In [11]:
df['nick_names'] = random.choice(['Beetrooter', 'Mad Monk', 'Lord Downer', \
                'Desiccated Coconut', 'Man of Steel', 'Witch of Ipswich'], size=1000)

In [12]:
df.head()

Unnamed: 0,a,b,c,d,nick_names
0,0.771321,0.020752,0.633648,0.748804,Mad Monk
1,0.498507,0.224797,0.198063,0.760531,Man of Steel
2,0.169111,0.08834,0.68536,0.953393,Desiccated Coconut
3,0.003948,0.512192,0.812621,0.612526,Man of Steel
4,0.721755,0.291876,0.917774,0.714576,Man of Steel


Let's take the mean of column 'a' by nick_names. 

In [13]:
df.groupby('nick_names')['a'].agg('mean')

nick_names
Beetrooter            0.487092
Desiccated Coconut    0.479433
Lord Downer           0.488855
Mad Monk              0.483189
Man of Steel          0.489713
Witch of Ipswich      0.507703
Name: a, dtype: float64

Rather than taking the mean of all the values in column 'a', take the mean of only those values that fall between 10th and 90th quantile. For this, let's use the function defined above, *quantile_range_mean*. This function takes two arguments: lower quantile and upper quantile, which default to 0.25 and 0.75, respectively.

In [14]:
df.groupby('nick_names')['a'].agg(quantile_range_mean, 0.10, 0.90)

nick_names
Beetrooter            0.483863
Desiccated Coconut    0.480264
Lord Downer           0.486858
Mad Monk              0.481985
Man of Steel          0.491312
Witch of Ipswich      0.508748
Name: a, dtype: float64

As can be seen from the preceding example, the pandas *agg* function can take user-defined functions and accept arbitrary number of arguments that get passed to the called function.

Let's say that as well as the mean of a user-defined quantile range, we also want to display the minimum and maxinum value of column 'a' for each nick_name within specified quantile range. Something like below works:

In [15]:
df.groupby('nick_names')['a'].agg([quantile_range_min, quantile_range_max, \
                quantile_range_mean], 0.10, 0.90, 0.50, 0.60, .20, .60)

Unnamed: 0_level_0,quantile_range_min,quantile_range_max,quantile_range_mean
nick_names,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Beetrooter,0.227403,0.723634,0.47616
Desiccated Coconut,0.234289,0.727197,0.479607
Lord Downer,0.206459,0.747156,0.48586
Mad Monk,0.229195,0.698495,0.491532
Man of Steel,0.27909,0.703795,0.481207
Witch of Ipswich,0.301847,0.73588,0.50628


Using closures removes the visual noise created by the arguments to the user-defined functions. 

In [16]:
def agg_fun_factory(func, func_name, *args, **kwargs):
    """
    A function to create a closure by passing *args and **kwargs to func()
    """
    def wrapper(s):
        return func(s, *args, **kwargs)
    #rename wrapper() pandas agg() required each passed func to have unique names
    wrapper.__name__ = func_name
    return wrapper

`Create closures and pass them to the pandas groupby aggregation function.

In [17]:
range_max = agg_fun_factory(quantile_range_max, 'q_range_max', 0.10, 0.90)
range_min = agg_fun_factory(quantile_range_min, 'q_range_min', 0.10, 0.90)
range_mean = agg_fun_factory(quantile_range_mean, 'q_range_mean', 0.10, 0.90)

In [18]:
df.groupby('nick_names')['a'].agg(['min', range_min, 'max', range_max, 'mean', range_mean])

Unnamed: 0_level_0,min,q_range_min,max,q_range_max,mean,q_range_mean
nick_names,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Beetrooter,0.002195,0.102244,0.992005,0.924139,0.487092,0.483863
Desiccated Coconut,0.00424,0.067951,0.996232,0.861263,0.479433,0.480264
Lord Downer,0.003839,0.104957,0.996208,0.885314,0.488855,0.486858
Mad Monk,0.004974,0.092848,0.990937,0.839341,0.483189,0.481985
Man of Steel,0.00176,0.106938,0.962916,0.861383,0.489713,0.491312
Witch of Ipswich,0.024939,0.161427,0.987625,0.877607,0.507703,0.508748
