# Decorator for better Data Science Projects

## Primer

Say we have to write a function to get input from users and add 2 to whatever number they enter. We can simply do this with one function. Let's say the function must return nothing if the input cannot be converted to a number. Let's say further that the user does not care about the output of adding 2 to their input; it's like a process in the background that does something to the user input and update a database.

In [1]:
import time
import typing
import random

In [2]:
def get_input_add_2():
    """
    Take some number from a user and add 2
    """
    try:
        return float(input("Please Enter a Number: ")) + 2
    except ValueError as _:
        return None

In [3]:
get_input_add_2()

Please Enter a Number: 10


12.0

In [4]:
get_input_add_2()

Please Enter a Number: Jonah


Great! Say you get some feedback from the project manager and now have to make the input function a bit user friendly. Keep in mind that the user does not care about the output of your `add 2` process. It would make sense to separate things. Like so

In [5]:
def add_2(some_number: typing.Union[float, None]) -> float:
    """
    Add 2 to some number if that is not null
    """
    return some_number + 2 if some_number is not None else None

def get_input(user: str):
    """
    asks user for input
    """
    try:
        return float(input(f"Hi {user}! Please Entter a Number: "))
    except ValueError as _:
        return None

In [6]:
user_input = get_input("Josiah")

Hi Josiah! Please Entter a Number: 10


In [7]:
add_2(user_input)

12.0

That's nice! We are just doing `add_2(get_input("Josiah"))`. But note that this always involves:

<ul>
    <li>Calling get_input on its argument - this can take time if we have an expensive process</li>
    <li></li>
</ul>

What if we want to do `add_2(get_input)("Josiah")`? How can we go about it?

First why is `add_2(get_input)("Josiah")` better?

<ul>
    <li>
        add_2(get_input) is actually supposed to be a function that takes an input. In other words, it will not run
        whatever expensive process we have until we need said process to run.
    </li>
</ul>

So what we need now is to make `add_2` a function that takes a function and returns another function. Let's see a skeleton

In [8]:
# DO NOT RUN THIS
def add_2(some_function):
    # do something
    return another_function

The return of `add_2` as depicted above must do whatever it's input function is, plus add the 2! That means that somewhere in there, we should have a mechanism to accept the argument for `some_function` (in this case `get_input`) and do what we expect it to do.

In [9]:
def add_2(some_function):
    """
    This should take any function that returns a number (like get_input)
    """
    
    def another_function(user):
        """
        This function should take the arguments for get_input
        """
        some_function_output = some_function(user)
        return some_function_output + 2
    
    return another_function

In [10]:
another_function = add_2(get_input)
another_function

<function __main__.add_2.<locals>.another_function(user)>

In [11]:
another_function("Josiah")

Hi Josiah! Please Entter a Number: 10


12.0

This is nice as we are simply manipulating functions without calling them until we need them! This saves time and makes things a lot more readable/maintainable.

Note how we passed the `user` parameter for `another_function`. This is because we wanted to use `add_2` on `get_input` only. But this defeats the purpose of using functions; reusability! How do we remedy this?

Answer: `*args` and `**kwargs`.

Let's look at a better implementation

In [12]:
def add_2(some_function):
    """
    This should take any function that returns a number (like get_input)
    """
    
    def another_function(*args, **kwargs):
        """
        This function can take any positional and keyword arguments needed for some_function to run
        """
        some_function_output = some_function(*args, **kwargs)
        return some_function_output + 2
    
    return another_function

In [13]:
add_2(get_input)("Josiah")

Hi Josiah! Please Entter a Number: 10


12.0

With the use of `*args` and `**kwargs`, we can now use `add_2` for any style of our `get_input` function.

In [14]:
def get_input_new(first_name: str, last_name: str):
    """
    asks user for input
    """
    try:
        return float(input(f"Hi {first_name}, {last_name}! Please Entter a Number: "))
    except ValueError as _:
        return None

In [15]:
add_2(get_input_new)("Jonah", "Hill")

Hi Jonah, Hill! Please Entter a Number: 23


25.0

But! This is not really how decorators are used. let's change things up so that the function composition is done at the very same time we define our functions

In [16]:
@add_2
def get_input(user: str):
    """
    asks user for input and adds 2
    """
    try:
        return float(input(f"Hi {user}! Please Entter a Number: "))
    except ValueError as _:
        return None
    
@add_2
def get_input_new(first_name: str, last_name: str):
    """
    asks user for input and adds 2
    """
    try:
        return float(input(f"Hi {first_name}, {last_name}! Please Entter a Number: "))
    except ValueError as _:
        return None

In [17]:
get_input("Josiah")

Hi Josiah! Please Entter a Number: 20


22.0

In [18]:
get_input_new("James", "Franco")

Hi James, Franco! Please Entter a Number: 20


22.0

This is, I believe, the bulk of it. You should now understand decorators and if it helps, they are just **functions** that take in **functions** and return **functions**. You can stack them up and you can use `functools.wraps` to keep the documentation of your functions intact but we do not need to go over that.

Say we want to add 2 to `get_input` and add 5 to `get_input_new`. Should we create two different decorators?

Although we could, we do not need to and we should not. But this means that we need a new parameter to tell us when to add 2 and when to add 5. To do this:

<ul>
    <li>We could partially evaluate a function that takes 2 arguments, n and some_function. I recommend looking into [`functools.partial`(https://docs.python.org/3.7/library/functools.html#functools.partial). As the name (might) indicates, it takes a function and partially evaluates it.
    </li>
    <li>
        We can create a function that takes an argument n and returns a function that takes as argument some_function
    </li>
</ul>

We will do the latter

In [19]:
def add_a_number(n: typing.Union[int, float]):
    """
    This will take n and add it to the output of some functions
    """
    def decorator(some_function):
        """
        This should take any function that returns a number (like get_input)
        """
        def another_function(*args, **kwargs):
            """
            This function can take any positional and keyword arguments needed for some_function to run
            """
            some_function_output = some_function(*args, **kwargs)
            
            return some_function_output + n

        return another_function
    
    return decorator

In [20]:
@add_a_number(2)
def get_input(user: str):
    """
    asks user for input and adds 2
    """
    try:
        return float(input(f"Hi {user}! Please Entter a Number: "))
    except ValueError as _:
        return None
    
@add_a_number(5)
def get_input_new(first_name: str, last_name: str):
    """
    asks user for input and adds 2
    """
    try:
        return float(input(f"Hi {first_name}, {last_name}! Please Entter a Number: "))
    except ValueError as _:
        return None

In [21]:
get_input("Josiah")

Hi Josiah! Please Entter a Number: 2


4.0

In [22]:
get_input_new("James", "Franco")

Hi James, Franco! Please Entter a Number: 2


7.0

What happens if we do use `@add_a_number` instead of `@add_a_number(2)` on top of `get_input`?

## Personal Experience

Where did I benefit from using decorators?

I worked on a project with some complex rules and data processing. There are different components to the project and the python functions we developped must only run after results from different (non-python) processes are posted. This means that we have to check a database for some conditions. 

<ul>
    <li>These conditions can vary and came about months after the code was productionized and running. We could go back and update the code and run tests but that is fairly risky.
    </li>
    <li>
        When the conditions are not met, we must wait a certain number of seconds (about an hour) and check the conditions again before running the complex process.
    </li>
    <li>
        We also have to send an email/notification informing the other process owners to fix the problems. 
    </li>
</ul>

There is really no reason why this part should be added to a process that is already productionized and running. Instead what we can do is keep this part separate in a function that controls how our other process runs.

This calls for decorators!

In [23]:
# say we had this function already defined and tested and productionized. We do not want to touch it
def complex_process(*args, **kwargs):
    # a lot of convoluted stuff
    return "Complicated"

In [24]:
# ideally you should have some small functions that return true or false so
# you can call them again after n seconds. This is an overly simplified example
def dummy_condition():
    return random.choice([True, False])

In [25]:
def run_after_n_seconds(seconds: int, email_receivers: typing.List[str] = ['someguy@mail.com']):
    # take a number of seconds to wait
    
    def check_conditions_and_run(complex_function):
        
        def run_things(*args, **kwargs):
            while not dummy_condition():
                # wait some time
                # use a logger instead of printing :)
                print(f"Condition not met! Waiting {seconds} seconds!")
                
                # send email
                time.sleep(seconds)
            
            output_of_complex = complex_function(*args, **kwargs)
            
            return output_of_complex
        
        return run_things
    
    return check_conditions_and_run
            

In [26]:
# Now all I need to do is, go to my python script where the complex function lies and just add the decorator
@run_after_n_seconds(5)
def complex_process(*args, **kwargs):
    # a lot of convoluted stuff
    return "Complicated"

In [27]:
complex_process()

Condition not met! Waiting 5 seconds!


'Complicated'

# Conclusion