# Writing Functions in Python

* Link: https://learn.datacamp.com/courses/writing-functions-in-python

## Course Description

You've done your analysis, built your report, and trained a model. What's next? Well, if you want to deploy your model into production, your code will need to be more reliable than exploratory scripts in a Jupyter notebook. Writing Functions in Python will give you a strong foundation in writing complex and beautiful functions so that you can contribute research and engineering skills to your team. You'll learn useful tricks, like how to write context managers and decorators. You'll also learn best practices around how to write maintainable reusable functions with good documentation. They say that people who can do good research and write high-quality code are unicorns. Take this course and discover the magic!

In [None]:
import pandas as pd

datapath = '/content/drive/MyDrive/Colab Notebooks/Career Track - Data Scientist with Python/18 - Course - Write Functions in Python/datasets/'

# Chapter 1 - Best Practices

The goal of this course is to transform you into a Python expert, and so the first chapter starts off with best practices when writing functions. We'll cover docstrings and why they matter and how to know when you need to turn a chunk of code into a function. You will also learn the details of how Python passes arguments to functions, as well as some common gotchas that can cause debugging headaches when calling functions.

## Docstrings

* Docstrings are a Python best practice that will make your code much easier to use, read, and maintain.
* With a docstring it is much easier to tell what the expected inputs and outputs should be, as well as what the function does.

### Anatomy of a docstring

* Every docstring has some (although usually not all) of these five key pieces of information: 
  * what the function does;
  * what the arguments are;
  * what the return value or values should be;
  * info about any errors raised; and 
  * anything else you'd like to say about the function.

In [None]:
def function_name(arguments):

"""
Description of what the function does.

Description of the arguments, if any.

Description of the return value(s), if any.

Description of errors raised, if any.

Optional extra notes or examples of usage.
"""

### Docstring formats

* Most popular:
  * Google Style
  * Numpydoc
* Others:
  * reStructuredText
  * EpyText

### Google Style - description

* **Start**: Concise description of what the function does. This should be in imperative language. 
  * For instance: "Split the data frame and stack the columns" instead of "This function will split the data frame and stack the columns".

* **Args**: Where you list each argument name, followed by its expected type in parentheses, and then what its role is in the function. 
  * If you need extra space, you can break to the next line and indent as I've done here. 
  * If an argument has a default value, mark it as "optional" when describing the type. 
  * If the function does not take any parameters, feel free to leave this section out.

* **Returns**: Where you list the expected type or types of what gets returned. You can also provide some comment about what gets returned, but often the name of the function and the description will make this clear. Additional lines should not be indented.

* **Raises**: If your function intentionally raises any errors. 

* **Notes**: Additional section for notes or examples of usage in free form text at the end (optional).

In [None]:
def function(arg_1, arg_2=42):
""" Description of what the function does.

    Args:
        arg_1 (str): Description of arg_1 that can break onto the next line
            if needed.
        arg_2 (int, optional): Write optional when an argument has a default
            value.
    
    Returns:
        bool: Optional description of the return value
        Extra lines are not indented.
    
   Raises:
        ValueError: Include any error types that the function intentionally
            raises.
    
    Notes:
        See https://www.datacamp.com/community/tutorials/docstrings-python
        for more info.

"""

### Numpydoc

* Very similar and is the most common format in the scientific Python community. 
* It takes up more vertical space.

In [None]:
def my_function(arg_1, arg_2=42):
    """
    Description of what the function does.
    
    Parameters
    ----------
    arg_1 : expected type of arg_1
        Description of arg_1.
    arg_2 : int, optional
        Write optional when an argument has a default value.
        Default=42.

    Returns
    -------
    The type of the return value
        Can include a description of the return value.
        Replace "Returns" with "Yields" if this function is a generator.
    """

### Retrieving docstrings

* Use function_name.__doc__

In [None]:
print(my_function.__doc__)


    Description of what the function does.
    
    Parameters
    ----------
    arg_1 : expected type of arg_1
        Description of arg_1.
    arg_2 : int, optional
        Write optional when an argument has a default value.
        Default=42.

    Returns
    -------
    The type of the return value
        Can include a description of the return value.
        Replace "Returns" with "Yields" if this function is a generator.
    


* To get a cleaner version, with those leading spaces removed, you can use the getdoc() function from the inspect module.
  * The inspect module contains a lot of useful methods for gathering information about functions.

In [None]:
import inspect
print(inspect.getdoc(my_function))

Description of what the function does.

Parameters
----------
arg_1 : expected type of arg_1
    Description of arg_1.
arg_2 : int, optional
    Write optional when an argument has a default value.
    Default=42.

Returns
-------
The type of the return value
    Can include a description of the return value.
    Replace "Returns" with "Yields" if this function is a generator.


### Exercise - Crafting a docstring

You've decided to write the world's greatest open-source natural language processing Python package. It will revolutionize working with free-form text, the way numpy did for arrays, pandas did for tabular data, and scikit-learn did for machine learning.

The first function you write is count_letter(). It takes a string and a single letter and returns the number of times the letter appears in the string. You want the users of your open-source package to be able to understand how this function works easily, so you will need to give it a docstring. Build up a Google Style docstring for this function by following these steps.

While it does require a bit more typing, the information presented here will make it very easy for others to use this code in the future. Remember that even though computers execute it, code is actually written for humans to read (otherwise you'd just be writing the 1s and 0s that the computer operates on).

In [None]:
def count_letter(content, letter):
  """Count the number of times `letter` appears in `content`.

  Args:
    content (str): The string to search.
    letter (str): The letter to search for.

  Returns:
    int

  Raises:
    ValueError: If `letter` is not a one-character string.
  """
  if (not isinstance(letter, str)) or len(letter) != 1:
    raise ValueError('`letter` must be a single character string.')
  return len([char for char in content if char == letter])

### Exercise 2 - Retrieving docstrings

You and a group of friends are working on building an amazing new Python IDE (integrated development environment -- like PyCharm, Spyder, Eclipse, Visual Studio, etc.). The team wants to add a feature that displays a tooltip with a function's docstring whenever the user starts typing the function name. That way, the user doesn't have to go elsewhere to look up the documentation for the function they are trying to use. You've been asked to complete the build_tooltip() function that retrieves a docstring from an arbitrary function.

Note that in Python, you can pass a function as an argument to another function. 

In [None]:
# Get the docstring with an attribute of count_letter()
docstring = count_letter.__doc__

border = '#' * 28
print('{}\n{}\n{}'.format(border, docstring, border))

############################
Count the number of times `letter` appears in `content`.

  Args:
    content (str): The string to search.
    letter (str): The letter to search for.

  Returns:
    int

  Raises:
    ValueError: If `letter` is not a one-character string.
  
############################


In [None]:
import inspect

# Get the docstring with a function from the inspect module
docstring = inspect.getdoc(count_letter)

border = '#' * 28
print('{}\n{}\n{}'.format(border, docstring, border))

############################
Count the number of times `letter` appears in `content`.

Args:
  content (str): The string to search.
  letter (str): The letter to search for.

Returns:
  int

Raises:
  ValueError: If `letter` is not a one-character string.
############################


In [None]:
# Using a function as an object, an argument for another function

def build_tooltip(function):
  """Create a tooltip for any function that shows the 
  function's docstring.
  
  Args:
    function (callable): The function we want a tooltip for.
    
  Returns:
    str
  """
  # Use 'inspect' to get the docstring
  docstring = inspect.getdoc(function)
  border = '#' * 28
  return '{}\n{}\n{}'.format(border, docstring, border)

print(build_tooltip(count_letter))
print(build_tooltip(range))
print(build_tooltip(print))

############################
Count the number of times `letter` appears in `content`.

Args:
  content (str): The string to search.
  letter (str): The letter to search for.

Returns:
  int

Raises:
  ValueError: If `letter` is not a one-character string.
############################
############################
range(stop) -> range object
range(start, stop[, step]) -> range object

Return an object that produces a sequence of integers from start (inclusive)
to stop (exclusive) by step.  range(i, j) produces i, i+1, i+2, ..., j-1.
start defaults to 0, and stop is omitted!  range(4) produces 0, 1, 2, 3.
These are exactly the valid indices for a list of 4 elements.
When step is given, it specifies the increment (or decrement).
############################
############################
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
file:  a file-like object (stream); defaults to the

## DRY and "Do One Thing"

* DRY (also known as "don't repeat yourself") and the "Do One Thing" principle are good ways to ensure that your functions are well designed and easy to test.

### Don't repeat yourself (DRY)

* When you are writing code to look for answers to a research question, it is totally normal to copy and paste a bit of code, tweak it slightly, and re-run it. 
* However, this kind of repeated code can lead to real problems.
* One of the problems with copying and pasting is that it is easy to accidentally introduce errors that are hard to spot.
* **Instead of repeated code, you should write a function!**


In [None]:
if False:
    train = pd.read_csv('train.csv')
    train_y = train['labels'].values ### <- there and there --v ###
    train_X = train[col for col in train.columns if col != 'labels'].values
    train_pca = PCA(n_components=2).fit_transform(train_X)
    plt.scatter(train_pca[:,0], train_pca[:,1])

    val = pd.read_csv('validation.csv')
    val_y = val['labels'].values ### <- there and there --v ###
    val_X = val[col for col in val.columns if col != 'labels'].values
    val_pca = PCA(n_components=2).fit_transform(val_X)
    plt.scatter(val_pca[:,0], val_pca[:,1])

    test = pd.read_csv('test.csv')
    test_y = test['labels'].values ### <- there and there --v ###
    test_X = test[col for col in test.columns if col != 'labels'].values
    test_pca = PCA(n_components=2).fit_transform(test_X)
    plt.scatter(test_pca[:,0], test_pca[:,1])

### Use functions to avoid repetition

* Wrapping the repeated logic in a function and then calling that function several times makes it much easier to avoid the kind of errors introduced by copying and pasting. 
* And if you ever need to change the column "label" back to "labels", or you want to swap out PCA for some other dimensionality reduction technique, you only have to do it in one or two places.

In [None]:
def load_and_plot(path):
    """Load a data set and plot the first two principal components.
    
    Args:
        path (str): The location of a CSV file.
    
    Returns:
        tuple of ndarray: (features, labels)
    """

    data = pd.read_csv(path)
    y = data['label'].values
    X = data[[col for col in train.columns if col != 'label']].values
    pca = PCA(n_components=2).fit_transform(X)
    plt.scatter(pca[:,0], pca[:,1])
    return X, y

In [None]:
# Using the functions instead
if False:
    train_X, train_y = load_and_plot('train.csv')
    val_X, val_y = load_and_plot('validation.csv')
    test_X, test_y = load_and_plot('test.csv')

### Do One Thing

* The previous function still has a big problem.
  * It violates another software engineering principle: Do One Thing. 
  * Every function should have a single responsibility.

* Instead of one big function, we could have a more nimble function that just loads the data and a second one for plotting.

In [None]:
def load_data(path):
    """Load a data set.
    
    Args:
        path (str): The location of a CSV file.
    
    Returns:
        tuple of ndarray: (features, labels)
    """
    data = pd.read_csv(path)
    y = data['labels'].values
    X = data[[col for col in data.columns
        if col != 'labels']].values
    return X, y


def plot_data(X):
    """Plot the first two principal components of a matrix
    
    Args:
        X (numpy.ndarray): The data to plot.
    """
    pca = PCA(n_components=2).fit_transform(X)
    plt.scatter(pca[:,0], pca[:,1])

* Our code has become more flexible. 
* Now, it is possible to load the data with just the load_data function. Likewise, if you want to do some transformation to the data before plotting, you can do the transformation and then call the plot_data() function. 
* We have decoupled the loading functionality from the plotting functionality.

### Advantages of doing one thing

The code becomes:
* More exible
* More easily understood
* Simpler to test
* Simpler to debug
* Easier to change

### Code smells and refactoring

* Repeated code and functions that do more than one thing are examples of "code smells", which are indications that you may need to refactor. 
* Refactoring is the process of improving code by changing it a little bit at a time. 
* This process is well described in **Martin Fowler's book, "Refactoring"**, which is a good read for any aspiring software engineer.

### Exercise - Extract a function

While you were developing a model to predict the likelihood of a student graduating from college, you wrote this bit of code to get the z-scores of students' yearly GPAs. Now you're ready to turn it into a production-quality system, so you need to do something about the repetition. Writing a function to calculate the z-scores would improve this code.

In [None]:
# Code to be refactored

if False:
    # Standardize the GPAs for each year
    df['y1_z'] = (df.y1_gpa - df.y1_gpa.mean()) / df.y1_gpa.std()
    df['y2_z'] = (df.y2_gpa - df.y2_gpa.mean()) / df.y2_gpa.std()
    df['y3_z'] = (df.y3_gpa - df.y3_gpa.mean()) / df.y3_gpa.std()
    df['y4_z'] = (df.y4_gpa - df.y4_gpa.mean()) / df.y4_gpa.std()

In [None]:
# Refactored code

def standardize(column):
  """Standardize the values in a column.

  Args:
    column (pandas Series): The data to standardize.

  Returns:
    pandas Series: the values as z-scores
  """
  # Finish the function so that it returns the z-scores
  z_score = (column - column.mean()) / column.std()
  return z_score

if False:
    df['y1_z'] = standardize(df['y1_gpa'])
    df['y2_z'] = standardize(df['y2_gpa'])
    df['y3_z'] = standardize(df['y3_gpa'])
    df['y4_z'] = standardize(df['y4_gpa'])

### Exercise 2 - Split up a function

Another engineer on your team has written this function to calculate the mean and median of a list. You want to show them how to split it into two simpler functions: mean() and median().

In [None]:
# Code to be refactored
def mean_and_median(values):
  """Get the mean and median of a list of `values`

  Args:
    values (iterable of float): A list of numbers

  Returns:
    tuple (float, float): The mean and median
  """
  mean = sum(values) / len(values)
  midpoint = int(len(values) / 2)
  if len(values) % 2 == 0:
    median = (values[midpoint - 1] + values[midpoint]) / 2
  else:
    median = values[midpoint]

  return mean, median

In [None]:
# Refactored code
def mean(values):
  """Get the mean of a list of values

  Args:
    values (iterable of float): A list of numbers

  Returns:
    float
  """
  # Write the mean() function
  mean = sum(values) / len(values)
  return mean

def median(values):
  """Get the median of a list of values

  Args:
    values (iterable of float): A list of numbers

  Returns:
    float
  """
  # Write the median() function
  midpoint = int(len(values) / 2)
  if len(values) % 2 == 0:
    median = (values[midpoint - 1] + values[midpoint]) / 2
  else:
    median = values[midpoint]
  return median

## Pass by assignment

* The way that Python passes information to functions is different from many other languages. 
* It is referred to as "pass by assignment".

### A surprising example

In [None]:
# Lists are mutable
def foo(x):
    x[0] = 99

my_list = [1, 2, 3]
print(my_list)
foo(my_list)
print(my_list)

[1, 2, 3]
[99, 2, 3]


In [None]:
# Integers are immutable
def bar(x):
    x = x + 90

my_var = 3
print(my_var)
bar(my_var)
print(my_var)

3
3


### Another example

In [None]:
a = [1, 2, 3]
b = a
print("Step 1: ", a, b) # 'a' and 'b' points to the same list

a.append(4)
print("Step 2: ", a, b)

b.append(5)
print("Step 3: ", a, b)

a = 10
print("Step 4: ", a, b) # variable 'a' no longer points to the list

Step 1:  [1, 2, 3] [1, 2, 3]
Step 2:  [1, 2, 3, 4] [1, 2, 3, 4]
Step 3:  [1, 2, 3, 4, 5] [1, 2, 3, 4, 5]
Step 4:  10 [1, 2, 3, 4, 5]


### Immutable or Mutable?

* **Immutable**: int, float, bool, string, bytes, tuple, frozenset, None
* **Mutable**: list, dict, set, bytearray, objects, functions, almost everything else!

### Exercise 1 - Mutable or immutable?
The following function adds a mapping between a string and the lowercase version of that string to a dictionary. What do you expect the values of d and s to be after the function is called?


In [None]:
def store_lower(_dict, _string):
  """Add a mapping between `_string` and a lowercased version of `_string` to `_dict`

  Args:
    _dict (dict): The dictionary to update.
    _string (str): The string to add.
  """
  orig_string = _string
  _string = _string.lower()
  _dict[orig_string] = _string

d = {}
s = 'Hello'

print(d, s)
store_lower(d, s)
print(d, s)

{} Hello
{'Hello': 'hello'} Hello


### Exercise 2 - Best practice for default arguments

One of your co-workers (who obviously didn't take this course) has written this function for adding a column to a panda's DataFrame. Unfortunately, they used a mutable variable as a default argument value! Please show them a better way to do this so that they don't get unexpected behavior.



In [None]:
# Problematic code
import pandas as pd

def add_column(values, df=pd.DataFrame()):
  """Add a column of `values` to a DataFrame `df`.
  The column will be named "col_<n>" where "n" is
  the numerical index of the column.

  Args:
    values (iterable): The values of the new column
    df (DataFrame, optional): The DataFrame to update.
      If no DataFrame is passed, one is created by default.

  Returns:
    DataFrame
  """
  df['col_{}'.format(len(df.columns))] = values
  return df

In [None]:
# Use an immutable variable for the default argument 
def better_add_column(values, df=None):
  """Add a column of `values` to a DataFrame `df`.
  The column will be named "col_<n>" where "n" is
  the numerical index of the column.

  Args:
    values (iterable): The values of the new column
    df (DataFrame, optional): The DataFrame to update.
      If no DataFrame is passed, one is created by default.

  Returns:
    DataFrame
  """
  # Update the function to create a default DataFrame
  if df is None:
    df = pandas.DataFrame()
  df['col_{}'.format(len(df.columns))] = values
  return df

* When you need to set a mutable variable as a default argument, always use `None` and then set the value in the body of the function. This prevents unexpected behavior like adding multiple columns if you call the function more than once.

# Chapter 2 - Context Managers

If you've ever seen the "with" keyword in Python and wondered what its deal was, then this is the chapter for you! Context managers are a convenient way to provide connections in Python and guarantee that those connections get cleaned up when you are done using them. This chapter will show you how to use context managers, as well as how to write your own.

## Using context managers

* What is a context manager?
  * Sets up a context
  * Runs your code
  * Removes the context

* Sintax:
  * The keyword `with` lets Python know that you are trying to enter a context.
  * Then you call a function. You can call any function that is built to work as a context manager
  * A context manager can take arguments like any normal function.
  * You end the "with" statement with a colon as if you were writing a for loop or an if statement.
  * Statements in Python that have an indented block after them, like for loops, if/else statements, function definitions, etc. are called "compound statements". 
    * The "with" statement is another type of compound statement, so the code you want to run inside the context manager needs to be indented.
  * Some context managers want to return a value that you can use inside the context. By adding "as" and a variable name at the end of the "with" statement, you can assign the returned value to the variable name. 



In [None]:
with <context-manager>(<args>) as <variable-name>:
    # Run your code here
    # This code is running "inside the context"

# This code runs after the context is removed

SyntaxError: ignored

### A real-world example

`open()` does three things:
* Sets up a context by opening a file
* Lets you run any code you want on that file
* Removes the context by closing the file

In [None]:
if False:
    with open('my_file.txt') as my_file:
        text = my_file.read()
        length = len(text)

    print('The file is {} characters long'.format(length)) # print is outside the context manager

* When calling the "open()" context manager, which returns a file that we can read from or write to we added "as my_file" to the "with" statement, assining the file to the variable "my_file".

### Exercise 1 - The number of cats

You are working on a natural language processing project to determine what makes great writers so great. Your current hypothesis is that great writers talk about cats a lot. To prove it, you want to count the number of times the word "cat" appears in "Alice's Adventures in Wonderland" by Lewis Carroll. You have already downloaded a text file, alice.txt, with the entire contents of this great book.

In [None]:
# Open "alice.txt" and assign the file to "file"
with open(datapath + 'alice.txt') as file:
  text = file.read()

n = 0
for word in text.split():
  if word.lower() in ['cat', 'cats']:
    n += 1

print('Lewis Carroll uses the word "cat" {} times'.format(n))

Lewis Carroll uses the word "cat" 24 times


* By opening the file using the with open() statement, you were able to read in the text of the file. More importantly, when you were done reading the text, the context manager closed the file for you.

### Exercise 2 - The speed of cats

You're working on a new web service that processes Instagram feeds to identify which pictures contain cats (don't ask why -- it's the internet). The code that processes the data is slower than you would like it to be, so you are working on tuning it up to run faster. Given an image, `image`, you have two functions that can process it:

* `process_with_numpy(image)`
* `process_with_pytorch(image)`

Your colleague wrote a context manager, `timer()`, that will print out how long the code inside the context block takes to run. She is suggesting you use it to see which of the two options is faster. Time each function to determine which one to use in your web service.

In [None]:
if False:
    image = get_image_from_instagram()

    # Time how long process_with_numpy(image) takes to run
    with timer():
        print('Numpy version')
        process_with_numpy(image)

    # Time how long process_with_pytorch(image) takes to run
    with timer():
        print('Pytorch version')
        process_with_pytorch(image)

In [None]:
# Outpout
"""
Numpy version
Processing..........done!
Elapsed: 1.52 seconds
Pytorch version
Processing..........done!
Elapsed: 0.33 seconds
"""

'\nNumpy version\nProcessing..........done!\nElapsed: 1.52 seconds\nPytorch version\nProcessing..........done!\nElapsed: 0.33 seconds\n'

* Now that you know the pytorch version is faster, you can use it in your web service to ensure your users get the rapid response time they expect.

* You may have noticed there was no `as <variable name>` at the end of the with statement in `timer()` context manager. That is because timer() is a context manager that does not return a value, so the `as <variable name>` at the end of the with statement isn't necessary.

## Writing context managers

* There are two ways to define a context manager
  * Class-based
  * Function-based **<-- focus of this course**

* There are five parts to creating a context manager.

### How to create a context manager

1. Define a function.
2. (optional) Add any set up code your context needs.
3. Use the "yield" keyword.
  * Used to signal to Python that this is a special kind of function.
4. (optional) Add any teardown code your context needs.
5. Add the `@contextlib.contextmanager` decorator

In [None]:
@contextlib.contextmanager # Step 5: Add decorator "contextmanager" from the "contextlib" module
def my_context(): # Step 1: Define a function
    # Step 2: (Optional) Add any set up code you need
    yield # Step 3: Must have the "yield" keyword
    # Step 4: Add any teardown code you need

### The "yield" keyword

* When you write `yield` it means that you are going to return a value, but you expect to finish the rest of the function at some point in the future. 

* The value that your context manager yields can be assigned to a variable in the "with" statement by adding `as <variable name>`.

* In our example, we've assigned the value `42` that `my_context()` yields to the variable `foo`. 

* By running this code, you can see that after the context block is done executing, the rest of the `my_context()` function gets run, printing "goodbye".

* The keyword `yield` is also used when creating generators.
  * A context manager function is technically a generator that yields a single value.

In [None]:
import contextlib

@contextlib.contextmanager
def my_context():
    print('hello')
    yield 42
    print('goodbye')

with my_context() as foo:
    print('foo is {}'.format(foo))

hello
foo is 42
goodbye


### Setup and teardown

* The ability for a function to yield control and know that it will get to finish running later is what makes context managers so useful. 

* The example below is a context manager that accesses a database. 
  * Like most context managers, it has some setup code that runs before the function yields. 
  * This context manager uses that setup code to connect to the database.
  * This setup/teardown behavior allows a context manager to hide things like connecting and disconnecting from a database so that a programmer using the context manager can just perform operations on the database without worrying about the underlying details.

In [None]:
@contextlib.contextmanager
def database(url):
    # Set up database connection
    db = postgres.connect(url)
    
    yield db
    
    # Tear down database connection
    db.disconnect()

if False:
    url = 'http://datacamp.com/data'
    with database(url) as my_db:
        course_list = my_db.execute('SELECT * FROM courses')

### Yielding a value or None

* The database() context manager that we've been looking at yields a specific value - the database connection - that can be used in the context block.

* Some context managers don't yield an explicit value. 

* In our example `in_dir()` is a context manager that changes the current working directory to a specific path and then changes it back after the context block is done.
  * It does not need to return anything with its "yield" statement.

In [None]:
@contextlib.contextmanager
def in_dir(path):
    # Save current working directory
    old_dir = os.getcwd()
    
    # Switch to new working directory
    os.chdir(path)
    
    yield
    
    # Change back to previous working directory
    os.chdir(old_dir)

if False:
    with in_dir('/data/project_1/'):
        project_files = os.listdir()

### Exercise 1 - The timer() context manager

A colleague of yours is working on a web service that processes Instagram photos. Customers are complaining that the service takes too long to identify whether or not an image has a cat in it, so your colleague has come to you for help. You decide to write a context manager that they can use to time how long their functions take to run.

In [None]:
# Add a decorator that will make timer() a context manager

@contextlib.contextmanager
def timer():
    """Time the execution of a context block.

    Yields:
    None
    """
    start = time.time()
    
    # Send control back to the context block
    yield
    
    end = time.time()
    print('Elapsed: {:.2f}s'.format(end - start))

if False:
    with timer():
        print('This should take approximately 0.25 seconds')
        time.sleep(0.25)

In [None]:
# Output
"""
This should take approximately 0.25 seconds
Elapsed: 0.25s
"""

'\nThis should take approximately 0.25 seconds\nElapsed: 0.25s\n'

* Your colleague can now use your timer() context manager to figure out which of their functions is running too slow. 

* Notice that the three elements of a context manager are all here: a function definition, a yield statement, and the @contextlib.contextmanager decorator. 

* It's also worth noticing that timer() is a context manager that does not return an explicit value, so yield is written by itself without specifying anything to return.

### Exercise 2 - A read-only open() context manager

You have a bunch of data files for your next deep learning project that took you months to collect and clean. It would be terrible if you accidentally overwrote one of those files when trying to read it in for training, so you decide to create a read-only version of the `open()` context manager to use in your project.

The regular `open()` context manager:

* takes a filename and a mode (`'r'` for read, `'w'` for write, or `'a'` for append)
* opens the file for reading, writing, or appending
* yields control back to the context, along with a reference to the file
* waits for the context to finish
* and then closes the file before exiting

Your context manager will do the same thing, except it will only take the filename as an argument and it will only open the file for reading.

In [None]:
@contextlib.contextmanager
def open_read_only(filename):
    """Open a file in read-only mode.

    Args:
    filename (str): The location of the file to read

    Yields:
    file object
    """
    read_only_file = open(filename, mode='r')
    
    # Yield read_only_file so it can be assigned to my_file
    yield read_only_file
    
    # Close read_only_file
    read_only_file.close()

if False:
    with open_read_only('my_file.txt') as my_file:
        print(my_file.read())

* Now you can relax, knowing that every time you use with open_read_only() your files are safe from being accidentally overwritten. 

* This function is an example of a context manager that does return a value, so we write `yield read_only_file` instead of just `yield`. Then the `read_only_file` object gets assigned to `my_file` in the with statement so that whoever is using your context can call its `.read()` method in the context block.

## Advanced topics

Covered topics:
* Nested contexts;
* Handling errors; and
* How to know when to create a context manager.

### Nested contexts

* First, an example of a copy() function that copies the contents of one file to another file. 
  * One way you could write this function would be to open the source file, store the contents of the file in the "contents" variable, then open the destination file and write the contents to it. 
  * This approach works fine until you try to copy a file that is too large to fit in memory.

In [None]:
def copy(src, dst):
    """Copy the contents of one file to another.
    Args:
    src (str): File name of the file to be copied.
    dst (str): Where to write the new file.
    """
    # Open the source file and read in the contents
    with open(src) as f_src:
        contents = f_src.read()
    
    # Open the destination file and write out the contents
    with open(dst, 'w') as f_dst:
        f_dst.write(contents)

* What would be ideal is if we could open both files at once and copy over one line at a time. 
  * The statement `for line in my_file` reads in the contents of `my_file` one line at a time until the end of the file.

In [None]:
def copy(src, dst):
    """Copy the contents of one file to another.
    Args:
    src (str): File name of the file to be copied.
    dst (str): Where to write the new file.
    """
    # Open both files
    with open(src) as f_src:
        with open(dst,'w') as f_dst:
            # Read and write each line, one at a time
            for line in f_src:
                f_dst.write(line)

### Handling errors

```
try:
    # code that might raise an error
except:
    # do something about the error
finally:
    # this code runs no matter what
```

* The example below tries to connect with the printer. 
    * Trying to access doc['txt'] instead of doc['text'] we get an error.
    * Using the `try` statement we still get the error, but the `finally` statement garantees that the printer is disconnected, allowing other users to use it.

In [None]:
def get_printer(ip):
    p = connect_to_printer(ip)
    
    try:
        yield
    finally:
        p.disconnect()
        print('disconnected from printer')

doc = {'text': 'This is my text.'}

with get_printer('10.0.34.111') as printer:
    printer.print_page(doc['txt'])

### Context manager patterns

* Patters to use context mamangers:
  * Open / Close
  * Lock / Release
  * Change / Reset
  * Enter / Exit
  * Start / Stop
  * Setup / Teardown
  * Connect / Disconnect

### Exercise 1 - Scraping the NASDAQ

Training deep neural nets is expensive! You might as well invest in NVIDIA stock since you're spending so much on GPUs. To pick the best time to invest, you are going to collect and analyze some data on how their stock is doing. The context manager `stock('NVDA')` will connect to the NASDAQ and return an object that you can use to get the latest price by calling its `.price()` method.

You want to connect to `stock('NVDA')` and record 10 timesteps of price data by writing it to the file `NVDA.txt`.

In [None]:
# Use the "stock('NVDA')" context manager and assign the result to the variable "nvda"
with stock('NVDA') as nvda:
    # Open "NVDA.txt" for writing as f_out
    with open('NVDA.txt', 'w') as f_out:
        for _ in range(10):
            value = nvda.price()
            print('Logging ${:.2f} for NVDA'.format(value))
            f_out.write('{:.2f}\n'.format(value))

* Now you can monitor the NVIDIA stock price and decide when is the exact right time to buy. Nesting context managers like this allows you to connect to the stock market (the CONNECT/DISCONNECT pattern) and write to a file (the OPEN/CLOSE pattern) at the same time.

### Exercise 2 - Changing the working directory

You are using an open-source library that lets you train deep neural networks on your data. Unfortunately, during training, this library writes out checkpoint models (i.e., models that have been trained on a portion of the data) to the current working directory. You find that behavior frustrating because you don't want to have to launch the script from the directory where the models will be saved.

You decide that one way to fix this is to write a context manager that changes the current working directory, lets you build your models, and then resets the working directory to its original location. You'll want to be sure that any errors that occur during model training don't prevent you from resetting the working directory to its original location.

In [None]:
def in_dir(directory):
    """Change current working directory to `directory`,
    allow the user to run some code, and change back.

    Args:
    directory (str): The path to a directory to work in.
    """
    current_dir = os.getcwd()
    os.chdir(directory)

    # Add code that lets you handle errors
    try:
        yield
    # Ensure the directory is reset,
    # whether there was an error or not
    finally:
        os.chdir(current_dir)

* Now, even if someone writes buggy code when using your context manager, you will be sure to change the current working directory back to what it was when they called `in_dir()`. This is important to do because your users might be relying on their working directory being what it was when they started the script. `in_dir()` is a great example of the CHANGE/RESET pattern that indicates you should use a context manager.

# Chapter 3 - Decorators

Decorators are an extremely powerful concept in Python. They allow you to modify the behavior of a function without changing the code of the function itself. This chapter will lay the foundational concepts needed to thoroughly understand decorators (functions as objects, scope, and closures), and give you a good introduction into how decorators are used and defined. This deep dive into Python internals will set you up to be a superstar Pythonista.

## Functions as objects

* Some Python objects:
  * Functions
  * Modules
  * Lists
  * DataFrames
  * Strings
  * Integers
  * Floats
  * ...

In [None]:
def x():
    pass

x = [1, 2, 3]
x = {'foo': 42}
x = pandas.DataFrame()
x = 'This is a sentence.'
x = 3
x = 71.2
import x

### Functions as variables

In [None]:
def my_function():
    print('Hello')

x = my_function
type(x)

function

In [None]:
my_function()

Hello


In [None]:
x()

Hello


In [None]:
# Renaming print function
PrintyMcPrintface = print
PrintyMcPrintface('Python is awesome!')

Python is awesome!


### Lists and dictionaries of functions

In [None]:
# Using lists
list_of_functions = [my_function, open, print]
list_of_functions[2]('I am printing with an element of a list!')

I am printing with an element of a list!


In [None]:
# Using dictionaries
dict_of_functions = {
    'func1': my_function,
    'func2': open,
    'func3': print
}

dict_of_functions['func3']('I am printing with a value of a dict!')

I am printing with a value of a dict!


### Referencing a function

* Important note:
  * When you assign a function to a variable, you do not include the parentheses after the function name. This is a subtle but very important distinction. 
  * When you type `my_function()` with the parentheses, you are **calling that function**. It evaluates to the value that the function returns. 
  * However, when you type `my_function` without the parentheses, you are **referencing the function itself**. It evaluates to a function object.

In [None]:
def my_function():
    return 42
x = my_function

In [None]:
# Calling the function
my_function()

42

In [None]:
# Referencing the function itself
my_function

<function __main__.my_function>

### Functions as arguments

In [None]:
def has_docstring(func):
    """Check to see if the function
    `func` has a docstring.
    
    Args:
        func (callable): A function.
    
    Returns:
        bool
    """
    return func.__doc__ is not None

In [None]:
def no_docstring():
    return 42

def with_docstring():
    """Return the value 42
    """
    return 42

In [None]:
# Testing the function 'has_docstring'
print(has_docstring(no_docstring))
print(has_docstring(with_docstring))

False
True


### Defining a function inside another function

* A function inside a function may be called:
  * Nested functions
  * Inner functions
  * Helper functions
  * Child functions

In [None]:
# Example of function inside a function
def foo():
    x = [3, 6, 9]
    
    def bar(y):
        print(y)
    
    for value in x:
        bar(x)

In [None]:
# Both do the same thing. 'foo2' is more clear
def foo1(x, y):
    if x > 4 and x < 10 and y > 4 and y < 10:
        print(x * y)

def foo2(x, y):
    
    def in_range(v):
        return v > 4 and v < 10
    
    if in_range(x) and in_range(y):
        print(x * y)

### Functions as return values



In [None]:
# We can then call new_func() as if it were the print_me() function.

def get_function():
    def print_me(s):
        print(s)
    
    return print_me

new_func = get_function()
new_func('This is a sentence.')

This is a sentence.


### Exercise 1 - Returning functions for a math game

You are building an educational math game where the player enters a math term, and your program returns a function that matches that term. For instance, if the user types "add", your program returns a function that adds two numbers. So far you've only implemented the "add" function. Now you want to include a "subtract" function.

In [None]:
def create_math_function(func_name):
  if func_name == 'add':
    def add(a, b):
      return a + b
    return add
  elif func_name == 'subtract':
    # Define the subtract() function
    def subtract(a, b):
      return a - b
    return subtract
  else:
    print("I don't know that one")
    
add = create_math_function('add')
print('5 + 2 = {}'.format(add(5, 2)))

subtract = create_math_function('subtract')
print('5 - 2 = {}'.format(subtract(5, 2)))

5 + 2 = 7
5 - 2 = 3


* Now that you've implemented the `subtract()` function, you can keep going to include `multiply()` and `divide()`. I predict this game is going to be even bigger than Fortnite!

Notice how we assign the return value from `create_math_function()` to the add and subtract variables in the script. Since `create_math_function()` returns a function, we can then call those variables as functions.

## Scope

* Scope determines which variables can be accessed at different points in your code.

In [None]:
def foo():
    x = 999 # This 'x' is in the local scope
    print('Local scope, x: ', x) 
    print('Local scope, y: ', y) # This 'y' has not been defined in the local scope, so it looks on the global scopes

x = 7 # This 'x' is in the global scope
y = 200
foo()
print('Global scope, x: ', x) # The function's scope does not change the value of 'x' in the global scope

Local scope, x:  999
Local scope, y:  200
Global scope, x:  7


### Scope order

* First, the interpreter looks for the variable in the **local scope**.
  * If functions are nested, them Python checks the variable in the parents scope. This is called the **nonlocal scope**.
* Secondly, it looks for the varuable in the **global scope**.
* Finally, it looks for the variable in the **builtin scope**.

\
* Examples:
  * print() is a function from the builtin scope.

### Using the global keyword

* Even though you can use global variables in a local scope, it's recommended not to do that, because it can make testing and debugging harder.

In [None]:
x = 7

def foo():
    x = 42 # Creates a new local 'x'
    print(x)

foo()
print(x) # Prints the global 'x'

42
7


In [None]:
x = 7

def foo():
    global x # Explicitly indicated the use of the global 'x'
    x = 42 
    print(x)

foo()
print(x) # Prints the global 'x'

42
42


### Using the nonlocal keyword

In [None]:
def foo():
    x = 10 # Creates a nonlocal 'x'
    
    def bar():
        x = 200 # Creates a local 'x'
        print(x)
    
    bar()
    print(x)

foo()

200
10


In [None]:
def foo():
    x = 10 # Creates a nonlocal 'x'
    
    def bar():
        nonlocal x # Uses the nonlocal 'x'
        x = 200 
        print(x)
    
    bar()
    print(x)

foo()

200
200


## Closures

* A closure in Python is a tuple of variables that are no longer in scope, but that a function needs in order to run.

### Attaching nonlocal variables to nested functions

In [None]:
def foo():
    a = 5
    def bar():
        print(a)
    return bar

func = foo()
func()

5


* How can func() see 'a' if it is defined in the scope o foo(), not bar()'s ? 
  * Because of closerus

In [None]:
# Python attached any nonlocal variable that bar() was going to need to the function object
# Those variables get stored in a tuple in the "__closure__" attribute of the function. 
type(func.__closure__)

tuple

In [None]:
# The closure for "func" has one variable
len(func.__closure__)

1

In [None]:
# The value of that variable is in "cell_contents"
func.__closure__[0].cell_contents

5

### Closures and deletion

In [None]:
x = 25

def foo(value):
    def bar():
        print(value)
    return bar

my_func = foo(x)
my_func()

25


In [None]:
del(x)
my_func()

25


In [None]:
# foo()'s 'value' argument is added to the closure, even though 'x' has been deleted
len(my_func.__closure__)

1

In [None]:
# Even though 'x' doesn't exist anymore, the value persists in its closure
my_func.__closure__[0].cell_contents

25

### Closures and overwriting

In [None]:
x = 25
def foo(value):
    def bar():
        print(value)
    return bar

x = foo(x) # Overwrittes 'x'
x() # But the value 25 persists in the closure

25


In [None]:
len(x.__closure__)

1

In [None]:
x.__closure__[0].cell_contents

25

### Definitions - nested function

* **Nested function**: A function dened inside another function.

In [None]:
# outer function
def parent():
    # nested function
    def child():
        pass
    return child

### Definitions - nonlocal variables

* **Nonlocal variables**: Variables dened in the parent function that are used by the child function.

In [None]:
def parent(arg_1, arg_2):
    # From child()'s point of view,
    # `value` and `my_dict` are nonlocal variables,
    # as are `arg_1` and `arg_2 .
    value = 22
    my_dict = {'chocolate': 'yummy'}
    
    def child():
        print(2 * value)
        print(my_dict['chocolate'])
        print(arg_1 + arg_2)
        
    return child

### Closure

* A closure is Python's way of attaching nonlocal variables to a returned function so that the function can operate even when it is called outside of its parent's scope.

* **Closure**: Nonlocal variables attached to a returned function.

In [None]:
def parent(arg_1, arg_2):
    value = 22
    my_dict = {'chocolate': 'yummy'}
    
    def child():
        print(2 * value)
        print(my_dict['chocolate'])
        print(arg_1 + arg_2)
        
    return child

new_function = parent(3, 4)

print([cell.cell_contents for cell in new_function.__closure__])

[3, 4, {'chocolate': 'yummy'}, 22]


### Why does all of this matter?

* Decorators use:
  * Functions as objects
  * Nested functions
  * Nonlocal scope
  * Closures

### Exercise 1 - Checking for closure

You're teaching your niece how to program in Python, and she is working on returning nested functions. She thinks she has written the code correctly, but she is worried that the returned function won't have the necessary information when called. Show her that all of the nonlocal variables she needs are in the new function's closure.

In [None]:
def return_a_func(arg1, arg2):
  def new_func():
    print('arg1 was {}'.format(arg1))
    print('arg2 was {}'.format(arg2))
  return new_func
    
my_func = return_a_func(2, 17)

print(my_func.__closure__ is not None)
print(len(my_func.__closure__) == 2)

# Get the values of the variables in the closure
closure_values = [
  my_func.__closure__[i].cell_contents for i in range(2)
]
print(closure_values == [2, 17])

True
True
True


## Decorators

* A decorator is a wrapper that you can place around a function that changes that function's behaviour.
  * You can modify the inputs;
  * You can modify the outputs;
  * You can modify the behaviour of the function itself.
* Decorators are just functions that take a function as an argument and return a modified version of that function.

### What does a decorator look like?

In [None]:
# The 'double_args' decotaor modifies the behaviour of the multiply() function.
@double_args
def multiply(a, b):
    return a * b
multiply(1, 5)

NameError: ignored

### The double_args decorator

In [None]:
# double_args decorator still does nothing

def multiply(a, b):
    return a * b
    
def double_args(func):
    return func

new_multiply = double_args(multiply)
new_multiply(1, 5)

5

In [None]:
# double_args decorator now modifies the multiply() function

def multiply(a, b):
    return a * b

def double_args(func):
    # Define a new function that we can modify
    def wrapper(a, b):
        # Call the passed in function, but double each argument
        return func(a * 2, b * 2)
    return wrapper


'''
* new_multiply() is equal to wrapper(), which calls multiply() 
after doubling each argument. 
* So 1 becomes 2 and 5 becomes 10, giving us 2 times 10, which
equals 20.
'''
new_multiply = double_args(multiply)
new_multiply(1, 5)

20

In [None]:
# overwritting multiply() function

def multiply(a, b):
    return a * b

def double_args(func):
    # Define a new function that we can modify
    def wrapper(a, b):
        # Call the passed in function, but double each argument
        return func(a * 2, b * 2)
    return wrapper

multiply = double_args(multiply)
multiply(1, 5)

20

### Decorator syntax

In [None]:
def double_args(func):
    def wrapper(a, b):
        return func(a * 2, b * 2)
    return wrapper

def multiply(a, b):
    return a * b

multiply = double_args(multiply)
multiply(1, 5)

20

In [None]:
def double_args(func):
    def wrapper(a, b):
        return func(a * 2, b * 2)
    return wrapper

@double_args # Using decorator syntax
def multiply(a, b):
    return a * b

multiply(1, 5)

20

### Exercise 1 - Using decorator syntax

You have written a decorator called print_args that prints out all of the arguments and their values any time a function that it is decorating gets called.

In [None]:
# Decorate my_function() with the print_args() decorator by redefining my_function().

def my_function(a, b, c):
  print(a + b + c)

if False:
    # Decorate my_function() with the print_args() decorator
    my_function = print_args(my_function)

    my_function(1, 2, 3)

In [None]:
# Decorate my_function() with the print_args() decorator
@print_args
def my_function(a, b, c):
  print(a + b + c)

my_function(1, 2, 3)

* Note that `@print_args` before the definition of my_function is exactly equivalent to `my_function = print_args(my_function)`. 

* Remember, even though decorators are functions themselves, when you use decorator syntax with the `@` symbol you do not include the parentheses after the decorator name.

### Exercise 2 - Defining a decorator

Your buddy has been working on a decorator that prints a "before" message before the decorated function is called and prints an "after" message after the decorated function is called. They are having trouble remembering how wrapping the decorated function is supposed to work. Help them out by finishing their `print_before_and_after()` decorator.

In [None]:
def print_before_and_after(func):
  def wrapper(*args):
    print('Before {}'.format(func.__name__))
    # Call the function being decorated with *args
    func(*args)
    print('After {}'.format(func.__name__))
  # Return the nested function
  return wrapper

@print_before_and_after
def multiply(a, b):
  print(a * b)

multiply(5, 10)

Before multiply
50
After multiply


* The decorator `print_before_and_after()` defines a nested function `wrapper()` that calls whatever function gets passed to `print_before_and_after()`. 

* `wrapper()` adds a little something else to the function call by printing one message before the decorated function is called and another right afterwards. 

* Since `print_before_and_after()` returns the new `wrapper()` function, we can use it as a decorator to decorate the `multiply()` function.

# Chapter 4 - More on Decorators

Now that you understand how decorators work under the hood, this chapter gives you a bunch of real-world examples of when and how you would write decorators in your own code. You will also learn advanced decorator concepts like how to preserve the metadata of your decorated functions and how to write decorators that take arguments.

## Real-world examples

### Time a function

* The timer() decorator runs the decorated function and then prints how long it took for the function to run. 
  * This is a pretty easy way to figure out where your computational bottlenecks are. 
  

In [None]:
import time

def timer(func):
    """A decorator that prints how long a function took to run.
    
    Args:
        func (callable): The function being decorated.
    
    Returns:
        callable: The decorated function.
    """

In [None]:
import time

def timer(func):
    """A decorator that prints how long a function took to run.
    """

    # Define the wrapper function to return.
    def wrapper(*args, **kwargs):
        # When wrapper() is called, get the current time.
        t_start = time.time()
        # Call the decorated function and store the result.
        result = func(*args, **kwargs)
        # Get the total time it took to run, and print it.
        t_total = time.time() - t_start
        print('{} took {}s'.format(func.__name__, t_total))
        return result
    
    return wrapper

In [None]:
@timer
def sleep_n_seconds(n):
    time.sleep(n)

sleep_n_seconds(5)

sleep_n_seconds took 5.00506591796875s


### Memoizing

* Memoizing is the process of storing the results of a function so that the next time the function is called with the same arguments you can just look up the answer. 

In [None]:
def memoize(func):
    """Store the results of the decorated function for fast lookup
    """
    # Store results in a dict that maps arguments to results
    cache = {}
    # Define the wrapper function to return.
    def wrapper(*args):
        # If these arguments haven't been seen before,
        if (args) not in cache:
            # Call func() and store the result.
            cache[args] = func(*args)
        return cache[args]
    return wrapper

In [None]:
@memoize
def slow_function(a, b):
    print('Sleeping...')
    time.sleep(5)
    return a + b

In [None]:
slow_function(3, 4)

Sleeping...


7

In [None]:
# Now the elements are in the cache, so the result is immediatly retrieved.
slow_function(3, 4)

7

### When to use decorators

* Add common behavior to multiple functions

In [None]:
@timer
def foo():
    # do some computation

@timer
def bar():
    # do some other computation

@timer
def baz():
    # do something else

### Exercise 1 - Print the return type

You are debugging a package that you've been working on with your friends. Something weird is happening with the data being returned from one of your functions, but you're not even sure which function is causing the trouble. You know that sometimes bugs can sneak into your code when you are expecting a function to return one thing, and it returns something different. For instance, if you expect a function to return a numpy array, but it returns a list, you can get unexpected behavior. To ensure this is not what is causing the trouble, you decide to write a decorator, `print_return_type()`, that will print out the type of the variable that gets returned from every call of any function it is decorating.

In [None]:
def print_return_type(func):
  # Define wrapper(), the decorated function
  def wrrapper(*args, **kwargs):
    # Call the function being decorated
    result = func(*args, **kwargs)
    print('{}() returned type {}'.format(
      func.__name__, type(result)
    ))
    return result
  # Return the decorated function
  return wrrapper
  
@print_return_type
def foo(value):
  return value
  
print(foo(42))
print(foo([1, 2, 3]))
print(foo({'a': 42}))

foo() returned type <class 'int'>
42
foo() returned type <class 'list'>
[1, 2, 3]
foo() returned type <class 'dict'>
{'a': 42}


* Your new decorator helps you examine the results of your functions at runtime. 

* Now you can apply this decorator to every function in the package you are developing and run your scripts. 

* Being able to examine the types of your return values will help you understand what is happening and will hopefully help you find the bug.

### Exercise 2 - Counter

You're working on a new web app, and you are curious about how many times each of the functions in it gets called. So you decide to write a decorator that adds a counter to each function that you decorate. You could use this information in the future to determine whether there are sections of code that you could remove because they are no longer being used by the app.

In [None]:
def counter(func):
  def wrapper(*args, **kwargs):
    wrapper.count += 1
    # Call the function being decorated and return the result
    return func
  wrapper.count = 0
  # Return the new decorated function
  return wrapper

# Decorate foo() with the counter() decorator
@counter
def foo():
  print('calling foo()')
  
foo()
foo()

print('foo() was called {} times.'.format(foo.count))

foo() was called 2 times.


* Now you can go decorate a bunch of functions with the counter() decorator, let your program run for a while, and then print out how many times each function was called.

* It seems a little magical that you can reference the wrapper() function from inside the definition of wrapper() as we do here on line 3. That's just one of the many neat things about functions in Python -- any function, not just decorators.

## Decorators and metadata

In [1]:
def sleep_n_seconds(n=10):
    """Pause processing for n seconds.
    
    Args:
        n (int): The number of seconds to pause for.
    """
    time.sleep(n)

In [2]:
# Docstring
print(sleep_n_seconds.__doc__)

Pause processing for n seconds.
    
    Args:
        n (int): The number of seconds to pause for.
    


In [3]:
# Function name
print(sleep_n_seconds.__name__)

sleep_n_seconds


In [4]:
# Default arguments
print(sleep_n_seconds.__defaults__)

(10,)


### The time decorator

In [6]:
def timer(func):
    """A decorator that prints how long a function took to run.
    """
    
    def wrapper(*args, **kwargs):
        t_start = time.time()
        
        result = func(*args, **kwargs)
        
        t_total = time.time() - t_start
        print('{} took {}s'.format(func.__name__, t_total))
        
        return result
    return wrapper

In [7]:
@timer
def sleep_n_seconds(n=10):
    """Pause processing for n seconds.
    
    Args:
        n (int): The number of seconds to pause for.
    """
    time.sleep(n)

In [8]:
# Function name changed
print(sleep_n_seconds.__name__)

wrapper


In [9]:
# No Docstring
print(sleep_n_seconds.__doc__)

None


### functools.wraps()

* To deal with this problem Python provides the wraps() function from the functools module is a decorator that you use when defining a decorator.

* If you use it to decorate the wrapper function that your decorator returns, it will modify wrapper()'s metadata to look like the function you are decorating. 

* Notice that the wraps() decorator takes the function you are decorating as an argument.

In [10]:
from functools import wraps

def timer(func):
    """A decorator that prints how long a function took to run.
    """
    
    @wraps(func)
    def wrapper(*args, **kwargs):
        t_start = time.time()
        
        result = func(*args, **kwargs)
        
        t_total = time.time() - t_start
        print('{} took {}s'.format(func.__name__, t_total))
        
        return result
    return wrapper

In [11]:
@timer
def sleep_n_seconds(n=10):
    """Pause processing for n seconds.
    
    Args:
        n (int): The number of seconds to pause for.
    """
    time.sleep(n)

In [13]:
# Docstring ok again
print(sleep_n_seconds.__doc__)

Pause processing for n seconds.
    
    Args:
        n (int): The number of seconds to pause for.
    


In [12]:
# Working again
print(sleep_n_seconds.__name__)

sleep_n_seconds


### Exercise 1 - Preserving docstrings when decorating functions

Your friend has come to you with a problem. They've written some nifty decorators and added them to the functions in the open-source library they've been working on. However, they were running some tests and discovered that all of the docstrings have mysteriously disappeared from their decorated functions. Show your friend how to preserve docstrings and other metadata when writing decorators.

In [14]:
from functools import wraps

def add_hello(func):
  # Decorate wrapper() so that it keeps func()'s metadata
  @wraps(func)
  def wrapper(*args, **kwargs):
    """Print 'hello' and then call the decorated function."""
    print('Hello')
    return func(*args, **kwargs)
  return wrapper
  
@add_hello
def print_sum(a, b):
  """Adds two numbers and prints the sum"""
  print(a + b)
  
print_sum(10, 20)
print(print_sum.__doc__)

Hello
30
Adds two numbers and prints the sum


* Your friend was concerned that they couldn't print the docstrings of their functions. 

* They now realize that the strange behavior they were seeing was caused by the fact that they were accidentally printing the wrapper() docstring instead of the docstring of the original function. 

* After adding @wraps(func) to all of their decorators, they see that the docstrings are back where they expect them to be.

### Exercise 2 - Measuring decorator overhead

Your boss wrote a decorator called `check_everything()` that they think is amazing, and they are insisting you use it on your function. However, you've noticed that when you use it to decorate your functions, it makes them run much slower. You need to convince your boss that the decorator is adding too much processing time to your function. To do this, you are going to measure how long the decorated function takes to run and compare it to how long the undecorated function would have taken to run. This is the decorator in question:

In [15]:
def check_everything(func):
  @wraps(func)
  def wrapper(*args, **kwargs):
    check_inputs(*args, **kwargs)
    result = func(*args, **kwargs)
    check_outputs(result)
    return result
  return wrapper

In [17]:
import time

@check_everything
def duplicate(my_list):
  """Return a new list that repeats the input twice"""
  return my_list + my_list

t_start = time.time()
duplicated_list = duplicate(list(range(50)))
t_end = time.time()
decorated_time = t_end - t_start

t_start = time.time()
# Call the original function instead of the decorated one
duplicated_list = duplicate.__wrapped__(list(range(50)))
t_end = time.time()
undecorated_time = t_end - t_start

print('Decorated time: {:.5f}s'.format(decorated_time))
print('Undecorated time: {:.5f}s'.format(undecorated_time))

NameError: ignored

* Your function ran approximately 10,000 times faster without your boss's decorator. 

* At least they were smart enough to add @wraps(func) to the nested wrapper() function so that you were able to access the original function. 

* You should show them the results of this test.

## Decorators that take arguments

In [18]:
def run_three_times(func):
    
    def wrapper(*args, **kwargs):
        for i in range(3):
            func(*args, **kwargs)
    
    return wrapper

@run_three_times
def print_sum(a, b):
    print(a + b)

print_sum(3, 5)

8
8
8


### A decorator factory

In [20]:
def run_n_times(n):
    """Define and return a decorator"""
    def decorator(func):
        def wrapper(*args, **kwargs):
            for i in range(n):
                func(*args, **kwargs)
        return wrapper
    return decorator

@run_n_times(5)
def print_sum(a, b):
    print(a + b)

print_sum(3, 5)

8
8
8
8
8


### Exercise 1 - Run_n_times()

* We have a decorator that takes an argument: run_n_times().

* Practice different ways of applying the decorator to the function print_sum().

In [21]:
# Make print_sum() run 10 times with the run_n_times() decorator
@run_n_times(10)
def print_sum(a, b):
  print(a + b)
  
print_sum(15, 20)

35
35
35
35
35
35
35
35
35
35


In [22]:
# Use run_n_times() to create the run_five_times() decorator
run_five_times = run_n_times(5)

@run_five_times
def print_sum(a, b):
  print(a + b)
  
print_sum(4, 100)

104
104
104
104
104


In [32]:
# Modify the print() function to always run 5 times
print = run_n_times(5)(print)
print('What is happening?!?!')

What is happening?!?!
What is happening?!?!
What is happening?!?!
What is happening?!?!
What is happening?!?!


### Exercise 2 - HTML Generator

You are writing a script that generates HTML for a webpage on the fly. So far, you have written two decorators that will add bold or italics tags to any function that returns a string. You notice, however, that these two decorators look very similar. Instead of writing a bunch of other similar looking decorators, you want to create one decorator, `html()`, that can take any pair of opening and closing tags.

In [24]:
def bold(func):
  @wraps(func)
  def wrapper(*args, **kwargs):
    msg = func(*args, **kwargs)
    return '<b>{}</b>'.format(msg)
  return wrapper

In [25]:
def italics(func):
  @wraps(func)
  def wrapper(*args, **kwargs):
    msg = func(*args, **kwargs)
    return '<i>{}</i>'.format(msg)
  return wrapper

In [26]:
def html(open_tag, close_tag):
  def decorator(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
      msg = func(*args, **kwargs)
      return '{}{}{}'.format(open_tag, msg, close_tag)
    # Return the decorated function
    return wrapper
  # Return the decorator
  return decorator

In [33]:
# Make hello() return bolded text
@html('<b>', '</b>')
def hello(name):
  return 'Hello {}!'.format(name)

del(print)
print(hello('Alice'))

<b>Hello Alice!</b>


In [34]:
# Make goodbye() return italicized text
@html('<i>', '</i>')
def goodbye(name):
  return 'Goodbye {}.'.format(name)
  
print(goodbye('Alice'))

<i>Goodbye Alice.</i>


## Timeout(): a real world example

* Let's imagine that we have some functions that occasionally either run for longer than we want them to or just hang and never return. 

* It would be nice if we could add some kind of timeout() decorator to those functions that will raise an error if the function runs for longer than expected.

In [67]:
import signal
import time

def raise_timeout(*args, **kwargs):
    print('Alarm: ', time.ctime())
    raise TimeoutError()

# When an "alarm" signal goes off, call raise_timeout()
signal.signal(signalnum=signal.SIGALRM, handler=raise_timeout)

def timeout_in_5s(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        # Set an alarm for 5 seconds
        signal.alarm(5) # Python calls the raise_timeout() function for more than 5s
        try:
            # Call the decorated func
            return func(*args, **kwargs)
        finally:
            # Cancel alarm
            signal.alarm(0)
    return wrapper

In [69]:
@timeout_in_5s
def foo():
    print('foo start')
    time.sleep(10)
    print('foo end')

foo()

foo start
Alarm:  Sun Dec  6 05:49:29 2020


TimeoutError: ignored

### Another timeout decorator

In [70]:
def timeout(n_seconds):
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            # Set an alarm for n seconds
            signal.alarm(n_seconds)
            try:
                # Call the decorated func
                return func(*args, **kwargs)
            finally:
                # Cancel alarm
                signal.alarm(0)
        return wrapper
    return decorator

In [72]:
@timeout(5)
def foo():
    time.sleep(10)
    print('foo!')

@timeout(20)
def bar():
    time.sleep(10)
    print('bar!')

foo()

Alarm:  Sun Dec  6 05:52:06 2020


TimeoutError: ignored

In [73]:
bar()

bar!


### Exercise 1 - Tag your functions

Tagging something means that you have given that thing one or more strings that act as labels. For instance, we often tag emails or photos so that we can search for them later. You've decided to write a decorator that will let you tag your functions with an arbitrary list of tags. You could use these tags for many things:

* Adding information about who has worked on the function, so a user can look up who to ask if they run into trouble using it.

* Labeling functions as "experimental" so that users know that the inputs and outputs might change in the future.

* Marking any functions that you plan to remove in a future version of the code.

* Etc.

In [74]:
def tag(*tags):
  # Define a new decorator, named "decorator", to return
  def decorator(func):
    # Ensure the decorated function keeps its metadata
    @wraps(func)
    def wrapper(*args, **kwargs):
      # Call the function being decorated and return the result
      return func(*args, **kwargs)
    wrapper.tags = tags
    return wrapper
  # Return the new decorator
  return decorator

@tag('test', 'this is a tag')
def foo():
  pass

print(foo.tags)

('test', 'this is a tag')


* With this new decorator, you can do some really interesting things. For instance, you could tag a bunch of image transforming functions, and then write code that searches for all of the functions that transform images and apply them, one after the other, on a given input image. 

### Exercise 2 - Check the return type

Python's flexibility around data types is usually cited as one of the benefits of the language. It can occasionally cause problems though if incorrect data types go unnoticed. You've decided that in order to make sure your code is doing exactly what you want it to do, you will explicitly check the return types of all of your functions and make sure they are what you expect them to be. To do that, you are going to create a decorator that checks that the return type of the decorated function is correct.

Note: `assert(condition)` is a function that you can use to test whether something is true. If `condition` is `True`, this function doesn't do anything. If `condition` is `False`, this function raises an error. The type of error that it raises is called an `AssertionError`.

In [75]:
def returns_dict(func):
  # Complete the returns_dict() decorator
  def wrapper(*args, **kwargs):
    result = func(*args, **kwargs)
    assert(type(result) == dict)
    return result
  return wrapper
  
@returns_dict
def foo(value):
  return value

try:
  print(foo([1,2,3]))
except AssertionError:
  print('foo() did not return a dict!')

foo() did not return a dict!


In [76]:
def returns(return_type):
  # Complete the returns() decorator
  def decorator(func):
    def wrapper(*args, **kwargs):
      result = func(*args, **kwargs)
      assert(type(result) == return_type)
      return result
    return wrapper
  return decorator
  
@returns(dict)
def foo(value):
  return value

try:
  print(foo([1,2,3]))
except AssertionError:
  print('foo() did not return a dict!')

foo() did not return a dict!


# Wrap-up

* Chapter 1 - Best Practices
  * Docstrings
  * DRY and Do One Thing
  * Pass by assignment (mutable vs immutable)

* Chapter 2 - Context Managers
  * Use of the keyword 'with'
  * Writting you own context mamangers by using the `contextmanager()` decorator.

* Chapter 3 - Decorators
  * How they work, how to use them, and how to write decorators of your own.

* Chapter 4 - More on Decorators
  * Use of `wraps()` to make sure your decorators maintain their metadata.
  * **Create decorators that take arguments**