**Writing Functions in Python**

October 2019

Noah Markowitz

In [1]:
import pandas as pd
import numpy as np

## Best Practices


### Docstrings

A good practice. Tells what purpose of function or class is. Docstring starts at the head of the function/class. Example below

```
"""Description of what function does

Args: 
    arg_1 (str): Description of arg_1 that can break into another line 
        if needed. Extra lines are indented
    arg_2 (int, optional): What optional when argument has a default value
    
Returns:
    bool: Optional description of the return value
    Extra lines not indented
    
Raises:
    ValueError: Include any error types that the function intentially
        raises
        
Notes:
    See <website url> for more info
"""
```

To access a docstring from a function or object do one of two things:
    1. `print(my_function.__doc__)`
    2. `import inspect; print(inspect.getdoc(my_function))`

In [2]:
def count_letter(content, letter):
  """Count the number of times `letter` appears in `content`.

  Args:
    content (str): The string to search.
    letter (str): The letter to search for.

  Returns:
    int

  # Add a section detailing what errors might be raised
  Raises:
    ValueError: If `letter` is not a one-character string.
  """
  if (not isinstance(letter, str)) or len(letter) != 1:
    raise ValueError('`letter` must be a single character string.')
  return len([char for char in content if char == letter])

Ways to print Docstring

In [3]:
# Get the docstring with an attribute of count_letter()
docstring = count_letter.__doc__
border = '#' * 28
print('{}\n{}\n{}'.format(border, docstring, border))

# Get the docstring with a function from the inspect module
import inspect
docstring = inspect.getdoc(count_letter)
border = '#' * 28
print('{}\n{}\n{}'.format(border, docstring, border))

# Print docstring for any function passed into build_tooltip
def build_tooltip(function):
  """Create a tooltip for any function that shows the 
  function's docstring.
  
  Args:
    function (callable): The function we want a tooltip for.
    
  Returns:
    str
  """
  # Use 'inspect' to get the docstring
  docstring = inspect.getdoc(function)
  border = '#' * 28
  return '{}\n{}\n{}'.format(border, docstring, border)

print(build_tooltip(count_letter))
print(build_tooltip(range))
print(build_tooltip(print))

############################
Count the number of times `letter` appears in `content`.

  Args:
    content (str): The string to search.
    letter (str): The letter to search for.

  Returns:
    int

  # Add a section detailing what errors might be raised
  Raises:
    ValueError: If `letter` is not a one-character string.
  
############################
############################
Count the number of times `letter` appears in `content`.

Args:
  content (str): The string to search.
  letter (str): The letter to search for.

Returns:
  int

# Add a section detailing what errors might be raised
Raises:
  ValueError: If `letter` is not a one-character string.
############################
############################
Count the number of times `letter` appears in `content`.

Args:
  content (str): The string to search.
  letter (str): The letter to search for.

Returns:
  int

# Add a section detailing what errors might be raised
Raises:
  ValueError: If `letter` is not a one-character s

### Dry and "Do One Thing"

DRY - Don't repeat yourself
* Though it's useful to copy and paste code, there's big possibilities of errors being introduced such as when you need to adjust what variables/functions need to reference/access in each instancd

Do One Thing - Every function should have a single responsibility
* Rather than have a function do many things (like load then plot data), split a function into its' multiple components/responsibilities. Advantages include:
    * More flexible
    * More easily understood
    * Simpler to test
    * Simpler to debug
    
Refactoring - Changing/improving code a little bit at a time

Improve the code below by writing a function:

```
# Standardize the GPAs for each year
df['y1_z'] = (df.y1_gpa - df.y1_gpa.mean()) / df.y1_gpa.std()
df['y2_z'] = (df.y2_gpa - df.y2_gpa.mean()) / df.y2_gpa.std()
df['y3_z'] = (df.y3_gpa - df.y3_gpa.mean()) / df.y3_gpa.std()
df['y4_z'] = (df.y4_gpa - df.y4_gpa.mean()) / df.y4_gpa.std()
```

It calculates z-scores of students GPA

>Note: df is a pandas DataFrame where each row is a student with 4 columns of yearly student GPAs: y1_gpa, y2_gpa, y3_gpa, y4_gpa

In [4]:
# Make a sample dataset to work with: 4 columns of random floats from 0-4
df = pd.DataFrame(4*np.random.random((100, 4)), columns = ['y1_gpa', 'y2_gpa','y3_gpa','y4_gpa'])

In [5]:
def standardize(column):
  """Standardize the values in a column.

  Args:
    column (pandas Series): The data to standardize.

  Returns:
    pandas Series: the values as z-scores
  """
  # Finish the function so that it returns the z-scores
  z_score = (column - column.mean()) / column.std()
  return z_score

# Use the standardize() function to calculate the z-scores
df['y1_z'] = standardize(df['y1_gpa'])
df['y2_z'] = standardize(df['y2_gpa'])
df['y3_z'] = standardize(df['y3_gpa'])
df['y4_z'] = standardize(df['y4_gpa'])

# Show results
display(df.head())

Unnamed: 0,y1_gpa,y2_gpa,y3_gpa,y4_gpa,y1_z,y2_z,y3_z,y4_z
0,2.729334,1.588427,2.89493,0.544082,0.487456,-0.467757,0.752001,-1.212174
1,2.153205,2.309773,3.581811,3.530328,-0.018477,0.130562,1.34204,1.231154
2,1.90215,1.870188,3.922563,3.116415,-0.238943,-0.234051,1.63475,0.892493
3,1.60654,3.63475,3.338554,3.903153,-0.498536,1.229562,1.133079,1.536197
4,1.221571,3.598102,1.163339,2.620465,-0.836599,1.199164,-0.735457,0.486709


Now try splitting this function that calculates mean and median into two separate functions

```
def mean_and_median(values):
  """Get the mean and median of a list of `values`

  Args:
    values (iterable of float): A list of numbers

  Returns:
    tuple (float, float): The mean and median
  """
  mean = sum(values) / len(values)
  midpoint = int(len(values) / 2)
  if len(values) % 2 == 0:
    median = (values[midpoint - 1] + values[midpoint]) / 2
  else:
    median = values[midpoint]

  return mean, median
```

In [6]:
def mean(values):
  """Get the mean of a list of values

  Args:
    values (iterable of float): A list of numbers

  Returns:
    float
  """
  # Write the mean() function
  mean = sum(values) / len(values)
  return mean

def median(values):
  """Get the median of a list of values

  Args:
    values (iterable of float): A list of numbers

  Returns:
    float
  """
  # Write the median() function
  midpoint = int(len(values) / 2)
  if len(values) % 2 == 0:
    median = (values[midpoint - 1] + values[midpoint]) / 2
  else:
    median = values[midpoint]
  return median

### Use Immutable Object for Default Function Values

Use:
```
# Use an immutable variable for the default argument 
def better_add_column(values, df=None):
  # Update the function to create a default DataFrame
  if df is None:
    df = pandas.DataFrame()
  df['col_{}'.format(len(df.columns))] = values
  return df
```

As opposed to 

```
def add_column(values, df=pandas.DataFrame()):
  df['col_{}'.format(len(df.columns))] = values
  return df
```

## Context Managers

Typically used with files. Can open a file and it will be automatically closed after you're done with it. The key is the `with` keyword

Ex:

```
with open('file.txt') as f:
  x = f.readlines()
```

The file will be opened, read and then automatically closed

---

Ex 2: Count the number of times the word "cat" appears in "Alice's Adventures in Wonderland" by Lewis Carroll. You have already downloaded a text file, alice.txt, with the entire contents of this great book.

```
# Open "alice.txt" and assign the file to "file"
with open('alice.txt') as file:
  text = file.read()

n = 0
for word in text.split():
  if word.lower() in ['cat', 'cats']:
    n += 1

print('Lewis Carroll uses the word "cat" {} times'.format(n))
```

Ex 3: You're testing code that processes data and determines if a cat is in a picture on instagram. You want to make it faster. Someone created a new context manager called `timer` to determine how fast a process is. Test it on `process_with_numpy()` and `process_with_pytorch()`

```
image = get_image_from_instagram()

# Time how long process_with_numpy(image) takes to run
with timer():
  print('Numpy version')
  process_with_numpy(image)

# Time how long process_with_pytorch(image) takes to run
with timer():
  print('Pytorch version')
  process_with_pytorch(image)
```

There was no `as <variable_name>` used at the end of the `with` statement because `timer()` is a context manager that doesn't return a value

### Creating Context Managers

Two ways to define context managers
1. Class-based
2. Function-based

Five parts to creating a context manager
1. Define function
2. Add any set up code your context needs (optional)
3. Use the `yield` keyword (returns the value but keeps running the code to the end)
4. Add any teardown code your context needs (optional)
5. Add the `@contextlib.contextmanager` decorator

Setup and Teardown

```
@contextlib.contextmanager
def database(url):
  # set up database connection
  db = postgres.connect(url)
  yield db
  # tear down database connection
  db.disconnect()
  
url = 'http://datacamp.com/data'
with database(url) as my_db:
  course_list = my_db.execute(
    'SELECT * FROM courses
  )
```

In [7]:
# The timer() context manager
# Add a decorator that will make timer() a context manager
import contextlib
import time
@contextlib.contextmanager
def timer():
  """Time the execution of a context block.

  Yields:
    None
  """
  start = time.time()
  # Send control back to the context block
  yield
  end = time.time()
  print('Elapsed: {:.2f}s'.format(end - start))

# Test it out
with timer():
  print('This should take approximately 0.25 seconds')
  time.sleep(0.25)

This should take approximately 0.25 seconds
Elapsed: 0.25s


Above, `yield` is written by itself as it doesn't return an explicit value

It's also useful to add a `try` statement around `yield` to handle when a mistake is made or an error occurs and teardown still needs to occur

```
try:
  yield
finally:
    file.close()
```

Above, no matter what the file in the context manager function will close

Cotext Manager Patterns (When to possibly use a context manager)
* Open-Close
* Lock-Release
* Change-Reset
* Enter-Exit
* Start-Stop
* Setup-Teardown
* Connect-Disconnect

## Decorators


### Functions as objects

Functions are just another type of object like lists, arrays, dictionaries etc.

You can assign it to a variable and call that function from the new variable

ex:

```
x = my_function()
x() # Execute "my_function()
```

You can also assign it to a list or dictionary and then call it from there. 

```
# Add the missing function references to the function map
function_map = {
  'mean': mean,
  'std': std,
  'minimum': minimum,
  'maximum': maximum
}

data = load_data()
print(data)

func_name = get_user_input()

# Call the chosen function and pass "data" as an argument
function_map[func_name](data)
```

You can also pass a function to another function
Below you're just checking if the function `log_product` has a docstring

```
def has_docstring(func):
  """Check to see if the function 
  `func` has a docstring.

  Args:
    func (callable): A function.

  Returns:
    bool
  """
  return func.__doc__ is not None
  
# Call has_docstring() on the log_product() function
ok = has_docstring(log_product)

if not ok:
  print("log_product() doesn't have a docstring!")
else:
  print("log_product() looks ok")
```

Even return a function as a value

```
def create_math_function(func_name):
  if func_name == 'add':
    def add(a, b):
      return a + b
    return add
  elif func_name == 'subtract':
    # Define the subtract() function
    def subtract(a,b):
      return a - b
    return subtract
  else:
    print("I don't know that one")
    
add = create_math_function('add')
print('5 + 2 = {}'.format(add(5, 2)))

subtract = create_math_function('subtract')
print('5 - 2 = {}'.format(subtract(5, 2)))
```

### Scope

Highest to lowest
1. Builtins (like the `print()` function
2. Global
3. Nonlocal - In the case of nested functions, variables defined in the parent function that are used by the child function
4. Local

You can assign a variable to be global scope using the `global` keyword

```
call_count = 0

def my_function():
  # Use a keyword that lets us update call_count 
  ____ call_count
  call_count += 1
  
  print("You've called my_function() {} times!".format(
    call_count
  ))
  
for _ in range(20):
  my_function()
```

You can assign a nonlocal variable using the `nonlocal` keyword
Nonlocal variables are not always needed but in the case of nested functions

```
def read_files():
  file_contents = None
  
  def save_contents(filename):
    # Add a keyword that lets us modify file_contents
    nonlocal file_contents
    if file_contents is None:
      file_contents = []
    with open(filename) as fin:
      file_contents.append(fin.read())
      
  for filename in ['1984.txt', 'MobyDick.txt', 'CatsEye.txt']:
    save_contents(filename)
    
  return file_contents

print('\n'.join(read_files()))
```

### Closures

Closures - Nonlocal variables attached to a return function so that the function can operate even when operated outside its' parent scope

A tuple of variables that are no longer in scope but a function needs in order to run

In [8]:
# Closures example 1
x = 25

def foo(value):
    def bar():
        print(value)
    return bar

my_func = foo(x)
my_func()

del(x)
my_func()

# Even though x no longer exists, the needed value exists in the functions closure
print( len(my_func.__closure__) )
my_func.__closure__[0].cell_contents

25
25
1


25

In [9]:
# Closures example 2
def parent(arg_1, arg_2):
    value = 22
    my_dict = {'chocolate': 'yummy'}
    def child():
        print(2 * value)
        print(my_dict['chocolate'])
        print(arg_1 + arg_2)
    return child

new_function = parent(3, 4)
print([cell.cell_contents for cell in new_function.__closure__])
new_function()

[3, 4, {'chocolate': 'yummy'}, 22]
44
yummy
7


In [10]:
def return_a_func(arg1, arg2):
    def new_func():
        print('arg1 was {}'.format(arg1))
        print('arg2 was {}'.format(arg2))
    return new_func
    
my_func = return_a_func(2, 17)

# Show the closure has content
print(my_func.__closure__ is not None)

# Show closure has length of 2
print(len(my_func.__closure__) == 2)

# Get the values of the variables in the closure
closure_values = [my_func.__closure__[i].cell_contents for i in range(2)]
print(closure_values == [2, 17])

True
True
True


Values get added to a function's closure in the order they are defined in the enclosing function (in this case, `arg1`
and then `arg2`), but only if they are used in the nested function. That is, if `return_a_func()` took a third argument (e.g., `arg3`) that wasn't used by `new_func()`, then it would not be captured in `new_func()`'s closure.

In [11]:
def my_special_function():
    print('You are running my_special_function()')

def get_new_func(func):
    def call_func():
        func()
    return call_func

new_func = get_new_func(my_special_function)

# Redefine my_special_function() to just print "hello"
def my_special_function():
    print('hello')

new_func()

You are running my_special_function()


In [12]:
# Delete my_special_function()
del(my_special_function)
new_func()

You are running my_special_function()


In [13]:
def my_special_function():
    print('You are running my_special_function()')

# Overwrite `my_special_function` with the new function
my_special_function = get_new_func(my_special_function)
my_special_function()

You are running my_special_function()


**Even if things are overwritten or deleted, the needed values are stored safely in the function's closure so that it can still run**

### Starting Decorators

Decorator - Wrapper for a function that changes its' behavior, modify inputs and modify outputs

The two blocks of code below are the same except one uses decorator syntax. However they both work the same way.

In this example the inputs to the `multiply` function are changed before being evaluated

In [14]:
# No decorator
def double_args(func):
    def wrapper(a, b):
        return func(a * 2, b * 2)
    return wrapper

def multiply(a, b):
    return a * b

multiply = double_args(multiply)
multiply(1, 5)

20

In [15]:
def double_args(func):
    def wrapper(a, b):
        return func(a * 2, b * 2)
    return wrapper

@double_args
def multiply(a, b):
    return a * b

multiply(1, 5)

20

Here's the process broken down for `multiply = double_args(multiply)`

* `double_args()` returns a function
* The function returned is `multiply(a*2,b*2)`

In [16]:
# Define a wrapper that prints statements before and after function execution
def print_before_and_after(func):
    def wrapper(*args):
        print('Before {}'.format(func.__name__))
        # Call the function being decorated with *args
        func(*args)
        print('After {}'.format(func.__name__))
        # Return the nested function
    return wrapper

@print_before_and_after
def multiply(a, b):
    print(a * b)

multiply(5, 10)

Before multiply
50
After multiply


## More on Decorators

When to use decorators: Add common code to multiple functions
* Like timing how long a function takes or printing before and after messages

In [17]:
# A wrapper that can determine the type of the data returned
def print_return_type(func):
    # Define wrapper(), the decorated function
    def wrapper(*args, **kwargs):
      # Call the function being decorated
      result = func(*args, **kwargs)
      print('{}() returned type {}'.format(
        func.__name__, type(result)
      ))
      return result
  
    # Return the decorated function
    return wrapper
  
@print_return_type
def foo(value):
    return value
  
print(foo(42))
print(foo([1, 2, 3]))
print(foo({'a': 42}))

foo() returned type <class 'int'>
42
foo() returned type <class 'list'>
[1, 2, 3]
foo() returned type <class 'dict'>
{'a': 42}


In [18]:
# Count how many times a function was called
def counter(func):
    def wrapper(*args, **kwargs):
        wrapper.count += 1
        # Call the function being decorated and return the result
        return func(*args, **kwargs)

    wrapper.count = 0
    # Return the new decorated function
    return wrapper

# Decorate foo() with the counter() decorator
@counter
def foo():
    print('calling foo()')
  
foo()
foo()

print('foo() was called {} times.'.format(foo.count))

calling foo()
calling foo()
foo() was called 2 times.


### Decorators and Metadata

A problem with decorators is that they obscure the function's metadata
    
If you like at the function's name and docstring it will call the decorator's name and docstring
* `print(func.__doc__)` and `print(func.__name__)`

To fix simply add a wrapper to the returned function: `functools.wraps()`
Below is an example

In [19]:
# Will not print the docstring of the function
def add_hello(func):
    def wrapper(*args, **kwargs):
        """Print 'hello' and then call the decorated function."""
        print('Hello')
        return func(*args, **kwargs)
    return wrapper

# Decorate print_sum() with the add_hello() decorator
@add_hello
def print_sum(a, b):
    """Adds two numbers and prints the sum"""
    print(a + b)

print_sum(10, 20)
print(print_sum.__doc__)

Hello
30
Print 'hello' and then call the decorated function.


In [20]:
# Will print the appropriate docstring
from functools import wraps

def add_hello(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        """Print 'hello' and then call the decorated function."""
        print('Hello')
        return func(*args, **kwargs)
    return wrapper

# Decorate print_sum() with the add_hello() decorator
@add_hello
def print_sum(a, b):
    """Adds two numbers and prints the sum"""
    print(a + b)
  
print_sum(10, 20)
print(print_sum.__doc__)

Hello
30
Adds two numbers and prints the sum


Even if the function is wrapped you still have access to the original version with the `.__wrapped__` attribute. Below example shows time it takes to run the function decorated and undecorated

In [21]:
def check_everything(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        result = func(*args, **kwargs)
        return result
    return wrapper

@check_everything
def duplicate(my_list):
    """Return a new list that repeats the input twice"""
    return my_list + my_list

t_start = time.time()
duplicated_list = duplicate(list(range(50)))
t_end = time.time()
decorated_time = t_end - t_start

t_start = time.time()
# Call the original function instead of the decorated one
duplicated_list = duplicate.__wrapped__(list(range(50)))
t_end = time.time()
undecorated_time = t_end - t_start

print('Decorated time: {:.5f}s'.format(decorated_time))
print('Undecorated time: {:.5f}s'.format(undecorated_time))

Decorated time: 0.00009s
Undecorated time: 0.00007s


### Decorators with arguments

Make a function that RETURNS a decorator than a function that IS a decorator

In [22]:
# This function takes an input and then returns the function with a decorator
def run_n_times(n):
    """Define and return a decorator"""
    def decorator(func):
        def wrapper(*args, **kwargs):
            for i in range(n):
                func(*args, **kwargs)
        return wrapper
    return decorator

In [23]:
# Make print_sum() run 10 times with the run_n_times() decorator
@run_n_times(10)
def print_sum(a, b):
    print(a + b)

print_sum(15, 20)

35
35
35
35
35
35
35
35
35
35


In [24]:
# Use run_n_times() to create the run_five_times() decorator
run_five_times = run_n_times(5)

@run_five_times
def print_sum(a, b):
    print(a + b)

print_sum(4, 100)

104
104
104
104
104


In [25]:
# Modify the print() function to always run 7 times
print2 = run_n_times(7)(print) #Pass first () to run_n_times func and pass second () to the nested func of run_n_times
print2('What is happening?!?!')

What is happening?!?!
What is happening?!?!
What is happening?!?!
What is happening?!?!
What is happening?!?!
What is happening?!?!
What is happening?!?!


The below example is a decorator that wraps text in a html tag such as "<i>" for italics

In [26]:
def html(open_tag, close_tag):
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            msg = func(*args, **kwargs)
            return '{}{}{}'.format(open_tag, msg, close_tag)
    
        # Return the decorated function
        return wrapper
    # Return the decorator
    return decorator

# Make hello() return bolded text
@html('<i>', '</i>')
def hello(name):
    return 'Hello {}!'.format(name)
  
print(hello('Alice'))

<i>Hello Alice!</i>


#### More examples

The first example below just reiterates how to create a decorator that accepts arguments

In [27]:
def tag(*tags):
  # Define a new decorator, named "decorator", to return
  def decorator(func):
    # Ensure the decorated function keeps its metadata
    @wraps(func)
    def wrapper(*args, **kwargs):
      # Call the function being decorated and return the result
      return func(*args, **kwargs)
    wrapper.tags = tags
    return wrapper
  # Return the new decorator
  return decorator

@tag('test', 'this is a tag')
def foo():
  pass

print(foo.tags)

('test', 'this is a tag')


The below example tests whether the output is a dict or not

In [28]:
def returns(return_type):
  # Complete the returns() decorator
  def decorator(func):
    def wrapper(*args, **kwargs):
      result = func(*args, **kwargs)
      assert(type(result) == return_type)
      return result
    return wrapper
  return decorator
  
@returns(dict)
def foo(value):
  return value

try:
  print(foo([1,2,3]))
except AssertionError:
  print('foo() did not return a dict!')

foo() did not return a dict!
