## Introducing docstrings

In this mission, we'll cover some best practices that will make your code much easier to use, read, and maintain, including:

- How to document your code so that others can easily understand it.
- How to create functions that are easier to test, debug, and change.
- How to setup default arguments in functions so that your code doesn't behave unexpectedly.

Let's start by looking at this split_and_stack() function:

In [3]:
def split_and_stack(df, new_names):
    half = int(len(df.columns) / 2)
    left = df.iloc[:, :half]
    right = df.iloc[:, half:]
    return pd.DataFrame(data=np.vstack([left.values, right.values]), columns=new_names)

If we wanted to understand what the function does, what the arguments are supposed to be, and what it returns, we would have to spend some time deciphering the code.

With a **docstring** though, it is much easier to tell what the expected inputs and outputs should be, as well as what the function does. A docstring is a string written as the first line of a function. Because docstrings usually span multiple lines, they are enclosed in triple quotes, Python's way of writing multi-line strings:

In [4]:
def split_and_stack(df, new_names):
    """Splits a DataFrame's columns into two halves and then stack
    them vertically, returning a new DataFrame with `new_names` as the
    column names.

    Args:
      df (DataFrame): The DataFrame to split.
      new_names (iterable of str): The column names for the new DataFrame.

    Returns:
      DataFrame
    """
    half = int(len(df.columns) / 2)
    left = df.iloc[:, :half]
    right = df.iloc[:, half:]
    return pd.DataFrame(
      data=np.vstack([left.values, right.values]),
      columns=new_names
    )

Every docstring has some (although usually not all) of these five key pieces of information:

- Description of what the function does.
- Description of the arguments, if any.
- Description of the return value(s), if any.
- Description of errors raised, if any.
- Optional extra notes or examples of usage.

Docstrings makes it easier for you and other data scientists or engineers to use, read, and maintain your code in the future. Remember that even though computers execute it, code is actually written for humans to read (otherwise you'd just be writing the 1s and 0s that the computer operates on).

### Retrieving docstrings

Every function in Python comes with a __doc__ attribute that holds the contents of the function's docstring.

In [5]:
def the_answer():
    """Returns the answer to life, 
    the universe, and everything.

    Returns:
        int
    """
    return 42

In [6]:
print(the_answer.__doc__)

Returns the answer to life, 
    the universe, and everything.

    Returns:
        int
    


Notice that the __doc__ attribute contains the raw docstring, including any tabs or spaces that were added to make the words visually line up.

To get a cleaner version, with those leading spaces removed, we can use the getdoc() function from the inspect module.

In [7]:
import inspect

print(inspect.getdoc(the_answer))

Returns the answer to life, 
the universe, and everything.

Returns:
    int


The inspect module contains a lot of useful methods for gathering information about functions, so we recommend you take some time at the end of this mission to read through the documentation.

In Jupyter notebook, there's also a keyboard shortcut we can use to access the docstrings for built-in functions - just press `Shift` + `Tab` while the cursor is within the parentheses of a built-in function:

### Google style docstrings

Now that we know how to retrieve a function's docstring, let's learn how to write our own.

Consistent style makes a project easier to read, and the Python community has evolved several standards for how to format docstrings. Google style and Numpydoc are the most popular formats. However, since Numpydoc takes up more vertical space, we'll focus on Google style in this mission to keep the examples compact and legible.

#### Description of what the function does

In Google style, the docstring starts with a concise description of what the function does. This should be in **imperative language**. For instance, we would write "Split the data frame and stack the columns" instead of "This function will split the data frame and stack the columns."

```
def function(arg_1, arg_2=42):
    """Description of what the function does.
    """
```    
   
#### Description of the arguments, if any

Next comes the "Args" section where you list each argument name, followed by its expected type in parentheses, and then its role in the function. If you need extra space, break to the next line and indent, like below. If an argument has a default value, mark it as "optional" when describing the type. If the function does not take any parameters, leave this section out.

```
def function(arg_1, arg_2=42):
    """Description of what the function does.
    Args:
      arg_1 (str): Description of arg_1 that can break onto the next line
        if needed.
      arg_2 (int, optional): Write optional when an argument has a default
        value.
  """
```
  
#### Description of the return value(s), if any

The next section is the "Returns" section, where you list the expected type or types of what gets returned. You can also provide some comment about what gets returned, but often the name of the function and the description will make this clear. Additional lines should not be indented.

```
def function(arg_1, arg_2=42):
    """Description of what the function does.
    
    Args:
      arg_1 (str): Description of arg_1 that can break onto the next line
        if needed.
      arg_2 (int, optional): Write optional when an argument has a default
        value.
        
    Returns:
      bool: Optional description of the return value
      Extra lines are not indented.
    """
```


#### Description of errors raised, if any.
#### Optional extra notes or examples of usage.

```
def function(arg_1, arg_2=42):
    """Description of what the function does.

    Args:
      arg_1 (str): Description of arg_1 that can break onto the next line
        if needed.
      arg_2 (int, optional): Write optional when an argument has a default
        value.

    Returns:
      bool: Optional description of the return value
      Extra lines are not indented.

    Raises:
      ValueError: Include any error types that the function intentionally
        raises.

    Notes:
      See https://www.dataquest.io for more info.  
    """
```


