# Lesson I 

## Docstrings

In this course, you'll learn how to write functions that others can use. Docstrings are a Python best practice that will make your code much easier to use, read, and maintain.

### A Complex Function

```python
    def split_and_stack(df, new_names):
        half = int(len(df.columns) / 2)
        left = df.iloc[:, half]
        right = df.iloc[:, :half]
        return pd.DataFrame(
            data=np.vstack([left.values, right.values]),
            columns=new_names
        )
```

Look at this ``split_and_stack()`` function. If you wanted to understand what the function does, what the arguments are supposed to be, and what it returns, you would have to spend some time deciphering the code.

#### A Complex function with a docstring

With a docstring though, it is much easier to tell what the expected inputs and outputs should be, as well as what the function does. This makes it easier for you and other engineers to use your code in the future.

```python
    def split_andstack(df, new_names):
        """ Split a DataFrame's columns into two halves and then stack them vertically,
        returning a new DataFrame with 'new_names' as the column names.

        Args:
            df (DataFrame): The DataFrame to split.
            new_names (iterable of str): The column names for the new DataFrame.

        Returns:
            DataFrame    
        """

        half = int(len(df.columns) / 2)
        left = df.iloc[:, half]
        right = df.iloc[:, :half]
        return pd.DataFrame(
            data=np.vstack([left.values, right.values]),
            columns=new_names
        )
```

### Anatomy of a Docstring

A docstring is a string written as the first line of a function. Because docstrings usually span multiple lines, they are enclosed in triple quotes, Python's way of writing multi-line strings. Every docstring has some (although usually not all) of these five key pieces of information: 

* Description of what the function does.
* Description of the arguments, if any.
* Description of the return value, if any.
* Description of errors that may occur, if any.
* Optional extra notes or examples of usage.

### Docstring Formats

Consistent style makes a project easier to read, and the Python community has evolved several standards for how to format your docstrings.

* Google Style
* Numpydoc
* reStructuredText
* Epytext

Google-style and Numpydoc are the most popular formats.

#### Google Style - description

In Google style, the docstring starts with a concise description of what the function does. This should be in imperative language. For instance: "Split the data frame and stack the columns" instead of "This function will split the data frame and stack the columns".

```python
    def function(arg_1, arg_2=42):
        """ Description of what the function does. 
        """
```

#### Google Style - arguments

Next comes the *"Args"* section where you list each argument name, followed by its expected type in parentheses, and then what its role is in the function.

```python
       def function(arg_1, arg_2=42):
        """ Description of what the function does. 
        
        Args:
            arg_1 (str) : Description of arg_1 that can break onto the next line 
            if needed.
            arg_2 (int, optional): write optional when argument has a default value.  
        
        """ 
```

If you need extra space, you can break to the next line and indent as I've done here. If an argument has a default value, mark it as "optional" when describing the type. If the function does not take any parameters, feel free to leave this section out.

#### Google Style - return value

The next section is the "Returns" section, where you list the expected type or types of what gets returned. You can also provide some comment about what gets returned, but often the name of the function and the description will make this clear. Additional lines should not be indented.

```python
        def function(arg_1, arg_2=42):
            """ Description of what the function does. 
            
            Args:
                arg_1 (str) : Description of arg_1 that can break onto the next line 
                if needed.
                arg_2 (int, optional): write optional when argument has a default value.  
            
            Returns:
                bool: Optional description of the return value
                Extra lines are not indented.
            """ 
```

#### Google Style - errors and extra notes

Finally, if your function intentionally raises any errors, you should add a "Raises" section. You can also include any additional notes or examples of usage in free form text at the end.

```python
    def function(arg_1, arg_2=42):
        """ Description of what the function does. 
        
        Args:
            arg_1 (str) : Description of arg_1 that can break onto the next line 
            if needed.
            arg_2 (int, optional): write optional when argument has a default value.  
        
        Raises:
            ValueError: If arg_1 is not a string.
            RuntimeError: If arg_2 is not an integer.
        
        Returns:
            bool: Optional description of the return value
            Extra lines are not indented.
        Raises:
            ValueError: Include any error types that the function intentionally raises.

        Notes:
            See the docstring for the function for more information.    
        """
```

### Numpydoc

The Numpydoc format is very similar and is the most common format in the scientific Python community.

```python
    def function(arg_1, arg_2=42):
        """
        Description of what the function does.

        Parameters
        ----------
        arg_1 : str
            Description of arg_1 that can break onto the next line 
            if needed.
        arg_2 : int, optional
            write optional when argument has a default value.

        Returns
        -------
        bool
            Optional description of the return value
            Extra lines are not indented.
```

### Retrieveing the Docstring

Sometimes it is useful for your code to access the contents of your function's docstring. Every function in Python comes with a ``__doc__`` attribute that holds this information. Notice that the ``__doc__`` attribute contains the *raw docstring*, including any tabs or spaces that were added to make the words line up visually.

In [1]:
def the_answer():
    """Return the answer to life,
    the universe, and everything.
    
    Returns:
     int
    """
    return 42

print(the_answer.__doc__)

Return the answer to life,
    the universe, and everything.
    
    Returns:
     int
    


To get a cleaner version, with those leading spaces removed, you can use the ``getdoc()`` function from the ``inspect`` module. The inspect module contains a lot of useful methods for gathering information about functions.

In [2]:
import inspect
print(inspect.getdoc(the_answer))

Return the answer to life,
the universe, and everything.

Returns:
 int


## Exercise

### Crafting a docstring

You've decided to write the world's greatest open-source natural language processing Python package. It will revolutionize working with free-form text, the way numpy did for arrays, pandas did for tabular data, and scikit-learn did for machine learning.

The first function you write is ``count_letter()``. It takes a string and a single letter and returns the number of times the letter appears in the string. You want the users of your open-source package to be able to understand how this function works easily, so you will need to give it a docstring. Build up a Google Style docstring for this function.

In [4]:
# Add a docstring to count_letter()
def count_letter(content, letter):
    """
    Count the number of times 'letter' 
    appears in 'Content'
    
    # Add a Google Style Arguments section
     Args:
      content (str) : The string to search
      letter (str): The letter to search for.
    
    # Add a returns section
     Returns:
       int
    # Add a section detailing what errors might be raised
     Raises:
      ValueError: If 'letter' is not a one-character string.   
    """
    if (not isinstance(letter, str)) or len(letter) != 1:
        raise ValueError('`letter` must be a single character string.')
    return len([char for char in content if char == letter])

### Retrieving the docstring

You and a group of friends are working on building an amazing new Python IDE (integrated development environment -- like PyCharm, Spyder, Eclipse, Visual Studio, etc.). The team wants to add a feature that displays a tooltip with a function's docstring whenever the user starts typing the function name. That way, the user doesn't have to go elsewhere to look up the documentation for the function they are trying to use. You've been asked to complete the ``build_tooltip()`` function that retrieves a docstring from an arbitrary function.

You will be reusing the ``count_letter()`` function that you developed in the last exercise to show that we can properly extract its docstring.

In [5]:
# Get the "count_letter" docstring by using an attribute of the function
docstring = count_letter.__doc__

border = '#' * 28
print('{}\n{}\n{}'.format(border, docstring, border))

############################

    Count the number of times 'letter' 
    appears in 'Content'
    
    # Add a Google Style Arguments section
     Args:
      content (str) : The string to search
      letter (str): The letter to search for.
    
    # Add a returns section
     Returns:
       int
    # Add a section detailing what errors might be raised
     Raises:
      ValueError: If 'letter' is not a one-character string.   
    
############################


In [6]:
import inspect

# Inspect the count_letter() function to get its docstring
docstring = inspect.getdoc(count_letter)

border = '#' * 28
print('{}\n{}\n{}'.format(border, docstring, border))

############################
Count the number of times 'letter' 
appears in 'Content'

# Add a Google Style Arguments section
 Args:
  content (str) : The string to search
  letter (str): The letter to search for.

# Add a returns section
 Returns:
   int
# Add a section detailing what errors might be raised
 Raises:
  ValueError: If 'letter' is not a one-character string.   
############################


In [7]:
import inspect

def build_tooltip(function):
  """Create a tooltip for any function that shows the
  function's docstring.

  Args:
    function (callable): The function we want a tooltip for.

  Returns:
    str
  """
  # Get the docstring for the "function" argument by using inspect
  docstring = inspect.getdoc(function)
  border = '#' * 28
  return '{}\n{}\n{}'.format(border, docstring, border)

print(build_tooltip(count_letter))
print(build_tooltip(range))
print(build_tooltip(print))

############################
Count the number of times 'letter' 
appears in 'Content'

# Add a Google Style Arguments section
 Args:
  content (str) : The string to search
  letter (str): The letter to search for.

# Add a returns section
 Returns:
   int
# Add a section detailing what errors might be raised
 Raises:
  ValueError: If 'letter' is not a one-character string.   
############################
############################
range(stop) -> range object
range(start, stop[, step]) -> range object

Return an object that produces a sequence of integers from start (inclusive)
to stop (exclusive) by step.  range(i, j) produces i, i+1, i+2, ..., j-1.
start defaults to 0, and stop is omitted!  range(4) produces 0, 1, 2, 3.
These are exactly the valid indices for a list of 4 elements.
When step is given, it specifies the increment (or decrement).
############################
############################
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

Prints the value