# Best Practices
## Crafting a docstring
+ Copy the following string and add it as the docstring for the function: Count the number of times `letter` appears in `content`.
+ Now add the arguments section, using the Google style for docstrings. Use str to indicate a string.
+ Add a returns section that informs the user the return value is an int.
+ Finally, add some information about the ValueError that gets raised when the arguments aren't correct.

In [1]:
def count_letter(content, letter):
  """Count the number of times `letter` appears in `content`.

  Args:
    content (str): The string to search.
    letter (str): The letter to search for.

  Returns:
    int

  # Add a section detailing what errors might be raised
  Raises:
    ValueError: If `letter` is not a one-character string.
  """
  if (not isinstance(letter, str)) or len(letter) != 1:
    raise ValueError('`letter` must be a single character string.')
  return len([char for char in content if char == letter])

In [2]:
help(count_letter)

Help on function count_letter in module __main__:

count_letter(content, letter)
    Count the number of times `letter` appears in `content`.
    
    Args:
      content (str): The string to search.
      letter (str): The letter to search for.
    
    Returns:
      int
    
    # Add a section detailing what errors might be raised
    Raises:
      ValueError: If `letter` is not a one-character string.



## Retrieving docstrings
+ Begin by getting the docstring for the function count_letter(). Use an attribute of the count_letter() function.
+ Now use a function from the inspect module to get a better-formatted version of count_letter()'s docstring.
+ Now create a build_tooltip() function that can extract the docstring from any function that we pass to it.


In [3]:
# Get the docstring with an attribute of count_letter()
docstring = count_letter.__doc__

border = '#' * 28
print('{}\n{}\n{}'.format(border, docstring, border))

############################
Count the number of times `letter` appears in `content`.

  Args:
    content (str): The string to search.
    letter (str): The letter to search for.

  Returns:
    int

  # Add a section detailing what errors might be raised
  Raises:
    ValueError: If `letter` is not a one-character string.
  
############################


In [4]:
import inspect

# Inspect the count_letter() function to get its docstring
docstring = inspect.getdoc(count_letter)

border = '#' * 28
print('{}\n{}\n{}'.format(border, docstring, border))

############################
Count the number of times `letter` appears in `content`.

Args:
  content (str): The string to search.
  letter (str): The letter to search for.

Returns:
  int

# Add a section detailing what errors might be raised
Raises:
  ValueError: If `letter` is not a one-character string.
############################


In [5]:
import inspect

def build_tooltip(function):
  """Create a tooltip for any function that shows the
  function's docstring.

  Args:
    function (callable): The function we want a tooltip for.

  Returns:
    str
  """
  # Get the docstring for the "function" argument by using inspect
  docstring = inspect.getdoc(function)
  border = '#' * 28
  return '{}\n{}\n{}'.format(border, docstring, border)

print(build_tooltip(count_letter))
print(build_tooltip(range))
print(build_tooltip(print))

############################
Count the number of times `letter` appears in `content`.

Args:
  content (str): The string to search.
  letter (str): The letter to search for.

Returns:
  int

# Add a section detailing what errors might be raised
Raises:
  ValueError: If `letter` is not a one-character string.
############################
############################
range(stop) -> range object
range(start, stop[, step]) -> range object

Return an object that produces a sequence of integers from start (inclusive)
to stop (exclusive) by step.  range(i, j) produces i, i+1, i+2, ..., j-1.
start defaults to 0, and stop is omitted!  range(4) produces 0, 1, 2, 3.
These are exactly the valid indices for a list of 4 elements.
When step is given, it specifies the increment (or decrement).
############################
############################
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

Prints the values to a stream, or to sys.stdout by default.
Optional keyword argument

## Docstrings to the rescue!

### Instruction
+ Examine each of these functions' docstrings in the IPython shell to determine which of them is actually numpy.histogram().

### Answer
+ numpy.fywdkxa()
+ help(numpy.fywdkxa)

## DRY and "Do one thing"

While we were developing a model to predict the likelihood of a student graduating from college, we wrote this bit of code to get the `z-scores` of students' yearly GPAs. Now we're ready to turn it into a production-quality system, so we need to do something about the repetition. Writing a function to calculate the z-scores would improve this code.

```python
# Standardize the GPAs for each year
df['y1_z'] = (df.y1_gpa - df.y1_gpa.mean()) / df.y1_gpa.std()
df['y2_z'] = (df.y2_gpa - df.y2_gpa.mean()) / df.y2_gpa.std()
df['y3_z'] = (df.y3_gpa - df.y3_gpa.mean()) / df.y3_gpa.std()
df['y4_z'] = (df.y4_gpa - df.y4_gpa.mean()) / df.y4_gpa.std()
```

> **Note:** `df` is a pandas DataFrame where each row is a student with `4` columns of yearly student GPAs: `y1_gpa`, `y2_gpa`, `y3_gpa`, `y4_gpa`

### Instructions
+ Finish the function so that it returns the z-scores of a column.
+ Use the function to calculate the z-scores for each year (df['y1_z'], df['y2_z'], etc.) from the raw GPA scores (df.y1_gpa, df.y2_gpa, etc.).

In [18]:
import pandas as pd
df = pd.read_csv('students.csv', index_col=0)
df.head()

Unnamed: 0,y1_gpa,y2_gpa,y3_gpa,y4_gpa
0.0,2.785877,2.052513,2.170544,0.06557
1.0,1.144557,2.666498,0.267098,2.884737
2.0,0.907406,0.423634,2.613459,0.03095
3.0,2.205259,0.52358,3.984345,0.339289
4.0,2.877876,1.287922,3.077589,0.901994


In [19]:
def standardize(column):
  """Standardize the values in a column.

  Args:
    column (pandas Series): The data to standardize.

  Returns:
    pandas Series: the values as z-scores
  """
  # Finish the function so that it returns the z-scores
  z_score = (df[column] - df[column].mean()) / df[column].std()
  return z_score

# Use the standardize() function to calculate the z-scores
df['y1_z'] = standardize("y1_gpa")
df['y2_z'] = standardize("y2_gpa")
df['y3_z'] = standardize("y3_gpa")
df['y4_z'] = standardize("y4_gpa")

df.head()

Unnamed: 0,y1_gpa,y2_gpa,y3_gpa,y4_gpa,y1_z,y2_z,y3_z,y4_z
0.0,2.785877,2.052513,2.170544,0.06557,0.875398,0.682172,-0.182695,-0.653112
1.0,1.144557,2.666498,0.267098,2.884737,-0.916844,1.315169,-1.562311,1.710661
2.0,0.907406,0.423634,2.613459,0.03095,-1.175802,-0.997144,0.138329,-0.68214
3.0,2.205259,0.52358,3.984345,0.339289,0.241391,-0.894103,1.131946,-0.423609
4.0,2.877876,1.287922,3.077589,0.901994,0.975857,-0.106094,0.47473,0.0482


## Split up a function
Another engineer on your team has written this function to calculate the mean and median of a sorted list. You want to show them how to split it into two simpler functions: mean() and median()

````
def mean_and_median(values):
  """Get the mean and median of a sorted list of `values`

  Args:
    values (iterable of float): A list of numbers

  Returns:
    tuple (float, float): The mean and median
  """
  mean = sum(values) / len(values)
  midpoint = int(len(values) / 2)
  if len(values) % 2 == 0:
    median = (values[midpoint - 1] + values[midpoint]) / 2
  else:
    median = values[midpoint]

  return mean, median
````

### Instructions 
+ Write the mean() function.
+ Write the median() function.

In [20]:
def mean(values):
  """Get the mean of a sorted list of values

  Args:
    values (iterable of float): A list of numbers

  Returns:
    float
  """
  # Write the mean() function
  mean = sum(values) / len(values)
  return mean

In [21]:
def median(values):
  """Get the median of a sorted list of values

  Args:
    values (iterable of float): A list of numbers

  Returns:
    float
  """
  # Write the median() function
  midpoint = int(len(values) /2)
  if len(values) % 2 == 0:
      median = (values[midpoint-1] + values[midpoint]) / 2
  else:
      median = values[midpoint]
  return median

## Best practice for default arguments
One of your co-workers (who obviously didn't take this course) has written this function for adding a column to a pandas DataFrame. Unfortunately, they used a mutable variable as a default argument value! Please show them a better way to do this so that they don't get unexpected behavior.

````
def add_column(values, df=pandas.DataFrame()):
  """Add a column of `values` to a DataFrame `df`.
  The column will be named "col_<n>" where "n" is
  the numerical index of the column.

  Args:
    values (iterable): The values of the new column
    df (DataFrame, optional): The DataFrame to update.
      If no DataFrame is passed, one is created by default.

  Returns:
    DataFrame
  """
  df['col_{}'.format(len(df.columns))] = values
  return df
````
Instructions

+ Change the default value of df to an immutable value to follow best practices.
+ Update the code of the function so that a new DataFrame is created if the caller didn't pass one.

In [None]:
# Use an immutable variable for the default argument
def better_add_column(values, df=None):
  """Add a column of `values` to a DataFrame `df`.
  The column will be named "col_<n>" where "n" is
  the numerical index of the column.

  Args:
    values (iterable): The values of the new column
    df (DataFrame, optional): The DataFrame to update.
      If no DataFrame is passed, one is created by default.

  Returns:
    DataFrame
  """
  # Update the function to create a default DataFrame
  if df is None:
    df = pandas.DataFrame()
  df['col_{}'.format(len(df.columns))] = values
  return df