# Python Style Guide

Documentation is even more valuable in Data Science as compared to CS in general as your jupyter notebook is expected to be a self-guided presentation of your analysis. Don’t forget, it’s not sufficient to have a strong analysis, you must also be persuasive!

Python comes with its own style recommendations [PEP 8](https://www.python.org/dev/peps/pep-0008/), we prune down this guide to those relevant themes for the course. It’s also influenced by [google’s style guide](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html).

### Warning

Deviations from following the format in this guide may yield penalties on any submitted Python code.


# Functions and Variable Names

Function and variables names should be all lowercase. Separate distinct words with underscores to improve readability. Use brief, simple language to name the variables. The names of the variables are themselves documentation. Pro tip: give corresponding variables names of identical length for extra readability:

```python
# poor form
FirstGuysScoreInTooLongVariableName += 1
y += 1

# same functionality, but documented properly:
score_player0 += 1
score_player1 += 1
```

# Function Docstrings

Your functions should contain a single line (typically less than 79 chars wide, though we won’t be counting for this course) which describes what the function does. Additionally, include the name, expected type (and shape for arrays) of all inputs/outputs as shown below:

```python
def get_gcd(x, y):
    """ computes the greatest common divisor of two ints
    Args:
        x (int): input integer
        y (int): input integer

    Returns:
        gcd (int): gcd of x, y
    """
```

# Comments and whitespace (within Python code)

Separate your code into “chunks” which perform similar functions separated by a line of whitespace between them. Label each chunk with a short message which describes its function. The ideal comment tells a reader what they need to know but would not have understood from reading the code chunk itself and (if you still have space) serves as an introduction to prime their understanding for the following few lines. Add a few comments to describe important variables along the way too.

Consider the following function. Taken out of context, we expect reader’s to have a tough time understanding why it does what it does but the chunking and comments (hopefully) help provide an easy on-ramp for reader’s to begin learning about it. Notice how critical the documentation becomes when you’re tossed into this function without proper context:

```python
def snip_trial(df_mode, trial_len, feat_list, start_stamp=None, start_idx=None):
   """ extracts a single trial from a dataframe

   Args:
       df_mode (pd.DataFrame): dataframe, contains timestamp and trial data
       trial_len (int): number of samples in trial
       feat_list (list): columns of dataframe which make up trial data
       start_stamp (float): timestamp @ start of trial (inclusive)
       start_idx (int): index of start of trial (inclusive) in df_mode

   Returns:
       trial (np.array): (trial_len, len(feat_list)) trial data
   """
   # check that only start_stamp xor start_idx is passed
   assert (start_stamp is None) != (start_idx is None)

   # get start_idx from start_stamp
   if start_idx is None:
       timestamp = df_mode['timestamp'].to_numpy()
       start_idx = np.searchsorted(timestamp, v=start_stamp, side='left')
   assert start_idx.size == 1, 'non unique start'

   # extract trial (in time)
   stop_idx = int(start_idx + trial_len)
   trial = df_mode.iloc[start_idx: stop_idx, :]

   # extract trial (just relevant features) and cast to array
   trial = trial.loc[:, feat_list].to_numpy()

   # check that trial has proper shape
   if trial.shape[0] != trial_len:
       raise IOError('data stream ends before trial')

   return trial
```

# Jupyter Notebook Style Notes

- Your Jupyter Notebook should be shared empty or with results which are consistent with a fresh “Kernel -> Restart & Run All Cells”. To do otherwise is clumsy and could be considered misleading in professional contexts.
- Use cells to chunk your program into pieces which perform a similar function.
- Suppress all output which you do not want to draw the reader’s attention to. (A semicolon on the last line will prevent the Out[] block from appearing).
- Markdown provides you a chance to talk to your reader as they move through your analysis. Use it. Having clear language (and crisp visuals) goes a long way towards teaching the reader just what you’ve accomplished. Be as clear and brief as possible.

# Odds and ends

- Only use one import per line:

```python
import numpy as np
import sklearn as skl
```

- Use single or double quotes for all strings, but don’t mix them in the same file:

```python
# preferred (if used consistently throughout)
String0 = 'this is how Prof Higger does it'

# acceptable (if used consistently throughout code)
String1 = "I feel like such a rebel"

# don't mix and match
String2 = 'sometimes you feel like a nut'
String3 = "sometimes you dont"
```