# Covered here

* [Purpose](#Purpose)
* [Style Guide](#Style-Guide)
 * [General code layout](#General-code-layout)
 * [Indentation](#Indentation)
 * [Naming conventions](#Naming-conventions)
 * [Line length](#Line-length)
 * [Line breaks and binary operators](#Line-breaks-and-binary-operators)
 * [Testing](#Testing)
 * [Imports](#Imports)
 * [White space and blank lines](#Whitespace-and-blank-lines)
 * [Module dunder attributes](#Module-dunder-attributes)
 * [Comments (besides docstrings)](#Comments-(besides-docstrings)
 * [Docstrings](#Docstrings)
 * [TODOs](#TODOs)
 * [Return statements](#Return-statements)
 * [List comprehension](#List-comprehension)
 * [Lambda functions](#Lambda-functions)
 * [Conditional expressions](#Conditional-expressions)
 * [Other](#Other)

# References & resources

* [PEP 8 -- Style Guide for Python Code](https://www.python.org/dev/peps/pep-0008/)
* [PEP 257 -- Docstring Conventions](https://www.python.org/dev/peps/pep-0257/)
* [Google Python Style Guide](https://google.github.io/styleguide/pyguide.html)
* [GNU Mailman Coding Style Guide](https://barry.warsaw.us/software/STYLEGUIDE.txt)
* robinwinslow.uk: [A summary of python code style conventions](https://robinwinslow.uk/2014/01/05/summary-of-python-code-style-conventions/)
* voidspace.org.uk: [Python Coding Style & Standards - My Personal Style Guide for Python Source Code](http://www.voidspace.org.uk/python/articles/python_style_guide.shtml)
* The Chromium Projects: [Python Style Guidelines](https://www.chromium.org/chromium-os/python-style-guidelines#TOC-Official-Style-Guide)
* The Hitchhiker's Guide to Python: [Code Style](http://docs.python-guide.org/en/latest/writing/style/)
* memonic.com: [Python Idioms and Efficiency](https://www.memonic.com/user/pneff/folder/python/id/1bufp)
* David Goodger: [Code Like a Pythonista: Idiomatic Python](http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#coding-style-readability-counts)
* [Top 10 Python idioms I wish I'd learned earlier](http://prooffreaderplus.blogspot.com/2014/11/top-10-python-idioms-i-wished-id.html)

# Purpose

This notebook codifies **my own** set of standardized style conventions for writing Python.  Nothing here should be outlandish--standard conventions such as 4-space indents are assumed and this document largely attempts to build on top of existing conventions with finer-grained rules.  Some sections here simply repeat/mimic existing conventions from places like PEP 8.  **Some directives and examples here are ported word-for-word directly from the links above and I do not claim this as my own work; it is only for my own reference.**

# Style Guide

## General code layout

* Generally, use 1 statement per line.
* You shouldn't need semicolons.
* But it is okay to put the result of a _test_ on the same line as the test **only if** the entire statement fits on one line.
 * You can never do so with `try/except` since the `try` and `except` can't both fit on the same line
 * Only do so with an `if` if there is no `else`.

In [32]:
# Bad
a = 1; b = 2

# Good
a = 1
b = 2

# -----------

# Okay
if a: print(b * 2)
    
# No
if a: print('no')
else: print('else')

4
no


Use single quotes rather than double quotes.  **But, prefer double quotes to needing to escape single quotes.**

In [47]:
# Yes
a = 'a string'
b = "a string with 'quotes'"

# No
a = "a string"
b = 'a sring with \'quotes\''

Using non-mandatory parentheses: Some expressions can get complicated. Parentheses can (and should) be used to make them less ambiguous. This is for the sake of people who read the code, even if it doesn't matter to the Python parser.

In [48]:
if (True and True) or (False or True):
    print(True)

True


## Indentation

Use 4 spaces per indentation level.

Continuation lines should align wrapped elements using a hanging indent.



In [3]:
# Yes
# More indentation included to distinguish this from the rest.
def long_function_name(
        var_one, var_two, var_three,
        var_four):
    return var_one

# Yes
# Aligned with opening delimiter.
foo = long_function_name(var_one=2, var_two=2,
                         var_three=2, var_four=4)

# No
# Further indentation required as indentation is not distinguishable.
def long_function_name(
    var_one, var_two, var_three,
    var_four):
    return var_one

Multiline constructs:
* The first element should begin on the second line
* The closing brace/bracket/parenthesis should line up under the first non-whitespace character of the previous line

In [1]:
d = {
    1 : 'first value',
    2 : 'the second value'
    }

my_list = [
    1, 2, 3,
    4, 5, 6,
    ]

Indentation on method chaining:
* the periods beginning each line should be **four spaces from the opening construct following assignment**
* the closing construct should be indented four spaces

In [57]:
import pandas as pd

arr = [[1, 2, 3],
       [10, 9, 8]]

df = (pd.DataFrame(arr)
         .transpose() # four spaces from the opening construct following assignment
         .sort_index(ascending=False)
         .dropna(how='all')
    ) # 4 spaces

## Naming conventions

* `class` names: use `CamelCase`
 * When using abbreviations in CamelCase, capitalize all the letters of the abbreviation. Thus `HTTPServerError` is better than `HttpServerError`.
* Method, function and variable names: use `lowercase_with_underscores`
* Always use `self` for the first argument to instance methods
* Always use `cls` for the first argument to class methods
* Never declare functions using `lambda` (`f = lambda x: 2*x`)
* Constants (5, 5.0) should use `ALLCAPS`
* Never use the characters the following characters as single-character variable names.  In some fonts, these characters are indistinguishable from the numerals one and zero. 
 * 'l' (lowercase letter el)
 * 'O' (uppercase letter oh)
 * 'I' (uppercase letter eye) 
* Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. 
* Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.
* **Avoid having module names that contain variables or functions equivalent to the module name.**  For instance, `import parse.parse` is not ideal

Using underscores before and after:
* Double-**leading** underscore: when naming a class attribute, invokes name mangling (inside `class FooBar`, `__boo` becomes `_FooBar__boo`).  Generally, double leading underscores should be used only to avoid name conflicts with attributes in classes designed to be subclassed.
* Double-**leading and trailing** underscore: "magic" objects or attributes that live in user-controlled namespaces. E.g. `__init__`, `__import__` or `__file__`. Never invent such names; only use them as documented.
* (Quasi-) protected methods and properties start with `_single_` underscore.  These are weak "internal use" indicators. E.g. `from M import *` does not import objects whose name starts with an underscore.
* If you need to use a reserved word, add a `_` to the end (e.g. `class_`; `Tkinter.Toplevel(master, class_='ClassName')`)

In [6]:
# Name mangling example
class Cls(object):
    def func1(self):
        pass
    def __func2(self):
        return True
c = Cls()
# can't access:
# c.__func2
c._Cls__func2() # only way to access

True

![naming.PNG](./imgs/naming.PNG)

## Line length

Max 79 characters (not 80).  The Python standard library requires limiting docstrings and comments to 72.

Make exceptions in rare cases.  One would be with URLs, which you should not break.  From PEP 8:

_Two good reasons to break a particular rule:_
1. _When applying the rule would make the code less readable, even for someone who is used to reading code that follows the rules._

In [51]:
"""
Resources
=========
http://www.python.org/dev/peps/pep-0008/#a-foolish-consistency-is-the-hobgoblin-of-little-minds
"""

pass

## Line breaks and binary operators

In [2]:
gross_wages = 100
taxable_interest = 10
dividends = 10
qualified_dividends = 5
ira_deduction = 30

# No: operators sit far away from their operands
income = (gross_wages +
          taxable_interest +
          (dividends - qualified_dividends) -
          ira_deduction)

# Yes: easy to match operators with operands
income = (gross_wages
       + taxable_interest
       + (dividends - qualified_dividends)
       - ira_deduction)

## Testing

Use "implicit" `True/False` rules where possible _and appropriate_.

In [71]:
a = []

# Good
if not a:
    print('a is an empty list')
    
# Bad
if a == []:
    # print('a is empty')
    pass

a is an empty list


**However, be careful.** Python evaluates certain values as `False` when in a boolean context. 

A quick "rule of thumb" is that all "empty" values are considered `False`:
* `None`
* 0
* `[]`
* `{}`
* `''`

So (from PEP 8), **beware of writing `if x` when you really mean `if x is not None`** -- e.g. when testing whether a variable or argument that defaults to `None` was set to some other value. The other value might have a type (such as a container) that could be `False` in a boolean context!

In [68]:
# Okay
def func(a=None):
    if a is None:
        print('a is None')
    elif not a:
        # And without the elif, you'd need if a == []
        print('a is an empty list') # no, it's not

# Bad
def func2(a=None):
    if not a:
        print('a is an empty list') # no, it's not!
func()

a is None


Never use `==` or `!=` to compare singletons like `None`. Use `is` or `is not`.

In [66]:
a = None

# Good
print(a is None)

# Bad
print(a == None)

True
True


Never compare a boolean variable to `False` using `==`. Use `if not x:` instead. If you need to distinguish `False` from `None` then chain the expressions, such as `if not x and x is not None:`

In [70]:
var = False

# Good
if not var:
    print("it's probably false, or could be none")

# More explicit, distinguish false from none
if (not var) and (var is not None):
    print('definitely false now')

# Bad
if var == False:
    print("it's false")

it's probably false, or could be none
definitely false now
it's false


## Imports

Don't use wildcards, ever.

In [34]:
# No
from collections import *

Imports should usually be on separate lines.

In [3]:
# No
import os, sys

# Yes
import os
import sys

Imports should be grouped in the following order:

1. Standard library imports
2. Related third party imports
3. Local application/library specific imports

You should put a blank line between each group of imports.  Within each grouping, **imports should be sorted lexicographically, ignoring case, according to each module's full package path.**

In [9]:
import collections
import warnings

import pandas as pd
import scipy.stats as scs

# from .datasets import somefunction

Where you are importing lots of names, use the following syntax:

In [50]:
# TODO: indentation?
from collections import (defaultdict, OrderedDict,
                         deque)

Use absolute imports:
* Even if the module is in the same package, use the full package name
* This helps prevent unintentionally importing a package twice
* Absolute imports more readable and tend to be better behaved

## Whitespace and blank lines

Trailing whitespace: make sure to delete it.  In Notepad++: _Edit > Blank Operations > Trim Trailing Whitespace_.

Blank lines: 
* Two blank lines between (**before and after**) top-level **function and class** definitions
* One blank line between method definitions.

In [22]:
class Cls1(object):
    
    def __init__(self, x):
        self.x = x
        
    def method1(self):
        pass
    

def _helper():
    pass

Follow standard typographic rules for the use of spaces around punctuation.  Avoid extraneous whitespace:
* immediately inside parentheses, brackets or braces
* between a trailing comma and a following close parenthesis
* immediately before a comma, semicolon, or colon
* immediately before the open parenthesis that starts the argument list of a function call
* immediately before the open parenthesis that starts an indexing or slicing:
* around the `=` sign when used to indicate a keyword argument or a default parameter value.

In [45]:
# Yes
res = sum((sum((1, 2)), sum((1, 2))))

d = {}
d['key'] = 1

if not d: print(1, 2)

def complex(real, imag=0.0):
    return magic(r=real, i=imag)


# No
res = sum(( sum((1, 2)), sum((1, 2)) )) # spaces inside brackets

d ['key'] = 1 # space before dictionary key

if not d: print(1 , 2)


def complex(real, imag = 0.0):
    return magic(r = real, i = imag) # Spaces in default values

Don't use whitespace to line up assignment operators (=, :).

In [35]:
# No
firstvar = 5
y        = 2

_Always_ surround these binary operators with a single space on either side:
* assignment (`=`)
* augmented assignment (`+=`, `-=` etc.)
* comparisons (`==, <, >, !=, <>, <=, >=, in, not in, is, is not`)
* Booleans (`and`, `or`, `not`).

If operators with different priorities are used, consider adding whitespace around the operators with the lowest priority(ies). Use your own judgment; however, never use more than one space, and always have the same amount of whitespace on both sides of a binary operator.

In [38]:
a = b = 1
x = 5
y = 2

# Preferred
x = x*2 - 1
hypot2 = x*x + y*y
c = (a+b) * (a-b)

# No
x = x * 2 - 1
hypot2 = x * x + y * y
c = (a + b) * (a - b)

## Module _dunder_ attributes

Every module should have the following non-null attributes at a bare minimum:
* `__author__`
* [`__all__`](https://docs.python.org/3/tutorial/modules.html#importing-from-a-package)

Module-level "dunders" should be placed after the module docstring but before any import statements except `from __future__` imports.  (Python mandates that future-imports must appear in the module before any other code except docstrings.)

In [4]:
"""This is the example module.

This module does stuff.
"""

# from __future__ import stuff

__all__ = ['a', 'b', 'c']
__version__ = '0.1'
__author__ = 'Brad Solomon'

import os
import sys

## Comments (besides docstrings)

Write comments in complete sentences.  (Try to write in ["Strunk & White" English](https://en.wikipedia.org/wiki/The_Elements_of_Style)).  Periods are not necessary unless the comment is multi-sentence.  (If a comment is short, the period at the end can be omitted.)

In [4]:
# This is a complete comment
# This is another comment.  Second sentence of the comment

Multi-line comments should indent 4 spaces from the start of the comment, beginning on the second line (hanging indents).

In [4]:
# This is a comment
#     that takes up multiple lines.
# The next comment starts here.

Refer to other Python objects with surrounding "\`" characters.

In [5]:
# This is a comment refering to the `func1` function

Inline comments:
* Use them sparingly for _non-obvious_ code.  _Complicated_ operations deserve standalone comments.
* **They should be separated by at least two spaces from the statement.**

In [6]:
x = 5
x = x + 1  # Compensate for border

## Docstrings

Conventions for writing good documentation strings (a.k.a. "docstrings") are immortalized in [PEP 257](https://www.python.org/dev/peps/pep-0257/).

A docstring should be organized as:
1. A summary line (one physical line) terminated by a period, question mark, or exclamation point
2. Followed by a blank line
3. Followed by the rest of the docstring _starting at the same cursor position as the first quote of the first line_.

In [27]:
def f(x):
    """This is the title-level docstring.  Fit one 1 line.
    
    Some additional description.
    
    Parameters
    ==========
    a : str or None, default None
        Definition of `a`
    """
    
    return x

Try to give every function a docstring, unless it meets all of the following criteria:
* not externally visible
* very short
* obvious

Use double rather than single-quotes for docstrings

In [26]:
"""Preferred"""

'''Okay, just not preferred'''

pass

* Docstring "underlines" should use the equals sign ("==="):
* If a parameter has a default value, state it next to the type
* If a parameter has some membership restriction, state the possible values in a tuple (see `b` below)

    """
    Parameters
    ==========
    a : bool, default False
        The definition of a
    b : str, one of ('ignore', 'catch')
        The definition of b
    """

* All modules should normally have docstrings
* All functions and classes exported by a module should also have docstrings
* Public methods should also have docstrings. Most modules in the Python Standard Library don't use docstrings for `__init__` methods
* A package itself may be documented in the module docstring of the `__init__.py` file in the package directory.

Classes should have a doc string below the class definition describing the class. **If your class has public attributes, they should be documented here in an Attributes section and follow the same formatting as a function's Parameters section.**

In [28]:
class SampleClass(object):
    """Summary of class here.

    Longer class information....
    Longer class information....

    Attributes:
        likes_spam: A boolean indicating if we like SPAM or not.
        eggs: An integer count of the eggs we have laid.
    """

    def __init__(self, likes_spam=False):
        """Inits SampleClass with blah."""
        self.likes_spam = likes_spam
        self.eggs = 0

    def public_method(self):
        """Performs operation blah."""

### One-line v. multi-line docstrings

One-liners are for really obvious cases. They should really fit on one line.
* Triple quotes are used even though the string fits on one line. This makes it easy to later expand it.
* The closing quotes are on the same line as the opening quotes. This looks better for one-liners.
* **There's no blank line either before or after the docstring.**
* The docstring is a phrase ending in a period. 
 * It prescribes the function or method's effect as a command ("Do this", "Return that"), not as a description; e.g. don't write "Returns the pathname ...".
* The one-line docstring should NOT be a "signature" reiterating the function/method parameters (which can be obtained by introspection).

In [2]:
# Yes
def function(a, b):
    """Perform operation X."""
    return a + b

# No - a docstring is not just a 'signature'
def function(a, b):
    """function(a, b) -> list"""

Multi-line docstrings:
* The entire docstring is indented the same as the quotes at its first line.
* Insert a blank line after all docstrings (one-line or multi-line) that document a class 
* The docstring for a module should generally list the classes, exceptions and functions (and any other objects) that are exported by the module, with a one-line summary of each.
* The docstring for a package (i.e., the docstring of the package's `__init__.py` module) should also list the modules and subpackages exported by the package.

Use `r"""raw triple double quotes"""` if you use any backslashes in your docstrings.

## TODOs

It is okay to leave TODOs in code. Format examples:

    # TODO: a single thing
    # TODO: 
    # - First thing
    # - Second thing

## `return` statements

* In multi-line functions, try to avoid doing any computation in the return statement itself.

In [6]:
# Bad
def f(x):
    a = x ** 2
    b = x - 5
    return x * a + b

# Good
def f(x):
    a = x ** 2
    b = x - 5
    res = x * a + b    
    return res

Separate the function body and `return` by 1 blank line, always.

In [7]:
# Bad
def f(x):
    if isinstance(x, str):
        x = float(x)
    res = x + 5
    return res

# Good
def f(x):
    if isinstance(x, str):
        x = float(x)
    res = x + 5
    
    return res

* Be consistent in return statements. Either all return statements in a function should return an expression, or none of them should. 
* If any return statement returns an expression, any return statements where no value is returned should explicitly state this as return None, and an explicit return statement should be present at the end of the function (if reachable).

In [None]:
# Yes
def foo(x):
    if x >= 0:
        return math.sqrt(x)
    else:
        return None

# Yes
def bar(x):
    if x < 0:
        return None
    return math.sqrt(x)

# Yes
def foo(x):
    if x >= 0:
        res = math.sqrt(x)
    else:
        res = None
    
    return res

# No
def foo(x):
    if x >= 0:
        return math.sqrt(x)

# No
def bar(x):
    if x < 0:
        return
    return math.sqrt(x)



## List comprehension

Okay to use for simple cases.  Don't use nested list comprehensions with multiple `for` clauses.

In [19]:
# Okay
order = list('acebd')
target = list('edba')
res = [m for m in order if m in target]

# Bad
res2 = [(x, y) for x in range(10) for y in range(5) if x * y > 10]

## Lambda functions

Okay to use them for one-liners. If the code inside the lambda function is any longer than 60–80 chars, it's probably better to define it as a regular (nested) function.

Always use a `def` statement instead of an assignment statement that binds a lambda expression directly to an identifier.

In [9]:
# Yes
def f(x): 
    return 2*x

# No:
f = lambda x: 2*x

For common operations like multiplication, use the functions from the [`operator`](https://docs.python.org/3/library/operator.html) module instead of lambda functions. For example, prefer `operator.mul` to `lambda x, y: x * y`.

In [3]:
# Yes
lambda x, y: x * y

# No
lambda x, y: operator.mul(x,y)

<function __main__.<lambda>>

## Conditional expressions

Okay to use for one-liners. In other cases prefer to use a complete `if` statement.

In [20]:
y = 2
x = 1 if y > 3 else 2
print(x)

2


Object type comparisons should always use `isinstance()` instead of comparing types directly.

In [10]:
obj = 5

# Yes
if isinstance(obj, int):
    pass

#No
if type(obj) == int:
    pass

## Other

* Write in UTF-8 in Python 3.
* Avoid bare `except` statements unless you have a great reason not to.
* Random seed values (i.e. `np.random.seed`): use 123.  Put this statement after all imports.