# Some Best Practices for Python Code

Most of what is covered here is defined in <a href="https://www.python.org/dev/peps/pep-0008/">pep8</a>, which is the official style guide for writing code in Python.

The pep8 list is a long one and so I will cover some of the most important ones for ensuring readability of your code. 

At the end of this notebook, I'll just list some tips and tricks that I've found to be very handy over the years.

## Line Lengths

When writing Python code, the ideal maximum line length is 79 characters. This is not a hard limit in the sense your code wont run if it contains a line longer than 79 characters, it's a hard _suggestion_. The motivation behind the 79 characters is to accommodate the scenario when you have multiple files open on the same screen. 

<img src="example_line_length.png" alt="side by side python code">

## Indentation

Suppose you have a function that requires a large number of arguments and looks something like this,

```python
    def my_func(arg1, arg2, arg3, arg4, arg5, arg6, arg7, arg8=10, arg9='hello', arg10=[1,2], arg11=None):
        .
        .
        .
        return result
    
```

While this might be readable on a larger monitor, it's very difficult to read on a laptop screen especially if the editor you are using is not full screen. A better way to write this code would be to break up the single line of arguments into a single line for each argument,

```python
    def my_func(
        arg1,
        arg2, 
        arg3,
        arg4,
        arg5,
        arg6,
        arg7,
        arg8=10,
        arg9='hello', 
        arg10=[1,2],
        arg11=None
    ):
        .
        .
        .
        return result
    
```

This doesnt just apply to functions with a long list of arguments either, it applies to any Python object. For example, consider the Python dictionaries given below,

```python

    data1 = {'date': [date1, date2, date3], 'param1':[val1, val2, val3], 'param2':[val1, val2, val3]}
    
    data2 = {
        'date': [date1, date2, date3], 
        'param1':[val1, val2, val3], 
        'param2':[val1, val2, val3]
    }
```

Which of those is more readable? `data2` of course!

## Naming Conventions

Having a consist approach to naming variables, functions, classes, and the like is essential for readability of shared code. Variable names should be explicit and aim to encapsulate something about the information they store. For example, please, please, please, at all costs, avoid doing something like this, 
```python
fruit = 'Jacksonville, Florida'
```
Additionally, you should try to avoid single character variable names for important variables, unless they are used to track iterations in a loop. 

The official conventions defined in the `pep8` standard are,
- Variables
  - lowercase or lower_case_with_underscores, e.g. `myvar` or `my_var`
- Functions
  - lowercase or lower_case_with_underscores, e.g. `myfunc()` or `my_func()`
- Classes 
  - CapitalizedWords, e.g. `MyClass()`
- Global Variables 
  - The offical pep8 says the same as functions
  - I disagree here and find that it is much nicer use the following scheme,
    - `_GLOBAL_VARIABLE`
  - It makes it much easier to immediately recognize when a variable you are looking it is defined in the local namespace of the function or the global namespace of the script. 

## Catching Exceptions

When using a `try/except`, please try to mention a specific exception as opposted to just a general exception. For example,

```python
b = '1'
try:
    a = b + 2
except TypeError as e:
    print(e)
else:
    print(f'The value of a: {a}')
```
Here we see that b is a `str` and since we can't add a `str` to an `int`, Python will raise a `TypeError` exception. This `try/except` block is especially handy for dealing with instances when a variable can be either be `None` or only one other type. 

## Some Tips and Tricks

#### `defaultdict`
Probably one of my most used objects in Python is `defaultdict` which is contained in the `collections` module (this module is native to Python, no external package required). This object allows you to create a Python `dict` where the default value associated with each new key can be defined. The default value can be anything you like, a string, a list, an integer. I've found the most useful to be a list.

Consider the following scenario:
- You have some list of data files. 
- Each data file has some parameters stored in it.
- You want to loop through each data file and retrieve the data for each parameter.
- Once the process is done you want a dictonary containing key/value pairs for each of the parameters stored in each of the files.
    - Each key in the `dict` object corresponds to a list of all the values for the given parameter in each of the files.
    
Withot using the `defaultdict` object, your code would look something like this:

```python
flist = [filename1, filename2, ..., filenameN]
data = {
    'param1': [],
    'param2': [],
    'param3': [],
    'param4': [],
}
for fname in flist:
    parameters = retrieve_data(fname)
    for i, param in enumerate(parameters):
        data[f'param{i}'].append(param)
```
This is fine when the number of parameters is small, but you can immediately see that it won't scale very well and will ultimately require some copy/pasting. 

We can accomplish the same task using a `defaultdict` as follows,

```python
flist = [filename1, filename2, ..., filenameN]
data = defaultdict(list)
for fname in flist:
    parameters = retrieve_data(fname)
    for i, param in enumerate(parameters):
        data[f'param{i}'].append(param)
```

Nice, right? Each time we create a new key/value pair in the `defaultdict` object, the value is initialized as an empty `list`!

### Keyword arguments

When defining a function in Python, the arguments are either specified by their position or a keyword. 
- A positional argument just means that the value of the parameter depends on where it is in the list of arguments passed to the function.

- A keyword argument just means that the value of the parameter depends on the value of the keyword passed to the function. 


An example of a function with positional and keyword arguments is given below.
```python

def my_func(
    positional1, 
    positional2,
    keyword1='hello',
    keyword2=None
):
    ...
```

Keyword arguments are really useful for simplifying a complex function call because you can do things like this,

```python

keywords = {
    'keyword1': 'wooohooooo',
    'keyword2': 3.14159
}

my_func(1, 2, **keywords)
```

# The `logging` module

The Python `logging` module provides a convient interface for logging any errors or warnings that occur during runtime. This makes it extremely useful for monitoring the progress of a long processing pipeline. One of the key features of `logging` is the ability to define a format for the output. In the snippet below, I set up the logger to include information about the module, function, and line number where the logging statement is called.


```python
import logging

logging.basicConfig(format='%(levelname)-4s '
                           '[%(module)s:%(funcName)s:%(lineno)d]'
                           ' %(message)s')
LOG = logging.getLogger()
LOG.setLevel(logging.INFO)

LOG.info('info!')
LOG.warning('warning!')
LOG.error('error!)
.
.
.
```

A full example can be downloaded here: <a href="./logging_example.py">logging_example.py</a>. Example output from this demo script is shown below.

In [1]:
import logging_example

inputs = {'input1':'3.15989', 'input2':'asdf'}
logging_example.cool_function(**inputs)

INFO [logging_example:<module>:9] info!
ERROR [logging_example:<module>:11] error!
ERROR [logging_example:cool_function:22] Unknown format code 'f' for object of type 'str'
INFO [logging_example:cool_function:29] 
input1: 3.15989
input2: asdf
----------------------------------------------------------------------
ERROR [logging_example:cool_function:34] can only concatenate str (not "int") to str
