# Overview

In the first two workshops, we learned the basics of Python programming followed by programming strategies aimed to prevent copying / repreating code (e.g. methods, object-oriented techniques, inheritance, etc). In this session, we will build on these concepts by introduction tools and other best practices to ensure that your code is not only functional, but also maintainable and easy to extend.

Some key concepts include:

* debugging
* version control
* documentation

# Debugging

Debugging is a challenging skill to master. In general, this involves trying to decrypt the error messages generated by the Python interpreter when you've made a mistake somewhere in the code. One of the most useful tools for debugging is known as the step-by-step debugger. These tools let you run code one step at a time to see where things break.

Let's take an example piece of code:

In [0]:
def mistake():
    
    a = 1
    b = 'one'
    print(a + b)
    
mistake()

Oops! What when wrong? Let's use an interactive debugger to find out: 

In [0]:
import pdb

def mistake():
    
    # Set a breakpoint to pause here
    # pdb.set_trace()
    
    a = '1 ' 
    b = 'one'
    print(a + b)
    
mistake()

## Tips

Some tips for using the `pdb` debugger:

* place the breakpoint near where your code broke
* (l)ist to see a bit of context
* (n)ext for next line
* use the `print` command to see variable contents
* (s)tep to step *into* a method
* (b)reak to set a breakpoint sometime in the future
* use the `exit()`command to finish debugging

## Practice

Try debugging these examples:

In [0]:
# -------------------------------------------------
# Example 1
# -------------------------------------------------

def multiply(values):
    """
    Method to multiply all values in list
    
    :params
    
      (list) values
      
    """
    total = 1 
    for value in values:
        total = total * int(value) 
    
    return total

def main():
    
    a = input('Enter value for A:')
    b = input('Enter value for B:')
    
    c = multiply([a, b])
    print('A x B = %i' % c)

main()

In [0]:
# -------------------------------------------------
# Example 2 
# -------------------------------------------------

def add_series(values):
    """
    Method to add all values together
    
    :params
    
      (list) values
    
    """
    total = 0
    
    for value in values:
        print('Adding %i to the total' % value)
        total = total + value
    
    return total

def create_series(max_value):
    """
    Method to create a list of numbers from 1 through max_value
    
    :params
    
      (int) max_value
      
    """
    values = []
    for i in range(max_value):
        values.append(i)
    
    return values

def calculate():
    """
    Method to add a series together
    
    """
    max_value = input('Add together all numbers between 1 and ? (inclusive):')
    max_value = int(max_value)
    values = create_series(max_value)
    total = add_series(values)
    
    print('The sum is %i' % total)

calculate()
    

# Basic Linux and Bash

One of the most challenging mental blocks to the world of programming is learning to use the command line (Terminal). In Linux, the most common Terminal flavor is known as **bash**. Luckily, at the CAIDM we have set up many user-friendly interfaces like this Jupyter notebook to make most of the process quite simple. However, every once in a while you will need to use some basic Terminal commands.

There are two ways to access the terminal. The first and easiest is straight from the Jupyter notebook. Any command that is prefixed with `!` is simply a command that will instead run in the Bash terminal. The second option, available only when connected to our in-house CAIDM servers, is the terminal screen itself.

## Common commands

In [0]:
# List the present working directory
!pwd

In [0]:
# List the files and directories in current folder
!ls

In [0]:
# Change directories
!cd datalab

# Move up one directory
!cd .. 

In [0]:
# Make directory
!mkdir temp

# Remove a direcotry (note the -r)
rm -r temp

In [0]:
# Move a file or folder
!mkdir temp
!mkdir sub
!mv sub temp

# Version control

Inevitably during the course of writing some code, you will inadvertently do something that breaks your entire program. No matter how much time you spend tracing through your code with your ninja debugging techniques, you can't seem to pinpoint the error. Everytime you fix one error, another one pops up!

The solution here is known as version control, the programmers version of a Dropbox (or Google cloud, OneDrive, etc) for backing up your work to the cloud. Even more useful, every **commit** or version you save to the **repository** is tracked, and there are cool tools to see how things changed.

One of the most popular tools for version control is known as *git*, which is what we will be using in the CAIDM. There are many online cloud systems (and software you can install on your own personal servers) to manage *git* based version control, but perhaps the most popular website is GitHub. 

We'll cover a few basic use cases here. For more information refer to several well written tutorial [here](https://guides.github.com/activities/hello-world/) or [here](https://product.hubspot.com/blog/git-and-github-tutorial-for-beginners).

## Setting up a repository

Each project and its contents are organized into a collection known as a **repository**. A repository can be created at any given time by simply using the `git init` command in the root folder of the target new repository.

In [0]:
# Make a new repo directory using your name
!mkdir peter_chang
!cd peter_chang
!git init

Next, log into your GItHub account and connect to the CAIDMRes organization (link [here](https://github.com/CAIDMRes)). First we will create the new repo in our online GitHub account. To do so:

1. Click on the green button to create a new repository.
2. Give it a name (your name)
3. Click create repository
4. Link your local repoository with this newly created remote (cloud) repository

In [0]:
# Link repositories
!git add remote https://github.com/CAIDMRes/peter_chang.git

## Make your first commit

Copy over the model checkpoint file from assignment #2 into your new repo. Then use the following series of commands to save your repository changes to the cloud.

In [0]:
# Move files in
!mv

# Check the repo status
!git status

# Add new files to be tracked
!git add model.ckpt model.ckpt.meta

# Commit the changes 
!git commit -m 'new checkpoint'

# Sync the changes with the cloud
!git push origin master

# Documentation

No matter how well you think you know the code you've written, you **will** forget the details in the *very* near future, more soon that you realize. This is why it's so critical to document your code as much as possible while you are writing. You've already been introduced to the most common types of commenting styles and best practices so far, but let's review them here for emphasis

## General comments

General comments are preceded by a simple hashtag `#` sign and are the most common way you will documenet your code. From the PEP8 official Python style guide:

* Comments that contradict the code are worse than no comments. Always make a priority of keeping the comments up-to-date when the code changes!
* Comments should be complete sentences. The first word should be capitalized, unless it is an identifier that begins with a lower case letter (never alter the case of identifiers!).
* You should use two spaces after a sentence-ending period in multi- sentence comments, except after the final sentence.
* When writing English, follow Strunk and White

## Docstrings

In Python, methods and classes can be documented with a special type of comment known as a docstring These comments are, by definition, placed immediately after the method (or class) signature, bookended by triple quotation marks. They are useful and unique because Python compilers often will search for comments in this section to include when giving the end-user a hint as to its functionality. As a result, the information provided here is quite important and often follows some general rules.

In [0]:
def my_method(arg1, arg2, arg3):
    """
    This is a brief explanation of my_method() functionality.
    
    If I would like to describe the functionality in further detail I would put 
    it here in the second paragraph. Sometimes this can be used to describe
    exceptional use cases or important considerations.
    
    :params
    
      (str) arg1 : a string for some purpose
      (int) arg2 : an integer representing something else
      (np.array) arg3 : a NumPy array
      
    :return
    
      (dict) result : the result of this method is a dictionary
      
    """
    pass

In [0]:
my_method()

# PEP

The official guide and recommendations for Python best practices can be found in series of documents known of **Python Enhancement Proposals** (PEP). There are over several thousand, more than any single developer could assimilate, but the general ideas do become engrained as you work more with the language. 

The full index of PEP documents can be found [here](https://www.python.org/dev/peps/). Of these, the most famous is likely PEP 20 - The Zen of Python. Enjoy!

### PEP20 - The Zen of Python

```
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
```