![floss](floss.jpg)

# Coding Best Practices

UW CLMS Treehouse  
01 February 2018

# Roadmap

- 'Best practices': Why do we care?  
- Principles to follow  
- Petite exercise
- Version control  

# Why do we care?

A famous quote:  
    "Always code as if the guy who ends up maintaining your code   
    will be a violent psychopath who knows where you live.   
    Code for readability."
    
    
![yikes](scream.jpg)

# More realistically, the person maintaining your code will be...  

## **YOU**, one year from now

- Maybe you're boning up for a job interview?
- Maybe you need to submit a code sample for an application?
- Maybe you're recycling code for a new project?

**Do yourself a favor!**

## Your coworkers & collaborators

- Build communication skills by making your code easy to understand!
- Increase productivity by reducing time wasted explaining your code!
- Help newcomers to your team get up to speed quickly!  

**Nice!**

# Some guidelines to follow

Write code that is:
- Readable
- Maintainable
- Portable

## Readable
- Document your code!
    - Use sensible naming
        - Don't call variables 'x' and 'y'
        - Give functions names that describe what they do (verb phrases!)
        - EXCEPTION: Math functions, e.g. ```cosine_similarity(x,y)```
    - Use descriptive comments
        - Explain things that aren't immediately clear from function and variable names
        - Explain how to use a module
        - Sometimes you want to explain a particular design choice
    - Follow style conventions
        - It's OK to be a prescriptivist about code style!
    - For Python: 
        - The PEP 8 Guidelines are a great place to look
        - It's like The Elements of Style for Python code
        - Following conventions makes it easier for other people to read your code.

# Portable

- Make it easy run your code in a new setting
    - Don't hard-code file paths! 
    - Don't hard-code any sort of environment variables or inputs
    - In other words, make your code flexible / recyclable
    - For example:
        - Confusion matrix: Don't hard-code the number of classes or the names of classes

# Maintainable

- Make your code modular
    - If you have multiple nested loops...
    - If you find yourself writing the same block of code multiple times...
    - **Factor it out! Make it a function or a class.**
        - This makes it easy to pinpoint where the problems are in your code
        - It also makes it easier to adapt your code to new specifications

# Tools that can help

## Linters, e.g. pylint

- Pull the 'lint' out of your code
- Automatically check for style issues, unused code, errors
- Provide tips for refactoring
- Pylint gives your code a 'score' after it runs 
    - You might get hooked on trying to get a perfect score

## Books!

- Effective Python, by Slatkin (https://effectivepython.com/)
- Programming Pearls, by Bentley 
- Probably 9000 other books

Coding books make excellent leisure reading. 




# Time for a petite example!

Given a list of scores, return a list of each score minus the mean score.

Let's call this 'shifting.'

In [13]:
scores = [1,2,3,4,5]
 
def shift(scores):
    new_scores = []
    for score in scores:
        mean = sum(scores) / 5 
        new_scores.append((float(score - mean)))
    return new_scores

shift(scores)

[-2.0, -1.0, 0.0, 1.0, 2.0]

What's wrong with this?

In [13]:
scores = [1,2,3,4,5]
 
def shift(scores):
    new_scores = []
    for score in scores:
        mean = sum(scores) / 5 
        new_scores.append((float(score - mean)))
    return new_scores

shift(scores)

[-2.0, -1.0, 0.0, 1.0, 2.0]

- You don't need to calculate the denominator 5 times!
    - Do it before the loop.
- This function is not portable!
    - What if I get a list with 6 values?

Let's fix it...

In [11]:
scores = [1,2,3,4,5]
 
def shift(scores):
    new_scores = []
    mean = sum(scores) / len(scores)
    for score in scores:
        new_scores.append((float(score - mean)))
    return new_scores

shift(scores)

[-2.0, -1.0, 0.0, 1.0, 2.0]

Can we do even better than this?

YOU BET! Python has list comprehensions.

In [2]:
scores = [1,2,3,4,5]
 
def shift(scores):
    # Still calculate the mean ahead of time!
    mean = sum(scores) / len(scores)
    return [s - mean for s in scores]

shift(scores)

[-2.0, -1.0, 0.0, 1.0, 2.0]

OK. Now I have THREE lists of scores for you to shift.

In [4]:
score_list = [[1,2,3,4,5],
              [2,3,4,5,6],
              [5,4,3,2,9]]

shift(score_list[0])
shift(score_list[1])
shift(score_list[2])

[0.40000000000000036,
 -0.5999999999999996,
 -1.5999999999999996,
 -2.5999999999999996,
 4.4]

In [8]:
# What did I tell you? Don't write the same line of code twice!
for score in score_list:
    # Only printing this so it shows up in my presentation
    print(shift(score))

[-2.0, -1.0, 0.0, 1.0, 2.0]
[-2.0, -1.0, 0.0, 1.0, 2.0]
[0.40000000000000036, -0.5999999999999996, -1.5999999999999996, -2.5999999999999996, 4.4]


This all seems REALLY obvious now... but it's easy to make little 'mistakes' like these as your code gets more complex and has more moving parts (especially if you are working under pressure). 

Keep those moving parts factored out from the start to make your code easier to debug and tweak!

Give yourself the time to proofread and optimize your code.

![blop](blop.jpg)



# Version control

- A way to control your versions.
- Git is the most commonly used.

## What is it useful for? 
- Reverting to a previous state if you really fr\*ck something up
- Collaboration on a project
    - Dropbox is OK, but not great for code projects!
- Documenting changes to your codebase
- Create a public portfolio of your code with GitHub or GitLab
    - Impress people with your awesome work!

## If you go into industry, it is highly likely that you will need to learn a version control system, so start early! 

# Getting started with version control

- Tutorial: [Try Git](http://try.github.io)
- Some IDEs integrate with version control
    - e.g, PyCharm, which has a very nice graphical interface for resolving conflicts in code
    - (What's an IDE? 'Integrated development environment')
        - (Often has built-in linter, compiler / interpreter, other features to streamline your workflow)


# Miscellaneous tips

- Rubber duck debugging
 - Explain your code to a rubber duck who has never programmed before and needs each line explained step by step
- Take advantage of resources!
 - Read up on the Python standard library
 - Develop your code on patas and use condor 
- Believe in yourself

# Did you learn something new? 

If you found this utterly boring --- that's good!  
You know as much, or more, than I do.

Or, I am really boring.  

# Thanks!

# Bonus material: IPython

## IPython: Interactive Python
- Enhanced REPL
    - Scroll up through history for **whole functions**, not just lines!
    - Supports Unix commands (no need to `import os`!)
    - Tab completion
    - Magic commands to show you names of variables, time your code, ...
- IPython Notebook
    - Make slides like these
    - Showcase projects in a lab notebook format
        - With in-line graphics!
    - Easily export to .py



# Bonus material: The famous Fibonacci problem
 
```python
def Fibonacci(n):
    if n<0:
        print("Incorrect input")
    # First Fibonacci number is 0
    elif n==1:
        return 0
    # Second Fibonacci number is 1
    elif n==2:
        return 1
    else:
        return Fibonacci(n-1)+Fibonacci(n-2)
```

## What's wrong with the above?

### You're duplicating operations... WHAT A WASTE
Use dynamic programming!


```python

FibArray = [0,1]
 
def fibonacci(n):
    if n<0:
        print("Incorrect input")
    elif n<=len(FibArray):
        return FibArray[n-1]
    else:
        temp_fib = fibonacci(n-1)+fibonacci(n-2)
        FibArray.append(temp_fib)
        return temp_fib
    
```