<img src='img/logo.png'>
<img src='img/title.png'>
<img src='img/py3k.png'>

# Table of Contents
* [Motivation](#Motivation)
* [How long should a function be?](#How-long-should-a-function-be?)
* [How many functions should be in a module?](#How-many-functions-should-be-in-a-module?)
* [When should I refactor?](#When-should-I-refactor?)
	* [Exercise (refactor into smaller functions)](#Exercise-%28refactor-into-smaller-functions%29)


In [None]:
# Some heuristic rules for writing good code
import this

# Motivation

Often, when we sit down to write code, our first pass may work, but not be easy to reuse or generalize. The temptation to just get something that just does the job has a number of downsides:
  * it becomes hard to later to understand what the code was meant to do
  * when sharing the code, other people will not understand how it works
  * other similar problems may require some of the same algorithm, but you don't want to repeatedly re-write the same techniques
  * smaller logical pieces lend themselves to testing - if a function only does one thing, we can test that it does that one thing correctly for all valid inputs (see chapter Testing); smaller also functions lend themselves to version control (e.g., git), so that when you change a function, we know immediately what it affects.
  
How to program is a question of personal style, but only to some extent. Keep in mind that the objective is to minimize the time it takes for someone to come to the code and understand what each piece of it does. That someone may well be you some time in the future.

# How long should a function be?

Humans have a limited capacity to remember things in working memory. Try to keep the number of variables small and give them meaningful names. Keep the amount of processing a function does to a minimum. This will result in functions that are typically 20 lines of code or shorter, although there are always exceptions.

# How many functions should be in a module?

Similar to above, the things in a module should be logically related, and limited in scope and number such that someone looking at the module understands what it's for and what the different components do.

# When should I refactor?

Refactoring is the process of turning code that works as advertised into a better structure that follows the suggestions above. With practice, a programmer will tend to design and write code that better follows style guides and conventions, but in real life, refactoring will always become necessary time to time. The *when* can be roughly described as either if a function has become too complex to easily understand OR some part (not all) of the functionality is required again is a differet context. Remember, repeating yourself means you have multiple oportunities to make mistakes.

In [None]:
# Example
import numpy as np
from math import pi

def sphere_dist(P1, P2):
    """Computes surface distance between locations on Earth
    
    Assume spherical surface of Earth from (latitude,longitude) coordinates
    """
    R = 6365 # Radius of earth (km)
    lat1, long1 = P1 # extract lat/long coordinates
    x1 = np.cos(long1 * pi/180) * np.cos(lat1 * pi/180)
    y1 = np.sin(long1 * pi/180) * np.cos(lat1 * pi/180)
    z1 = np.cos(lat1 * pi/180) # trig functions are in radians
    R1 = np.array([x1,y1,z1]) # cartesian location of P1

    lat2, long2 = P2
    x2 = np.cos(long2 * pi/180) * np.cos(lat2 * pi/180)
    y2 = np.sin(long2 * pi/180) * np.cos(lat2 * pi/180)
    z2 = np.cos(lat2 * pi/180)
    R2 = np.array([x2,y2,z2]) # cartesian location of P2
    alpha = np.arccos(np.dot(R1,R2)) # angle in radians
    return R * alpha

In [None]:
# London to New York
sphere_dist((51.5, -0.1), (40.7, -74.0))

In the function, we had the following variables defined: `P1, P2, lat1, long1, lat2, long2, x1, y1, z1, x2, y2, z2, R, R1, R2, alpha`. That's quite a few! Also, the expression `* pi/180` (to convert from degrees to radians) shows up many times, and the first half of the function is identical to the second half, except for which variables are being processed. All this repetition leads to mistakes, as we see.

## Exercise (refactor into smaller functions)

Rewrite the function into smaller functions to remove the repetitions and decrease the number of variables in use at one time. The signature of `sphere_dist` should not change, and the result should be the same (this is a simple test).

In [None]:
def spherical_to_cartesian(P):
    pass
 
def deg_to_rad(a):
    pass

def sphere_dist(P1, P2):
    """Computes surface distance between locations on Earth
    
    Assume spherical surface of Earth from (latitude,longitude) coordinates
    """
    pass

sphere_dist((51.5, -0.1), (40.7, -74.0))

Next, consider generalising the code: what if we wanted to calculate the cartesian coordinates of any spherical coordinate `(lat, long, R)`? Would our earth-specific distance implementation need to change?

---
<a href='best_refactoring_soln.ipynb' class='btn btn-primary'>Solution</a>

<img src='img/copyright.png'>