In [1]:
# imports
import numpy as np

# Formatting Guidelines

#### Question: What are some features of well-written code?

- Good variable names
- Plenty of spacing
- Comments when needed*
- Reasonably sized functions
- "Shallow" or not deeply nested

## Well-Written Code

Here, we'll go over each of the features of we previously discussed about well written code and how to make sure your code flows logically and is easy to read.

These are only guidelines that help people (including your future self) read your code.

### Naming Objects

(Adapted from "Naming Things in Code" by [CodeAesthetic](https://www.youtube.com/@CodeAesthetic/videos))

#### Descriptive Names

First, name sure your object names are descriptive. Outside of coordinates (```x, y, r```), variables should not be a single letter. To the chagrin of many, multi-letter variables applies to indices too ðŸ˜¢.

Here is an example of bad object naming. Can you guess what ```s``` or ```nqr``` is?

In [2]:
# non-descriptive variable names
k = 1.380649e-23

def s(nqr):
    return - k * (np.log(nqr) + 5/2)

Let's change the object names to be more descriptive.

In [3]:
# descriptive variable names
k = 1.380649e-23

def entropy_ideal_gas(density_ratio):
    return - k * (np.log(density_ratio) + 5/2)

Anyone who guessed ```s``` is the entropy of an ideal gas particle given by the Sackur-Tetrode equation and ```nqr``` is the ratio of observed density to quantum density was correct!

#### Avoid Abbreviations

Generally, you should not have abbreviations in your object names as they narrow the audience that can read your code. However, like everything in life, context matters. For this class, do not feel obligated to name your diffusion coefficient ```diffusion_coefficient``` instead of ```D``` -- this applies to your other variables too.

Depending on the context, you can also use abbreviations like avg for average and stddev for standard deviation. Just keep in mind who the end user of the code will be.

In [4]:
# with abbreviations
def a_to_s(a):
    d = a * 365.2422
    h = d * 24
    m = h * 60
    s = m * 60

    return s

In [5]:
# without abbreviations
def year_to_second(years):
    days = years * 365.2422
    hours = days * 24
    minutes = hours * 60
    seconds = minutes * 60

    return seconds

You may have been able to guess what ```a_to_s``` meant by looking at the source code. However, imagine how annoyed you would be if you had to look through another person's (this includes your past self) source code to discern what a name meant.

### Blank Lines

Adding blank lines is an easy way to improve the readability of your code. Spaces between lines group your code into blocks logically units.

Let's see an example from one of our NPRE 451 labs where we did not have any blank lines in our code. (You will get an error if you try to run this)

In [None]:
# without blank lines
def fit_all_models():
    fit_results = [] # An empty array for storing the fitted parameters
    models = [poisson, gaussian, binomial]
    for model in models:
        if model == poisson:
            initial_guess = [sample_mean] # initial_guess = [YOUR GUESS FOR MU]
            bounds = (250, 300) # bounds = (MU LOWER BOUND, MU UPPER BOUND)
        if model == gaussian:
            initial_guess = [sample_mean, sample_stdev] # initial_guess = [YOUR GUESS FOR MU, YOUR GUESS FOR SIGMA]
            bounds = ([250, 10], [300, 20]) # bounds = ([MU LOWER BOUND, SIGMA LOWER BOUND], [MU UPPER BOUND, SIGMA UPPER BOUND])
        if model == poisson or model == gaussian:
            # the following line performs the fit
            best_fit_parameters, cov_matrix = optimize.curve_fit(model, counts, probabilities, p0 = initial_guess, bounds = bounds)
        if model == binomial:
            guess_of_p = 1 - sample_stdev**2/sample_mean
            guess_of_N = sample_mean / guess_of_p
            initial_guess = [guess_of_N, guess_of_p] # initial_guess = [YOUR GUESS FOR N, YOUR GUESS FOR P]
            best_fit_parameters, cov_matrix = optimize.curve_fit(model, counts, probabilities, p0 = initial_guess, maxfev = 10000)
        fit_results.append([best_fit_parameters, cov_matrix])
    return fit_results

In [None]:
# with blank lines
def fit_all_models():
    fit_results = [] # An empty array for storing the fitted parameters
    models = [poisson, gaussian, binomial]
    
    for model in models:
    
        if model == poisson:
            initial_guess = [sample_mean] # initial_guess = [YOUR GUESS FOR MU]
            bounds = (250, 300) # bounds = (MU LOWER BOUND, MU UPPER BOUND)
        
        if model == gaussian:
            initial_guess = [sample_mean, sample_stdev] # initial_guess = [YOUR GUESS FOR MU, YOUR GUESS FOR SIGMA]
            bounds = ([250, 10], [300, 20]) # bounds = ([MU LOWER BOUND, SIGMA LOWER BOUND], [MU UPPER BOUND, SIGMA UPPER BOUND])
        
        if model == poisson or model == gaussian:
            # the following line performs the fit
            best_fit_parameters, cov_matrix = optimize.curve_fit(model, counts, probabilities, p0 = initial_guess, bounds = bounds)
        
        if model == binomial:
            guess_of_p = 1 - sample_stdev**2/sample_mean
            guess_of_N = sample_mean / guess_of_p
    
            initial_guess = [guess_of_N, guess_of_p] # initial_guess = [YOUR GUESS FOR N, YOUR GUESS FOR P]
            best_fit_parameters, cov_matrix = optimize.curve_fit(model, counts, probabilities, p0 = initial_guess, maxfev = 10000)
        
        fit_results.append([best_fit_parameters, cov_matrix])
    return fit_results

Although this code could be refactored, adding a few blank lines to visually group the code dramatically improves the readability of the code.

Blank lines are free, so don't hestitate to add more. If you think your code would be easier to read with an extra blank line, it probably would be, so add the line.

### Comments

In addition to blank lines, do not hesitate to add comments if you think the comment would improve readability.

If you find yourself writing verbose comments, your code might benefit from simplification or refactoring.

### Reasonably Sized Functions

Each function should not have too much responsibility. Not only is a function that does too much hard to understand, gigantic functions are almost impossible to debug.

Let's go back to the code we used for NPRE 451 and see if we can break up the function a bit better.

In [None]:
# all in one function
def fit_all_models():
    fit_results = [] # An empty array for storing the fitted parameters
    models = [poisson, gaussian, binomial]
    
    for model in models:
    
        if model == poisson:
            initial_guess = [sample_mean] # initial_guess = [YOUR GUESS FOR MU]
            bounds = (250, 300) # bounds = (MU LOWER BOUND, MU UPPER BOUND)
        
        if model == gaussian:
            initial_guess = [sample_mean, sample_stdev] # initial_guess = [YOUR GUESS FOR MU, YOUR GUESS FOR SIGMA]
            bounds = ([250, 10], [300, 20]) # bounds = ([MU LOWER BOUND, SIGMA LOWER BOUND], [MU UPPER BOUND, SIGMA UPPER BOUND])
        
        if model == poisson or model == gaussian:
            # the following line performs the fit
            best_fit_parameters, cov_matrix = optimize.curve_fit(model, counts, probabilities, p0 = initial_guess, bounds = bounds)
        
        if model == binomial:
            guess_of_p = 1 - sample_stdev**2/sample_mean
            guess_of_N = sample_mean / guess_of_p
    
            initial_guess = [guess_of_N, guess_of_p] # initial_guess = [YOUR GUESS FOR N, YOUR GUESS FOR P]
            best_fit_parameters, cov_matrix = optimize.curve_fit(model, counts, probabilities, p0 = initial_guess, maxfev = 10000)
        
        fit_results.append([best_fit_parameters, cov_matrix])
    return fit_results

In [None]:
# multiple functions
def poisson_params(mean=sample_mean, mu_bounds=(250, 300)):
    """Generates best fit parameters for a poisson distribution"""
    initial_guess = [mean]
    
    fit_parameters, cov_matrix = optimize.curve_fit(
        poisson, counts, probabilities, p0 = initial_guess, bounds = mu_bounds
    )

    return fit_parameters, cov_matrix


def gaussian_params(
    mean=sample_mean,
    std=sample_std,
    mu_bounds=(250, 300),
    sigma_bounds=(10, 20)
):
    """Generates best fit parameters for a gaussian distribution"""
    initial_guess = [mean, std]
    bounds = (
        [mu_bounds[0], sigma_bounds[0]],
        [mu_bounds[1], sigma_bounds[1]]
    )

    fit_parameters, cov_matrix = optimize.curve_fit(
        gaussian, counts, probabilities, p0=initial_guess, bounds=bounds
    )

    return fit_parameters, cov_matrix


def binomial_params(
    mean=sample_mean,
    std=sample_stddev,
):
    """Generates Best fit parameters for a binomial disribution"""
    P_guess = 1 - std**2 / mean
    N_guess = mean / P_guess

    initial_guess = [N_guess, P_guess]

    fit_parameters, cov_matrix = optimize.curve_fit(
        binomial, counts, probabilities, p0=initial_guess, maxfev=10000
    )
    
    return fit_parameters, cov_matrix


fit_results = [
    poisson_params(),
    gaussian_params(),
    binomial_params()
]

For the computer project, reasonably sized functions would be individual functions for generating diffusion coefficients, getting material properties, and performing different steps required to solve the problem.

From experience, it is much easier to debug a function that only find the diffusion coefficient at a given point than a function that solves the entire question at once.

### "Shallow" Code

A code being "shallow" means 
Nesting refers to a loop or statement inside another loop or statement. Each loop or statement adds another "layer" to your code. Generally, 2-3 should be the maximum depth for your code, however, exceptions do apply.

Here is an example of counting layers in your code.

In [None]:
# 0 layers deep
def testing(end):
    # 1 layer deep, functions move you down a layer
    for i in range(end + 1):
        # 2 layers deep, loops move you down a layer
        if i % 2 == 0:
            # 3 layers deep, statements move you down a layer
            print(f"{i} is even")
        # 2 layers deep
        else:
            # 3 layers deep
            print(f"{i} is odd")
    # 1 layer deep
    return


# 0 layers deep
testing(3)

In addition to deeper code being, generally, harder to read, layers also determine the scope of your variables. If you define a variable in layer 4, once you move back to layer 3, your variable is out of scope -- layer 1,2,3 cannot access the variable defined in layer 4.

An example of deeply nested code and how to fix it will be shown later, but try limit depth of your code. If you need a for loop inside and if statement inside another if statement all in another for loop, they is probably a better way to write the code.

# Formatting Example using ```sum_evens()```

Let's look at a poorly written implementation of a function that adds all the even numbers between some bounds. This function works and does the intended job, however, it is somewhat difficult to sight read. (example from "Why You Shouldn't Nest Your Code" by [CodeAesthetic on YT](https://www.youtube.com/@CodeAesthetic/videos))

#### Question: What about this code makes it difficult to read?

The main issue with ```sum_evens()``` is how nested everything is. Particularly,
1. the else statement at the bottom and
2. the while loop has too much logic in it.

### Bad Example of ```sum_evens()```

In [10]:
def sum_evens(bottom, top):
    if (top > bottom):
        total = 0
        
        while (bottom <= top):
            if (bottom % 2 == 0):
                total += bottom
            bottom += 1

        return total

    else:
        return 0

In [11]:
# you can test the code here
sum_evens(5, 10)

24

The code works as expected, but we can format ```sum_evens()``` in a much clearer way.

## Reformatting ```sum_evens()```

### Extraction

Two ways we can improve the legibility of ```sum_evens()``` are extraction and inversion. We will start with extraction and come back to inversion in a minute.

Extraction is a fancy way of saying "this one function is doing too much, so I am going to extract some responsibility of the main function and create smaller, more isolated functions." We said before that the while loop is doing too much, so let's take some of the logic out of the while loop and create a new function to do the same thing.

In [12]:
def filter_evens(number):
    if (number % 2 == 0):
        return number

    return 0


def sum_evens(bottom, top):
    if (top > bottom):
        total = 0
        
        while (bottom <= top):
            total += filter_evens(bottom)
            bottom += 1

        return total

    else:
        return 0

Here, we **extract** the responsibility of filtering the even numbers out of ```sum_evens``` and into ```filter_evens```. This example is fairly straightforward, but you can imagine doing something similar for a gigantic function that, for example, solves an entire problem on the computer project.

### Inversion

As previously mentioned, inversion is another way we can make code more readable. Instead of enter an ```if``` statement if ```top > bottom```, we can invert the condition and return early if ```bottom > top```.

Let's see what this looks like.

In [13]:
def filter_evens(number):
    if (number % 2 == 0):
        return number

    return 0


def sum_evens(bottom, top):
    if (bottom > top):
        return 0
    
    total = 0
    
    while (bottom <= top):
        total += filter_evens(bottom)
        bottom += 1

    return total

Instead of entering the main body of our code by checking for a postive statement (```top > bottom```), we **invert** the statement to a negative (```bottom > top```).

With inversion, we have a section of our code that "validates the input" before we enter the main body of our code. With inversion, the input validation section can have an arbitrary number of conditions ensuring your input follows an expected form.

Once again, ```sum_evens()``` is fairly straightforward, so let's look at a particularly egregious example.

In [14]:
def check_if_2(number):
    if (number == int(number)):
        if (number >= 0):
            if (number % 2 == 0):
                if (number == 2):
                    return "Yep, that's 2"
                else:
                    return f"{number} is an even natural number, but not 2"
            else:
                return f"{number} is a natural number, but not even"
        else:
            return f"{number} is an integer, but not a natural number"
    else:
        return f"{number} is not an integer"

With inversion, ```check_if_2()``` becomes:

In [15]:
def check_if_2(number):
    if (number != int(number)):
        return f"{number} is not an integer"

    if (number < 0):
        return f"{number} is an integer, but not a natural number"

    if (number % 2 != 0):
        return f"{number} is a natural number, but not even"

    if (number != 2):
        return f"{number} is an even natural number, but not 2"
    
    return "Yep, that's 2"

Which implementation do you think is easier to follow? Notice, both have the same number of lines as well.

Instead of going through ```if else``` statements, the input gets sequentially validated. If the validation fails, the code returns early. 

When using inversion, it is best to start with the most common case that will cause an early return and work your way to the least common case. In this example, the most and least common cases are easy to determine, which may not be the case in actual problems

# Unit Tests

Until this point, you have hopefully trusted me when I described what the function is doing. To reiterate, we expect ```sum_evens()``` to sum all the even numbers between the bottom and the top including the top. Additionally, if the bottom is larger than the top, we return 0.

Using this logic, for a given ```(bottom, top)```, we expect the following results:
- ```(0, 2)``` = 0 + 2 = 2
- ```(0, 5)``` = 0 + 2 + 4 = 6
- ```(4, 4)``` = 4
- ```(10, 4)``` = 0
- ```(-2, 5)``` = (-2) + 0 + 2 + 4 = 4
- ```(-6, -3)``` = (-6) + (-4) + (-2) = -10

If we want to check our code against the expected results, we can implement unit tests. Unit tests assert our function equal the expected value for a wide range of simple cases.

Let's see what unit tests using ```assert``` might look for our ```sum_even()``` function.

In [16]:
def test_sum_even():
    assert sum_evens(0, 2) == 2
    assert sum_evens(0, 5) == 6
    assert sum_evens(4, 4) == 4
    assert sum_evens(10, 4) == 0
    assert sum_evens(-2, 5) == 4
    assert sum_evens(-6, -3) == -10

test_sum_even()

Unit tests reduces the need for temporary print statements in your functions that you manually against.

When constructing unit tests, try to come up with every scenario that could be an issue. For this example, we checked:
1. an even bottom to an even top
2. an even bottom to an off top
3. a bottom equal to top
4. a bottom greater than the top
5. a negative bottom to a positive top
6. a negative bottom to a negative top

This is by no means a comprehensive list. Still, each unit test provides a unique check not covered by the other.

For this computer project, unit tests are by no means required. However, if you have questions about the code, easily showing your code works as intended or how your code fails will make it so much easier for Professor Novak and the TAs to help you debugging.

(If you want a more robust testing suite, there is a module called [pytest](https://docs.pytest.org/en/stable/) designed with unit tests in mind.)

# Debugging

## Introduction

If you write any amount of code, you will inevitably have to debug your code. Debugging is the process of locating errors in your code, isolating those errors, and modifying your code to behave as expected.

The term debugging comes from Admiral Grace Hopper, who worked on early computers at Harvard in the 1940s. The computer she was working on had a moth inside slowing the computer down. She then told administration they were "debugging" the system.

Picture this; you ask another person for help debugging your code and they see a jumbled mess of code that has one-letter variables, no functions, and quantities defined nowhere near where they are used. Trying to find the bug in your code is like trying to find a needle in a hay stack.

#### For Jupyter Lab users

Jupyter Lab caches variables in cells you have previously run

#### Question: If you were helping somebody debug their code, what would you hope to see?

- Functions
  - short, isolated, simple
- Descriptive variables
- Feedback mechanisms
  - unit tests
  - print statements
- Anything else?

## Thinking about Writing Code

When you are writing code, think about the final product as a jigsaw puzzle. Each function or block of code you write is a puzzle piece. If you are trying to put 20 puzzle pieces together at once, you will take longer to solve the puzzle. Work one function at a time, ensure each function works as you expect, couple the fuction with your existing functions.

Breaking up your code allows you to isolate bugs more easily and prevent you from combing through code that works.