# Introductory Programming with Python (Part 2)
In our last lesson, we discovered something suspicious was going on in our inflammation data by drawing some plots. How can we use Python to automatically recognize the different features we saw, and take a different action for each? 

 
Let's start this session by learning how to write code that runs only when certain conditions are true.

## Conditionals
In Python, conditionals allow you to make decisions in your code based on certain conditions. The most common way to do this is by using if, elif, and else statements.

Here’s how they work:

- **if**: checks if a condition is True and runs the code inside the block.
- **elif**: (short for "else if") checks another condition if the previous one was False.
- **else**: runs if none of the previous conditions were True.

For example:

In [93]:
#Your code goes here

The second line of this code uses the if keyword to tell Python to make a decision. If the condition after if is true, the code under if runs, and "greater" is printed. If the condition is false, the code under else runs, and "not greater" is printed. Only one of these will run, then the program moves on to print "done."

![Image of if](https://swcarpentry.github.io/python-novice-inflammation/fig/python-flowchart-conditional.png)

Conditional statements don’t have to include an else. If there isn’t one, Python simply does nothing if the test is false:

In [94]:
#Your code goes here

We can also chain several tests together using elif, which is short for “else if”. The following Python code uses elif to print the sign of a number.

In [95]:
#Your code goes here

Note that to test for equality we use a double equals sign == rather than a single equals sign = which is used to assign values.

### Comparing in Python

Along with the > and == operators we have already used for comparing values in our conditionals, there are a few more options to know about:

We can also combine conditions using **and** and **or**.

- **and** is only true if both conditions are true:

In [96]:
#Your code goes here

- **or** is true if at least one of the conditions is true:

In [97]:
#Your code goes here

### True and False
In Python, True and False are special values called booleans, which represent whether something is true or false.

For example:

- **1 < 0** (Is 1 less than 0?) returns False, because this statement is not true.
- **-1 < 0** (Is -1 less than 0?) returns True, because this statement is true.

Booleans help us make decisions in our programs.

In [98]:
#Your code goes here

In [99]:
#Your code goes here

## Checking our Data
Now that we’ve seen how conditionals work, we can use them to check for the suspicious features we saw in our inflammation data. We are about to use functions provided by the numpy module again. 

Therefore, since we’re working in a new Python notebook, we will make sure to load the module and data again:

In [100]:
#Your code goes here

From the first couple of plots, we saw that maximum daily inflammation exhibits a strange behavior and raises one unit a day. Wouldn’t it be a good idea to detect such behavior and report it as suspicious? Let’s do that! 

However, instead of checking every single day of the study, let’s merely check if maximum inflammation in the beginning (day 0) and in the middle (day 20) of the study are equal to the corresponding day numbers.

In [102]:
#Your code goes here

We also saw a different problem in the third dataset; the minima per day were all zero (looks like a healthy person snuck into our study). We can also check for this with an elif condition:

And if neither of these conditions are true, we can use else to give the all-clear:

else:
    print('Seems OK!')

Let’s test that out:

In [103]:
#Your code goes here

In this way, we have asked Python to do something different depending on the condition of our data. Here we printed messages in all cases, but we could also imagine not using the else catch-all so that messages are only printed when something is wrong, freeing us from having to manually examine every plot for features we’ve seen before.

___
### Check your understanding
Consider this code:

Which of the following would be printed if you were to run this code? Why did you pick this answer?
1. A
2. B
3. C
4. B and C
___

### Sorting a list into buckets
In our data folder, large data sets are stored in files whose names start with *“inflammation-”* and small data sets – in files whose names start with *“small-”*. We also have some other files that we do not care about at this point.

We’d like to break all these files into three lists called **large_files**, **small_files**, and **other_files**, respectively.

The **startswith()** method in Python checks if a string begins with a specific sequence of characters (a prefix). It returns *True* if the string starts with the exact characters passed to it, and *False* otherwise.

**Key Points**
1. **Case-Sensitivity**: The method is case-sensitive, meaning 'Str' and 'str' are treated differently.

In [104]:
#Your code goes here

In [105]:
#Your code goes here

2. **Exact Match**: The prefix must match exactly in both characters and case for startswith() to return True.
3. **Practical Use**: This method is useful for categorizing or sorting strings, like checking if filenames start with specific prefixes, making it efficient and straightforward to use.

Now let's break all the files into three lists called *large_files*, *small_files*, and *other_files*.

In [107]:
filenames = ['inflammation-01.csv',
         'myscript.py',
         'inflammation-02.csv',
         'small-01.csv',
         'small-02.csv']
#Your code goes here

## Creating Functions
At this point, we’ve seen that code can have Python make decisions about what it sees in our data. What if we want to convert some of our data, like taking a temperature in Fahrenheit and converting it to Celsius. We could write something like this for converting a single number:

In [108]:
#Your code goes here

and for a second number we could just copy the line and rename the variables:

In [109]:
#Your code goes here

But we would be in trouble as soon as we had to do this more than a couple times. Cutting and pasting it is going to make our code get very long and very repetitive, very quickly. We’d like a way to package our code so that it is easier to reuse, a shorthand way of re-executing longer pieces of code. In Python we can use **functions**. 

In Python, a ***function*** is a block of reusable code that performs a specific task. Functions help organize your code, make it reusable, and simplify complex programs.

Let’s start by defining a function fahr_to_celsius that converts temperatures from Fahrenheit to Celsius:

In [110]:
#Your code goes here

The function definition opens with the keyword ***def*** followed by the name of the function (*fahr_to_celsius*) and a parenthesized list of parameter names (*temp*).

The body of the function — the statements that are executed when it runs — is indented below the definition line. The body concludes with a return keyword followed by the return value.

When we call the function, the values we pass to it are assigned to those variables so that we can use them inside the function. Inside the function, we use a return statement to send a result back to whoever asked for it.

![Image of function](https://swcarpentry.github.io/python-novice-inflammation/fig/python-function.svg)


Now, let's try running our function:

In [111]:
#Your code goes here

This command should call our function, using “32” as the input and return the function value.

In fact, calling our own function is no different from calling any other function:


In [112]:
#Your code goes here

We’ve successfully called the function that we defined, and we have access to the value that we returned.

### Composing Functions
Now that we’ve seen how to turn Fahrenheit into Celsius, we can also write the function to turn Celsius into Kelvin:

In [113]:
#Your code goes here

What about converting Fahrenheit to Kelvin? We could write out the formula, but we don’t need to. Instead, we can compose the two functions we have already created:

In [114]:
#Your code goes here

This is our first taste of how larger programs are built: we define basic operations, then combine them in ever-larger chunks to get the effect we want. Real-life functions will usually be larger than the ones shown here — typically half a dozen to a few dozen lines — but they shouldn’t ever be much longer than that, or the next person who reads it won’t be able to understand what’s going on.

### Variable Scope
In composing our temperature conversion functions, we created variables inside of those functions, **temp, temp_c, temp_f, and temp_k**. 

We refer to these variables as local variables because they no longer exist once the function is done executing. If we try to access their values outside of the function, we will encounter an error:

In [115]:
#Your code goes here

If you want to reuse the temperature in Kelvin after you have calculated it with fahr_to_kelvin, you can store the result of the function call in a variable:

In [116]:
#Your code goes here

The variable temp_kelvin, being defined outside any function, is said to be *global*.

Inside a function, one can read the value of such global variables:

In [117]:
#Your code goes here

### Tidying up
Now that we know how to wrap bits of code up in functions, we can make our inflammation analysis easier to read and easier to reuse. First, let’s make a visualize function that generates our plots:

In [118]:
#Your code goes here

and another function called detect_problems that checks for those systematics we noticed:

In [119]:
#Your code goes here

Wait! Didn’t we forget to specify what both of these functions should return? 

Well, we didn’t. In Python, functions are not required to include a return statement and can be used for the sole purpose of grouping together pieces of code that conceptually do one thing. 

In such cases, function names usually describe what they do, e.g. visualize, detect_problems.

Notice that rather than jumbling this code together in one giant for loop, we can now read and reuse both ideas separately. We can reproduce the previous analysis with a much simpler for loop:

In [121]:
import glob
filenames = sorted(glob.glob('inflammation*.csv'))

#Your code goes here

By giving our functions human-readable names, we can more easily read and understand what is happening in the for loop. Even better, if at some later date we want to use either of those pieces of code again, we can do so in a single line.

### Testing and Documenting 
Once we start putting things in functions so that we can re-use them, we need to start testing that those functions are working correctly. To see how to do this, let’s write a function to offset a dataset so that it’s mean value shifts to a user-defined value:

In [122]:
#Your code goes here

We could test this on our actual data, but since we don’t know what the values ought to be, it will be hard to tell if the result was correct. Instead, let’s use NumPy to create a matrix of 0’s and then offset its values to have a mean value of 3:


In [123]:
import numpy
#Your code goes here

That looks right, so let’s try **offset_mean** on our real data:

In [124]:
#Your code goes here

It’s hard to tell from the default output whether the result is correct, but there are a few tests that we can run to reassure us:

In [125]:
#Your code goes here

That seems almost right: the original mean was about 6.1, so the lower bound from zero is now about -6.1. The mean of the offset data isn’t quite zero, but it’s pretty close. We can even go further and check that the standard deviation hasn’t changed:

In [126]:
#Your code goes here

Those values look the same, but we probably wouldn’t notice if they were different in the sixth decimal place. Let’s do this instead:

In [127]:
#Your code goes here

Everything looks good, and we should probably get back to doing our analysis. We have one more task first, though: we should write some documentation for our function to remind ourselves later what it’s for and how to use it.

The usual way to put documentation in software is to add comments like this:

In [128]:
# offset_mean(data, target_mean_value):
# return a new array containing the original data with its mean offset to match the desired value.

#Your code goes here

There’s a better way, though. If the first thing in a function is a string that isn’t assigned to a variable, that string is attached to the function as its documentation:

In [129]:
def offset_mean(data, target_mean_value):
    """Return a new array containing the original data
       with its mean offset to match the desired value."""
    #Your code goes here

This is better because we can now ask Python’s built-in help system to show us the documentation for the function:

In [130]:
#Your code goes here

A string like this is called a docstring. We don’t need to use triple quotes when we write one, but if we do, we can break the string across multiple lines:

In [131]:
def offset_mean(data, target_mean_value):
    """Return a new array containing the original data
       with its mean offset to match the desired value.

    Examples
    --------
    >>> offset_mean([1, 2, 3], 0)
    array([-1.,  0.,  1.])
    """
    #Your code goes here

### Defining Defaults
We have passed parameters to functions in two ways: directly, as in type(data), and by name, as in numpy.loadtxt(fname='something.csv', delimiter=','). In fact, we can pass the filename to loadtxt without the fname=:

In [36]:
numpy.loadtxt('data/inflammation-01.csv', delimiter=',')

array([[0., 0., 1., ..., 3., 0., 0.],
       [0., 1., 2., ..., 1., 0., 1.],
       [0., 1., 1., ..., 2., 1., 1.],
       ...,
       [0., 1., 1., ..., 1., 1., 1.],
       [0., 0., 0., ..., 0., 2., 0.],
       [0., 0., 1., ..., 1., 1., 0.]])

but we still need to say delimiter=:

In [132]:
#Your code goes here

To understand what’s going on, and make our own functions easier to use, let’s re-define our offset_mean function like this:

In [133]:
def offset_mean(data, target_mean_value=0.0):
    """Return a new array containing the original data
       with its mean offset to match the desired value, (0 by default).

    Examples
    --------
    >>> offset_mean([1, 2, 3])
    array([-1.,  0.,  1.])
    """
    #Your code goes here

The key change is that the second parameter is now written target_mean_value=0.0 instead of just target_mean_value. If we call the function with two arguments, it works as it did before:

In [134]:
#Your code goes here

But we can also now call it with just one parameter, in which case target_mean_value is automatically assigned the default value of 0.0:

In [135]:
#Your code goes here

This is handy: if we usually want a function to work one way, but occasionally need it to do something else, we can allow people to pass a parameter when they need to but provide a default to make the normal case easier. The example below shows how Python matches values to parameters:

In [136]:
#Your code goes here

As this example shows, parameters are matched up from left to right, and any that haven’t been given a value explicitly get their default value. We can override this behavior by naming the value as we pass it in:

In [137]:
#Your code goes here

With that in hand, let’s look at the help for numpy.loadtxt:

In [138]:
#Your code goes here

There’s a lot of information here, but the most important part is the first couple of lines:

This tells us that loadtxt has one parameter called fname that doesn’t have a default value, and eight others that do. If we call the function like this:

then the filename is assigned to fname (which is what we want), but the delimiter string ',' is assigned to dtype rather than delimiter, because dtype is the second parameter in the list. However ',' isn’t a known dtype so our code produced an error message when we tried to run it. When we call loadtxt we don’t have to provide fname= for the filename because it’s the first item in the list, but if we want the ',' to be assigned to the variable delimiter, we do have to provide delimiter= for the second parameter since delimiter is not the second parameter in the list.

### Readable functions
Consider these two functions:

In [None]:
def s(p):
    a = 0
    for v in p:
        a += v
    m = a / len(p)
    d = 0
    for v in p:
        d += (v - m) * (v - m)
    return numpy.sqrt(d / (len(p) - 1))

def std_dev(sample):
    sample_sum = 0
    for value in sample:
        sample_sum += value

    sample_mean = sample_sum / len(sample)

    sum_squared_devs = 0
    for value in sample:
        sum_squared_devs += (value - sample_mean) * (value - sample_mean)

    return numpy.sqrt(sum_squared_devs / (len(sample) - 1))

The functions **s** and **std_dev** are computationally equivalent (they both calculate the sample standard deviation), but to a human reader, they look very different. You probably found std_dev much easier to read and understand than s.

As this example illustrates, both documentation and a programmer’s coding style combine to determine how easy it is for others to read and understand the programmer’s code. Choosing meaningful variable names and using blank spaces to break the code into logical “chunks” are helpful techniques for producing readable code. This is useful not only for sharing code with others, but also for the original programmer. If you need to revisit code that you wrote months ago and haven’t thought about since then, you will appreciate the value of readable code!

### Return vs Print
Note that return and print are not interchangeable. print is a Python function that prints data to the screen. It enables us, users, see the data. return statement, on the other hand, makes data visible to the program. Let’s have a look at the following function:


In [139]:
#Your code goes here

Python will first execute the function add with a = 7 and b = 3, and, therefore, print 10. However, because function add does not have a line that starts with return (no return “statement”), it will, by default, return nothing which, in Python world, is called None. Therefore, A will be assigned to None and the last line (print(A)) will print None.

### Check your Understanding
“Adding” two strings produces their concatenation: 'a' + 'b' is 'ab'. Write a function called fence that takes two parameters called original and wrapper and returns a new string that has the wrapper character at the beginning and end of the original. A call to your function should look like this:


In [140]:
#Your code goes here

## Errors and Exceptions
Every programmer encounters errors, both those who are just beginning, and those who have been programming for years. Encountering errors and exceptions can be very frustrating at times, and can make coding feel like a hopeless endeavour. However, understanding what the different types of errors are and when you are likely to encounter them can help a lot. Once you know why you get certain types of errors, they become much easier to fix.

Errors in Python have a very specific form, called a traceback. Let’s examine one:

In [170]:
#Your code goes here

This particular traceback has two levels. You can determine the number of levels by looking for the number of arrows on the left hand side. In this case:

- The first shows code from the cell above, with an arrow pointing to **Line 10** (which is favorite_ice_cream()).

- The second shows some code in the function favorite_ice_cream, with an arrow pointing to **Line 8** (which is print(ice_creams[3])).

The last level is the actual place where the error occurred. The other level(s) show what function the program executed to get to the next level down. So, in this case, the program first performed a function call to the function favorite_ice_cream. Inside this function, the program encountered an error on Line 6, when it tried to run the code print(ice_creams[3]).

___
### Long Tracebacks
Sometimes, you might see a traceback that is very long -- sometimes they might even be 20 levels deep! This can make it seem like something horrible happened, but the length of the error message does not reflect severity, rather, it indicates that your program called many functions before it encountered the error. Most of the time, the actual place where the error occurred is at the bottom-most level, so you can skip down the traceback to the bottom.
___

So what error did the program actually encounter? In the last line of the traceback, Python helpfully tells us the category or type of error (in this case, it is an IndexError) and a more detailed error message (in this case, it says “list index out of range”).

If you encounter an error and don’t know what it means, it is still important to read the traceback closely. That way, if you fix the error, but encounter a new one, you can tell that the error changed. Additionally, sometimes knowing where the error occurred is enough to fix it, even if you don’t entirely understand the message.

If you do encounter an error you don’t recognize, try looking at the [ official documentation](https://docs.python.org/3/library/exceptions.html) on errors. However, note that you may not always be able to find the error there, as it is possible to create custom errors. In that case, hopefully the custom error message is informative enough to help you figure out what went wrong.

###  Syntax Errors
When you forget a colon at the end of a line, accidentally add one space too many when indenting under an if statement, or forget a parenthesis, you will encounter a syntax error. This means that Python couldn’t figure out how to read your program. This is similar to forgetting punctuation in English: for example, this text is difficult to read there is no punctuation there is also no capitalization why is this hard because you have to figure out where each sentence ends you also have to figure out where each sentence begins to some extent it might be ambiguous if there should be a sentence break or not

People can typically figure out what is meant by text with no punctuation, but people are much smarter than computers. If Python doesn’t know how to read the program, it will give up and inform you with an error. For example:

In [None]:
def some_function()
    msg = 'hello, world!'
    print(msg)
     return msg

Here, Python tells us that there is a SyntaxError on line 1, and even puts a little arrow in the place where there is an issue. In this case the problem is that the function definition is missing a colon at the end.

Actually, the function above has two issues with syntax. If we fix the problem with the colon, we see that there is also an IndentationError, which means that the lines in the function definition do not all have the same indentation:



In [None]:
def some_function():
    msg = 'hello, world!'
    print(msg)
     return msg

Both SyntaxError and IndentationError indicate a problem with the syntax of your program, but an IndentationError is more specific: it always means that there is a problem with how your code is indented.

### Variable Name Errors
Another very common type of error is called a NameError, and occurs when you try to use a variable that does not exist. For example:

In [143]:
#Your code goes here

Variable name errors come with some of the most informative error messages, which are usually of the form “name ‘the_variable_name’ is not defined”.

Why does this error message occur? That’s a harder question to answer, because it depends on what your code is supposed to do. However, there are a few very common reasons why you might have an undefined variable. The first is that you meant to use a string, but forgot to put quotes around it:

In [144]:
#Your code goes here

The second reason is that you might be trying to use a variable that does not yet exist. In the following example, count should have been defined (e.g., with count = 0) before the for loop:

In [145]:
#Your code goes here

Finally, the third possibility is that you made a typo when you were writing your code. Let’s say we fixed the error above by adding the line Count = 0 before the for loop. Frustratingly, this actually does not fix the error. Remember that variables are case-sensitive, so the variable count is different from Count. We still get the same error, because we still have not defined count:

In [146]:
#Your code goes here

### Index Errors
Next up are errors having to do with containers (like lists and strings) and the items within them. If you try to access an item in a list or a string that does not exist, then you will get an error. This makes sense: if you asked someone what day they would like to get coffee, and they answered “caturday”, you might be a bit annoyed. Python gets similarly annoyed if you try to ask it for an item that doesn’t exist:

In [147]:
#Your code goes here

Here, Python is telling us that there is an IndexError in our code, meaning we tried to access a list index that did not exist.

### File Errors

The last type of error we’ll cover today are those associated with reading and writing files: FileNotFoundError. If you try to read a file that does not exist, you will receive a FileNotFoundError telling you so. If you attempt to write to a file that was opened read-only, Python 3 returns an UnsupportedOperationError. More generally, problems with input and output manifest as OSErrors, which may show up as a more specific subclass; you can see the [list in the Python docs](https://docs.python.org/3/library/exceptions.html#os-exceptions). They all have a unique UNIX errno, which is you can see in the error message.


In [148]:
#Your code goes here

One reason for receiving this error is that you specified an incorrect path to the file. For example, if I am currently in a folder called myproject, and I have a file in myproject/writing/myfile.txt, but I try to open myfile.txt, this will fail. The correct path would be writing/myfile.txt. It is also possible that the file name or its path contains a typo.

A related issue can occur if you use the “read” flag instead of the “write” flag. Python will not give you an error if you try to open a file for writing when the file does not exist. However, if you meant to open a file for reading, but accidentally opened it for writing, and then try to read from it, you will get an UnsupportedOperation error telling you that the file was not opened for reading:

In [149]:
#Your code goes here

These are the most common errors with files, though many others exist. If you get an error that you’ve never seen before, searching the Internet for that error type often reveals common reasons why you might get that error.

### Check your Understanding
1. Read the code below, and (without running it) try to identify what the errors are.
2. Run the code, and read the error message. What type of NameError do you think this is? In other words, is it a string with no quotes, a misspelled variable, or a variable that should have been defined but was not?
3. Fix the error.
4. Repeat steps 2 and 3, until you have fixed all the errors.

In [None]:
for number in range(10):
    # use a if the number is a multiple of 3, otherwise use b
    if (Number % 3) == 0:
        message = message + a
    else:
        message = message + 'b'
print(message)

## Defensive Programming
So far we have learned the basic tools of programming: variables and lists, file I/O, loops, conditionals, and functions. What they haven’t done is show us how to tell whether a program is getting the right answer, and how to tell if it’s still getting the right answer as we make changes to it.

To achieve that, we need to:

- Write programs that check their own operation.
- Write and run tests for widely-used functions.
- Make sure we know what “correct” actually means.
  
The good news is, doing these things will speed up our programming, not slow it down!

### Assertions
The first step toward getting the right answers from our programs is to assume that mistakes will happen and to guard against them. This is called defensive programming, and the most common way to do it is to add assertions to our code so that it checks itself as it runs. 

An assertion is simply a statement that something must be true at a certain point in a program. When Python sees one, it evaluates the assertion’s condition. If it’s true, Python does nothing, but if it’s false, Python halts the program immediately and prints the error message if one is provided.

For example, let's write the code that calculates the sum of a list of positive numbers, asserting that each number is greater than zero, and then prints the total sum. The code will halts as soon as the loop encounters a value that isn’t positive:

In [150]:
#Your code goes here

Programs like the Firefox browser are full of assertions: 10-20% of the code they contain are there to check that the other 80–90% are working correctly. Broadly speaking, assertions fall into three categories:

- A precondition is something that must be true at the start of a function in order for it to work correctly.

- A postcondition is something that the function guarantees is true when it finishes.

- An invariant is something that is always true at a particular point inside a piece of code.

For example, suppose we are representing rectangles using a tuple of four coordinates (x0, y0, x1, y1), representing the lower left and upper right corners of the rectangle. In order to do some calculations, we need to normalize the rectangle so that the lower left corner is at the origin and the longest side is 1.0 units long. 

The following function will do this, but checks that its input is correctly formatted and that its result makes sense:

In [151]:
def normalize_rectangle(rect):
    """Normalizes a rectangle so that it is at the origin and 1.0 units long on its longest axis.
    Input should be of the format (x0, y0, x1, y1).
    (x0, y0) and (x1, y1) define the lower left and upper right corners
    of the rectangle, respectively."""
    
    #Your code goes here


The preconditions on lines 6, 8, and 9 catch invalid inputs:

In [153]:
# missing the fourth coordinate
#Your code goes here

In [154]:
# X axis inverted
#Your code goes here

The post-conditions on lines 20 and 21 help us catch bugs by telling us when our calculations might have been incorrect. For example, if we normalize a rectangle that is taller than it is wide everything seems OK:

In [155]:
#Your code goes here

but if we normalize one that’s wider than it is tall, the assertion is triggered:

In [156]:
#Your code goes here

Re-reading the function, we realize that on line 14, we should divide the height by the width (dy/dx) instead of the width by the height (dx/dy). If we hadn’t included the final assertion checks, the function would have returned a result that looked right but was actually incorrect. Debugging this mistake would have taken longer than simply adding the assertion checks, which help catch errors early.

But assertions aren’t just about catching errors: they also help people understand programs. Each assertion gives the person reading the program a chance to check (consciously or otherwise) that their understanding matches what the code is doing.

Most good programmers follow two rules when adding assertions to their code. The first is, fail early, fail often. The greater the distance between when and where an error occurs and when it’s noticed, the harder the error will be to debug, so good code catches mistakes as early as possible.

The second rule is to turn bugs into assertions or tests. When you fix a bug, write an assertion that will catch the same mistake if it happens again. Mistakes often occur in similar places, or could be repeated when you change the code later. Adding assertions helps prevent old problems from coming back and can warn future programmers (including yourself) that this part of the code is tricky. It saves time by catching errors early.

### Test-Driven Development
An assertion checks if something is true in a program, but we also need to test if the entire code works correctly by checking the output for a specific input. For example, if we want to find where multiple time series overlap, we represent each series by its start and end times. The goal is to find the largest time range that all series share.

![graph](https://swcarpentry.github.io/python-novice-inflammation/fig/python-overlapping-ranges.svg)

Many beginners would solve this by writing a function, testing it a few times, and fixing it if necessary. However, a better approach is test-driven development (TDD). With TDD, you write small tests first, then create the function to pass those tests. This helps avoid confirmation bias and clarifies what the function should actually do.

We start by defining an empty function range_overlap:

In [67]:
def range_overlap(ranges):
    pass

Next, we create some test cases to check how the function should work:

In [157]:
#Your code goes here

The error is actually reassuring: we haven’t implemented any logic into range_overlap yet, so if the tests passed, it would indicate that we’ve written an entirely ineffective test.

And as a bonus of writing these tests, we’ve implicitly defined what our input and output look like: we expect a list of pairs as input, and produce a single pair as output.

We also need a test for when the ranges don’t overlap, like this:

In [158]:
#Your code goes here

In these cases, we expect the function to return None because there’s no overlap.

Again, we get an error because we haven’t written our function, but we’re now ready to do so:

In [160]:
#Your code goes here

We calculate the left endpoint of the overlap by finding the maximum of all the starting points because the overlap can only start at the latest of those points. Similarly, we calculate the right endpoint by finding the minimum of the ending points because the overlap can only end at the earliest of those points.

To make running them easier, let’s put them all in a function:

In [161]:
#Your code goes here

We can now test range_overlap with a single function call:

In [162]:
#Your code goes here

The first test that was supposed to produce None fails, so we know something is wrong with our function. We don’t know whether the other tests passed or failed because Python halted the program as soon as it spotted the first error. 

Still, some information is better than none, and if we trace the behavior of the function with that input, we realize that we’re there is a mistake in the **range_overlap function** comes from the way the variables max_left and min_right are initialized. They are both set to fixed values, 0.0 and 1.0, instead of being initialized based on the actual data from the input ranges.

For example, in the first test, the function should return None because there is no overlap between the intervals (0.0, 1.0) and (5.0, 6.0). However, since max_left starts at 0.0 and min_right starts at 1.0, the function mistakenly calculates an overlap even when there shouldn't be one.

This highlights a programming principle: you should always initialize variables based on the input data, not with fixed values that might not fit all cases.

## Debugging 
Once testing has uncovered problems, the next step is to fix them. Many novices do this by making more-or-less random changes to their code until it seems to produce the right answer, but that’s very inefficient (and the result is usually only correct for the one case they’re testing). The more experienced a programmer is, the more systematically they debug, and most follow some variation on the rules explained below.



### Know What It’s Supposed to Do
The first step in debugging is understanding the expected output. Without this, it’s difficult to identify or fix issues. Writing a test case that specifies what inputs produce what results helps guide the debugging process. If we can't define what the correct output looks like, it’s hard to know when the issue is fixed.

Debugging scientific software is often more challenging because the output might be unknown, which is why we're running the analysis in the first place. Common strategies include:

- Test with simplified data: Use small, simple datasets where you can manually calculate the correct result (e.g., for one or two records).

- Test simplified cases: Start with basic, stripped-down versions of your model.

- Compare to an oracle: An oracle is a trusted source (e.g., existing software, experimental data) that gives you a reference for correct results.

- Check conservation laws: Ensure that quantities like mass, energy, or the number of data records remain consistent unless a change is expected.

- Visualize: Use visualizations to spot discrepancies, but this should be a last resort, as comparing visualizations automatically is difficult.

Before you start debugging, make sure you know what your program is supposed to do. Write test cases to check if the output matches what you expect.

In [163]:
# Function to add two numbers
#Your code goes here

# Test case
#Your code goes here

If the test fails, the program will let you know, and you can start debugging from there.

### Make It Fail Every Time
When debugging code, it's crucial to find a consistent test case that causes the error to happen every time. If a bug only appears occasionally (an intermittent issue), it becomes much harder to track down and fix. Imagine having to run your program multiple times, hoping for the error to show up just once — it’s frustrating and time-consuming. The key to effective debugging is ensuring that the bug occurs reliably, so you can focus on fixing it without wasting time.

If the failure only happens randomly, you might not catch it during testing. You could scroll past the output where the bug occurs, or worse, you might think you’ve fixed the issue just because the bug didn’t show up during a particular run. By making sure the bug fails every time, you create a solid foundation for investigating the cause.

Let’s say we have a simple function that divides two numbers:



In [164]:
#Your code goes here

If we try to divide by zero, Python will throw a ZeroDivisionError, but let’s say this error only happens occasionally because of how the function is called.

- **Bad debugging**: If we only sometimes test with zero, the failure may not always occur, making it hard to identify the problem.

- **Good debugging**: We can create a test that always triggers the error.

In [166]:
#Your code goes here

By making it fail consistently, we can easily debug and fix it, like adding a check for zero before dividing:

In [167]:
#Your code goes here

Now, we’ve handled the error. Testing will no longer fail, and the bug is fixed.

### Make It Fail Fast
To debug efficiently, your code should fail quickly. If a bug takes too long to show up, you waste valuable time waiting instead of diagnosing the problem. The goal is to create tests that reveal errors as soon as possible.

Key Points
- **Quick Failures**: Design tests that trigger errors immediately, so you can quickly identify the source of the problem.
- **Simplify Tests**: Instead of running complex scenarios, start with straightforward inputs that will fail fast.

Imagine we have a function that calculates the average of a list:

In [168]:
#Your code goes here

Solution:

In [169]:
#Your code goes here

**Why This is Effective**
- **Immediate Feedback**: If you run the above code, it raises a ValueError right away, alerting you that you cannot calculate an average from an empty list.
- **Focused Debugging**: You can address the issue of handling empty lists without needing to wait for the function to process data.


By designing your tests to fail quickly, you can efficiently identify and fix issues, making the debugging process smoother and faster.

### Change One Thing at a Time, For a Reason
When debugging, it's crucial to change one thing at a time for several reasons:

- **Isolation of Issues**: Modifying a single aspect allows you to clearly see its effect, making it easier to identify what fixes the problem.

- **Avoiding New Bugs**: Changing multiple parts of the code simultaneously increases the risk of introducing new issues, complicating the debugging process.

- **Simplified Testing**: After each change, you can run tests to verify the fix, ensuring that your code functions as intended before moving on.

- **Building Understanding**: This approach deepens your understanding of how different parts of your code interact, helping you learn from mistakes and successes.

- **Documentation and Communication**: It makes it easier to communicate changes with team members, providing clear documentation of your debugging process.

In essence, changing one thing at a time leads to more efficient and effective debugging, resulting in cleaner and more reliable code.

### Keep Track of What You’ve Done
Keeping a record of your debugging process is crucial for several reasons:

- **Reproducibility**: It allows you to repeat tests and revisit changes, ensuring a clear understanding of what worked and what didn’t.

- **Efficiency**: Tracking your efforts prevents wasting time on the same issues by reminding you of past tests and their outcomes.

- **Clarity**: Detailed notes help clarify your thought process and identify patterns, reducing confusion when solving complex problems.

- **Collaboration**: Clear records facilitate communication when seeking help from others, making it easier for them to understand the problem and provide useful guidance.

- **Learning**: Documenting your debugging efforts enables you to learn from past mistakes and improve your coding skills over time.

In essence, systematic record-keeping enhances debugging effectiveness and contributes to your growth as a programmer.

### Be Humble
If you can’t find a bug in 10 minutes, ask for help. Explaining the issue to someone else often clarifies your thinking. Others, not emotionally attached to the code, can also spot errors more easily.

Learning from mistakes is key. Programmers tend to repeat the same errors, so identifying why an issue occurred helps prevent it in the future. Breaking code into smaller, testable pieces and learning from mistakes makes coding faster and more efficient in the long run.