# More Tips for Better Coding

## Introduction

This chapter covers more best practice tips for coding, although the advice is geard towards what is achievable and pragmatic rather than what is obtainable by a superhuman coding machine. Here, we'll cover more practical matters such as debugging code and logging.

## Debugging code

Computers are *very* literal, so literal that unless you're perfectly precise about what you want, they will end up doing something different. When that happens, one of the most difficult issues in programming is to understand *why* the code isn't doing what you expected. When the code doesn't do what we expect, it's called a bug.

Bugs could be fundamental issues with the code you're using (in fact, the term originated because a moth causing a problem in an early computer) and, if you find one of these, you should file an issue with the maintainers of the code. However, what's much more likely is that the instructions you gave aren't quite what is needed to produce the outcome that you want. And, in this case, you might need to *debug* the code: to find out which part of it isn't doing what you expect.

Even with a small code base, it can be tricky to track down where the bug is: but don't fear, there are tools on hand to help you find where the bug is.

### Print statements

The simplest, and I'm afraid to say the most common, way to debug code is to plonk `print` statements in the code. Let's take a common example in which we perform some simple array operations, here multiplying an array and then summing it with another array:

In [None]:
import numpy as np


def array_operations(in_arr_one, in_arr_two):
    out_arr = in_arr_one*1.5
    out_arr = out_arr + in_arr_two
    return out_arr


in_vals_one = np.array([3, 2, 5, 16, '7', 8, 9, 22])
in_vals_two = np.array([4, 7, 3, 23, 6, 8, 0])

result = array_operations(in_vals_one, in_vals_two)
result

Oh no! We've got a `UFuncTypeError` here, perhaps not the most illuminating error message we've ever seen. We'd like to know what's going wrong here. The `Traceback` did give us a hint about where the issue occurred though; it happens in the multiplication line of the function we wrote.

To debug the error with print statements, we might re-run the code like this:

In [None]:
def array_operations(in_arr_one, in_arr_two):
    print(f'in_arr_one is {in_arr_one}')
    out_arr = in_arr_one*1.5
    out_arr = out_arr + in_arr_two
    return out_arr


in_vals_one = np.array([3, 2, 5, 16, '7', 8, 9, 22])
in_vals_two = np.array([4, 7, 3, 23, 6, 8, 0])

result = array_operations(in_vals_one, in_vals_two)
result

What can we tell from the values of `in_arr_one` that are now being printed? Well, they seem to have quote marks around them and what that means is that they're strings, *not* floating point numbers or integers! Multiplying a string by 1.5 doesn't make sense here, so that's our error. If we did this, we might then trace the origin of that array back to find out where it was defined and see that instead of `np.array([3, 2, 5, 16, 7, 8, 9, 22])` being declared, we have `np.array([3, 2, 5, 16, '7', 8, 9, 22])` instead and `numpy` decides to cast the whole array as a string to ensure consistency.

Let's fix that problem by turning `'7'` into `7` and run it again:

In [None]:
def array_operations(in_arr_one, in_arr_two):
    out_arr = in_arr_one*1.5
    out_arr = out_arr + in_arr_two
    return out_arr


in_vals_one = np.array([3, 2, 5, 16, 7, 8, 9, 22])
in_vals_two = np.array([4, 7, 3, 23, 6, 8, 0])

result = array_operations(in_vals_one, in_vals_two)
result

Still not working! But we've moved on to a different error now. We can still use a print statement to debug this one, which seems to be related to the shapes of variables passed into the function:

In [None]:
def array_operations(in_arr_one, in_arr_two):
    print(f'in_arr_one shape is {in_arr_one.shape}')
    out_arr = in_arr_one*1.5
    print(f'intermediate out_arr shape is {out_arr.shape}')
    print(f'in_arr_two shape is {in_arr_two.shape}')
    out_arr = out_arr + in_arr_two
    return out_arr


in_vals_one = np.array([3, 2, 5, 16, 7, 8, 9, 22])
in_vals_two = np.array([4, 7, 3, 23, 6, 8, 0])

result = array_operations(in_vals_one, in_vals_two)
result

The print statement now tells us the shapes of the arrays as we go through the function. We can see that in the line before the `return` statement the two arrays that are being combined using the `+` operator don't have the same shape, so we're effectively adding two vectors from two differently dimensioned vector spaces and, understandably, we are being called out on our nonsense. To fix this problem, we would have to ensure that the input arrays are the same shape (it looks like we may have just missed a value from `in_vals_two`).

`print` statements are great for a quick bit of debugging and you are likely to want to use them more frequently than any other debugging tool. However, for complex, nested code debugging, they aren't always very efficient and you will sometimes feel like you are playing battleships in continually refining where they should go until you have pinpointed the actual problem, so they're far from perfect. Fortunately, there are other tools in the debugging toolbox...


### Icecream and better print statements

Typing `print` statements with arguments that help you debug code can become tedious. There are better ways to work, which we'll come to, but we must also recognise that `print` is used widely in practice. So what if we had a function that was as easier to use as `print` but better geared toward debugging? Well, we do, and it's called [**icecream**], and it's available in most major languages, including Python, Dart, Rust, javascript, C++, PHP, Go, Ruby, and Java. 

Let's take an example from earlier in this chapter, where we used a `print` statement to display the contents of `in_arr_one` in advance of the line that caused an error being run. All we will do now is switch out `print(f'in_arr_one is {in_arr_one}')` for `ic(in_arr_one)`.

```python
from icecream import ic

def array_operations(in_arr_one, in_arr_two):
    # Old debug line using `print`
    # print(f'in_arr_one is {in_arr_one}')
    # new debug line:
    ic(in_arr_one)
    out_arr = in_arr_one*1.5
    out_arr = out_arr + in_arr_two
    return out_arr


in_vals_one = np.array([3, 2, 5, 16, '7', 8, 9, 22])
in_vals_two = np.array([4, 7, 3, 23, 6, 8, 0])

array_operations(in_vals_one, in_vals_two)
```

```
ic| in_arr_one: array(['3', '2', '5', '16', '7', '8', '9', '22'], dtype='<U21')
---------------------------------------------------------------------------
UFuncTypeError                            Traceback (most recent call last)
<ipython-input-6-9efd5fc1a1fe> in <module>
     14 in_vals_two = np.array([4, 7, 3, 23, 6, 8, 0])
     15 
---> 16 array_operations(in_vals_one, in_vals_two)

<ipython-input-6-9efd5fc1a1fe> in array_operations(in_arr_one, in_arr_two)
      6     # new debug line:
      7     ic(in_arr_one)
----> 8     out_arr = in_arr_one*1.5
      9     out_arr = out_arr + in_arr_two
     10     return out_arr

UFuncTypeError: ufunc 'multiply' did not contain a loop with signature matching types (dtype('<U32'), dtype('<U32')) -> dtype('<U32')
```

What we get in terms of debugging output is `ic| in_arr_one: array(['3', '2', '5', '16', '7', '8', '9', '22'], dtype='<U21')`, which is quite similar to before apart from three important differences, all of which are advantages:

1. it is easier and quicker to write `ic(in_arr_one)` than `print(f'in_arr_one is {in_arr_one}')`

2. **icecream** automatically picks up the name of the variable, `in_arr_one`, and clearly displays its contents

3. **icecream** shows us that `in_arr_one` is of `type` array and that it has the `dtype` of `U`, which stands for Unicode (i.e. a string). `<U21` just means that all strings in the array are less than 21 characters long.

**icecream** has some other advantages relative to print statements too, for instance it can tell you about which lines were executed in which scripts if you call it without arguments:


```python
def foo():
    ic()
    print('first')
    
    if 10 < 20:
        ic()
        print('second')
    else:
        ic()
        print('Never executed')

foo()
```

```
ic| <ipython-input-7-8ced0f8fcf82>:2 in foo() at 00:58:19.962
ic| <ipython-input-7-8ced0f8fcf82>:6 in foo() at 00:58:19.979
first
second
```

And it can wrap assignments rather than living on its own lines:

```python
def half(i):
    return ic(i) / 2

a = 6
b = ic(half(a))
```

```
ic| i: 6
ic| half(a): 3.0
```

All in all, if you find yourself using `print` to debug, you might find a one-time import of **icecream** followed by use of `ic` instead both more convenient and more effective.

### Debugging with the IDE

In this section, we'll learn about how your Integrated Development Environment, or IDE, can aid you with debugging. While we'll talk through the use of Visual Studio Code, which is free, directly supports Python, R, and other languages, and is especially rich, many of the features will be present in other IDEs too and the ideas are somewhat general. 

To begin debugging using Visual Studio Code, get a script ready, for example `script.py`, that you'd like to debug. If your script has an error in, a debug run will automatically run into it and stop on the error; alternatively you can click to the left of the line number in your script to create a *breakpoint* that your code will stop at anyway when in debug mode.

To begin a debug session, click on the play button partially covered by a bug that's on the left hand ribbon of the VS Code window. It will bring up a menu. Click 'Run and debug' and select 'Python file'. The debugger will now start running the script you had open. When it reaches and error or a breakpoint it will stop. 

Why is this useful? Once the code stops, you can hover over any variables and see what's 'inside' them, which is useful for working out what's going on. Remember, in the examples above, we only saw variables that we asked for. Using the debugger, we can hover over any variable we're interested in without having to decide ahead of time! We can also see other useful bits of info such as the *call stack* of functions that have been called, what local (within the current scope) and global (available everywhere) variables have been defined, and we can nominate variables to watch too.

Perhaps you now want to progress the code on from a breakpoint; you can do this too. You'll see that a menu has appeared with stop, restart, play, and other buttons on it. To skip over the next line of code, use the curved arrow over the dot. To dig in to the next line of code, for example if it's a function, use the arrow pointing toward a dot. To carry on running the code, use the play button.

This is only really scratching the surface of what you can do with IDE based debugging, but even that surface layer provides lots of really useful tools for finding out what's going on when your code executes.

## Logging

Logging is a means of tracking events that happen when software runs. An event is described by a descriptive message that can optionally contain data about variables that are defined as the code is executing.

Logging has two main purposes: to record events of interest, such as an error, and to act as an auditable account of what happened after the fact.

Although Python has a built-in logger, we will see an example of logging using [**loguru**](), a package that makes logging a little easier and has some nice settings.

Let's see how to log a debug message:

In [None]:
from loguru import logger

logger.debug("Simple logging!")

Well this is great: we have the time, the type of debug message, what bit of code it happened in including a line number, and the message itself (basically all the info we need). There are different levels of code messages, with debug just one. They are:

- CRITICAL
- ERROR
- WARNING
- INFO
- DEBUG

You can find advice on what level to use for what message [here](https://reflectoring.io/logging-levels/), but it will depend a bit on what you're using your logs for.

What we've just seen are logging messages written out to the console, which doesn't persist. This is clearly no good for auditing what happened long after the fact (and it may not be that good for debugging either) so we also need a way to write a log to a file.



## Auto-magically improving your code


### Linting


### Formatting

