# Code refactoring

In this challenge you will practice:

- Moving code from notebook to `.py` files
- Writing tests and TDD
- Refactoring code
- Good `git commit` practices

# Moving code to `.py` files

We have, and will be, coding a lot in 📚 notebooks.

As we write more code, we'll want to move some of our code into 🐍 `.py` files.

## Why❓

- To **simplify our notebooks** and only keep the essence in the notebook
- To **reuse** the same code in multiple notebooks, e.g. to fetch and clean our source data
- To be able to **test** our code
- To use in our final **apps in production**

## 🎯 Our task: calculate the Manhattan distance

<img alt="Manhattan vs Euclidean distance" src="https://wagon-public-datasets.s3.amazonaws.com/data-science-images/lectures/manhattan_distance.png" width=500>

* **Euclidean distance**: The straight-line distance between two points in space.
  Formula:

  $$
  \sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2}
  $$

* **Manhattan distance**: The sum of the absolute differences of the coordinates (like navigating a grid).
  Formula:

  $$
  |x_2 - x_1| + |y_2 - y_1|
  $$


First, we 📝 draft some code in a notebook cell:

In [None]:
a = (1, 1)
b = (4, 5)

d_x = b[0] - a[0]
d_y = b[1] - a[1]

distance = d_x + d_y
distance

(Yes, there's a problem with this code, but let's pretend we didn't see it.)

Looks good! But we can't reuse this code.

Let's 🛠️ refactor this into a function `manhattan(a, b)` that will take two points, i.e. two tuples, as its arguments:

In [None]:
# YOUR CODE HERE

Try your function:

In [None]:
a = (1, 1)
b = (4, 5)
manhattan(a, b)

Better! But we can only use it in this 📚 notebook.

Let's 🚚 move it into a  🐍 `utils/distances.py` file.

Copy paste your function definition in the `.py` file.

Now we can import it from there, and delete our draft code from the notebook:

In [None]:
from utils.distances import manhattan

manhattan(a, b)

### Check your code!

In [None]:
from nbresult import ChallengeResult

a = (1, 1)
b = (4, 5)

result = ChallengeResult('manhattan_from_notebook',
    manhattan=manhattan(a,b),
    manhattan_reverse=manhattan(b,a)
)
result.write()

In [None]:
print(result.check())

That didn't work out very well. You only passed one test. Why are you not passing the second one?

You should be able to find out why from the test results, but if you're in doubt run the cell below.

In [None]:
print("Distance between a and b:", manhattan(a,b))
print("Distance between b and a:", manhattan(b,a))

Correct your `manhattan()` function in the `utils/distances.py` function:

Find this line:

```python
distance = d_x + d_y
```

and replace it by:
```python
distance = abs(d_x + d_y)
```

(Yes, that's not correct either, we know. Just copy-paste it for now. You'll see why later.)

Then try the cell above again. Does it make any difference?

No, your notebook is still running the previous version of your code. You have to restart your notebook, and then rerun all the code. 

That's annoying. Fortunately there's an easier way.

## Autoreload

💡 Call the IPython [**`autoreload`**](https://ipython.readthedocs.io/en/stable/config/extensions/autoreload.html) extension to avoid restarting the kernel everytime you modify the `.py` within your package.

Add this at the top of your notebook, restart the kernel, and start the code again.

```python
%load_ext autoreload
%autoreload 2
```

From now, the kernel will autoreload the code you imported. So whenever you change your code, you will be using the new version in your notebook. Nice!

## Committing, the proper way

We wrote our first working function!

Now is a good time to `commit`!

Commit small, and often!

Which files would you commit?

If you don't know, go back into your terminal, and run `git status` to see which files were modified.

Let's stage our `.py` file.

In your terminal:
    
```bash
git add utils/distances.py
```

Then commit, with a meaningful message.

We'll follow the ["Conventional Commits" specification](https://www.conventionalcommits.org) for our commit messages.

```bash
git commit -m "feat(utils): manhattan distance"
```

# Testing

Let's write some more tests for our `manhattan()` function!

First, in the `tests` folder, create a `test_manhattan.py` file for our test:

```bash
touch tests/test_manhattan.py
```

Add some tests:

```python
from utils.distances import manhattan

def test_manhattan():
    assert manhattan((0, 0), (0, 0)) == 0
    assert manhattan((0, 0), (1, 1)) == 2
    assert manhattan((0, 0), (1, 0)) == 1
    assert manhattan((0, 0), (0, 1)) == 1
    assert manhattan((0, 0), (-1, 0)) == 1
    assert manhattan((0, 0), (0, -1)) == 1
    assert manhattan((0, 0), (-1, -1)) == 2
    assert manhattan((0, 0), (1, -1)) == 2
    assert manhattan((0, 0), (-1, 1)) == 2
```

Time to run the tests!

In your terminal:

```bash
pytest -v -k manhattan
```

We use the `-k` option to only test the tests with `manhattan` in their name.

Whoops! Something is wrong with our code.

👉 Fix it, and run the tests again.

Once it passes, `commit`!

Actually, do two commits:
- One with the tests
- One with your code

This way you'll be able to revert only one of the two if you ever need to.

In your terminal:

```bash
git add tests/test_manhattan.py
git commit -m "test(utils): manhattan distance"

git add utils/distances.py
git commit -m "fix(utils): manhattan distance with abs"

git push origin master
```

On Kitt you will still see one failed test. We're not done yet.

## TDD

So far we've been writing our tests after the facts. That's not the best practice.

### 🎯 Our next task: the Euclidean distance, the TDD way!


First, write tests in a new file `tests/test_euclidean.py`.

Add a test:

```python
from utils.distances import euclidean

def test_euclidean():
    assert euclidean((0, 0), (0, 0)) == 0
    assert euclidean((0, 0), (3, 4)) == 5
    assert euclidean((0, 0), (1, 0)) == 1
    assert euclidean((0, 0), (0, 1)) == 1
    assert euclidean((0, 0), (-1, 0)) == 1
    assert euclidean((0, 0), (0, -1)) == 1
    assert euclidean((0, 0), (-3, -4)) == 5
    assert euclidean((0, 0), (3, -4)) == 5
    assert euclidean((0, 0), (-3, 4)) == 5
```
<br>
And `commit` the test!

Add an empty function to `utils/distances.py`:

```python
# [...]

def euclidean(a, b):
    pass
```

Our code doesn't do anything yet. Our test should fail, right?

Let's make sure it does. In the terminal run:

```bash
pytest -v -k euclidean
```

You should have one failed test.

Good. Time to get coding. Change the function into this (wrong) code:

```python
def euclidean(a, b):
    d_x = b[0] - a[0]
    d_y = b[1] - a[1]

    distance = d_x**2 + d_y**2
    return distance
```

And test again: we're getting closer. Even if it doesn't work yet, `commit`. 

Just in case you mess up your code in what comes next ... 🙀

<br>
Fix the code, and run the tests again.

Once this specific test passes, it's good practice to run the whole test suite again:

```bash
pytest
```

You should pass all the tests now.

<br>

🏋️ You know the drill: `commit`

## Refactoring

Our two functions are very similar.

We can [**refactor**](https://en.wikipedia.org/wiki/Code_refactoring) them, to reuse the common parts.

The Manhattan (`p = 1`) and Euclidean (`p = 2`) distances are special cases of the [**Minkowski**](https://en.wikipedia.org/wiki/Minkowski_distance) distance:

$$AB = \sqrt[\raisebox{16mu}{\hspace{-2mu}$\scriptstyle\frac1p$}]{(\color {red}{b_x - a_x} \color {black})^p + (\color {teal}{b_y - a_y}) \color {black}^p}$$

Let's code Minkowski, and reuse that function to calculate Manhattan and Euclidean!

Here goes our first (wrong) attempt:

```python
def manhattan(a, b):
    return minkowski(a, b, 1)


def euclidean(a, b):
    return minkowski(a, b, 2)


def minkowski(a, b, p):
    d_x = b[0] - a[0]
    d_y = b[1] - a[1]

    distance = (d_x**p + d_y**p)**(1/p)
    return distance
```

Run the tests again, fix the code:

```python
    distance = (abs(d_x)**p + abs(d_y)**p)**(1/p)
```
<br>
Finally run the tests again.

The tests allowed us to check our refactored code in one go! How cool is that!

Sure you didn't forget something? Did you `commit`? And push?

On Kitt you should see 5/5.

Also, don't forget to commit your notebook!

## Good style

One last thing, did you achieve 10/10 for style? No? Read the `pylint` and make sure you get 10/10.

Then commit and push again.

🏁 Congratulations! You learned:

- How to move code from a notebook into `.py` files.
- To write your own tests.
- To refactor code to not repeat yourself (DRY, Don't Repeat Yourself).
- To use your test suite to make sure you didn't make an error while refactoring.