# Software Development

## Put Reusable code in modules

You might remember that we re-factored our decorator to use a generic cache. In the same spirit, we would like to refactor our code to use our `Vector` claa more generically.

To be able to reuse our Vector class at multiple places, we put it into a file called `vector.py` with the following code

```python
import reprlib
class Vector:
    
    def __init__(self, lst):
        self._storage = lst
        
    def __len__(self):
        return len(self._storage)
    
    def __getitem__(self, i):
        return self._storage[i]

    def __add__(self, other_vector):
        try:
            sumlist = []
            for i, _ in enumerate(other_vector):
                sumlist.append(self._storage[i] + other_vector[i])
            return Vector(sumlist)
        except TypeError:
            return NotImplemented
    
    def __radd__(self, other_vector):
        # turn other + self around
        return self + other_vector
    
    def __mul__(self, scalar):
        return Vector([item*scalar for item in self._storage])

    def __rmul__(self, scalar):
        return self*scalar

    def __repr__(self):
        components = reprlib.repr(self._storage)
        return f"Vector({components})"
```

The `reprlib` module usage allows us to print a truncated representation of the list

In [1]:
import vector
v1 = vector.Vector([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
v1

In [2]:
v1 = vector.Vector([4, 2, 7])
v1 + [-1, -1, 3]

In [3]:
[-1, -1, 3] + v1

In [5]:
v1 + 5 # not yet working

In [6]:
5*v1

In [7]:
v1*5

## Document and and Test before programming more

As we make code changes, we want to be sure that our code is not introducing errors into the computations on vectors.

Thus we take all the examples we have been collecting and put them into a test area. Now we'll make sure these examples *ran the way they ran before* when we make *any code changes*.

This is called **testing**.

### Document your code using Docstrings

We'll start by introducing the simplest way to do this: *doctests*. Doctests puts tests into the *documentation strings* of modules, classes and functions.

But we have not used documentation strings so far: these are a great way to document what our function, class, or module is doing.

Now the function below really does not need documentation, but bear with me, its a nice small example that illustrates docstrings.

Docstrings are strings within double-quotes that document modules, classes, or functions. They come in two flavors:

(1) The single line flavor:

In [1]:
def square(x):
    "Takes a number x and returns its square"
    return x*x

Look at the line just below the function definition. It describes what the function is doing. This is a dostring.

(2) The multi-line flavor

In [2]:
def square(x):
    """
    Takes a number x and returns its square
    
    Parameters
    ----------
    
    x : number
        An int or floating-point number
       
    Returns
    -------
    
    number
        A number of the same type as the input
        
    
    """
    return x*x

Here we illustrate the [Numerical and Scientific Python docstring convention](https://numpydoc.readthedocs.io/en/latest/format.html) (the numpy docstring conventions).

For a lot of functions and classes this seems excessive (certainly is for `square`). But the numpy conventions are great when we want to communicate what our functions and classes do.

### Write Doctests

More importantly for us, though, we can use these docstrings to incorporate tests.

In [9]:
def square(x):
    """
    Takes a number x and returns its square
        
    >>> square(5)
    25
    >>> square(5.0)
    25.0
    """
    return x*x

The way you specify tests is my writing the code to be tested at a faake prompt ">>>". Then a space and the code. For example `>>> square(5)`. then on the next line, the expected answer `25` all by itself. More details [here](https://docs.python.org/3/library/doctest.html).

The advantage of this format is that you have now provided the user of your function some examples as well..and we all know that examples are the documentation that most people read. Infact, probably the only documentation

You can test your function like so:

In [10]:
import doctest
doctest.run_docstring_examples(square, globals(), verbose=True)

Lets mess up the implementation of square to see how it fails:

In [11]:
def square(x):
    """
    Takes a number x and returns its square
        
    >>> square(5)
    25
    >>> square(5.0)
    25.0
    """
    return x*x*x

In [12]:
doctest.run_docstring_examples(square, globals(), verbose=True)

You can see the failures since we implemented a cube instead of a square.

### Tests in modules

Usually when we want to test we'll put our tests in a file which represents are module. So lets do that for our vector class. We create a file `vector2.py` which now has tests in it.

The tests we put in are just the examples we have been "informally" testing out vector class so far with!

In [13]:
%pycat vector2.py

The `if __name__ == "__main__":` section at the bottom will be run when you test. You test by going to the command line and doing `python vector2.py -v`. This is simulated in the notebook by putting a bang (!) before the command

In [14]:
!python vector2.py -v

You can see all our tests pass. You can also run these interactively in the jupyter notebook:

## Test Driven Programming

So we wrote some code, and then tested it. Sometimes you might want to write the tests first, before you have done the development, to establish what your code should do.

For instance, our addition seems to truncate the final vector, which for n-dimensional vector algebra, seems just wierd. If you are adding a smaller dimensional vector to a larger dimensional vector, it makes sense to "embed" the smaller dimensional vector into the space of the larger dimensional one. Thus we will want to pad the smaller vector with zeroes.

We also want to support dot products. The python operator for dot products is `@`. (This is actually the operator for matrix multiplication, but as we shall see, dot products are a special case of matrix multiplication.) For 2 vectors, we compute the dot product by multiplying the vectors componentwise and then summing the multiplied pairs. We encounter the same dimensionality issue here as well, which we fix by padding with zeros.

### Adding new tests

So we add tests corresponding to these:

```python
>>> v1 = Vector([4, 2, 7])
>>> v2 = Vector([1, -1, 3])
>>> v1 + range(2)
Vector([4, 3, 7])
>>> range(2) + v1
Vector([4, 3, 7])
>>> λ = 3
>>> v1*λ
Vector([12, 6, 21])
>>> λ*v1
Vector([12, 6, 21])
>>> v1@v2
23
>>> v2@v1
23
```

In [15]:
%pycat vector3.py

Now we run our tests:

In [16]:
import vector3
doctest.testmod(vector3, verbose=True)

What happened here? All the tests which have anything to do with adding lower dimensional sequences in failed. Also the dot product ones!

### Add features to your code to fix the failed tests

Lets write code to fix this! We first write a function that pads any two sequences to the longest sequence

In [11]:
def pad_vectors(left, right):
    maxlen = max(len(left), len(right))
    outleft, outright = [], []
    for i in range(maxlen):
        if i > len(left) - 1:
            leftval = 0
        else:
            leftval = left[i]
        outleft.append(leftval)
        if i > len(right) - 1:
            rightval = 0
        else:
            rightval = right[i]
        outright.append(rightval)
    return outleft, outright

In [12]:
pad_vectors(range(2), range(5,10))

In [13]:
pad_vectors([1, 2, 3], range(10))

We'll incorporate these examples as tests. Because we implemented the sequence protolcol for the vector class, we know this will work for vectors as well.

```python
def pad_vectors(left, right):
    """
    pad sequence left or right with zeros to make
    both sequences the length of the longest sequence

    >>> pad_vectors(range(2), range(5,10))
    ([0, 1, 0, 0, 0], [5, 6, 7, 8, 9])
    >>> pad_vectors([1, 2, 3], range(10))
    ([1, 2, 3, 0, 0, 0, 0, 0, 0, 0], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
    """
    ...
```

Now we can use this in addition and matrix multiplication to do the needful:

```python
def __add__(self, other_vector):
        """
        Adding 2 vectors, pads to longest length
        """
        try:
            left, right = pad_vectors(self, other_vector)
            return Vector([a + b for a, b in zip(left, right)])
        except TypeError:
            return NotImplemented
```

```python
def __matmul__(self, other_vector): 
        try:
            left, right = pad_vectors(self, other_vector)
            return sum([a * b for a, b in zip(left, right)]) 
        except TypeError:
            return NotImplemented 
```

The entire code is here, below:

In [17]:
%pycat vector4.py

And running the tests give us:

In [18]:
import vector4
doctest.testmod(vector4, verbose=True)

### Using the new class

Using the new vector class is nice, but we still fail on translation:

In [17]:
v1 = vector4.Vector([1,2,3])
v1 + 5

The natural thing to do here  is to add 5 to every component. This is called *broadcasting*, and we shall soon see how to use numpys *ndarray* to achieve this.