## 6.3 Developing data types

In the rest of this chapter I will show you how to implement the sequence ADT
with static arrays and other data structures.

A static array cannot grow, so it's convenient to distinguish
sequences that are  **bounded**, i.e. that have a maximum length by design,
from those that aren't.
The length of any sequence is limited by the available memory, but bounded
sequences have their **capacity** (maximum length) set when they're created.
The sequence is full when its length (number of items it has) equals
its capacity (number of items it can hold).
When a bounded sequence is full, no item can be inserted or appended.
A sequence with capacity zero never grows and remains empty.
An unbounded sequence can be seen as a sequence with infinite capacity.

Introducing the notion of capacity forces us to change the sequence ADT
definition. First, we should add a function to obtain a sequence's capacity.
Second, we must change the preconditions of the insertion and append
operations: they aren't defined for full sequences.
Third, for operations that produce a new sequence, like slicing and concatenation, what should the capacity of the output sequence be? For example,
when taking a slice of length _s_ from a sequence with capacity _c_,
should the capacity of the output sequence be _s_, _c_ or some other value?

It's difficult to decide without knowing how the slice will be used.
It's best to add the new capacity as a further input of the operation,
so that the user can set a meaningful value for their intended use of the slice.
The concatenation operation should also have the capacity of the output sequence as an additional input.
Both operations need an additional precondition for the new input:
the capacity set by the user must be at least the length of the output, otherwise the slice or the concatenation won't fit in the new sequence.

To sum up, the plan to implement the sequence ADT with a static array requires
the concept of capacity, which leads to a new operation and changes to others.

<div class="alert alert-warning">
<strong>Note:</strong> When implementing a restricted version of an ADT,
the definition of some operations may have to change.
</div>

### 6.3.1 Abstract classes

Before we write each implementation of the sequence ADT,
it's convenient to define the ADT itself as an **abstract class**:
a class that won't be instantiated.
The abstract class lists the available methods (operations),
but doesn't implement all of them.
Each sequence data type will be a subclass of the abstract class,
using a particular data structure and implementing the methods.
Each sequence object will be an instance of a subclass,
not of the abstract class.

In M269 we define an abstract class like any other class, but without
an `__init__` method because the abstract class isn't meant to be instantiated.
This in turn leads to most methods not being implemented,
as there are no instance variables to access.
The [Zen of Python](../05_TMA01-1/05_3_coding_style.ipynb#5.3-Coding-style) states that
explicit is better than implicit, so we'll use Python's `pass` statement
to reinforce that the method does nothing.

For the new operation that returns a sequence's capacity,
we must somehow represent infinity to handle unbounded sequences.
Fortunately, that's easy.
The IEEE 754&nbsp;standard that defines floating-point numbers also includes
special values to represent positive and negative infinity.
Python's `math` module defines the float constant `inf` for positive infinity.
Negative infinity is simply `-inf`.

Here's a partial definition of the sequence ADT in Python.
It follows the templates that define the sequence operations in
[Section&nbsp;4.1](../04_Iteration/04_1_sequences.ipynb#4.1-The-Sequence-ADT) and
[Section&nbsp;4.6](../04_Iteration/04_6_lists.ipynb#4.6.1-Modifying-sequences), modified to
take the sequence's capacity into account and use Python's syntax.

In [1]:
# this code is also in m269_sequence.py

class Sequence:
    """The sequence ADT."""

    def capacity(self) -> float:
        """Return how many items the sequence can hold.

        Postconditions: if the capacity is only limited by memory,
        the output is math.inf,
        otherwise it's the non-negative integer set at creation time
        """
        pass

    def length(self) -> int:
        """Return the number of items in the sequence.

        Postconditions: 0 <= self.length() <= self.capacity()
        """
        pass

    def get_item(self, index: int) -> object:
        """Return the item at position index.

        Preconditions: 0 <= index < self.length()
        Postconditions: the output is the n-th item of self, with n = index + 1
        """
        pass

    def set_item(self, index: int, item: object) -> None:
        """Replace the item at position index with the given one.

        Preconditions: 0 <= index < self.length()
        Postconditions: post-self.get_item(index) == item
        """
        pass

    def insert(self, index: int, item: object) -> None:
        """Insert item at position index.

        Preconditions: 0 <= index <= self.length() < self.capacity()
        Postconditions: post-self is the sequence
        pre-self.get_item(0), ..., pre-self.get_item(index - 1),
        item, pre-self.get_item(index), ...,
        pre-self.get_item(pre-self.length() - 1)
        """
        pass

    def append(self, item: object) -> None:
        """Add item to the end of the sequence.

        Preconditions: self.length() < self.capacity()
        Postconditions: post-self is the sequence
        pre-self.get_item(0), ..., pre-self.get_item(pre-self.length() - 1), item
        """
        self.insert(self.length(), item)

    def __str__(self) -> str:
        """Return a string representation of the sequence.

        Postconditions: the output uses Python's syntax for lists
        """
        items = []
        for index in range(self.length()):
            items.append(self.get_item(index))
        return str(items)

Methods `append` and `__str__` can already be implemented because
they don't need to access any instance variable.
This saves us from repeatedly implementing them in each subclass because
a subclass inherits the methods of its superclass, unless it redefines them.
This gives subclasses of `Sequence` the flexibility
to either use the above `append` and `__str__` methods or
to implement a more efficient version.

Note that the `__str__` method uses the `append` method of the `list` class:
methods in different classes can have the same name.

#### Exercise 6.3.1

Add to the `Sequence` class a `has` method that implements
the membership operation in terms of the other operations.

[Hint](../31_Hints/Hints_06_3_01.ipynb)
[Answer](../32_Answers/Answers_06_3_01.ipynb)

#### Exercise 6.3.2 (optional)

Add a method header, docstring and `pass` statement for the
[remove operation](../04_Iteration/04_6_lists.ipynb#4.6.1-Modifying-sequences).
Later optional exercises ask you to implement the method.

### 6.3.2 Testing data types

Having defined the methods to be implemented, we can write tests for them.
Each test takes an empty sequence instance, created by the subclass to be
tested, applies operations that modify the sequence and finally uses the
operations that inspect sequences to check the result.

In the following, I create a sequence of natural numbers in different ways,
by inserting, appending and replacing items, and check the result.
I must make sure that I don't create sequences longer than their capacity.

The final argument of `check` is always the sequence being tested,
so that it's printed if the test fails.

In [2]:
# this code is also in m269_sequence.py

def test_items(test: str, items: Sequence) -> None:
    """Check that items is the sequence 0, 1, 2, ..., length - 1."""
    for index in range(items.length()):
        check(test + ': n-th item',
            items.get_item(index), index, items)
    check(test + ': length <= capacity',
          items.length() <= items.capacity(), True, items)

def test_init(items: Sequence) -> None:
    """Check that items is the empty sequence."""
    check('init length', items.length(), 0, items)
    test_items('init', items)

def test_append(items: Sequence, length: int) -> None:
    """Check a sequence created with successive appends.

    Preconditions: items is empty; 0 <= length < items.capacity()
    """
    for number in range(min(length, items.capacity())):
        items.append(number)
    test_items('append', items)

def test_insert_start(items: Sequence, length: int) -> None:
    """Check a sequence created by successive inserts at index 0.

    Preconditions: items is empty; 0 <= length <= items.capacity()
    """
    for number in range(min(length, items.capacity()) - 1, -1, -1):
        items.insert(0, number)
    test_items('insert at 0', items)

def test_set_item(items: Sequence, length: int) -> None:
    """Check a sequence created by replacing all items.

    Preconditions: items is empty; 0 <= length <= items.capacity()
    """
    for number in range(min(length, items.capacity())):
        items.append(None)
    for number in range(min(length, items.capacity())):
        items.set_item(number, number)
    test_items('set item', items)

We'll run the tests after implementing each subclass  of `Sequence`
in the rest of the chapter.

#### Exercise 6.3.3

Add to the previous code cell a test function for the membership method.
Like the other test functions,
it takes an empty sequence and a non-negative length, and returns nothing.

After writing the function, run the cell to add your function to the file.

[Hint](../31_Hints/Hints_06_3_03.ipynb)
[Answer](../32_Answers/Answers_06_3_03.ipynb)

#### Exercise 6.3.4 (optional)

Add to the code cell a function to test the remove operation.
The test function has the same inputs and output as the previous one, but
with precondition `length > 0`: we can't remove items from an empty sequence.

[Hint](../31_Hints/Hints_06_3_04.ipynb)

⟵ [Previous section](06_2_static_array.ipynb) | [Up](06-introduction.ipynb) | [Next section](06_4_bounded.ipynb) ⟶