# Native Data Types

Every value in Python has a **class** that determines what **type** of value it is. Values that share a class also share behavior under certain procedures, functions and operators.

Types that are built into the langauge by default are called **native data types**. So far we have only used native data types.

Properties of native data types:
- There are expressions that evaluate to values of native types, called **literals**.
- There are built-in functions and operators to manipulate values of native types.

### Python Native Numeric Types

- Floats: approximations of real numbers using _floating point representation_ with a finite amount of precision. There are minimum and maximum values of floats.
- Ints: objects representing integers exactly.
    - Subtleties and details of using floats or ints are enforced by floating point standards. Two relevant details are the fact that the division of two ints returns a float (less precision) and floats must not be used for equality tests.
- Complex: representing complex values.

### Python Native Non-numeric Types

Examples of non-numeric types are sound, image, location... An example of built-in non-numeric type is the `bool` class. Most of types must be defined by programmers using means of combination and abstraction.


# Data Abstraction

Sometimes it is needed to define a **compound data type** value from existing data types and that can be treated as a unit. To create new compound data types:
- It is needed to define how the compound data type is constructed 
- It is needed to define how it is used. 

The general technique of isolating the parts of a program that deal with how data are represented from the parts that deal with how data are manipulated is a powerful design methodology called **data abstraction**. 

Our programs should use data in such a way as to make as few assumptions about the data as possible. At the same time, a concrete data representation is defined as an independent part of the program. These two parts of a program, the part that operates on abstract data and the part that defines a concrete representation, are connected by a small set of functions that implement abstract data in terms of the concrete representation.

**Wishful thinking** consists on making some basic assumptions about the behavour of the type. These assumptions allow us to use the new data type despite not knowing how it is actually implemented (as long as the assumptions are correct).

All ADT must have **constructors** (to produce a new instance of a data type) and **selectors** (to choose a part of the ADT). If the collection of constructors and selectors satisfy some behaviour conditions expected from the data type, then the colection constitute a valid representation of a kind of data. Note that the constractors and selectors are not unique and they admit variations. We recognize data abstraction by its behavior.

**Abstraction barriers**: certain parts of the program must only act on abstract data types in a certain level of abstraction i.e. treating the ADT as a unit, treating its components, treating the components of components, etc. For instance, user defined functions of ADT are better defined not supposing much about the concrete implementation, but necessarily using selectors of the ADT. Abstraction barriers make programs easier to maintain and to modify.

An example of data abstraction are lists, a built in type in Python. A list literal has the form `list = [elem1, elem2, elem3, ...]`. The list can be evaluated as a unit, and we can also access to each element in two ways: 
- With the multiple assignment: `x, y, z, ... = list`.
- With selection operator: `x = list[0]`.
- With `getitem`: `x = getitem(list, 0)`.

Another example is an implementation of a list with a pair of data:
```python
def pair(x, y):
    """Return a function that represents a pair."""
    def get(index):
        if index == 0:
            return x
        elif index == 1:
            return y
        return get

def select(p, i):
    """Return the element at index i of pair p."""
    return p(i)
```
It consists on a constructor `pair(x, y)` and a selector `select(p, i)`.


# Sequences

A **sequence** is an ordered collection of values. Sequences are not instances of a particular built-in type or abstract data representation, but instead a collection of behaviors that are shared among several different types of data.
- Length: a sequence has a finite length. An empty sequence has length 0.
- Element selection: a sequence has an element corresponding to any non-negative integer index less than its length, starting at 0 for the first element.

## Lists and Ranges

A `list` value is a sequence that can have arbitrary length. Lists have a large set of built-in behaviors, along with specific syntax to express those behaviors. See CP for some functionalities of lists. Any values can be included in a list, including another list. Element selection can be applied multiple times in order to select a deeply nested element in a list containing lists.

`range` is another built-in type of sequence in Python, which represents a range of integers. A`range` is not a `list`, but both are a sequence. Lists of integers can be made using `list(range(...))`.

## Sequence Iteration

It is common to apply iterate over the elements of a sequence and perform some computation for each element in turn. For this task the `for` loop was created. A `for` loop can be thought as a simplified version of a `while` loop which iterates over the element values.

A `for` statement consists of a single clause with the form:
```
for <name> in <expression>:
    <suite>
```
`for` statements are executed by the following procedure:
1. Evaluate the header `<expression>`, which must yield an iterable value.
2. For each element value in that iterable value, in order:
    1. Bind `<name>` to that value in the current frame (no new frame).
    2. Execute the `<suite>`.

The exact meaning of _iterable values_ is left undefined. Sequences (and lists) are iterable values, so the procedure applies. An important consequence of this evaluation procedure is that `<name>` will be bound to the last element of the sequence after the `for` statement is executed. If `<name>` is not used in the suite, it is common to use an underscore `_`.

**Sequence unpacking**: consider a sequence whose elements consist on sequences with the same length. Doing `for x,y,z,... in sequence:` binds the first element to `x`, the second to `y`, etc. This pattern of binding multiple names to multiple values in a fixed-length sequence is called sequence unpacking.

## Sequence processing

**List comprehensions**: is an expression that evaluates a fixed expression for each element in a sequence and collects the resulting values in a list. In Python the general form of a list comprehension is:
```
[<map expression> for <name> in <sequence expression> if <filter expression>]
```

Evaluation of list comprehension:
1. Evaluate `<sequence expression>`, which must return an iterable value.
2. For each element in `<sequence expression>`, in order:
    1. Bind element value to `<name>`.
    2. Evaluate the `<filter expression>`. 
    3. If true, the `<map expression>` is evaluated.
    4. Values of `<map expression>` are collected in a list.
    
**Aggregation**: aggregate all values in a sequence into a single value. The built-in functions `sum`, `all`, `min`, and `max` are all examples of aggregation functions.

**HOFs**: HOFs can be used to express patterns in sequence procesing, such as the evaluation of an expression for each element in a sequence, filtering elements or reduce a list:
```python
def apply_to_all(map_fn, s):
    return [map_fn(x) for x in s]

def keep_if(filter_fn, s):
    return [x for x in s if filter_fn(x)]

def reduce(reduce_fn, s, initial):
    reduced = initial
    for x in s:
        reduced = reduce_fn(reduced, x)    
    return reduced
```
In Python it is more common to use list comprehension than HOFs. There are also built-in functions that perform similar tasks than `apply_to_all` and `keep_if`, `map` and `filter`, which we will later see. The syntax is:
```python
apply_to_all = lambda map_fn, s: list(map(map_fn, s))
keep_if = lambda filter_fn, s: list(filter(filter_fn, s))
```
And the `reduce` function is built into the `functools` module of the Python standard library.

## Sequence abstraction

The **richness of an abstraction** is measured in how many behaviors that abstraction includes. Sequences are very rich abstractions. In general though, most user-defined abstractions should be kept as simple as possible.

**Membership**: a value can be tested for membership in a sequence. Python has two operators `in` and `not in` that evaluate to `True` or `False` depending on whether an element appears in a sequence.

**Slicing**: a _slice_ of a sequence is any contiguous span of the original sequence, designated by a pair of integers.

## Strings

The native data type for text in Python is called a string, and corresponds to the constructor `str` (*string coercion*). Strings are another example of a rich abstraction. Strings satisfy the two basic conditions of a sequence: they have a length and they support element selection. 

Addition and multiplication work in the same way than in lists. For strings, `in` matches substrings. Also strings can extend in more than one line (*multiline literals*). Strings can be executed with `exec()`.

## Trees

**Closure property**: a method for combining data values has a closure property if the result of combination can itself be combined using the same method. In lists this means that the combination of lists as elements of another lists make a new list. Closure is the key to power in any means of combination because it permits us to create hierarchical structures.

A **tree** is a fundamental data abstraction that imposes regularity on how hierarchical values are structured and manipulated. 
- In a **recursive description**, a tree has a **root label** and a sequence of **branches**. Each branch of a tree is a tree. A tree with no branches is called a **leaf**. Any tree contained within a tree is called a **sub-tree** of that tree (such as a branch of a branch). The root of each sub-tree of a tree is called a **node** in that tree. 
- In a **relative description** each location in a tree is called a **node**. Each node has a **label** that can be any value. One node can be the **parent** or **child** of another node.

The data abstraction for a tree consists on the constructor `tree` and the selectors `label` and `branches`. A tree has a root label and a list of branches. Each branch is a tree. We begin with a simplified version.
```python
def tree(root_label, branches=[]):
    for branch in branches:
        assert is_tree(branch), 'branches must be trees' 
    return [root_label] + list(branches)

def label(tree):
    return tree[0]

def branches(tree):
    return tree[1:]

def is_tree(tree):
    if type(tree) != list or len(tree) < 1:    
        return False
    for branch in branches(tree):
        if not is_tree(branch):
            return False
    return True

def is_leaf(tree):
    return not branches(tree)

def count_leaves(tree):
    if is_leaf(tree):
        return 1
    else:
        branch_counts = [count_leaves(b) for b in branches(tree)]
        return sum(branch_counts)
```

Tree-recursive functions can be used to construct trees, like partition trees or the Fibonacci sequence.
```python
def fib_tree(n):
    if n == 0 or n == 1:
        return tree(n)
    else:
        left, right = fib_tree(n-2), fib_tree(n-1)    
        fib_n = label(left) + label(right)
    return tree(fib_n, [left, right])
```
Tree-recursive functions are also used to process trees, like the `count_leaves` function.

We say that a tree is a **binary tree** if the tree and subtrees has two branches. Partition trees are binary trees. The process of converting a non-binary tree to a binary tree is called **binarization**.

## Linked lists

A **linked list** is a pair containing the first element of the sequence and the rest of the sequence. The n-th element is also a linked list. The rest of the inner-most linked list is `'empty'`, a value that represents an empty linked list. Linked lists have recursive structure: the rest of a linked list is a linked list or `'empty'`.

A representation of linked lists with constructor `link` and selectors `first` and `rest` is:
```python
empty = 'empty'

def is_link(s):
    """s is a linked list if it is empty or a (first, rest) pair."""
    return s == empty or (len(s) == 2 and is_link(s[1]))

def link(first, rest):
    """Construct a linked list from its first element and the rest."""
    assert is_link(rest), "rest must be a linked list."
    return [first, rest]

def first(s):
    """Return the first element of a linked list s."""
    assert is_link(s), "first only applies to linked lists."
    assert s != empty, "empty linked list has no first element."
    return s[0]

def rest(s):
    """Return the rest of the elements of a linked list s."""
    assert is_link(s), "rest only applies to linked lists."
    assert s != empty, "empty linked list has no rest."
    return s[1]
```
Linked lists can be implemented differently, for instance using functions.

Linked lists are sequences as they store values in order and satisfies the sequence abstraction (length and element selection) as it can be shown with these implementations:
```python
def len_link(s):
    """Return the length of linked list s."""
    length = 0
    while s != empty:
        s, length = rest(s), length + 1
    return length
    
def getitem_link(s, i):
    """Return the element at index i of linked list s."""
    while i > 0:
        s, i = rest(s), i - 1
    return first(s)

def len_link_recursive(s):
    """Return the length of a linked list s."""
    if s == empty:
        return 0
    return 1 + len_link_recursive(rest(s))

def getitem_link_recursive(s, i):
    """Return the element at index i of linked list s."""
    if i == 0:
        return first(s)
    return getitem_link_recursive(rest(s), i - 1)
```
The first implementation is iterative while the second one is recursive.


# Mutable Data

## The Object Metaphor

**Objects** combine data values with behavior. Objects are both information and processes, bundled together to represent the properties, interactions, and behaviors of complex things. Objects represent information, but also behave like the things that they represent. All values in Python are objects. That is, all values have behavior and attributes.

An object is bound to a **class**, which represents a kind of value. The object behaves as the value it represents. Given an object we can _construct_ **instances** of that class by calling the class with some specific arguments. Objects have **attributes**, which are named values that are part of the object. We use dot notation to designate an attribute of an object: `<expression>.<name>`, where `<expression>` evaluates to an object and `<name>` is the name of an atribute for that object. Note that attributes are not available in the general environment, only for objects. Objects also have **methods**, which are function-valued attributes. By implementation, methods are functions that compute their results from both their arguments and their object.

## Mutability

*Bibliography*: https://towardsdatascience.com/https-towardsdatascience-com-python-basics-mutable-vs-immutable-objects-829a0cb1530a, https://stackoverflow.com/questions/8056130/immutable-vs-mutable-types, https://medium.com/@meghamohan/mutable-and-immutable-side-of-python-c2145cf72747.

In Python all objects have:
- **Identity**: an integer that usually corresponds to the object's location in memory. The identity of an object does not change.
- **Type**: defines the possible values and operations that support a group of objects. The type of an object does not change. A type of object is called a class. Classes are first-class values in Python.
- **Value**. The **mutability** of the value of an object is defined by its type.
    - **Immutable values**: values that cannot change. Examples: int, float, decimal, bool, string, tuple, and range.
    - **Mutable values**: values that can change. Examples: lists, dictionaries, sets and user-defined classes.
Some comments about mutability:
- By assigning a different value to an immutable value we are assigning a different identity, i.e., the name points to a different object in memory. On the other hand, assignments in mutable values can change values without changing identities, i.e., the object itself is changing.
- Two different names can point to the same object. For mutable objects, if we change the value of one name, then the other is also changed, because we are changing the object. For immutable values only one name is affected.
- Mutable default arguments in functions are dangerous. They might produce unexpected results.

## Containers

A **container** is an object that contains references to other objects. Some examples of containers are tuples, lists, and dictionaries. The value of an immutable container that contains a reference to a mutable object can be changed if that mutable object is changed. However, the container is still considered immutable because when we talk about the mutability of a container only the identities of the contained objects are implied.

### Lists

Properties of lists in Python:
- The built-in `list` function creates a new list that contains the values of its argument, which must be an iterable value such as a sequence. If another list is passed as an argument, then a copy of that list is created, but with different identity. The objects contained within  the new copy are the same.
- Any slice will produce a new list (new object) but the objects contained within the list are the same. Therefore, mutating a list within a sliced list will affect the original list.
- Adding two lists together creates a new list that contains the values of the first list, followed by the values in the second list. 
- `append` adds an object to the end of a list (not the copy of an object). If it is a mutable object, altering the original object alters the list. The method always returns `None`. 
- The `extend` method of a list takes an iterable value as an argument and adds each of its elements to the end of the list. Modifications of mutable values of the original iterable object do affect the final list (like lists in a list), but immutable objects do not. `x.extend(y)` is equivalent to `x += y`. The method always returns `None`.
- The `pop` method removes and returns the last element of the list. When given an integer argument `i`, it removes and returns the element at index `i` of the list.
- The `remove` method takes one argument that must be equal to a value in the list. It removes the first item in the list that is equal to its argument. The method always returns `None`.
- The `insert` method takes two arguments: an index and a value to be inserted. The value is added to the list at the given index. All elements before the given index stay the same, but all elements after the index have their indices increased by one. This method mutates the list by increasing its size by one, then returns `None`. 
- A list comprehension always creates a new list. This resulting list does not share any of its contents with the iterable expression, and evaluating the list comprehension does not modify the iterable expression.
- The `count` method of a list takes in an item as an argument and returns how many times an equal item apears in the list. If the argument is not equal to any element of the list, then `count` returns 0.

### Tuples

A **tuple**, an instance of the built-in `tuple` type, is an inmutable sequence. Elements are separated by commas, and optionally surrounded by parenthesis. Tuples share methods with lists, but values of tuples cannot be modified (they are immutable). Tuples are used implicitly in multiple assignment

### Dictionaries

**Dictionaries** are Python's built-in data type for storing and manipulating correspondence relationships. A dictionary contains **key-value** pairs, where both the keys and values are objects. The purpose of a dictionary is to provide an abstraction for storing and retrieving values that are indexed not by consecutive integers, but by descriptive keys. Strings commonly serve as keys, because strings are our conventional representation for names of things. Dictionaries were unordered collections of key-value pairs until Python 3.6. Since Python 3.6, their contents will be ordered by insertion. Dictionaries do have some restrictions:
- A key of a dictionary cannot be or contain a mutable value.
- There can be at most one value for a given key.

Dictionaries have new methods, like `keys`, `values`, and `items`. A useful method implemented by dictionaries is `get`, which returns either the value for a key, if the key is present, or a default value. The arguments to `get` are the key and the default value. Dictionaries also have a comprehension syntax analogous to those of lists, but enclosed in curly brackets.

## Local state

Lists and dictionaries have **local state**: they are changing values that have some particular contents at any point in the execution of a program. 

Functions can also have local state. This can happen when the function is non-pure. Calling the function not only returns a value, but also has the side effect of changing the function in some way, so that the next call with the same argument will return a different result. An implementation of such a function requires a new kind of statement: a `nonlocal` statement. The `nonlocal` statement declares that whenever we change the binding of a certain name, the binding is changed in the first frame in which that name is already bound (**enclosing scope**). Recall that without the `nonlocal` statement, an assignment statement would always bind a name in the first frame of the current environment. If the name has not previously been bound to a value, then the `nonlocal` statement will give an error. The `nonlocal` statement indicates that the name appears somewhere in the environment other than the first (local) frame or the last (global) frame.

Ever since we first encountered nested `def` statements, we have observed that a locally defined function can look up names outside of its local frames. No `nonlocal` statement is required to access a non-local name. By contrast, only after a `nonlocal` statement can a function change the binding of names in these frames. Noe assignment statements have a dual role. Either they change local bindings, or they change nonlocal bindings. 

This pattern of non-local assignment is a general feature of programming languages with higher-order functions and lexical scope. Most other languages do not require a `nonlocal` statement at all. Instead, non-local assignment is often the default behavior of assignment statements.

Python also has an unusual restriction regarding the lookup of names: within the body of a function, all instances of a name must refer to the same frame. As a result, Python cannot look up the value of a name in a non-local frame, then bind that same name in the local frame, because the same name would be accessed in two different frames in the same function. This restriction allows Python to pre-compute which frame contains each name before executing the body of a function. When this restriction is violated, an `UnboundLocalError` error message results. As we study interpreter design, we will see that pre-computing facts about a function body before executing it is quite common.

Note that mutable values can be changed without a nonlocal statement. Also remember that global names cannot be modified by non-local keywords and that names in the current frame cannot be overridden using the nonlocal keyword.

Non-local assignment has given us the ability to maintain some state that is local to a function, but evolves over successive calls to that function. The non-local name associated with a particular instance of a function is shared among all calls to that function. However, the binding for the same non-local name associated with a different instance of the function is inaccessible to the rest of the program.

An expression that contains only pure function calls is **referentially transparent**; its value does not change if we substitute one of its subexpression with the value of that subexpression. Re-binding operations violate the conditions of referential transparency because they do more than return a value; they change the environment.

## Iterators

An **iterator** is an object that provides sequential access to values, one by one. The iterator abstraction has two components:
- A mechanism for retrieving the next element in the sequence being processed. 
- A mechanism for signaling that the end of the sequence has been reached and no further elements remain.

For any container an iterator can be obtained by calling the built-in `iter` function. The contents of the iterator can be accessed by calling the built-in `next` function. Python signals that there are no more values available by raising a `StopIteration` exception when `next` is called.

An iterator maintains local state to represent its position in a sequence. Two separate iterators can track two different positions in the same sequence. Calling `iter` on an iterator will return that iterator, not a copy.

The usefulness of iterators is derived from the fact that the underlying series of data for an iterator may not be represented explicitly in memory. An iterator provides a mechanism for considering each of a series of values in turn, but all of those elements do not need to be stored simultaneously. Instead, when the next element is requested from an iterator, that element may be computed on demand instead of being retrieved from an existing memory source.  Iterators are only required to compute the next element of the series, in order, each time another element is requested. While not as flexible as **random access** (accessing arbitrary elements of a sequence in any order), **sequential access** to sequential data is often sufficient for data processing applications (**lazy sequence processing**: ready to compute values, but actually computes them when needed).

## Iterables

Any value that can produce iterators is called an **iterable** value. Examples are strings, tuples, containers (dictionaries, sets...), iterators... A `for` statement can be used to iterate over the contents of any iterable or iterator.

Dictionaries and sets are unordered because the programmer has no control over the order of iteration, but Python does guarantee certain properties about their order in its specification. If a dictionary changes in structure because a key is added or removed, then all iterators become invalid, and future iterators may exhibit changes to the order of their contents. On the other hand, changing the value of an existing key does not invalidate iterators or change the order of their contents.

## Built-in Iterators

- `map(fun, iterable)`: returns a map object (which is an iterator) of the results after applying the given function to each item of a given iterable.
- `filter(fun, iterable)`: returns an iterator over a subset of the values in another iterable.
- `zip(iterable1, iterable2, ...)`: returns an iterator of tuples based on the iterable objects. Each tuple has one element from all the iterables.
- `reversed(sequence)`: returns an iterator with `sequence` but in reversed order.

All these iterators use lazy sequence processing. To view the contents of an iterator, place the iterator into a container: `list(iterable)`, `tuple(iterable)`, `sorted(iterable)`.

## Generators

Generators allow us to define iterations over arbitrary sequences, even infinite sequences, by leveraging the features of the Python interpreter.

A **generator** is an iterator returned by a special class of function called a **generator function**. Generator functions are distinguished from regular functions in that rather than containing `return` statements in their body, they use `yield` statements to return elements of a series.

When called, a generator function doesn't return a particular yielded value, but instead a generator (which is a type of iterator) that itself can return the yielded values. Calling `next` on the generator continues execution of the generator function from wherever it left off previously until another yield statement is executed (lazy processing).

The first time `next` is called, the program executes statements from the body of the generator function until it encounters the `yield` statement. Then, it pauses and returns the value right after `yield`. `yield` statements do not destroy the newly created environment; they preserve it for later. When `next` is called again, execution resumes where it left off. The values of any bound names in the scope of the generator function are preserved across subsequent calls to `next`.

A `yield from` statement yields all values from an iterator or iterable (>Python 3.3). It is useful in recursion.

## Streams

A **stream** is a lazily computed linked list. A `Stream` instance responds to requests for its `first` element and the `rest` of the stream. The `rest` of a `Stream` is itself a `Stream`. The `rest` of a stream is only computed when it is looked up, rather than being stored in advance. That is, the `rest` of a stream is computed lazily.

To achieve this lazy evaluation, a stream stores a function that computes the rest of the stream. Whenever this function is called, its returned value is cached as part of the stream in an attribute called `_rest`, named with an underscore to indicate that it should not be accessed directly.

The accessible attribute `rest` is a property method that returns the rest of the stream, computing it if necessary. With this design, a stream stores *how to compute* the rest of the stream, rather than always storing the rest explicitly.

```python
class Stream:
    """A lazily computed linked list."""
    class empty:
        def __repr__(self):
            return 'Stream.empty'
    empty = empty()
    def __init__(self, first, compute_rest=lambda: empty):  
        assert callable(compute_rest), 'compute_rest must be callable.'
        self.first = first
        self._compute_rest = compute_rest
        self._rest = None
    @property
    def rest(self):
    """Return the rest of the stream, computing it if necessary."""
        if self._compute_rest is not None:
            self._rest = self._compute_rest()
            self._compute_rest = None
        return self._rest
    def __repr__(self):    
        return 'Stream({0}, <...>)'.format(repr(self.first))
```

When a `Stream` instance is constructed, the field `self._rest` is `None`, signifying that the rest of the `Stream` has not yet been computed. When the `rest` attribute is requested via a dot expression, the `rest` property method is invoked, which triggers computation with `self._rest = self._compute_rest()`. Because of the caching mechanism within a `Stream`, the `compute_rest` function is only ever called once, then discarded. The essential properties of a `compute_rest` function are that it takes no arguments, and it returns a `Stream` or `Stream.empty`.

The same higher-order functions that manipulate sequences -- `map` and `filter` -- also apply to streams, although their implementations must change to apply their argument functions lazily. See CP for implementation. The `map_stream` and `filter_stream` functions exhibit a common pattern in stream processing: a locally defined `compute_rest` function recursively applies a processing function to the rest of the stream whenever the rest is computed.

Streams contrast with iterators in that they can be passed to pure functions multiple times and yield the same result each time.


## Implementing Lists and Dictionaries

The Python language does not give us access to the implementation of lists, only to the sequence abstraction and mutation methods built into the language. In the CP text a mutable linked list is implemented. The contents of the list are stored in a nonlocal variable.

**Dispatch functions**: it is a function of the type `dispatch(message, value)`. According to the `message` argument, different functions will be evaluated. This approach, which encapsulates the logic for all operations on a data value within one function that responds to different messages, is a discipline called **message passing**. Variations to local variables can be achieved using `nonlocal`.

A dispatch function can be implemented, intead of using `elif` clauses, with a dictionary (**dispatch dictionaries**). It returns a dictionary with the available messages as keys. The values can be bound to numbers or to functions. Storaging values in the dictionary avoids using nonlocal variables.

## Propagating Constrains

Mutable data allows us to simulate systems with change, but also allows us to build new kinds of abstractions. **Constraint-based systems** support computation in multiple directions. Expressing programs as constraints is a type of **declarative programming**, in which a programmer declares the structure of a problem to be solved, but abstracts away the details of exactly how the solution to the problem is computed.

**Sketch the design of a general model of linear relationships**. We define _primitive constraints_ that hold between quantities. We also define a _means of combination_, so that primitive constraints can be combined to express more complex relations.  We combine constraints by constructing a _network_ in which constraints are joined by connectors. A _connector_ is an object that "holds" a value and may participate in one or more constraints.

Computation by such a network proceeds as follows: When a connector is given a value (by the user or by a constraint box to which it is linked), it awakens all of its associated constraints (except for the constraint that just awakened it) to inform them that it has a value. Each awakened constraint box then polls its connectors to see if there is enough information to determine a value for a connector. If so, the box sets that connector, which then awakens all of its associated constraints, and so on.

**Implementing the Constraint System**: connectors are dictionaries that map message names to function and data values (message passing). Constraints are also dictionaries. See the text for the complete implementation.

***Finish!!***


# Object-Oriented Programming

**Object-oriented programming (OOP)** is a method for organizing modular programs. The object system enables a new metaphor for designing programs in which several independent agents interact within the computer. Each object bundles together local state and behavior in a way that abstracts the complexity of both (distributed state). Objects communicate with each other, and useful results are computed as a consequence of their interaction. Not only do objects pass messages, they also share behavior among other objects of the same type and inherit characteristics from related types.

We have seen that an object is a data value that has **methods** and **attributes**, accessible via dot notation. Every object also has a **type**, called its **class**. To create new types of data, we implement new classes.

## Objects and Classes

Every object is an **instance** of some particular class. A class definition specifies the attributes and methods shared among objects of that class. A class serves as a template for its instances.

An **attribute** of an object is a name-value pair associated with the object, which is accessible via dot notation. The attributes specific to a particular object, as opposed to all objects of a class, are called **instance attributes** (which may also be called fields, properties, or instance variables). 

Functions that operate on the object or perform object-specific computations are called **methods**. We say that methods are _invoked_ on a particular object. Methods can have side effects.

User-defined classes are created by `class` statements, which consist of a single clause. A `class` statement defines the class name, then includes a suite of statements to define the attributes of the class:
```
class <name>:
    <suite>
```
When a class statement is executed, a new class is created and bound to `<name>` in the first frame of the current environment. The suite is then executed. Any names bound within the `<suite>` of a class statement, through `def` or assignment statements, create or modify attributes of the class.

Classes are typically organized around manipulating instance attributes. Instances have dot expression syntax. The class specifies the instance attributes of its objects by defining a method for initializing new objects. The `<suite>` of a class statement contains `def` statements that define new methods for objects of that class. The method that initializes objects has a special name in Python, `__init__`, and is called the **constructor** for the class.

The `__init__` method has two kinds of formal parameters. The first one, `self`, is bound to the newly created object. The second kind of parameters are bound to the arguments passed to the class when it is being instantiated. The constructor binds instance attributes to values. The assignment statements persist because they are stored as an attribute of `self` using dot notation.  By convention, we use the parameter name `self` for the first argument of a constructor, because it is bound to the object being instantiated.

Each new instance has its instance attributes, the value of which are independent of other objects of the same class. To enforce this separation, every object that is an instance of a user-defined class has a unique identity. Object identity is compared using the `is`  and `is not` operators.

*Observation*: functions can also have attributes. See an example in sec. 2.8 of CP.

**Methods**. Object methods are also defined by a `def` statement in the suite of a `class` statement. Methods are the _messages_ an object can receive in dispatch dictionaries. While method definitions do not differ from function definitions in how they are declared, method definitions do have a different effect when executed. The function value that is created by a `def` statement within a `class` statement is bound to the declared name, but bound locally within the class as an attribute. That value is invoked as a method using dot notation from an instance of the class. Each method definition again includes a special first parameter `self`, which is bound to the object on which the method is invoked.

## Message Passing and Dot Expressions

Instance attributes and methods replicate much of the behavior of a dispatch dictionary in a message passing implementation of a data value.

**Dot expressions**. The central idea in message passing was that data values should have behavior by responding to messages that are relevant to the abstract type they represent. Dot notation is a syntactic feature of Python that formalizes the message passing metaphor. A dot expression consists of an expression, a dot, and a name: `<expression> . <name>`. The `<expression>` can be any valid Python expression, but the `<name>` must be a simple name (not an expression that evaluates to a name). A dot expression evaluates to the value of the attribute with the given `<name>`, for the object that is the value of the `<expression>`. The built-in function `getattr` also returns an attribute for an object by name. It is the function equivalent of dot notation.

The attributes of an object include all of its instance attributes, along with all of the attributes (including methods) defined in its class. We can also test whether an object has a named attribute with `hasattr`.

**Methods and functions**. When a method is invoked on an object, that object is implicitly passed as the first argument to the method. That is, the object that is the value of the `<expression>` to the left of the dot is passed automatically as the first argument to the method named on the right side of the dot expression. As a result, the object is bound to the parameter `self`.
    
To achieve automatic `self` binding, Python distinguishes between functions, which we already know, and **bound methods**, which couple together a function and the object on which that method will be invoked. A bound method value is already associated with its first argument, the instance on which it was invoked, which will be named `self` when the method is called.

An example to ilustrate the differences between functions and bound methods. Let a method `my_mthd` of a class `class` and an instance of that class, `instance`. `my_mthd` can be called in two ways. Firstly, `class.my_mthd`, which is a function whose first argument is `self`. Secondly, `instance.my_mthd`, which is a bound method. `instance.my_mthd` has one argument less than `class.my_mthd` because `self` has already been bound to `instance`, which is an instance of `class`. We can call `my_mthd` in two equivalent ways, either `class.my_mthd(instance, args)` or `instance.my_mthd(args)`. Therefore dot notation accesses attributes of the instance or its class.

**Naming Conventions**. Class names are conventionally written using the CapWords convention (also called CamelCase). Method names follow the standard convention of naming functions using lowercased words separated by underscores.

In some cases, there are instance variables and methods that are related to the maintenance and consistency of an object that we don't want users of the object to see or use. They are not part of the abstraction defined by a class, but instead part of the implementation. Python's convention dictates that if an attribute name starts with an underscore, it should only be accessed within methods of the class itself, rather than by users of the class.

## Class Attributes

Some attribute values are shared across all objects of a given class. Such attributes are associated with the class itself, rather than any individual instance of the class. These attributes are called **class attributes**. 

Class attributes are created by assignment statements in the suite of a class statement, outside of any method definition. In the broader developer community, class attributes may also be called class variables or static variables.

**Attribute names**. We could easily have a class attribute and an instance attribute with the same name. To solve this evaluation problem, a dot expression has the following evaluation procedure:
1. Evaluate the `<expression>` to the left of the dot, which yields the _object_ of the dot expression.
2. `<name>` is matched against the instance attributes of that object; if an attribute with that name exists, its value is returned.
3. If `<name>` does not appear among instance attributes, then `<name>` is looked up in the class, which yields a class attribute value.
4. That value is returned unless it is a function, in which case a bound method is returned instead, in which the object of the dot expression is bound to the first argument of the function.

In this evaluation procedure, instance attributes are found before class attributes, just as local names have priority over global in an environment. Methods defined within the class are combined with the object of the dot expression to form a bound method during the fourth step of this evaluation procedure.

**Attribute assignment**. All assignment statements that contain a dot expression on their left-hand side affect attributes for the object of that dot expression. If the object is an instance, then assignment sets an instance attribute. If the object is a class, then assignment sets a class attribute. As a consequence of this rule, assignment to an attribute of an object cannot affect the attributes of its class.

## Inheritance

When working in the object-oriented programming paradigm, we often find that different types are related. In particular, we find that similar classes differ in their amount of specialization. **Inheritance** is a method for relating classes together. Syntactically:
```
class <name>(<base class>):
    <suite>
```
A **subclass** **inherits** the attributes of its **base class**, but may **override** certain attributes, including certain methods. With inheritance, we only specify what is different between the subclass and the base class. Anything that we leave unspecified in the subclass is automatically assumed to behave just as it would for the base class.

We specify inheritance by placing an expression that evaluates to the base class in parentheses after the class name.

When Python resolves a name in a dot expression that is not an attribute of the instance, it tries to find that name in every base class in the inheritance chain for the original object's class. We can define this procedure recursively. To look up a name in a class.
1. If it names an attribute in the class, return the attribute value.
2. Otherwise, look up the name in the base class, if there is one.

**Calling ancestors**. Attributes that have been overridden are still accessible via class objects. Also, it is a good idea to look up attributes on instances whenever possible (which might be overridden). 

**Interfaces**. *Bibliography*: https://www.php.net/manual/en/language.oop5.interfaces.php 

It is extremely common in object-oriented programs that different types of objects will share the same attribute names. An **object interface** is a collection of attributes and conditions on those attributes. Object interfaces allow us to create code which specifies which methods a class must implement, without having to define how these methods are implemented. In some programming languages such as Java, interface implementations must be explicitly declared. 

An interface is provided so we can describe a set of functions and then hide the final implementation of those functions in an implementing class. This allows us to change the implementation of those functions without changing how we use it. The parts of our program that use objects (rather than implementing them) are most robust to future changes if they do not make assumptions about object types, but instead only about their attribute names. That is, they use the object abstraction, rather than assuming anything about its implementation.

### Multiple Inheritance

Python supports the concept of a subclass inheriting attributes from multiple base classes, a language feature called **multiple inheritance**. The order to look for attributes is: first check the current class, then, in the order of how the base classes where given, if exists, check those clases. More specifically, Python resolves the ordering by using a recursive algorithm called the C3 Method Resolution Ordering.

## The Role of Objects

Object-oriented programming is particularly well-suited to programs that model systems that have separate but interacting parts. Functional abstractions provide a more natural metaphor for representing relationships between inputs and outputs.  


# Object Abstraction

A central concept in object abstraction is a **generic or polymorphic function**, which is a function that can accept values of multiple different types. We will consider three different techniques for implementing generic functions: shared interfaces, type dispatching, and type coercion.

## Special Methods

To represent data effectively, an object value should behave like the kind of data it is meant to represent, including producing a string representation of itself. Python stipulates that all objects should produce two different string representations: one that is human-interpretable text and one that is a Python-interpretable expression. The constructor function for strings, `str`, returns a human-readable string. The result of calling `str` on the value of an expression is what Python prints using `print`. Where possible, the `repr` function returns a Python expression that evaluates to an equal object. The result of calling `repr` on the value of an expression is what Python prints in an interactive session. The `str` constructor often coincides with `repr`, but provides a more interpretable text representation in some cases. For most object types, `eval(repr(object)) == object`.

`repr` and `str` are polymorphic functions. The `repr` function always invokes a method called `__repr__` on its argument. By implementing this same method in user-defined classes, we can extend the applicability of `repr` to any class we create in the future. The `str` constructor is implemented in a similar manner. For more details in how these two functions work, watch 61A Fall 2015 Lecture 16 Video 3.

These polymorphic functions are examples of a more general principle: certain functions should apply to multiple data types. Moreover, one way to create such a function is to use a shared attribute name with a different definition in each class.

In Python, certain special names are invoked by the Python interpreter in special circumstances.
- `__bool__`: objects of user-defined classes are considered to be true, but the special `__bool__` method can be used to override this behavior.
- `__len__`: the `len` function invokes the `__len__` method of its argument to determine its length. Python uses a sequence's length to determine its truth value, if it does not provide a `__bool__` method.
- `__getitem__`: the `__getitem__` method is invoked by the element selection operator.
- `__call__`: Python also allows us to define objects that can be "called" like functions by including a `__call__` method. With this method, we can define a class that behaves like a higher-order function.
- ...

## Multiple Representations

There might be more than one useful representation for a data object, and we might like to design systems that can deal with multiple representations. In addition to the data-abstraction barriers that isolate representation from use, we need abstraction barriers that isolate different design choices from each other and permit different choices to coexist in a single program.

**Interfaces**. Object attributes, which are a form of message passing, allows different data types to respond to the same message in different ways. A shared set of messages that elicit similar behavior from different classes is a powerful method of abstraction. An interface is a set of shared attribute names, along with a specification of their behavior.

**Properties**. The requirement that two or more attribute values maintain a fixed relationship with each other is a new problem. One solution is to store attribute values for only one representation and compute the other representation whenever it is needed. Python has a simple feature for computing attributes on the fly from zero-argument functions. The `@property` decorator allows functions to be called without call expression syntax (parentheses following an expression). Therefore we can have arguments with different representations which are converted on the fly to the one which was originally meant to be defined. Basically `@property` allows to treat method attributes as non-method attributes.

Another interesting decorator is `@<attribute>.setter`, which on a method designates that it will be called whenever that attribute is assigned. `@property` allows a function to work as a instance attribute and `@<attribute>.setter` allows modified name assignments on an attribute (which might an attribute created using `@property`).

The interface approach to encoding multiple representations has appealing properties. The class for each representation can be developed separately; they must only agree on the names of the attributes they share, as well as any behavior conditions for those attributes. The interface is also additive. If another programmer wanted to add another representation to the same program, they would only have to create another class with the same attributes.

## Type dispatching and type coercion.

**Type dispatching**. One way to implement cross-type operations is to select behavior based on the types of the arguments to a function or method. The idea of type dispatching is to write functions that inspect the type of arguments they receive, then execute code that is appropriate for those types. 

The built-in function `isinstance` takes an object and a class. It returns true if the object has a class that either is or inherits from the given class. We can also give a `type_tag` to different types to check whether they are the same. If not, we need a cross-type operation implemented in some carefully controlled way, so that we can support it without seriously violating our abstraction barriers.

To reflect type dispatching, the interface must be modified so that it checks the types and act accordingly. For instance, we can use dictionaries that given some types as keys, it returns a function that act on those keys.

**Coercion**. In the general situation of completely unrelated operations acting on completely unrelated types, implementing explicit cross-type operations, cumbersome though it may be, is the best that one can hope for. Fortunately, we can sometimes do better by taking advantage of additional structure that may be latent in our type system. Often the different data types are not completely independent, and there may be ways by which objects of one type may be viewed as being of another type. This process is called coercion.

We can create a coercions dictionary which indexes all possible coercions by a pair of type tags, indicating that the corresponding value coerces a value of the first type to a value of the second type.


# Efficiency

**Efficiency** refers to the computational resources used by a representation or process, such as how much time and memory are required to compute the result of a function or represent an object. These amounts can vary widely depending on the details of an implementation.

## Measuring Efficiency

A reliable way to characterize the efficiency of a program is to measure how many times some event occurs. This can be achieved using an user-defined decorator.

To understand the **space (memory) requirements** of a function, we must specify generally how memory is used, preserved, and reclaimed in our environment model of computation. In evaluating an expression, the interpreter preserves all active environments and all values and frames referenced by those environments. An environment is active if it provides the evaluation context for some expression being evaluated. An environment becomes inactive whenever the function call for which its first frame was created finally returns. A well-designed interpreter can reclaim the memory that was used to store this frame. 

## Memoization

Tree-recursive computational processes are not efficient. They can often be made more efficient through **memoization**, a powerful technique for increasing the efficiency of recursive functions that repeat computation. A memoized function will store the return value for any arguments it has previously received.

Memoization can be expressed naturally as a higher-order function, which can also be used as a decorator. The definition below creates a cache of previously computed results, indexed by the arguments from which they were computed. The use of a dictionary requires that the argument to the memoized function be immutable (pure functions).
```python
def memo(f):
    cache = {}
    def memoized(n):
        if n not in cache:
            cache[n] = f(n)
        return cache[n]
    return memoized
```

## Orders of Growth

A useful categorization of the computational resources consumption is the **order of growth** of a process, which expresses in simple terms how the resource requirements of a process grow as a function of the input.

**Big Theta Notation**. Let $n$ be a parameter that measures the size of the input to some process, and let $R(n)$ be the amount of some resource that the process requires for an input of size $n$. We say that $R(n)$ has order of growth $\Theta(f(n))$, written $R(n)=\Theta(f(n))$ (pronounced "theta of f(n)"), if there are positive constants $k_1$ and $k_2$ independent of $n$ such that
$$ k_1 f(n) \leq R(n) \leq k_2 f(n) $$
for any value of $n$ larger than some minimum $m$. In other words, for large $n$, the value $R(n)$ is always sandwiched between two values that both scale with $f(n)$:
- A lower bound $k_1 f(n)$.
- An upper bound $k_2 f(n)$.

In the orders of growth we consider functions of growths up to some multiplicative constante, i.e, it is equivalent $\Theta(2n)$ to $\Theta(n)$ and $\Theta(\log_e n)$ to $\Theta(\log_{10} n)$.

Nesting. When an inner computational process is repeated for each step in an outer process, then the order of growth of the entire process is a product of the number of steps in the outer and inner processes.

Lower-order terms. As the input to a process grows, the fastest growing part of a computation dominates the total resources used. Theta notation captures this intuition. In a sum, all but the fastest growing term can be dropped without changing the order of growth.

**Big O Notation**. Big O Notation describes the upper bound for the amount of resourced used. Big O says *at most*, big Theta says *at least* and *at most*.


# Recursive Objects

Objects can have other objects as attribute values. When an object of some class has an attribute value of that same class, it is a **recursive object**.

Using OOP instead of ADT simplifies the constructor and selectors. When using ADT we had to invent a way to combine the pieces (lists, dictionaries...) and also we had to come up with a way of selecting those pieces. When using the Python object system we do not have to make decission on how the pieces are combined as we always use the same way: each part is an attribute which can be selected using dot expressions for attributes. Nevertheless, the creation of tree instances (methods/functions) is equivalent.

## Sets

In addition to the list, tuple, and dictionary, Python has a fourth built-in container type called a **set**. Set literals follow the mathematical notation of elements enclosed in braces. Duplicate elements are removed upon construction. Sets are unordered collections, and so the printed ordering may differ from the element ordering in the set literal. Sets are mutable objects.

Python sets support a variety of operations, including membership tests, length computation, and the standard set operations of union and intersection, check disjoint, subset, superset...