Some notes for my Python intro talk, part 2

This will be using Python 3.5, which is the latest version. You may come across a lot of code requiring Python 2.7, the last of the Python 2 series. Some deliberate backward incompatibilities were introduced in Python 3 to fix problems that could not be handled in a backward-compatible fashion.

Previously covered in part 1:
* Numbers
* Strings
* Lists, mutability
* Dictionaries
* Control constructs: `for`-loop, `if`-statement
* Function definitions

Now we continue with our stock-control example...

In [None]:
stock = {"apples" : 5, "oranges" : 3, "pears" : 3}

## Sets

Where dictionaries store an association of keys with values, sets store only the presence of the keys.

In [None]:
citrus = {"oranges", "lemons"}

The set constructor differs from the dictionary constructor correspondingly. However, note that empty braces “`{}`” denote an empty *dictionary*, not an empty *set*: the latter is constructed with the built-in `set` function called with no arguments: `set()`.

Demonstration of set-membership test:

In [None]:
for k in stock :
    if k in citrus :
        print(k, stock[k])
    #end if
#end for

More categories: why not put them into a dictionary, keyed on the category name:

In [None]:
categories = {"citrus" : {"oranges", "lemons"}, "pome" : {"apples", "pears", "quinces"}}

Example of a function taking an argument:

In [None]:
def show_stock_in_category(category_name) :
    for k in stock :
        if k in categories[category_name] :
            print(k, stock[k])
        #end if
    #end for
#end show_stock_in_category

Example use:

In [None]:
show_stock_in_category("citrus")

or alternatively modify original `show_stock` function to take optional category: (note short-cut boolean evaluation)

In [None]:
def show_stock(category_name = None) :
    "shows items and quantities in the stock-keeping system for the specified category, or all categories if omitted."
    for k in stock :
        if category_name == None or k in categories[category_name] :
            print(k, stock[k])
            if stock[k] < 2 :
                print(k, "running low")
            #end if
        #end if
    #end for
#end show_stock

In [None]:
show_stock()

still works as before, while

In [None]:
show_stock("citrus")

now also works. As does specifying the argument name:

In [None]:
show_stock(category_name = "citrus")

which can be handy for specifying arguments out of order, omitting args with defaults, also good as a documentation aid (reader is more likely to remember argument names than their order).

## Classes

You previously saw that Python lists also work as one-dimensional arrays. What if you want to define, say, two-dimensional arrays? It is easy enough to have an array of arrays, with elements referenced by *a*`[`*i*`][`*j*`]`, but what if you want to use a syntax more like multidimensional arrays in other languages, i.e. *a*`[`*i*`, `*j*`]`?

First, let us get the behaviour of our two-dimensional array class. We will define `get` and `set` methods which, given *i* and *j* arguments, will return or update the corresponding array elements. As with other OO languages, we need to define a special *constructor* method that will initialize newly-created class instances.

In Python, all methods are just function definitions in the class, with the class instance passed as the first argument. The function definition can give any name you like to this argument, but it is common to use the name `self`.

There is no member visibility control (“public”/“private”/“protected” etc). All members are accessible to caller. As GvR says, “we’re all consenting adults here”. There is a convention to begin internal member names with a single underscore, as a hint to the caller that Here Be Tygers.

The constructor is a method with the special name `__init__`.

In [None]:
class Array :

    def __init__(self, nr_rows, nr_cols, initval) :
        self.nr_rows = nr_rows
        self.nr_cols = nr_cols
        self.data = [initval] * nr_rows * nr_cols
    #end __init__

    def get(self, i, j) :
        if not isinstance(i, int) or not isinstance(j, int) or i < 0 or i >= self.nr_rows or j < 0 or j >= self.nr_cols :
            raise IndexError("invalid array indices")
        #end if
        return self.data[i * self.nr_cols + j]
    #end get

    def set(self, i, j, val) :
        if not isinstance(i, int) or not isinstance(j, int) or i < 0 or i >= self.nr_rows or j < 0 or j >= self.nr_cols :
            raise IndexError("invalid array indices")
        #end if
        self.data[i * self.nr_cols + j] = val
    #end set

#end Array

Constructing a class instance involves invoking the class name as though it were a function that returns the new instance; the arguments passed are those to the `__init__` method (skipping the first one):

In [None]:
arr = Array(3, 3, 0)

This creates a new 3×3 array, with all elements initialized to the integer 0.

In [None]:
print(arr.get(2, 1))

In [None]:
arr.set(2, 1, 9)
print(arr.get(2, 1))

OK, this `get`/`set` notation works, but how do we use regular two-dimensional array notation? The answer is to add more specially-named methods to the class definition:

    def __getitem__(self, index) :
        return self.get(index[0], index[1])
    #end __getitem__

    def __setitem__(self, index, val) :
        self.set(index[0], index[1], val)
    #end __setitem__


In [None]:
class Array :

    def __init__(self, nr_rows, nr_cols, initval) :
        self.nr_rows = nr_rows
        self.nr_cols = nr_cols
        self.data = [initval] * nr_rows * nr_cols
    #end __init__

    def get(self, i, j) :
        if not isinstance(i, int) or not isinstance(j, int) or i < 0 or i >= self.nr_rows or j < 0 or j >= self.nr_cols :
            raise IndexError("invalid array indices")
        #end if
        return self.data[i * self.nr_cols + j]
    #end get

    def set(self, i, j, val) :
        if not isinstance(i, int) or not isinstance(j, int) or i < 0 or i >= self.nr_rows or j < 0 or j >= self.nr_cols :
            raise IndexError("invalid array indices")
        #end if
        self.data[i * self.nr_cols + j] = val
    #end set

    def __getitem__(self, index) :
        return self.get(index[0], index[1])
    #end __getitem__

    def __setitem__(self, index, val) :
        self.set(index[0], index[1], val)
    #end __setitem__
    
#end Array

Don’t forget to recreate the array object:

In [None]:
arr = Array(3, 3, 0)

Now let us try the notation:

In [None]:
print(arr[2, 1])

The `__getitem__` method is used in an expression to get a value as above, while `__setitem__` comes into play on the left-hand side of an assignment:

In [None]:
arr[2, 1] = 9
print(arr[2, 1])

Note how the methods are implemented: the array indices are combined into a single tuple argument, which the methods here call `index`. See how they extract the individual indices and pass them to the regular `get` and `set` methods we previously defined.

What happens if we try to print the array object itself? What do we see?

In [None]:
print(arr)

The answer is, nothing very exciting. But we can fix this, by adding yet another method with a special name: the `__repr__` method, whose job it is to return some human-readable string representation:

    def __repr__(self) :
        return "Array(%d, %d, %s)" % (self.nr_rows, self.nr_cols, repr(self.data))
    #end __repr__

This will return a string that shows the dimensions of the array, and the contents of its elements.

In [None]:
class Array :

    def __init__(self, nr_rows, nr_cols, initval) :
        self.nr_rows = nr_rows
        self.nr_cols = nr_cols
        self.data = [initval] * nr_rows * nr_cols
    #end __init__

    def get(self, i, j) :
        if not isinstance(i, int) or not isinstance(j, int) or i < 0 or i >= self.nr_rows or j < 0 or j >= self.nr_cols :
            raise IndexError("invalid array indices")
        #end if
        return self.data[i * self.nr_cols + j]
    #end get

    def set(self, i, j, val) :
        if not isinstance(i, int) or not isinstance(j, int) or i < 0 or i >= self.nr_rows or j < 0 or j >= self.nr_cols :
            raise IndexError("invalid array indices")
        #end if
        self.data[i * self.nr_cols + j] = val
    #end set

    def __getitem__(self, index) :
        return self.get(index[0], index[1])
    #end __getitem__

    def __setitem__(self, index, val) :
        self.set(index[0], index[1], val)
    #end __setitem__

    def __repr__(self) :
        return "Array(%d, %d, %s)" % (self.nr_rows, self.nr_cols, repr(self.data))
    #end __repr__

#end Array

Let us create an instance of the new class, redo the assignment to the array element, and see how it prints:

In [None]:
arr = Array(3, 3, 0)
arr[2, 1] = 9
print(arr)

A bit more readable,  don’t you think?

Now, about defining a custom meaning for a built-in operator, like “+”. To do this we need to add a method with the special name `__add__`. This example definition will take an `Array` and a value, and return a new `Array` with the value added to every element:

    def __add__(self, n) :
        result = Array(self.nr_rows, self.nr_cols, None)
        for i in range(self.nr_rows) :
            for j in range(self.nr_cols) :
                result[i, j] = self[i, j] + n
            #end for
        #end for
        return result
    #end __add__

Note elements don’t have to be numbers, anything for which “+” is valid will work.


In [None]:
class Array :

    def __init__(self, nr_rows, nr_cols, initval) :
        self.nr_rows = nr_rows
        self.nr_cols = nr_cols
        self.data = [initval] * nr_rows * nr_cols
    #end __init__

    def get(self, i, j) :
        if not isinstance(i, int) or not isinstance(j, int) or i < 0 or i >= self.nr_rows or j < 0 or j >= self.nr_cols :
            raise IndexError("invalid array indices")
        #end if
        return self.data[i * self.nr_cols + j]
    #end get

    def set(self, i, j, val) :
        if not isinstance(i, int) or not isinstance(j, int) or i < 0 or i >= self.nr_rows or j < 0 or j >= self.nr_cols :
            raise IndexError("invalid array indices")
        #end if
        self.data[i * self.nr_cols + j] = val
    #end set

    def __getitem__(self, index) :
        return self.get(index[0], index[1])
    #end __getitem__

    def __setitem__(self, index, val) :
        self.set(index[0], index[1], val)
    #end __setitem__

    def __repr__(self) :
        return "Array(%d, %d, %s)" % (self.nr_rows, self.nr_cols, repr(self.data))
    #end __repr__

    
    def __add__(self, n) :
        result = Array(self.nr_rows, self.nr_cols, None)
        for i in range(self.nr_rows) :
            for j in range(self.nr_cols) :
                result[i, j] = self[i, j] + n
            #end for
        #end for
        return result
    #end __add__

#end Array

In [None]:
arr = Array(3, 3, 2)
print(arr + 5)

This works because

In [None]:
(2).__add__(5)

is equivalent to

In [None]:
2 + 5

If you want the `Array` instance to update itself in place, then you define a method that implements the “+=” operator, the name of which is `__iadd__`, e.g.

    def __iadd__(self, n) :
        for i in range(self.nr_rows) :
            for j in range(self.nr_cols) :
                self[i, j] += n
            #end for
        #end for
        return self
    #end __iadd__


In [None]:
class Array :

    def __init__(self, nr_rows, nr_cols, initval) :
        self.nr_rows = nr_rows
        self.nr_cols = nr_cols
        self.data = [initval] * nr_rows * nr_cols
    #end __init__

    def get(self, i, j) :
        if not isinstance(i, int) or not isinstance(j, int) or i < 0 or i >= self.nr_rows or j < 0 or j >= self.nr_cols :
            raise IndexError("invalid array indices")
        #end if
        return self.data[i * self.nr_cols + j]
    #end get

    def set(self, i, j, val) :
        if not isinstance(i, int) or not isinstance(j, int) or i < 0 or i >= self.nr_rows or j < 0 or j >= self.nr_cols :
            raise IndexError("invalid array indices")
        #end if
        self.data[i * self.nr_cols + j] = val
    #end set

    def __getitem__(self, index) :
        return self.get(index[0], index[1])
    #end __getitem__

    def __setitem__(self, index, val) :
        self.set(index[0], index[1], val)
    #end __setitem__

    def __repr__(self) :
        return "Array(%d, %d, %s)" % (self.nr_rows, self.nr_cols, repr(self.data))
    #end __repr__

    
    def __add__(self, n) :
        result = Array(self.nr_rows, self.nr_cols, None)
        for i in range(self.nr_rows) :
            for j in range(self.nr_cols) :
                result[i, j] = self[i, j] + n
            #end for
        #end for
        return result
    #end __add__

    def __iadd__(self, n) :
        for i in range(self.nr_rows) :
            for j in range(self.nr_cols) :
                self[i, j] += n
            #end for
        #end for
        return self
    #end __iadd__

#end Array

Create a new instance:

In [None]:
arr = Array(3, 3, 2)

Try the new method:

In [None]:
arr += 4
print(arr)

## More Looping Fun ##

Returning to our previous stock-control example, suppose we would like the stock printout to include an item number on each line. One way to do this is as follows:

In [None]:
def show_itemized_stock() :
    i = 0
    for k in sorted(stock) :
        i += 1
        print("{}. {:.<12}{}".format(i, k, stock[k]))
    #end for
#end show_itemized_stock
show_itemized_stock()

However, Python offers a built-in function called `enumerate`, which makes this a little easier:

In [None]:
def show_itemized_stock() :
    for i, k in enumerate(sorted(stock)) :
        print("{}. {:.<12}{}".format(i + 1, k, stock[k]))
    #end for
#end show_itemized_stock
show_itemized_stock()

Note that, in common with Python conventions elsewhere, it wants to number things from zero. So we have to add 1 to start numbering from that.

### Permutations ###

Consider the problem of generating all permutations of a given list, e.g. the list `[1, 2, 3]` has $3! = 6$ permutations:

    [1, 2, 3]
    [1, 3, 2]
    [2, 3, 1]
    [2, 1, 3]
    [3, 1, 2]
    [3, 2, 1]

How do we generate these? The general algorithm can be expressed *recursively* as follows:

* If the list is empty, then there is only one permutation: the empty list.
* Otherwise, pick each element of the list in turn. For each such selection:
  * for each permutation of the remaining items in the list, put the previously-selected element on the front, and return this as a permutation.

This can be expressed in Python as follows:

In [None]:
def permute(l) :
    if len(l) == 0 :
        yield []
    else :
        for i, elt in enumerate(l) :
            for rest in permute(l[:i] + l[i + 1:]) :
                yield [elt] + rest
            #end for
        #end for
    #end if
#end permute

This function is called a *generator*. Instead of being entered, returning a result and terminating, the `yield` calls only *suspend* the execution of the function, so that it can be resumed from where it left off, until it executes another `yield` or it terminates.

One way to use such a function is as the iterator expression in a loop, where the loop iterates once for each `yield` from the generator:

In [None]:
for c in permute([1, 2, 3]) :
    print(c)
#end for

What’s the advantage of using a generator? It can be handy to avoid storing the whole of a large list in memory at once, where you only need to process one element at a time. For example, the function might do a database query, and `yield` each matching record, one at a time, and there might be a million matching records.

## Reflection/RTTI ##

*Reflection* is a fancy term for being able to examine and manipulate type information at run-time. Or a more limited form of this might be called *Run-Time Type Information* (RTTI). Some languages have a complex system for doing this, usually with only partial functionality. Python, on the other hand, being a fully dynamic language, can offer full access, even being able to do things like create new types at run time.

The `issubclass` built-in function lets you query subclass/superclass relationships. This works among the built-in types as well.

In [None]:
issubclass(int, float)

In [None]:
issubclass(bool, int)

To determine the type of an object, you can use the `type` built-in function, as shown in previous examples. Types are objects too, and in particular you can compare them for equality:

In [None]:
type(3) == int

But if you need to check that a value is of an acceptable type, it is better to use the `isinstance` function, since this will also accept values of subclasses (yes, even the built-in types can be subclassed):

In [None]:
isinstance(3, int)

You can check that a value is of (or a subclass of) any of a list or tuple of types:

In [None]:
isinstance(3, (int, float)), isinstance(3.0, (int, float))

If you want to check that a value is of a numeric type, the `numbers` module provides *abstract base classes* to make this more convenient:

In [None]:
import numbers
issubclass(int, numbers.Real), issubclass(float, numbers.Real)

In [None]:
isinstance(3, float)

In [None]:
isinstance(3, numbers.Real)

## `any` and `all` ##

Supposing you want to check that all elements of a list are of a particular type. Rather than writing a loop statement and collecting the results, you can directly write what is called a *list comprehension* expression.

In [None]:
sample_list_1 = [1, 2, 3]
sample_list_2 = [1, "two", 3]

In [None]:
all(isinstance(i, int) for i in sample_list_1), \
all(isinstance(i, int) for i in sample_list_2)

If you wanted the opposite condition, the obvious way would be to put a `not` on the front:

    not all(isinstance(i, int) for i in sample_list_1), \
    not all(isinstance(i, int) for i in sample_list_2)

Another way to express it is using the complementary function to `all`, which is `any`:

In [None]:
any(not isinstance(i, int) for i in sample_list_1), \
any(not isinstance(i, int) for i in sample_list_2)

In this case, it doesn’t look like one is obviously better than the other. It might be different in other cases.

**Summary:** The core of the Python language can be defined very compactly (I estimate the language reference is about 140 printed pages), certainly compared to other general-purpose languages. Most of the power of the language comes from libraries, both standard ones that come with the language and a whole host of third-party ones. These languages take full advantage of the power of the core, so using them becomes like using a whole lot of additional features built into the language. You can get some flavour of this power from the examples above, but more will become apparent as you delve into the libraries.

Have fun.