<center><h1>A Primer on Python Classes</h1>
<br>
<center><b>Collin Capano</b></center>

The `inference` module in PyCBC makes heavy use of Python classes in order to manage different samplers and models with a common API. In this tutorial we provide an overview of:
 * Object oriented programming in Python, including what a `class` is and how to define it.
 * Class inheritance.
 * Abstract base classes.
 * How these are used in PyCBC Inference.
 
In addition, the Appendix provides some more background on common decorators and different ways to define and call functions.

### Table of Contents

 * [I. Brief review of functions](#I.-Brief-review-of-functions)
 * [II. Namespaces and scope](#II.-Namespaces-and-scope)
 * [III. Classes](#III.-Classes)
   - [The need for classes](#The-need-for-classes)
   - [Defining classes](#Defining-classes)
   - [What's the deal with `self`?](#What's-the-deal-with-self?)
   - [The `__init__` method](#The-__init__-method)
 * [IV. Class inheritance](#IV.-Class-inheritance)
   - [Using `super`](#Using-super)
   - [Multiple inheritance](#Multiple-inheritance)
 * [V. Abstract Base Classes](#V.-Abstract-Base-Classes)
 * [VI. Usage in `pycbc.inference`](#VI.-Usage-in-pycbc.inference)
 * [Further reading](#Further-reading)
 * [Appendix](#Appendix)
   - [A. Decorators](#A.-Decorators)
   - [B. More on functions](#B.-More-on-functions)

### Prerequisites

In [1]:
from __future__ import print_function

## I. Brief review of functions

When we start up a Python environment, we have all of Python's built-in functions and operators available to us, such as `import`, `for`, `if`, etc. With the `=` operator we can define variables, which we can then modify using other operators, such as `+`, `-`, etc. For example, let's define a float called `a` and set it to some initial value:

In [2]:
a = 1.234

We can now carry out various operations with `a`, e.g.:

In [3]:
print('Add one to a:', a + 1)
print('Square a:', a**2)

Add one to a: 2.234
Square a: 1.522756


Say we wanted to find the [floor](https://en.wikipedia.org/wiki/Floor_and_ceiling_functions) of `a`. We could do that with the following:

In [4]:
print('The floor of a is:')
if a > 0:
    print(int(a))
else:
    print(int(a-1))

The floor of a is:
1


Now introduce two other variables, `b` and `c`:

In [5]:
b = -3.14
c = 7.2

Say we also want the floor of these. To do that, we could copy and paste the code above twice, once for `b` and once for `c`. However, repeating code is generally a bad idea. First, it makes your code long and tedious to read. Second, what if you want to modify it later, or you find there's a bug? Now you'd have to go and fix every repetition of it, which can easily lead to more bugs. This rule of thumb is so important, it's worth a call out:
<br>
<br>
<center><div class="alert alert-block alert-info">
    <b><big>Do not repeat code.</big></b>
    <br>
    <small>(Unless you really need to.)</small>
</div></center>

Instead, we can define a function, and put our code in there:

In [6]:
def floor(x):
    if x > 0:
        x = int(x)
    else:
        x = int(x-1)
    return x

We can now easily apply the same function to all of our variables, without repeating several lines of code:

In [7]:
print("floor(a):", floor(a))
print("floor(b):", floor(b))
print("floor(c):", floor(c))

floor(a): 1
floor(b): -4
floor(c): 7


For more on functions, including various ways to define them and call them, see the [Appendix](#B.-More-on-functions).

## II. Namespaces and scope

Before moving on to classes, it is instructive to understand how mappings between objects and values are handled in python.

In the definition of `floor` we created an argument called `x`, which we modified within the function. No variable `x` was defined prior to the function definition. And even though we have called `floor` three times, no variable `x` exists in the notebook:

In [8]:
print(x)

NameError: name 'x' is not defined

To understand what's going on here, let's establish a few definitions. When we set `a = 1.234`, we created a *mapping* from the object called `a` to the value `1.234`. We created similar mappings for `b` and `c`. Likewise, when we defined the function `floor`, we created a mapping from the object called `floor` to the block of code defined above. A collection of mappings is called a *namespace*. Currently, the namespace of our notebook/module (called the module's `global` namespace) is `a`, `b`, `c`, and `floor`. Our notebook also has access to Python's `builtins` namespace. The *[scope](https://en.wikipedia.org/wiki/Scope_(computer_science))* of the notebook --- i.e., the collection of namespaces that are searched to resolve objects --- is the notebook's namespace plus the `builtins`.

When we define a function, we essentially define a new, *local* namespace within that function. The function's local namespace includes all of the arguments that the function takes, plus any mappings that are created within that function. So, when we called `floor(a)`, a namespace was created in which a local object `x` was mapped to the value of `a`, `1.234`. The code within `floor` was then evaluated. Since `1.234` is greater than 0, this line was executed:
```
    x = int(x)
```
At this point the local `x` is remapped to the integer `1`. The line `return x` then passes the value that `x` is mapped to back to the notebook. At this point, *the function's namespace is deleted*, meaning that the object `x` ceases to exist. The same applies to any mappings that are defined within the function. For example:

In [9]:
def quadsum(a, b):
    x = a**2
    y = b**2
    return (x + y)**(0.5)

In [10]:
print(quadsum(b, c))

7.854909292919938


In [11]:
print(y)

NameError: name 'y' is not defined

If we want the result of `quadsum` to exist in the notebook's namespace, we need to assign an object to it; e.g.:

In [12]:
d = quadsum(b, c)
print(d)

7.854909292919938


Likewise, if we want other objects that were created within a function to persist, we need to pass those objects using `return`. For example:

In [13]:
def prodquot(a, b):
    prod = a*b
    quot = a / b
    return prod, quot

In [14]:
btc, bdc = prodquot(b, c)
print(btc, bdc)

-22.608 -0.4361111111111111


In both `quadsum` and `prodquot`, the arguments of the function were called `a` and `b`. But we already had an `a` and `b` defined in the notebook's namespace! This was not problematic because whenever Python is evaluating a block of code, *local mappings take precedence*. So, when we called `prodquot(b, c)`, a local namespace was created in which `a` was mapped to the notebook's `b`, and a local `b` was mapped to the notebook's `c`.

If a function uses an object that is not defined in the local namespace, then Python will step out to the next local namespace to look for it. This process will repeat until it gets to the module's `global` namespace, followed by the `builtins`; if nothing can be found there, a `NameError` is raised.

For example, when the code inside `prodquot(b, c)` is being evaluated, 3 namespaces exist. In order of object resolution, they are:
  1. `prodquot` namespace: `{a: -3.14, b: 7.2}`
  2. `global` namespace: `{a: 1.234, b: -3.14, c: 7.2, prodquot: <prodquot code>, ... }`
  3. `builtin` namespace: `{*: <multiply>, /: <divide>, ... }`

As we will see below, understanding nested namespaces is key to understanding how classes work.

*****
<center><div class="alert alert-block alert-danger"><b>Warning</b></div></center>

*****

Due to the way mappings are resolved, it is possible to define a function that uses a variable from outside of its local namespace. For example:

In [15]:
def dontdothis(b):
    return a + b

In [16]:
dontdothis(c)

8.434000000000001

Here, because `a` was not listed in `dontdothis`'s arguments (and was not defined anywhere in the function), Python ends up using the `a` defined in the notebook's `global` namespace. We therefore end up with `-3.14 + 7.2`.

If it wasn't clear from the function name, **don't do this.** The danger is the output of the function can change depending on where it is called in the code even if the arguments are the same. For example, if at some later point we set `a` to a different value:

In [17]:
a = 10.0

Then calling `dontdothis` with the same argument will yield a different result:

In [18]:
dontdothis(c)

17.2

This can lead to unexpected bugs that are hard to track down. Thus:
<br>
<br>
<center><div class="alert alert-block alert-info">
    <b><big>You should include all variables needed by a function in its list of arguments.</big></b>
    <br>
</div></center>
<br>
There are some cases where this rule may need to be broken, but these are rare, and should be avoided if possible.

## III. Classes

### The need for classes

It is possible to write code in Python that only uses functions (this is known as [functional programming](https://en.wikipedia.org/wiki/Functional_programming)). However, in certain cases this can lead to unwieldy and difficult to manage code.

As a simple example (borrowed from [here](https://stackoverflow.com/a/33072722)), say we wanted some code to keep track of a student's progress in a class. We could do this using a dictionary:

In [19]:
# create student Jane
jane = {}
jane['name'] = 'Jane'
jane['homework_grades'] = [87., 90., 82., 75., 97.]
jane['exam_grades'] = [88., 80., 94.]

You also give your students the opportunity to earn extra credit by writing a report. Jane, the hardworking student that she is, takes advantage of this and writes an excellent report. You add that to her grades:

In [20]:
jane['extra_credit'] = 99.

You now write a function to calculate your students' GPA:

In [21]:
def gpa(student):
    homeworkavg = float(sum(student['homework_grades']))/len(student['homework_grades'])
    examavg = float(sum(student['exam_grades']))/len(student['exam_grades'])
    totalavg = (homeworkavg + examavg + student['extra_credit'])/3.
    return 4.0 * totalavg / 100.

... which you use on to get Jane's score:

In [22]:
print(gpa(jane))

3.6337777777777776


Now you want to do the same for your other student, Susie:

In [23]:
susie = {}
susie['name'] = 'Susie'
susie['homework_grades'] = [80., 73., 77., 50., 0.]
susie['exam_grades'] = [70., 63., 50.]

In [24]:
print(gpa(susie))

KeyError: 'extra_credit'

Oh! Susie - a bit of a slacker - didn't do the extra credit (even though she needed it more). As a result, you forgot to add an extra credit entry for her, leading to a problem with your function. You can fix this by going back and changing your `gpa` function, or by adding the missing data to Susie's dictionary. However, this highlights a disadvantage to this type of programming: the object (in this case a student, represented by a dictionary) *is not well defined.* This can lead to pitfalls when trying to write functions that will manipulate your objects.

### Defining classes

Python classes are a way to bundle data and functions together into logically coherent objects (this is known as [object oriented programming](https://en.wikipedia.org/wiki/Object-oriented_programming)). Like functions, they create their own local namespaces within your program within which you can manipulate objects. Unlike functions, these namespaces persist even when you are not interacting with the class.

To illustrate how classes work, let's revisit the above problem. We want an object that represents a student, which has a number of *attributes*, such as `name`, `homework_grades`, etc. To do that, we define a `class`:

In [28]:
class Student(object):
    """Stores information about students."""
    
    def __init__(self, name):
        self.name = name
        self.homework_grades = []
        self.exam_grades = []
        self.extra_credit = None

    def homework_avg(self):
        """Average homework grade."""
        return sum(self.homework_grades)/float(len(self.homework_grades))
    
    def exam_avg(self):
        """Average exam grade."""
        return sum(self.exam_grades)/float(len(self.exam_grades))
    
    def gpa(self):
        """GPA (on a 4.0 scale)."""
        if self.extra_credit is not None:
            totalavg = (self.homework_avg() + self.exam_avg() + self.extra_credit) / 3.
        else:
            totalavg = (self.homework_avg() + self.exam_avg())/2.
        return 4.0 * totalavg / 100.

Now we create `Student` *instances*, one for Jane, and one for Susie:

In [29]:
jane = Student('Jane')
jane.homework_grades = [87., 90., 82., 75., 97.]
jane.exam_grades = [88., 80., 94.]
jane.extra_credit = 99.

susie = Student('Susie')
susie.homework_grades = [80., 73., 77., 50., 0.]
susie.exam_grades = [70., 63., 50.]

We can now get their GPAs:

In [30]:
jane.gpa()

3.6337777777777776

In [31]:
susie.gpa()

2.34

What did we do here? First, we defined the class `Student`, to which we added methods `__init__`, `homework_avg`, `exam_avg`, and `gpa`. No code was executed until we created an *instance* of `Student` by calling:
```
jane = Student('Jane')
```
At this point, a local namespace called `jane` was created, which contained the mappings:
```
{name: 'Jane', homework_grades: [], exam_grades: [], extra_credit: None,
 homework_avg: <Student.homework_avg code>, exam_avg: <Student.exam_avg code>,
 gpa: <Student.gpa code>}
```
This is similar to what happens when a function is called. However, unlike a function, the namespace persisted after this line, allowing us to access its attributes with `.`; e.g., `jane.homework_grades`. Once we populated the relevant attributes, we were then able to call `gpa` to get Jane's GPA.

### What's the deal with `self`?

Functions that are defined in a class are called *methods*. Their purpose is to act on a class instance's attributes. In our `Student` example there were four methods: `__init__`, `homework_avg`, `exam_avg`, and `gpa`. All of these took `self` as their first argument. Why?

In order for a method to act on an instance's attributes, it must have some way to reference the class *instance*. For example, when we call `jane.gpa()`, we want the function `Student.gpa` to act on *Jane*'s grades. This is the purpose of `self`: it represents the class instance, so that we can access attributes of the instance within the function definition.

Note that when we called `jane.gpa()` we did not need to provide any arguments. This is because **Python automatically adds the class instance as the first argument when a method is called.** Thus:
<br>
<div class="alert alert-block alert-info">
    <b><big>1. All class method definitions must have <code>self</code> as their first argument.</big></b> (With two exceptions; see Appendix A, below.)
    <br>
    <br>
    <b><big>2. When calling a method of a class instance, you do not pass the instance in the function arguments.</big></b>
    </div>
 
So...

**Correct:**

In [32]:
jane.gpa()

3.6337777777777776

**Not correct:**

In [33]:
jane.gpa(jane)

TypeError: gpa() takes 1 positional argument but 2 were given

Although you can do:

In [34]:
Student.gpa(jane)

3.6337777777777776

*Note:* The name `self` in a method definition isn't special. What matters is the argument order: the *first* argument of the class method is assumed to be the class instance, regardless of what it is named. Using `self` for this is just a convention.

### The `__init__` method

For the most part, you can add any number of methods with various names to a class. However, there are a few special method names that Python recognizes (all of which begin and end with `__`). The most common of these is the  `__init__` method.

The `__init__` method is called while the class is initialized. The arguments of `__init__` determine what arguments need to be passed to the class when it is initialized. In our example, `Student.__init__` was defined as needing `name` (in addition to `self`, which is always required). As a result, when we initialized the class we had to pass in a string representing the student's name.

The `__init__` method is used to add attributes to a class and set their initial values. It is not necessary to have an `__init__` method, however. Nor must attributes be assigned in `__init__`. Attributes may be added to a class instance *after* the class is initialized. For example:

In [35]:
class Foo(object):
    pass

foo = Foo()
foo.bar = 10.
print(foo.bar)

10.0


The catch is that any attribute added to an instance *only exists for that instance*. This means that if we create another instance of `Foo`:

In [36]:
foo2 = Foo()

It will *not* have a `bar` attribute:

In [37]:
print(foo2.bar)

AttributeError: 'Foo' object has no attribute 'bar'

The benefit of `__init__` is that it ensures that all instances of a class have at least the attributes that are set within it.

## IV. Class inheritance

A major feature of classes is that they can *inherit* from other classes, allowing us to reuse code to build more complex objects. To do this, you pass the class (or classes) that you want to inherit from in the class definition.

For example, say we are teaching a course that includes a lab component, in addition to homework and exams. Now we want to able to store grades for students' labs, and incorporate that into their GPA. Rather then recreating a `Student` class from scratch, let's inherit from our current `Student` class, adding in the new attributes we need:

In [38]:
class LabStudent(Student):
    """A student with lab grades."""
    
    def __init__(self, name):
        super(LabStudent, self).__init__(name)
        self.lab_grades = []

    def lab_avg(self):
        return sum(self.lab_grades)/float(len(self.lab_grades))
    
    def gpa(self):
        if self.extra_credit is not None:
            totalavg = (self.homework_avg() + self.exam_avg() + self.lab_avg()
                        + self.extra_credit)/4.
        else:
            totalavg = (self.homework_avg() + self.exam_avg() + self.lab_avg())/3.
        return 4.0 * totalavg / 100.

In [39]:
# now lets create a lab student, and enter their grades
kelly = LabStudent('Kelly')
kelly.homework_grades = [92., 88., 95., 96.]
kelly.exam_grades = [94., 97.]
kelly.lab_grades = [65., 70., 72., 73.]  # Kelly is clearly destined to be a theorist
kelly.gpa()

3.4433333333333334

Note that in the definition of `LabStudent`, we did not include definitions for `homework_avg` or `exam_avg`. Even so, they are attributes of `LabStudent`:

In [40]:
print(kelly.homework_avg())
print(kelly.exam_avg())

92.75
95.5


This is because we automatically inherited them from `Student`.

We did, however, redefine `__init__` and `gpa`. Whenever you define a method or attribute with the same name as something in the parent class, that new method/attribute takes precedence. This is because if a class inherits from a parent, its local namespace will be nested inside of its parent's namespace (which is nested inside its parent's, ..., which is nested in the `global` namespace, which is nested in the `builtins` namespace).

Recall that when resolving a method or attribute, Python will begin by checking the inner-most namespace for a name match. If none is found, it will step outward until it finds a name match. So, when we call `kelly.gpa()`, we end up executing the `gpa` function defined in `LabStudent` (with `kelly` passed as `self`). Within that function, a call to `self.homework_avg()` is made. Since `homework_avg` is not defined in `LabStudent`'s namespace, Python stepped up to the namespace of the parent class, `Student`. There it found a `homework_avg`, so it executed that function, passing it `self` (which maps to `kelly`).

### Using `super`

We needed to override the entirety of `Student.gpa` so as to add in the lab grades. But in the case of `__init__`, we only needed to *add* functionality to `Student.__init__` so as to create an attribute for `lab_grades`. We still wanted to run the same code that was in `Student.__init__`. To do this, we could of copied the code in `Student.__init__`, but that would break our rule about not repeating code. It would be best if we could call the parent class's method. But how do we do that, if the method name has been overridden in the child class?

The `super` function provides a way of accessing and calling methods of the parent class. It takes two arguments, the child class who's parent you want to access, and the instance of the child class to act on. So,
```
super(LabStudent, self).__init__(name)
```
is equivalent to:
```
Student.__init__(self, name)
```

### Multiple inheritance

It is possible to inherit from multiple classes. This is useful if you want to *compose* a class out of smaller, simpler classes. For example:

In [38]:
class Foo(object):
    def foo(self):
        return "foo"

class Bar(object):
    def bar(self):
        return "bar"

class FooBar(Foo, Bar):
    def foobar(self):
        return self.foo() + self.bar()

In [39]:
foobar = FooBar()
print(foobar.foo(), foobar.bar(), foobar.foobar())

foo bar foobar


But what happens if both parent classes define the same function? Which one takes precedence? Let's add a common function, `cat` to both `Foo` and `Bar` and see what happens:

In [40]:
class Foo(object):
    def foo(self):
        return "foo"
    
    def cat(self):
        return "I'm Foo's cat."

class Bar(object):
    def bar(self):
        return "bar"
    
    def cat(self):
        return "I'm Bar's cat."

class FooBar(Foo, Bar):
    def foobar(self):
        return self.foo() + self.bar()

In [41]:
foobar = FooBar()
print(foobar.cat())

I'm Foo's cat.


So `Foo`'s `cat` took precedence over `Bar`'s. What happens if we flip the order of the parent classes?

In [42]:
class BarFoo(Bar, Foo):
    def foobar(self):
        return self.foo() + self.bar()

In [43]:
barfoo = BarFoo()
print(barfoo.cat())

I'm Bar's cat.


Now `Bar`'s cat has taken precedence over `Foo`'s. We see that:

<div class="alert alert-block alert-info">
    <b><big>The <i>method resolution order</i> of a class works left to right in the list of its parent classes.</big></b>
    </div>

Building children classes from multiple, simple parent classes can be very useful for reusing core functionality over multiple classes. This is done extensively in the [pycbc.inference.sampler](https://pycbc.org/pycbc/latest/html/pycbc.inference.sampler.html) package, for example.

However, this can quickly lead to confusing code in which it is difficult to figure out what happens where. This is particularly true if you inherit from classes which themselves have multiple inheritance, and have common method names as the other parents. (For nitty-gritty details on how method resolution order works in these more complicated cases, see [this blog post](https://medium.com/technology-nineleaps/python-method-resolution-order-4fd41d2fcc).)

It can also get to be very confusing if you make classes that have many generations of parents, i.e., you have something like:
```
class A(object):
    <...>

class B(A):
    <...>

class C(B):
    <...>

etc.
```
(I've certainly been guilty of this.)

**When using multiple inheritance, it's best to only inherit from parent classes that do not have multiple parents, and do not have conflicting method names. In general, even if you only inherit from a single parent, try to limit the number of generations of inheritance.**

## V. Abstract Base Classes

In PyCBC Inference we have support for several different samplers, such as `emcee`, `emcee_pt`, etc. Each of these samplers has different properties that need to be set in order for it to function correctly. For example, `emcee` needs to know how many walkers to use, whereas `emcee_pt` needs to have the number of walkers *and* the number of temperatures set. For this reason, we define a different class for each sampler; e.g., for `emcee` we use the [EmceeEnsembleSampler](https://pycbc.org/pycbc/latest/html/pycbc.inference.sampler.html#pycbc.inference.sampler.emcee.EmceeEnsembleSampler) class while for `emcee_pt` we use the [EmceePTSampler](https://pycbc.org/pycbc/latest/html/pycbc.inference.sampler.html#pycbc.inference.sampler.emcee_pt.EmceePTSampler) class.

The samplers do share some common properties though. Given a *model* that has a prior and a likelihood function, we expect all samplers to run for some time, producing posterior samples at the end. We might expect that the API for these common methods -- what the name of the method is and what arguments it takes -- should be the same for all of these classes. Then we can write a program like `pycbc_inference` that, regardless of the sampler that the user selects, can simply do `sampler.run()`.

It makes sense that all of the sampler classes should inherit from a common parent class. However, although all of the sampler's have a `run` method, what code those methods need to execute may differ between samplers. What we need is a parent class that defines common methods and arguments, but not have those methods actually do anything.

*Abstract base classes* are classes that do just this. They are used to define *abstract* methods. Abstract methods don't do anything, and in fact must be overridden by children classes. Their purpose is to establish a common API for all children classes.

A simple example:

In [44]:
from abc import ABCMeta, abstractmethod

In [45]:
class BaseCar(object):
    """Abstract base class representing a car."""
    __metaclass__ = ABCMeta
    
    @abstractmethod
    def accelerate(self, pedal_position):
        """Accelerates the car.
        
        The amount of acceleration depends on the pedal position.
        
        Parameters
        ----------
        pedal_position : float
            A float between 0 and 100 indicating how far the
            gas pedal is depressed (0 = not at all,
            100 = all the way to the floor).
        """
        pass

Here we've created a class to represent cars. All cars must have a gas pedal that, when depressed, accelerates the car. We've therefore defined an `accelerate` method that depends on how much the pedal is pressed. But what happens under the hood (literally) when the pedal is pressed depends on the type of engine the car has. So we've made `accelerate` an abstract method by sticking the `@abstractmethod` *decorator* above it. (For more on decorators, see the [Appendix](#A.-Decorators).) The method doesn't actually do anything. In fact, if we try to instantiate `BaseCar`, we'll get an error:

In [46]:
BaseCar()

TypeError: Can't instantiate abstract class BaseCar with abstract methods accelerate

`BaseCar`'s only purpose is to be inherited by other classes, which must define their own `accelerate`. Here are a couple of examples:

In [47]:
class GasCar(BaseCar):
    """A car that has a gas engine."""
    
    def accelerate(self, pedal_position):
        print("Increasing fuel flow to injectors by {} percent."
              .format(pedal_position))

        
class ElectricCar(BaseCar):
    """A car that has an electric engine."""
    
    def accelerate(self, pedal_position):
        print("Increasing power drain from batteries "
              "by {} percent.".format(pedal_position))

We can instantiate these classes:

In [48]:
gas_car = GasCar()
tesla = ElectricCar()

We can now call their `accelerate` methods using the same syntax even though they do different things:

In [49]:
gas_car.accelerate(50)

Increasing fuel flow to injectors by 50 percent.


In [50]:
tesla.accelerate(50)

Increasing power drain from batteries by 50 percent.


The end result - the car goes faster - would be the same for both.

## VI. Usage in `pycbc.inference`

This pattern of defining a base class for a common set of classes is used throughout the `pycbc.inference` package:
 * There is a unique class for each sampler supported by `pycbc_inference`. All of these sampler classes must inherit from the [BaseSampler](https://pycbc.org/pycbc/latest/html/pycbc.inference.sampler.html#pycbc.inference.sampler.base.BaseSampler) class. This allows the executable `pycbc_inference` to interact with all of these samplers using a common interface.
 * All samplers have an `io` attribute. This is another type of class that handles reading and writing results to HDF files. There is a unique `io` class for each sampler, all of which inherit from their own abstract base class, [BaseInferenceFile](https://pycbc.org/pycbc/latest/html/pycbc.inference.io.html#pycbc.inference.io.base_hdf.BaseInferenceFile).
 * Different types of models are supported, with each model having its own class. Again, all of these classes must inherit from the same abstract base class, in this case [BaseModel](https://pycbc.org/pycbc/latest/html/pycbc.inference.models.html#pycbc.inference.models.base.BaseModel).
 
The PyCBC Inference documentation pages have more details on how the [sampler](https://pycbc.org/pycbc/latest/html/inference/sampler_api.html) and [io](https://pycbc.org/pycbc/latest/html/inference/io.html) modules are organized. For more information about the `model` classes, see Alex Nitz's tutorials.

## Further reading

1. Python's tutorial on classes, which includes more details on namespaces and scope: https://docs.python.org/3.7/tutorial/classes.html
2. Another tutorial on object oriented programming in Python: https://jeffknupp.com/blog/2014/06/18/improve-your-python-python-classes-and-object-oriented-programming/
3. More on method resolution order, from the former "Benevolent Dictator for Life" [Guido van Rossum](https://en.wikipedia.org/wiki/Guido_van_Rossum): http://python-history.blogspot.com/2010/06/method-resolution-order.html
3. The following are doc pages for the `abc` module, which includes more details on abstract base classes. Unfortunately, the syntax has changed a bit between Python 2.7 and 3, so I've included both here. Currently, PyCBC usese the 2.7 syntax, but we will be switching to 3 in the future.
   * [abc in Python 2.7](https://docs.python.org/2.7/library/abc.html)
   * [abc in Python 3.7](https://docs.python.org/3.7/library/abc.html)

# Appendix

## A. Decorators

Consider the following modification of our `Student` class:

In [51]:
class Student(object):
    """Stores information about students."""
    
    def __init__(self, name):
        self.name = name
        self.homework_grades = []
        self.exam_grades = []
        self.extra_credit = None

    @classmethod
    def from_dict(cls, sdict):
        """Initialize a Student using the given dictionary."""
        student = cls(sdict['name'])
        if 'homework_grades' in sdict:
            student.homework_grades = sdict['homework_grades']
        if 'exam_grades' in sdict:
            student.exam_grades = sdict['exam_grades']
        if 'extra_credit' in sdict:
            student.extra_credit = sdict['extra_credit']
        return student
    
    @staticmethod
    def avg(values):
        return sum(values)/float(len(values))
    
    @property
    def homework_avg(self):
        """Average homework grade."""
        return self.avg(self.homework_grades)
    
    @property
    def exam_avg(self):
        """Average exam grade."""
        return self.avg(self.exam_grades)
    
    @property
    def gpa(self):
        """GPA (on a 4.0 scale)."""
        scores = [self.homework_avg, self.exam_avg]
        if self.extra_credit is not None:
            scores.append(self.extra_credit)
        return 4.0 * self.avg(scores)/100.

What are all of those things with `@` before the function definitions? And why don't the `from_dict` and `avg` methods not have `self` as their first arguments?!?

The names beginning with `@`, such as `@classmethod`, are called *decorators*. Essentially, decorators are wrappers around functions that modify the behavior of that function. They can be used on any Python function, but they are most often seen in classes.

Using and creating decorators is a large topic itself, the details of which we won't get into here (if you're interested, see [here](https://realpython.com/primer-on-python-decorators/) for an excellent tutorial). However, there are three predefined decorators in the Python `builtins` that often come up in class definitions (and which we make heavy use of in the `pycbc.inference` package) that I want highlight here: `@classmethod`, `@staticmethod`, and `@property`.

### @classmethod

The `@classmethod` decorator modifies methods so that instead of taking an *instance* of a class as the first argument (what we normally call `self`) it takes the *class itself* (which we normally call `cls`). This is typically used to provide a way to instantiate a class using alternate input arguments than what is defined in `__init__`. For example, say we had a dictionary that specifies all of the information about Susie:

In [52]:
susie_dict = {}
susie_dict['name'] = 'Susie'
susie_dict['homework_grades'] = [80., 73., 77., 50., 0.]
susie_dict['exam_grades'] = [70., 63., 50.]

We can now instantiate a `Student` representation of Susie using the `from_dict` method:

In [53]:
susie = Student.from_dict(susie_dict)

print(susie.name)
print(susie.homework_grades)
print(susie.exam_grades)

Susie
[80.0, 73.0, 77.0, 50.0, 0.0]
[70.0, 63.0, 50.0]


Notice that when we used `from_dict` we only provided the dictionary, even though the definition of `from_dict` had two arguments, `cls` and `sdict`. As with normal methods, `@classmethod` automatically adds in the class as the first argument.

In PyCBC Inference we most often use `@classmethod` to define a `from_config` function for a class. These functions extract the information needed to initialize the class from a configuration file. For example, [EmceeEnsembleSampler.from_config](http://pycbc.org/pycbc/latest/html/pycbc.inference.sampler.html#pycbc.inference.sampler.emcee.EmceeEnsembleSampler.from_config) creates an instance of an `emcee` sampler using information provided in the given configuration file.

### @staticmethod

The `@staticmethod` decorator modifies methods so that the class instance `self` is *not* automatically added when the method is called. As a result, we do not need to reserve the first argument in the definition of a `@staticmethod`. We see that in the above example: `avg` takes a single argument `values`, which is a list of values to calculate an average for.

Since methods wrapped with `@staticmethod` do not do any automatic substitutions, it's possible to use them without needing to create a class instance. For example:

In [54]:
Student.avg([60., 70.])

65.0

In fact, we could have just defined `avg` outside of `Student` in the global namespace. In that case we would of called `avg(...)` instead of `self.avg(...)` inside of `Student`.

So why use `@staticmethod`? Its main purpose is to provide a function that is heavily used by a class, but may have little meaning outside of the class. This can make code a easier to read and understand ("[syntactic sugar](https://en.wikipedia.org/wiki/Syntactic_sugar)"). Basically, `@staticmethod` is a way of logically organizing functions.

### @property

The `@property` decorator is another form of syntactic sugar that modifies how a method is called. Basically, it makes a method look like an attribute.

For example, in the above `homework_avg`, `exam_avg`, and `gpa` were all methods. Normally you would call them like `susie.gpa()` (as we did in the [Classes](#III.-Classes) section). However, because we stuck the `@property` decorator on each of these methods, we instead do:

In [55]:
susie.gpa

2.34

In other words, we no longer include the `()` after the name. Note that `@property` **only works with methods that take no arguments.**

The `@property` is used when we want to run some additional code under the hood when an attribute is accessed. It is typically paired with a "setter", which allows us to also run some code when an attribute is set. For example:

In [56]:
class Star(object):
    _mass = None
    
    @property
    def mass(self):
        if self._mass is None:
            raise ValueError("no mass set")
        return self._mass
    
    @mass.setter
    def mass(self, value):
        if value <= 0:
            raise ValueError("mass must be > 0")
        self._mass = value

In [57]:
star = Star()

In [58]:
print(star.mass)

ValueError: no mass set

In [59]:
star.mass = 10.
print(star.mass)

10.0


In [60]:
star2 = Star()
star2.mass = -3

ValueError: mass must be > 0

## B. More on functions

### Multiple variables

Our `floor` function defined above is simple function of one argument. We can build more complicated functions that act on multiple arguments. For example:

In [61]:
def floorplus(a, b):
    """Takes the floor of a and adds it to b.
    
    Parameters
    ----------
    a : float
        The value to take the floor of.
    b : float
        The value to add to.

    Returns
    -------
    float :
        Returns floor(a) + b.
    """
    return floor(a) + b

In [62]:
print(floorplus(a, b))
print(floorplus(b, c))

6.86
3.2


In this example, `floorplus` is a function of two arguments, both of which are required. We can also define functions that use default values for some arguments. If we don't provide the values for these arguments when calling the function, the default is used. For example:

In [63]:
def floorplus2(a, b, offset=0):
    return floor(a) + b + offset

In [64]:
print(floorplus2(a, b))
print(floorplus2(a, b, offset=1))

6.86
7.86


### Positional v. keyword arguments

When we called `floorplus2` in the second example, we just passed `a` and `b`, but explicitly set `offset=1`. But we did not need to set `offset` in this manner; this works just as well:

In [65]:
print(floorplus2(a, b, 1))

7.86


Alternatively, we could have been explicit with all three arguments:

In [66]:
print(floorplus2(a=a, b=b, offset=1))

7.86


When arguments are passed without explicitly setting their names, they are called **positional** arguments. When arguments are passed with their names explicitly set, they are called **keyword** arguments.

If we provide all arguments as keyword arguments, as we did in the second case, we do not need to respect the order of the arguments. To illustrate this, let's define a new function that simply prints what value it is given for each argument:

In [67]:
def printargs(a, b, c=3, d=4):
    print('a:', a, 'b:', b, 'c:', c, 'd:', d)

Note that we get the same result in all of the following ways:

In [68]:
printargs(1, 2)
printargs(1, 2, 3, 4)
printargs(a=1, b=2, c=3, d=4)
printargs(c=3, a=1, d=4, b=2)

a: 1 b: 2 c: 3 d: 4
a: 1 b: 2 c: 3 d: 4
a: 1 b: 2 c: 3 d: 4
a: 1 b: 2 c: 3 d: 4


### Calling functions with the `*` and `**` operators

Python provides another, even more flexible way to call functions. Say we have a list of values that we want our `printargs` function to operate on:

In [69]:
args = [5, 6, 7, 8]

We can then do:

In [70]:
printargs(*args)

a: 5 b: 6 c: 7 d: 8


Alternatively, if we have a dictionary of values:

In [71]:
kwargs = {'a': 5, 'b': 6, 'c': 7, 'd': 8}

In [72]:
printargs(**kwargs)

a: 5 b: 6 c: 7 d: 8


In other words, if we have a list of values, we can pass the elements of the list as positional arguments to a function using the `*` operator followed by the list. If we have a dictionary of values, we can pass the dictionary as keyword arguments to the function using the `**` operator.

<div class="alert alert-block alert-warning"><b>Note:</b> <code>*</code> and <code>**</code> are also used as multiplication and exponent operators, respectively. What they are being used for will always be obvious from the context.</div>
    
This functionality is useful when handling functions programmatically. For example, if you don't know ahead of time if a user is going to provide one or more default arguments, you can set up a dictionary and populate it accordingly.

### Defining functions with the `*` and `**` operators

You can also define functions using these operators. For example:

In [73]:
def printargs2(*args, **kwargs):
    print('args:', args)
    print('kwargs:', kwargs)

In [74]:
args = range(3)
kwargs = {'foo': 100, 'bar': 200, 'cat': 'hat'}
printargs2(*args, **kwargs)

args: (0, 1, 2)
kwargs: {'foo': 100, 'bar': 200, 'cat': 'hat'}


The advantage with this is you can define functions for which you do not know how many arguments will be passed at run time. This can happen if the number of argument and names of the arguments the function will need will change depending on the context. This is made use of by PyCBC's [waveform.get_td_waveform](http://pycbc.org/pycbc/latest/html/pycbc.waveform.html#pycbc.waveform.waveform.get_td_waveform) and [waveform.get_fd_waveform](http://pycbc.org/pycbc/latest/html/pycbc.waveform.html#pycbc.waveform.waveform.get_fd_waveform) functions, for example.

The disadvantage to defining functions in this way is it obfuscates what the function does and what arguments are needed. For this reason, **defining functions in this manner should only be used sparingly**.