# Coupling functions and data structures with classes

This unit introduces one of the most important aspects of modern programming languages, **classes** and **object-oriented programming**.
Classes provide a means for bundling data structures and functions into a convenient package where all parts fit together without any hiccups.
As our data structures and functions get increasingly more complicated, classes will be a godsend to keep things manageable.

## Why life without classes isn't fun

You already know that Python has a variety of methods for objects of different types.

In [None]:
print("uppercase".upper())
print(["item1"].append("item2"))
print({"item1": 1, "item2": 2}.values())

The nice thing about methods is that they work for any arbitrary object as long as it has the right type.
The `.upper()` method works for every string, no matter what it looks like.

In [None]:
print("UPPERCASE".upper())
print("".upper())
print("@~&öööǿ".upper())

A type for which `.upper()` wouldn't work doesn't have the method to begin with.

In [None]:
# we can't uppercase a list
["uppercase"].upper()

This is a major advantage over functions.
Just about any object can be passed into a function, including objects that the function is not defined for.
This means that the programmer has to make sure that the function is only fed with things it doesn't choke on.
But this is surprisingly difficult as your code becomes more complex.

Just consider our implementation of FSAs in the previous notebook.
Every change in the data structure also required a change in the function `accept`.
FSAs also had to be created manually according to a specific template.
A minor typo could break the whole code:

In [None]:
# This doesn't run. Can you spot the mistake?

def accepts(sentence, fsa):
    """Test if FSA accepts sentence.
    
    The FSA must be a dictionary of the form
    
    {'I': initial_state,
     'F': {final_state1, final_state2, ...},
     'T': {current_state: {word: next_state}}
    
    Arguments
    ---------
    sentence: list of strings
        tokenized sentence
    fsa: dict
        finite-state automaton
        
    Returns
    -------
    bool
    """
    # set current state
    cs = fsa["I"]
    # iterate over sentence and follow along in automaton
    for word in sentence:
        cs = fsa["T"].get(cs, {}).get(word)
        if cs is None:
            return False
    # did we make it to a final state?
    return True if cs in fsa["F"] else False

fsa = {"I": 1,
       "F": {2},
       "T": {1: {"this": 2},
             2: {"breaks", 1},
            },
      }

accepts(["this", "breaks"], fsa)

The code above contains a minor typo.
In the defintion of the FSA's transitions, it says `{"breaks", 1}` instead of `{"breaks": 1}`.
This means the value for the key `2` is a set instead of a dictionary.
This is an easy typo to make, and it causes an error that is hard to find.
And if something is already difficult to spot in our short toy code, just imagine how hard it would be in a massive code base with thousands if not millions of line of code.

This simply isn't a very maintainable approach.
It would be much nicer if there was unified piece of code that would allow us to

1. define what FSAs look like, and
1. easily create new FSAs, and
1. implement functions like `accepts` as custom methods for FSAs.

This is exactly what classes are for.

## The basics of classes

Rather than starting out with a lot of theory, we'll just look at an example of a class right away.

In [None]:
class testclass:
    class_greeting = "Every member of this class has this greeting"
    
a_test_object = testclass()
a_test_object.class_greeting

The code above defines a new class with the name `test`.
The class doesn't provide much.
It only contains a variable `class_greeting` that is set of a specific string.
This variable is an **attribute** of the class.

We then use `test()` to create a new object of this class.
We store the object in the variable `a_test_object`.
We can then use `a_test_object.class_greeting` to access the attribute `class_greeting` for this object.

We can also put function definitions inside a class.
Any function defined this way will be available as a **method**.
Class methods work just like all the methods you have already encountered.

Functions that are defined inside a class always have the special value `self` as their first argument.
This basically tells Python that the method call `some_class.some_method(x, y, z)` is short for the function call `class_type.some_method(some_class, x, y, z)`.
This should be familiar to you - as you know, `some_list.append("x")` is just a shorthand for `list.append(some_list, x)`.

We can also use `self` to reference attributes of the object.
For instance, the attribute `class_greeting` is available as `self.class_greeting`.

In [None]:
class testclass:
    class_greeting = "Every member of this class has this greeting"
    
    def greet(self):
        print(self.class_greeting)
        
    def greet_user(self, username):
        print(f"{self.class_greeting}, {username}!")

    
a_test_object = testclass()
# call greet as method
a_test_object.greet()
# call greet as function
testclass.greet(a_test_object)

# call greet_user as method
a_test_object.greet_user("Thomas")
# call greet_user as function
testclass.greet_user(a_test_object, "Thomas")

Notice how we instantiate the test object with `a_test_object = testclass()` instead of `a_test_object = testclass`.
You might suspect that we can pass arguments to `testclass()`, and you'd be correct.
What exactly happens with those arguments is determined by the special function `__init__`.
Like all other functions, `__init__` must have `self` as the first argument.

In [None]:
class testclass:
    def __init__(self, greeting="A default greeting"):
        self.class_greeting = greeting
        
    def greet(self):
        print(self.class_greeting)
        
    def greet_user(self, username):
        print(f"{self.class_greeting}, {username}!")

    
a_test_object = testclass("Now we have a different greeting")
# call greet as method
a_test_object.greet()
# call greet as function
testclass.greet(a_test_object)

# call greet_user as method
a_test_object.greet_user("Thomas")
# call greet_user as function
testclass.greet_user(a_test_object, "Thomas")

# instantiate another object without a custom greeting
another_test_object = testclass()
another_test_object.greet()

And that's already enough to implement FSAs as a class.

## Defining an FSA class

An FSA class requires several components.
We need attributes for the initial state, the set of final states, and the transitions.
These should be passed in as arguments, which requires an appropriately defined `__init__` function.

In [None]:
class fsa:
    def __init__(self, initial=1, final=set(), transitions={}):
        """Class for finite-state automata.
        
        Arguments
        ---------
        initial: int or string
            name of initial state
            default: 1
        final: set
            set of final states
            default: empty set
        transitions: dict
            dictionary of the form {current_state: {arc_label: new_state}}
            default: empty dictionary
        """
        self.I = initial
        self.F = final
        self.T = transitions

Now we can already use our class to construct a new FSA.

In [None]:
fsa1 = fsa(initial=1, final={1}, transitions={1: {"awesome": 1}})
print("Initial state:", fsa1.I)
print("Final states:", fsa1.F)
print("Transitions:", fsa1.T)

But of course that's pretty useless without an `accepts` method, so we define a corresponding function for the class.
It is almost exactly the same as the version from the previous notebook, except that we now use `self.I`, `self.F`, and `self.T` for the inital state, the set of final states, and the transitions dictionary, respectively.

In [None]:
class fsa:
    def __init__(self, initial=1, final=set(), transitions={}):
        """Class for finite-state automata.
        
        Arguments
        ---------
        initial: int or string
            name of initial state
            default: 1
        final: set
            set of final states
            default: empty set
        transitions: dict
            dictionary of the form {current_state: {arc_label: new_state}}
            default: empty dictionary
        """
        self.I = initial
        self.F = final
        self.T = transitions
        

    def accepts(self, sentence):
        """Test if FSA accepts sentence.
    
        Arguments
        ---------
        sentence: list of strings
            tokenized sentence
        
        Returns
        -------
        bool
        """
        # set current state to initial state
        cs = self.I
        # iterate over sentence and follow along in automaton
        for word in sentence:
            cs = self.T.get(cs, {}).get(word)
            if cs is None:
                return False
        # did we make it to a final state?
        return True if cs in self.F else False

In [None]:
fsa1 = fsa(initial=1, final={1}, transitions={1: {"awesome": 1}})
fsa1.accepts(["awesome", "awesome", "awesome"])

In [None]:
fsa1 = fsa(initial=1, final={1}, transitions={1: {"awesome": 1}})
fsa1.accepts(["lame", "lame", "lame"])

So there you have it.
A simple class for FSAs with a method for checking strings.
For now the benefit is pretty marginal since `fsa1.accepts(sentence)` isn't all too different from `accepts(sentence, fsa1)` as we had it in the previous notebook.
In particular, it's still easy to define incorrect automata.
In order to handle this properly, we need to implement some error checking mechanisms.
But this will be left for the next unit.
Getting the hang of classes will already keep you occupied for a while, there's no need to pile on even more.

## Bullet-point summary

- Classes allow us to couple together data structures and functions.
- An **object** is an instance of a specific class.
  For example, `"some string"` is an instance of the string class and hence a string object.
- Variables that are defined inside a class are called **attributes** and are available as `name_of_object.variable_name`.
- Functions defined inside a class are available as methods in the usual fashion.
- Every function inside a class must have `self` as the first argument.
- The special function `__init__` is used to set up all arguments that get passed into the class upon creation of an object.