<img src="https://github.com/Center-for-Health-Data-Science/PythonTsunami/blob/spring2022/figures/HeaDS_logo_large_withTitle.png?raw=1" width="300">

<img src="https://github.com/Center-for-Health-Data-Science/PythonTsunami/blob/spring2022/figures/tsunami_logo.PNG?raw=1" width="600">

# Classes in Python


We used a lot of class objects in the last two days (dataframes, PCA objects, linear regression model, decision tree model, ect).

Today, we will hear more about how to define our own classes. Let's start with a quick recap.


## Exercise 1 (5 mins)

In your group, discuss:

* What is an attribute? Can you think of examples from the last two days?
* What is a method?
* What is an object instance?



## User-defined classes

Just like how we can make custom, user-defined functions we can also define our own types of objects, complete with attributes and methods that describe the object and its functionality.


### Example: A simple class of duck objects

<div>
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/b/bf/Anas_platyrhynchos_male_female_quadrat.jpg/800px-Anas_platyrhynchos_male_female_quadrat.jpg" width="300"/>
</div>


In [None]:
class DuckClass:

    def say_quack(self):
        return 'Quack, quack!'

How does it work? Let's create an instance, i.e. a duck object.

In [None]:
#create an object instance
duck1 = DuckClass()

Now, an object with the name 'duck1' exists:

In [None]:
duck1

<__main__.DuckClass at 0x7c58d8820be0>

We can inspect it in the same way we did with the PCA and model objects, by using `vars()`.

In [None]:
vars(duck1)

{}

This object doesn't have any attributes (at least none we can publically see).

It has a method though because we defined one:

In [None]:
duck1.say_quack()

'Quack, quack!'

Let's look at our class definition again

The def line `class [className]:` is required.

Next, we can see that the method `say_quack` is a user-defined function inside the class definition (it starts with the `def` keyword).

```python
class DuckClass:                              # def line with name of the class

    def say_quack(self):                      # a method
        return 'Quack, quack!'
```

It has a funny argument though. What exactly is `self`?

Just like functions, methods are executable and therefore they end in bracets.

Just like functions, methods can have arguments.

**Unlike** functions, methods are *called on the object*:

```
#we call a method on the object
duck1.say_quack()

# we pass an object to a function
print(duck1)

```

Since a method is an operation we perform **on** the object, we also need to pass the object to it. This is what `self` means, the object itself.


Let's add an attribute to our class.

In [None]:
class DuckClass:
    scientific_name = 'Anas platyrhynchos'

    def say_quack(self):
        return 'Quack, quack!'

In [None]:
#we need to remake the object for the changes in the definition to take effect
duck1 = DuckClass()

In [None]:
duck1.scientific_name

'Anas platyrhynchos'

## Class attributes

`scientific_name` is a **class attribute**. It is defined right after the class declaration:

```python
class DuckClass:
    scientific_name = 'Anas platyrhynchos'
```

Class attributes are shared by **all** instances of a class:

In [None]:
#create a second duck:
duck2 = DuckClass()

In [None]:
#check the sci. name of duck1 and duck2:
print(duck1.scientific_name)
print(duck2.scientific_name)

Anas platyrhynchos
Anas platyrhynchos


We will see why this matters later.

## Exercise 2: Make a class (10 mins)

Time for an exercise. Define a class of dog objects that has at least 1 class attribute and 1 method.


## Instance attributes and the init

We have now seen how to create methods and class attributes inside our user-defined class. But there is another kind of attribute:

**Instance attribute**

Unlike class attributes, instance attributes belong to their instance and they can differ between two instances.

Instance attributes are declared inside a special method called `__init__`. This method is not typically called by the user (unlike other methods, like `say_quack`). It runs only once, when a new object instance is created.


### Example 2: A class of named ducks

In [None]:
class DuckClass:
    scientific_name = 'Anas platyrhynchos'     # A class attribute

    def __init__(self, name = 'Donald'):       # The init method
        self.name = name                       # An instance attribute

    def say_quack(self):                       # A method
        return 'Quack, quack!'

Just like methods, instance attributes reference the object with the `self` keyword.

Let's remake the duck1 object as an instance of this new class. `duck1` now has both a **class** attribute which is its scientific name and an **instance** attribute which is its personal name.

In [None]:
duck1 = DuckClass()
print(duck1.scientific_name)
print(duck1.name)

Anas platyrhynchos
Donald


You can see that the `name` of this duck is Donald. This is because I have defined as the default value in the `__init__` function definition (remember default values?).

If we don't pass a name during creation of the instance object it will therefore get the default name, 'Donald'. We can also pass another name at creation:

In [None]:
duck2 = DuckClass(name = 'Daisy')
print(duck2.scientific_name)
print(duck2.name)

Daisy
Anas platyrhynchos


Now we have two different ducks named Donald and Daisy. They both have the same scientific name, i.e. class attribute, 'Anas platyrhynchos'. But their pet names, i.e. instance attributes, are individual.

If instance attributes have no default value inside the `__init__`, we are required to provide the value at creation.

In [None]:
class DuckClass:
    scientific_name = 'Anas platyrhynchos'     # A class attribute

    def __init__(self, age, name = 'Donald'):  # The init method
        self.name = name                       # An instance attribute
        self.age = age

    def say_quack(self):                       # A method
        return 'Quack, quack!'

In [None]:
duck1 = DuckClass()

TypeError: DuckClass.__init__() missing 1 required positional argument: 'age'

In [None]:
duck1 = DuckClass(age=1)

In [None]:
duck1.age

1

Instance attributes are the kind of attributes that are shown to us when we call `vars()`. Notice that the class attribute is not shown.

In [None]:
vars(duck1)

{'name': 'Donald', 'age': 1}

## Exercise 3: Update the dog class (10 mins)

Take the next 10 minutes to re-write the definiton of your dog class by including:

* an `init` method
* at least 1 **instance** attribute.

Then create at least one instance of the class, test the method and display the attributes.
  

## Updating attributes

So now we have a class definition with a method, a class attribute and two instance attributes. However, instance attributes do not generally remain the same for all of the object's life. Indeed, when we are for example training a model object we want its instance attributes to change to reflect what the model has learned from the data.

The proper way to update attributes is via a method.




In [None]:
class DuckClass:
    scientific_name = 'Anas platyrhynchos'     # A class attribute

    def __init__(self, name = 'Donald'):
        self.name = name                       # An instance attribute
        self.fav_food = []                     # Another instance attribute

    def say_quack(self):
        return 'Quack, quack!'

    def add_food(self, food):
        self.fav_food.append(food)

This class of ducks has both a name and list of their favorite foods. Did you know bread is bad for ducks?

Directly after creation, the duck object has no favorite foods because this attribute is initialized to an empty list.

In [None]:
duck1 = DuckClass()
duck1.fav_food

[]

Let's add a favorite food by using the `add_food` method we have made for this purpose:

In [None]:
duck1.add_food('corn')
duck1.fav_food

['corn']

And one more:

In [None]:
duck1.add_food('snails')
duck1.fav_food

['corn', 'snails']

Have another look at the method definition and notice all the places we use `self`:


```python
    def add_food(self, food):
        self.fav_food.append(food)
```

## Exercise 4: Updating instance attributes (10 mins)

Add a method to the DuckClass that lets us rename our ducks and test it.

In [None]:
class DuckClass:
    scientific_name = 'Anas platyrhynchos'     # A class attribute

    def __init__(self, name = 'Donald'):       # An instance attribute
        self.name = name
        self.fav_food = []

    def say_quack(self):
        return 'Quack, quack!'

    def add_food(self, food):
        self.fav_food.append(food)

    # add your method here

In [None]:
#test it

## Other special methods: repr and str

You may have noticed that when we investigate a class instance, i.e. by writing `print(duck1)`, what we get back is not very useful.

In [None]:
print(duck1)

<__main__.DuckClass object at 0x7f7bb1ec6d30>


We do see that we have a DuckClass object but not much more. There is more info contained in the object though, remember this duck has a name and favorite food list:


In [None]:
print(duck1.name)
print(duck1.fav_food)

Donald
['corn', 'snails']


To make the representation of the object instance more useful we can add special functions to the class definition that control what we see when we call the object. In come `__repr__` and `__str__`.

In [None]:
class DuckClass:
    scientific_name = 'Anas platyrhynchos'

    def __init__(self, name = 'Donald'):
        self.name = name
        self.fav_food = []

    def say_quack(self):
        return 'Quack, quack!'

    def add_food(self, food):
        self.fav_food.append(food)

    def __repr__(self):
        return "DuckClass(name=%r,fav_food=%r)" % (self.name,self.fav_food)

    def __str__(self):
        return "This duck is called %s and likes %s." %(self.name,', '.join(self.fav_food))

In [None]:
#now we will have to re-build our duck1 since we have a new class definition

duck1 = DuckClass()
duck1.add_food('corn')
duck1.add_food('snails')

Now, writing only the name of the object instance will get us the repr:

In [None]:
duck1

DuckClass(name='Donald',fav_food=['corn', 'snails'])

Whereas a call to print will get us the str:

In [None]:
print(duck1)

This duck is called Donald and likes corn, snails.


We can also explicitly specify which one we want:

In [None]:
print(repr(duck1))
print(str(duck1))

DuckClass(name='Donald',fav_food=['corn', 'snails'])
This duck is called Donald and likes corn, snails.


Stackoverflow user moshez summarizes for us:

In simple terms: almost every object you implement should have a functional __repr__ that’s usable for understanding the object. Implementing __str__ is optional: do that if you need a “pretty print” functionality (for example, used by a report generator).

For more details see the [full post](https://stackoverflow.com/questions/1436703/what-is-the-difference-between-str-and-repr).

## Exercise 5: str and repr (5 mins)

Add a `str` and a `repr` method to your dog class from earlier and test that they work.

## Documenting classes

Like user-defined functions, user-defined classes are documented with the doc string following directly after the `class` statement:

In [1]:
class DuckClass:
    """A class of duck objects"""               #The class doc string
    scientific_name = 'Anas platyrhynchos'

    def __init__(self, name = 'Donald'):
        self.name = name
        self.fav_food = []

    def say_quack(self):
        return 'Quack, quack!'

    def add_food(self, food):
        self.fav_food.append(food)

Now we (or another user) can run `help` on the class to get a pretty extensive description:

In [2]:
help(DuckClass)

Help on class DuckClass in module __main__:

class DuckClass(builtins.object)
 |  DuckClass(name='Donald')
 |  
 |  A class of duck objects
 |  
 |  Methods defined here:
 |  
 |  __init__(self, name='Donald')
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  add_food(self, food)
 |  
 |  say_quack(self)
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  scientific_name = 'Anas platyrhynchos'



Methods inside the class definition are also user-defined functions, so we should add doc strings here too.

In [3]:
class DuckClass:
    """A class of duck objects"""
    scientific_name = 'Anas platyrhynchos'

    def __init__(self, name = 'Donald'):       # An instance attribute
        """The init method to create a new DuckClass instance

        Parameters
        ----------
        name : str, optional
            The name of the duck (default 'Donald')
        fav_food : list
            A list of favorite foods of this duck. To be added to with the add_food method.
        """
        self.name = name
        self.fav_food = []

    def say_quack(self):
        """A method to print 'Quack, quack!'"""
        return 'Quack, quack!'

    def add_food(self, food):
        """A method to add items to the fav_food instance attribute

        Parameters
        ----------
        food : str
            The food to add
        """
        self.fav_food.append(food)

In [4]:
help(DuckClass)

Help on class DuckClass in module __main__:

class DuckClass(builtins.object)
 |  DuckClass(name='Donald')
 |  
 |  A class of duck objects
 |  
 |  Methods defined here:
 |  
 |  __init__(self, name='Donald')
 |      The init method to create a new DuckClass instance
 |      
 |      Parameters
 |      ----------
 |      name : str, optional
 |          The name of the duck (default 'Donald')
 |      fav_food : list
 |          A list of favorite foods of this duck. To be added to with the add_food method.
 |  
 |  add_food(self, food)
 |      A method to add items to the fav_food instance attribute
 |      
 |      Parameters
 |      ----------
 |      food : str
 |          The food to add
 |  
 |  say_quack(self)
 |      A method to print 'Quack, quack!'
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak r