# Objects
Click [here](https://datahub.berkeley.edu/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fberkeley-physics%2FPython-Tutorials&urlpath=tree%2FPython-Tutorials%2F2+-+Intermediate%2F2+-+Objects.ipynb&branch=master) to open this notebook in the DataHub.

## Learning objectives
By the end of this tutorial, you will be able to:
- Distinguish between functions and methods
- Recognise and avoid errors caused by copying pointers rather than objects
- Create your own class
- Use matplotlib's object-oriented approach

## Relevant documentation
- [Python data structures tutorial](https://docs.python.org/3/tutorial/datastructures.html)
- [NumPy `ndarray` class](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html)
- [Python `copy` module](https://docs.python.org/3/library/copy.html)
- [Python classes tutorial](https://docs.python.org/3/tutorial/classes.html)
- [Matplotlib `Figure` class](https://matplotlib.org/api/_as_gen/matplotlib.figure.Figure.html#matplotlib.figure.Figure)
- [Matplotlib `Axes` class](https://matplotlib.org/api/axes_api.html#matplotlib.axes.Axes)

## Primitives vs objects
An _object_ is an entity that can have attributes and methods. A _primitive_ on the other hand, is simply a value. Consider the difference between a list and an integer. A list has many associated pieces of data, including the values of the list, and actions that can be done to this specific list, like adding and removing values. It is convenient to store all these data and actions in a single reference to the list, as shown below.

The associated data values are called _attributes_ and associated functions are called _methods,_ and are accessed using a dot after the reference to the object itself, as shown below.

In [None]:
x = [1,2,3]
x.remove(1) #method that removes first instance of given value
print(x)
x.append(4) #method to add element at end
print(x)

In [None]:
import numpy as np

y = np.arange(10)
print(type(y))
y.size, y.shape, y.dtype #some attributes of numpy array object

In [None]:
z = y.reshape((5,2)) #one method of numpy array object
z.shape

A common error is to confuse the object itself with a reference to it. In the following example, the variable `x` stores a _pointer_ to a list. Then the variable assignment `y=x` assigns the same pointer to `y`, so we now have two references to the __same__ list.

In [None]:
x = [1,2,3,4]
y = x
x.append(5)
print(x)
print(y)

If we instead wanted to copy the list, we could use the `copy` method. This ensures that `y` gets set to a pointer to a second object, which is a copy of the first object.

In [None]:
x = [1,2,3,4]
y = x.copy()
x.append(5)
print(x)
print(y)

However, this is only a _shallow_ copy. If an element of the list is another pointer, the same problem occurs: the pointer gets copied rather than the object, so we have multiple references to the same object, rather than multiple objects.

In [None]:
x = [[1,2],3,4]
y = x.copy()
x[0].append(5)
print(x)
print(y)

The python module `copy` contains a function `deepcopy` that recursively copies each object encountered, preventing such errors from occuring at all depths. This works with all objects.

In [None]:
from copy import deepcopy
x = [[1,2],3,4]
y = deepcopy(x)
x[0].append(5)
print(x)
print(y)

NumPy arrays have their own `copy` method (and the typing requirements usually prevent nested arrays, so shallow copies are typically enough).

In [None]:
x = np.array([1,2,3])
y = x
x[0] = 0
print(x)
print(y)

In [None]:
x = np.array([1,2,3])
y = x.copy()
x[0] = 0
print(x)
print(y)

## Creating your own classes
You can define your own data structures by defining a `class` similarly to how you define a function. The most important method is `__init__(self, *args)`: it defines a _constructor,_ which is a function that creates objects that are instances of this class. You can define attributes, using the special `self` reference to the instantiated object to access/modify attributes and methods. The `__repr__` method determines how the object is printed, or in general, how it is represented by a string. 

The following object allows us to store all rational numbers exactly (since arbitrarily large integers can be stored exactly in Python). The `__float__` method defines how the object can be converted to a float. In general, things surrounded by the double underscore have special meanings in Python.

In [None]:
from math import gcd

class Fraction:
    def __init__(self, num, den):
        r = gcd(num, den)
        self.num = num//r
        self.den = den//r
        
    def __repr__(self):
        return "%d/%d"%(self.num,self.den)
    
    def __float__(self):
        return self.num/self.den
    
half = Fraction(3,6) #calling constructor (__init__), passing num and den

In [None]:
type(half) #reference to class itself

In [None]:
print(half) #calling __repr__

In [None]:
float(half) #calling __float__

The following functions perform arithmetic with fractions. Convert them to methods of the `Fraction` class (replacing the first argument with `self`). This will allow us to write things like `half.invert()` or `half.add(Fraction(1,3))`, freeing up the global namespace.

In [None]:
def invert(frac):
    return Fraction(frac.den, frac.num)
    
def add(frac1,frac2):
    return Fraction(frac1.num*frac2.den+frac2.num*frac1.den, frac1.den*frac2.den)

def multiply(frac1,frac2):
    return Fraction(frac1.num*frac2.num, frac1.den*frac2.den)

You can even figure out how to name the methods to work with `+` and `*`, so you can write things like `Fraction(1,3)+Fraction(1,2)`.

## Matplotlib Axes objects
So far, you've probably used the global imperative `matplotlib` functions like `plt.plot`. However, this can become confusing, especially when we have multiple plots in the same figure. How do we specify which plot we want to modify? In such cases (while there are less elegnat workarounds), the preferred approach is to work with the `Figure` and `Axes` objects, which is what matplotlib uses below the hood anyway.

If you're already plotting something, you can obtain references to the Figure or Axes objects that are currently in focus by calling `gcf()` or `gca()`. This allows you to set more specific options than you can with the imperative approach.

In [None]:
%matplotlib inline
from matplotlib import pyplot as plt

x = np.linspace(0,10,1000)
y = np.exp(-x)

plt.plot(x,y)
plt.xlabel("$x$")
plt.ylabel("$y$") 
#try applying a log-scale to the y-axis

In [None]:
plt.plot(x,y)
plt.xlabel("$x$")
plt.ylabel("$y$")

ax = plt.gca()
print(type(ax))
ax.set_yscale("log")

There are several ways to interact with matplotlib: you can explicitly create Axes objects and add them to a Figure object, you can use helper functions that create and return both at the same time, or you can avoid referring to the objects except where necessary. The imperative plotting commands you are used to are actually methods on Axes objects, and can be called that way. The following cell is equivalent to that above.

In [None]:
ax = plt.gca()
ax.plot(x,y)
ax.set_xlabel("$x$")
ax.set_ylabel("$y$")
ax.set_yscale("log")

The `plt.subplots` function is useful for creating plots with multiple subplots. It returns a Figure object and an array of Axis objects.

In [None]:
f, ax = plt.subplots(2, 2, figsize=(16,9))
ax[0,0].plot(x, y)
ax[0,1].plot(x, -y)
ax[1,0].plot(-x, y)
ax[1,1].plot(-x, -y)