# WHEN TO YOU USE OOP?

Identifying objects is a very important task in object-oriented analysis and programming.

Remember, objects are things that have both data and behaviour.

If we are working only with data, we are often better off storing it in a list, set, dictionary or other data structure.

On the other hand, if we are working only with behaviour, but no stored data, a simple function is more suitable.

*An object, however, has both data and behaviour.*

Proficient Python programmers use built-in data structures unless there is an obvious need to define a class.

There is no reason to add an extra level of compelxity if it does not help organize our code.

On the other hand, the need is not always self-evident.

We can often start our programs by storing data in a few variables.

As the program expands, we will later find that we are passing the same set of related variables to a set of functions.

**This is the time to think about grouping both variables and functions into a class.**



For example, if we are desinging a program to model polygons in two-dimensional space, we might start with each polygon represented as a list of points.

The points would be modeled as two tubles (x,y) describing where that point is located.

This is all data, stored in a set of nested data structures.

We can (and often do) start hacking by:


In [1]:
square = [ (1,1), (1,2), (2,2), (2,1) ]

Now, if we want to calculate the distance around the perimeter of the polygon, we need to sum the distances between each point.

To do this, we need a function to calculate the distance between two points.

In [2]:
from math import hypot

def distance(p1, p2):
    return hypot(p1[0]-p2[0], p1[1]-p2[1])


In [8]:
distance(square[0], square[1])


1.0

In [9]:
def perimeter(polygon):
    
    pairs = zip(polygon, polygon[1:] + [polygon[0]])
    
    return sum(distance(p1, p2) for p1, p2 in pairs)

In [10]:
perimeter(square)

4.0

We can write an object-oriented version in record time. Let's compare them as follows:


In [13]:
from math import hypot
from typing import List, Tuple, Optional, Iterable


class Point:
    
    def __init__(self, x: float, y: float) -> None:
        self.x = x
        self.y = y
        
    def distance(self, other: 'Point') -> float:
        return hypot(self.x - other.x, self.y - other.y)
    
class Polygon:
    
    def __init__(self, ) -> None:
        self.vertices: List[Point] = []
        
    def add_point(self, point: Point) -> None:
        self.vertices.append(point)
        
    def perimeter(self) -> float:
        pairs = zip(self.vertices, self.vertices[1:] + [self.vertices[0]])
        
        return sum(p1.distance(p2) for p1, p2 in pairs)

In [14]:
square = Polygon()
square.add_point(Point(1,1))
square.add_point(Point(1,2))
square.add_point(Point(2,2))
square.add_point(Point(2,1))

In [15]:
square.perimeter()

4.0

That's fairly succinct and easy to read, you might think, but let's compare it to the function-based code:


In [17]:
square = [(1,1), (1,2), (2,2), (2,1)]
perimeter(square)



4.0

Hmm, maybe the object-oriented API isn't so compact! 


Code length is not a good indicator of code complexity. 

Some programmers get hung up on complicated one-liners that do an incredible amount of work in one line of code. 

This can be a fun exercise, but the result is often unreadable, even to the original author the following day. 

Minimizing the amount of code can often make a program easier to read, but do not blindly assume this is the case.

**No one wins at code golf. Minimizing the volume of code is rarely desirable.**

Luckily, this trade-off isn't necessary. We can make the object-oriented Polygon API as easy to use as the functional implementation. 

All we have to do is alter our Polygon class so that it can be constructed with multiple points.

Let's give it an initializer that accepts a list of Point objects:

In [18]:
class Polygon_2:
    def __init__(self, vertices: Optional[Iterable[Point]] = None) -> None:
        self.vertices = list(vertices) if vertices else []
    def perimeter(self) -> float:
        pairs = zip(
            self.vertices, self.vertices[1:] + self.vertices[:1])
        return sum(p1.distance(p2) for p1, p2 in pairs)


For the perimeter() method, we've used the zip() function to create pairs of vertices, with items drawn from two lists to create a sequence of pairs. 

One list provided to zip() is the complete sequence of vertices. 

The other list of vertices starts from vertex 1 (not 0) and ends with the vertex before 1 (that is, vertex 0). 

For a triangle, this will make three pairs:
 
(v[0], v[1]), (v[1], v[2]), and (v[2], v[0]). 

We can then compute the distance between the pairs using Point.distance(). 

Finally, we sum the sequence of distances. This seems to improve things considerably. 

We can now use this class like the original hacked-in function definitions:

In [19]:
square = Polygon_2([Point(1,1), Point(1,2), Point(2,2), Point(2,1)])

In [20]:
square.perimeter()

4.0

It's handy to have the details of the individual method definitions. 

We've built an API that's close to the original, succinct set of definitions. 

We've added enough formality to be confident the code is likely to work before we even start putting test cases together.

Let's take one more step. Let's allow it to accept tuples too, and we can construct the Point objects ourselves, if needed:

In [21]:
from typing import Union

Pair = Tuple[float, float]

Point_or_Tuple = Union[Point, Pair]

class Polygon_3:
    def __init__(self, vertices: Optional[Iterable[Point_or_Tuple]] = None) -> None:
        self.vertices: List[Point] = []
        if vertices:
            for point_or_tuple in vertices:
                self.vertices.append(self.make_point(point_or_tuple))
    @staticmethod
    def make_point(item: Point_or_Tuple) -> Point:
        return item if isinstance(item, Point) else Point(*item)


This initializer goes through the list of items (either Point or Tuple[float, float]) and ensures that any non-Point objects are converted to Point instances.

For an example this small, there's no clear winner between the object-oriented and more data-oriented versions of this code. They all do the same thing. 

If we have new functions that accept a polygon argument, such as area(polygon) or point_in_polygon(polygon, x, y), the benefits of the object-oriented code become increasingly obvious.

Likewise, if we add other attributes to the polygon, such as color or texture, it makes more and more sense to encapsulate that data into a single class.

The distinction is a design decision, but in general, the more important a set of data is, the more likely it is to have multiple functions specific to that data, and the more useful it is to use a class with attributes and methods.

When making this decision, it also pays to consider how the class will be used. 

If we're only trying to calculate the perimeter of one polygon in the context of a much greater problem, using a function will probably be quickest to code and easier to use one time only. 

On the other hand, if our program needs to manipulate numerous polygons in a wide variety of ways (calculating the perimeter, area, and intersection with other polygons, moving or scaling them, and so on), we have almost certainly identified a class of related objects. 

The class definition becomes more important as the number of instances increases.

# ADDING BEHAVIOURS TO CLASS DATA WITH PROPERTIES

So far, we've focused on the separation of behavior and data. 

This is very important in object-oriented programming, but we're about to see that, in Python, the distinction is uncannily blurry. 

Python is very good at blurring distinctions; it doesn't exactly help us to think outside the box. 

Rather, it teaches us to stop thinking about the box.

Before we get into the details, let's discuss some bad object-oriented design principles. 

Many object-oriented developers teach us to never access attributes directly. 

They insist that we write attribute access like this:

In [22]:
class Color:
    def __init__(self, rgb_value: int, name: str) -> None:
        self._rgb_value = rgb_value
        self._name = name
    def set_name(self, name: str) -> None:
        self._name = name
    def get_name(self) -> str:
        return self._name
    def set_rgb_value(self, rgb_value: int) -> None:
        self._rgb_value = rgb_value
    def get_rgb_value(self) -> int:
        return self._rgb_value

*The instance variables are prefixed with an underscore to suggest that they are private (other languages would actually force them to be private)*. 

**Then, the get and set methods provide access to each variable.** 

This class would be used in practice as follows:

In [23]:
c = Color(0xff0000, 'bright red')

In [24]:
c.get_name()

'bright red'

In [25]:
c.set_name('reddish red')

In [26]:
c.get_name()

'reddish red'

The above example is not nearly as readable as the following:

In [27]:
class Color_Py:
    def __init__(self,rgb_value: int, name: str) -> None:
        self.rgb_value = rgb_value
        self.name = name

In [28]:
c = Color_Py(0xff0000, 'bright redPy')

In [29]:
c.name

'bright redPy'

In [30]:
c.name = 'reddish redPy'

In [31]:
c.name

'reddish redPy'

The idea of seetters and getters seems helpful for encapsulating the class definition.

One ongoing justification for getters and setters is that someday, we may want to add extra code when a value is set or retrieved.

For example, we could decide to cache a value to avoid complex computations, or we might want to validate that a given value is suitable for input.

For example:

In [32]:
class Color_V:
    def __init__(self, rgb_value: int, name: str) -> None:
        self._rgb_value = rgb_value
        self._name = name
        if not name:
            raise ValueError("Invalid Name")
    def _set_name(self, name: str) -> None:
        if not name:
            raise ValueError("Invalid Name")
        self._name = name
    

If we had written our original code for direct attribute access and then later changed it o a method like the preceding one, we would have a problem:

-Anyone who had written code that accessed the attribute directly would now have to change their code to access a method.

 If they did not change the access style from attribute access to a function call, their code would be broken.

The mantra that we should make all attributes private, accessible through methods, does not make much sense in Python.

*Python Language lacks any real concept of private members!* 

**We are all adults!** 

We can make the syntax distinction between attribute and method less visible.

Python gives us the `property` function to make methods that look like attributes.

We can therefore write our code to use direct member access, and if we ever unecpectedly need to alter the implementation to do some calculation when getting pr setting that attribute's value, we can do so without changing the interface.
 

In [33]:
class Color_VP:
    
    def __init__(self, rgb_value: int, name: str) -> None:
        self._rgb_value = rgb_value
        if not name:
            raise ValueError(f"Invalid Name: {name!r}")
        self._name = name
    
    def _set_name(self, name: str) -> None:
        if not name:
            raise ValueError(f"Invalid Name: {name!r}")
        self._name = name
    
    def _get_name(self) -> str:
        return self._name
    
    ## Now we can use the property function to create a property object
    name = property(_get_name, _set_name)

Compared to earlier class, we first change the name attibute into a private attribute, `_name`. 

Then, we add two more private methods to get and set that variable, performing our validation when we set it.

Finally, we use the property construction at the bottom.

This creates a new attribute on the `Color_VP` class called `name`.

It sets this attribute to be a property.

Under the hood, a `property` attrbiute delegates the real work to the two methods we just created.

When used in an access context, the first fucntion gets the value.

When used in an update context, the second function sets the value.

This new version of the `Color` class can be used in exaclt the same way as the earlier version, yet it now peroforms validation when we set the `name` attribute: 

In [34]:
c = Color_VP(0xff0000, 'bright red')

In [35]:
c.name

'bright red'

In [36]:
c.name = 'red3'

In [37]:
c.name

'red3'

In [38]:
c.name = ''

ValueError: Invalid Name: ''

So, if we would previously written code to access the name attribute, and then changed it to use our property based object, the previous code would still work.

If it attempts to set an empty property value, this is behaviour we wanted to forbid. It is success.

Bear in mind that, even with the name property, the previous code is not 100% safe.

People can still access the _name attribute directly and set it to an empty string if they want to.

But if they access a variable we have explicitly marked with an underscore to suggest it is private, they're the ones that have to deal wtih the consequences.

# Properties in Detail


Think of the property function as returning an object that proxies any requests to get or set the attribute value through the method names we have specified.

The property built-in is like a constructor for such an object, and that object is set as the public facing member for the given attribute.

This property constur