# Item 37: Compose Classes Instead of Nesting Many Levels of Built-in Types

Python's dictionary type is very useful for when maintaining dynamic internal state over the lifetime of an object.

In [1]:
# Say we want to record the grades of a set of students whose names aren't known in advance. We can define a 
# class to store the names in a dictionary instead of using a predefined attribute for each student
class SimpleGradebook:
    def __init__(self):
        self._grades = {}

    def add_student(self, name):
        self._grades[name] = []

    def report_grade(self, name, score):
        self._grades[name].append(score)

    def average_grade(self, name):
        grades = self._grades[name]
        return sum(grades) / len(grades)

In [2]:
# Using the class is simple
book = SimpleGradebook()
book.add_student('Isaac Newton')
book.report_grade('Isaac Newton', 90)
book.report_grade('Isaac Newton', 95)
book.report_grade('Isaac Newton', 85)

print(book.average_grade('Isaac Newton'))

90.0


Dictionaries and their related built-in types are so easy to use that there's a danger of overextending them to write brittle code.

Say, for example, we wanto to extend the `SimpleGradebook` class to keep a list of grades by subject, not just overall. We can do this by changing the `_grades` dictionary to map student names (its keys) to yet another dictionary (its values). The innermost dictionary will map subjects (its keys) to a `list` of grades (its values). 

In [5]:
# Here, we do this by using a defaultdict instance for the inner dictionary to handle missing subjects
from collections import defaultdict

class BySubjectGradebook():
    def __init__(self):
        self._grades = {} # Outer dict

    def add_student(self, name):
        self._grades[name] = defaultdict(list) # Inner dict
    
    # Because of this change, the report_grade and average_grade 
    # methods gain a bit of complexity, but they are till manageable

    def report_grade(self, name, subject, grade):
        by_subject = self._grades[name]
        grade_list = by_subject[subject]
        grade_list.append(grade)

    def average_grade(self, name):
        by_suject = self._grades[name]
        total, count = 0, 0
        for grades in by_suject.values():
            total += sum(grades)
            count += len(grades)
        return total / count

In [6]:
# Using the above class remains simple enough
book = BySubjectGradebook()
book.add_student('Albert Einstein')
book.report_grade('Albert Einstein', 'Math', 75)
book.report_grade('Albert Einstein', 'Math', 65)
book.report_grade('Albert Einstein', 'Gym', 90)
book.report_grade('Albert Einstein', 'Gym', 95)
print(book.average_grade('Albert Einstein'))

81.25


In [7]:
# Now lets say that we want to track the weight of each score toward the overall grade in the class so that
# midterms and final exams are more important than quizes. We can implement this by using a tuple in the 
# values list
class WeightedGradeBook():
    def __init__(self):
        self._grades = {} # Outer dict

    def add_student(self, name):
        self._grades[name] = defaultdict(list) # Inner dict

    def report_grade(self, name, subject, score, weight):
        by_subject = self._grades[name]
        grade_list = by_subject[subject]
        grade_list.append((score, weight))

    # Although the changes to the report_grade method were simple, the averag_grade method
    # now has a loop within a loop making it difficult to read

    def average_grade(self, name):
        by_suject = self._grades[name]

        score_sum, score_count = 0, 0
        for subject, scores in by_suject.items():
            subject_avg, total_weight = 0, 0
            for score, weight in scores:
                subject_avg += score * weight
                total_weight += weight
            
            score_sum += subject_avg / total_weight
            score_count += 1
        
        return score_sum / score_count

In [8]:
# Using the class is now more difficult (we're not able to tell what numbers correspond to what in the 
# positional aruguments of the function)
book = WeightedGradeBook()
book.add_student('Albert Einstein')
book.report_grade('Albert Einstein', 'Math', 75, 0.05)
book.report_grade('Albert Einstein', 'Math', 65, 0.15)
book.report_grade('Albert Einstein', 'Math', 70, 0.80)
book.report_grade('Albert Einstein', 'Gym', 100, 0.40)
book.report_grade('Albert Einstein', 'Gym', 85, 0.60)
print(book.average_grade('Albert Einstein'))

80.25


Whenever we encounter complexity like this, it's time to make the leap from built-in types to a hierarchy of classes.

Its important to note that Python's built-in types make it easy to keep adding layer after layer to the internal bookkeeping. We should avoid doing this for more than one level; using dictionaries that contain other dictionaries makes our code hard to read and sets us up for maintenance nightmare.

As soon as we realiz that our bookkeeping is getting our of hand, the author suggests to break it all out into classes. We can then provide well-defined interfaces that better encapsulate our data. This approach also enables us to create a layer of abstraction between our interfaces and our concrete implementations.

## Refactoring Classes

We can start our refactoring process by moving to classes at the bottom of the dependency tree: as single grade. A class seems to heavyweight for such simple info. We can instead use a `tuple` of `(score, weight)` to track grades in a `list`:

In [9]:
grades = []
grades.append((95, 0.45))
grades.append((85, 0.55))
total = sum(score * weight for score, weight in grades)
total_weight = sum(weight for _, weight in grades)
average_grade = total / total_weight

The problem with the code above is that `tuple` instances are positional.

In [10]:
# Say we want to associate more info with a particular grade, we would need to rewrite every usage of the
# two-tuple to be aware that there are now three items present instead of two, which means we have to use the _
# further to ignore certain indexes
grades = []
grades.append((95, 0.45, 'Great job!'))
grades.append((85, 0.55, 'Better next time'))
total = sum(score * weight for score, weight, _ in grades)
total_weight = sum(weight for _, weight, _ in grades)
average_grade = total / total_weight

This pattern of extending tuples longer and longer is similar to deepening layers of dictionaries. As soon as we find ourselves going deeper than a two-tuple, it's time to consider another approach.

In this case, the `namedtuple` in the `collections` module does exactly what we need: it lets us define tiny, immutable data classes:

In [23]:
from collections import namedtuple

Grade = namedtuple('Grade', ('score', 'weight'))

These classes can be constructed with positional or keyword arguments. The fields are accessible with named attributes. Having named attributes makes it easy to move from a `namedtuple` to a class later if the requirements change again.

## Limitations of `namedtuple`
- We can't specify default argument values for `namedtuple` classes.
- The attribute values of `namedtuple` instances are still accessible using numerical indexes and iteration.

In [24]:
# Next, we can write a class to represent a single subject that contains a set of grades
class Subject:
    def __init__(self):
        self._grades = []

    def report_grade(self, score, weight):
        self._grades.append(Grade(score, weight))

    def average_grade(self):
        total, total_weight = 0, 0
        for grade in self._grades:
            total += grade.score * grade.weight
            total_weight += grade.weight
        return total / total_weight

In [25]:
# Next, we write a class to represent a set of subjects that are being studied by a single student
class Student:
    def __init__(self):
        self._subjects = defaultdict(Subject)

    def get_subject(self, name):
        return self._subjects[name]
    
    def average_grade(self):
        total, count = 0, 0
        for subject in self._subjects.values():
            total += subject.average_grade()
            count += 1
        return total / count

In [26]:
# Finally, we'd write a container for all students, keyed dynamically by their names
class Gradebook:
    def __init__(self):
        self._students = defaultdict(Student)

    def get_student(self, name):
        return self._students[name]

Although these classes contain more lines of code than the previous ones, they are much clearer and easier to read.

In [28]:
# Usage of these clases is also more clear and extensible
book = Gradebook()
albert = book.get_student('Albert Einstein')
math = albert.get_subject('Math')
math.report_grade(75, 0.05)
math.report_grade(65, 0.15)
math.report_grade(70, 0.80)
gym = albert.get_subject('Gym')
gym.report_grade(100, 0.40)
gym.report_grade(85, 0.60)
print(albert.average_grade())

80.25
