# Class Design Exercise

<style>
section.present > section.present { 
    max-height: 90%; 
    overflow-y: scroll;
}
</style>

<small><a href="https://colab.research.google.com/github/brandeis-jdelfino/cosi-10a/blob/main/lectures/notebooks/13_class_design.ipynb">Link to interactive slides on Google Colab</a></small>

# Exercise

Create a program to model a library network:
* Our libraries deal in books only - no need to model magazines, video, etc.
* There are multiple library branches, each with their own book inventory.
* Library patrons have accounts in the library network, and can have books out on loan.
* Data on books, branches, and patrons will be loaded from files.

Our program should be able to:
* List the branches and their info.
* Print out all the books from a branch.
* Print out the books a patron has checked out.
* Provide very simple text search (substring matching) over book titles.

# File structure

We have 4 files. First 2:
* `branches.csv`
  * A comma-delimited file that contains one row per branch, with 3 string fields: 
     * `id`
     * `name`
     * `address`
* `patrons.csv`
  * A comma-delimited file that contains one row per patron, with 2 string fields: 
     * `id`
     * `name`

# File structure

We have 4 files. Third file:
* `books.json`
  * A JSON file with a list of dictionaries. Each dictionary represents a unique title. They have the following fields:
    * `id` (str)
    * `name` (str)
    * `description` (str)
    * `copies`
      * A list of dictionaries, each of which has 2 keys: `copy_id`, `branch_id`.

# File structure

We have 4 files. Last file:
* `checkouts.json`
  * A JSON file with a list of dictionaries. Each dictionary represents a checkout of a book copy by a patron. They have the following fields:
  * `patron_id` (str)
  * `copy_id` (str)

# Where do we start??

This is a big problem, we can't tackle it all at once.

Let's focus first on a small piece: model the data with some simple classes, load it in from the files, and work on the first task (print out branches). 

There are many overly complex ways to model the relationships in this data. However, we don't know much yet. Let's keep it simple, then iterate. 

Each class we introduce will be self-contained, and will not refer directly to any other classes.

Our tasks:

* **List the branches and their info.**
* Print out all the books from a branch.
* Print out the books a patron has checked out.
* Provide very simple text search (substring matching) over book titles.

# A `Branch` class

A very simple class - one field for each column of data in `branches.csv`.

In [None]:
class Branch:
    def __init__(self, branch_id, name, address):
        self.id = branch_id
        self.name = name
        self.address = address

    def __str__(self):
        return f"{self.name} ({self.id})"

# Code to load branches.csv

In [None]:
import csv

def load_branches(filename):
    all_branches = []
    with open(filename, 'r') as f:
        reader = csv.reader(f, delimiter=',')
        for line in reader:
            branch_id = line[0]
            name = line[1]
            address = line[2]
            new_branch = Branch(branch_id, name, address)
            all_branches.append(new_branch)
    print(f"Loaded {len(all_branches)} branches from {filename}")
    return all_branches

We can shorten this a bit:

In [None]:
import csv

def load_branches(filename):
    all_branches = []
    with open(filename, 'r') as f:
        reader = csv.reader(f, delimiter=',')
        for line in reader:
            all_branches.append(Branch(line[0], line[1], line[2]))
    print(f"Loaded {len(all_branches)} branches from {filename}")
    return all_branches

# Listing branches

We can already implement our first operation: listing all the branches.

Loop through the list of `Branch` instances, and print out info about the branch:

In [None]:
branches = load_branches('../../snippets/library/branches.csv')
for b in branches:
    print(b)

Our tasks:

* ~~List the branches and their info.~~
* **Print out all the books from a branch.**
* Print out the books a patron has checked out.
* Provide very simple text search (substring matching) over book titles.

# Loading books

Books should be very similar to branches, right? Well, let's look at the data:

```
{
    "title": "She-Devil",
    "description": "In hac habitasse platea dictumst.",
    "id": "39cfc454-a730-420e-b921-6ffb220705f0",
    "copies": [
      {
        "id": "c104dba4-1d13-4f71-8406-bc0f5a2ee47f",
        "branch_id": "87fd9eaa-53d6-4b13-87e3-af267a0626b5"
      },
      {
        "id": "8997c40c-1603-4649-93bc-6ce88f87ab49",
        "branch_id": "0026b74e-c4cf-45ba-954c-c1ce18710ade"
      },
...
```

A book has multiple copies. How do we model this?

# Keep it simple: a `BookCopy` class

We could model this in several complicated ways: e.g. a separate `Book` and `BookCopy` class, where each `Book` holds a list of `BookCopy` instances.

However, it's not clear we need to model the difference between a `Book` and `BookCopy`. Let's keep it really simple and just have a `BookCopy` class with all the data.

In [None]:
class BookCopy:
    def __init__(self, copy_id, book_id, branch_id, title, description):
        self.copy_id = copy_id
        self.book_id = book_id
        self.branch_id = branch_id
        self.title = title
        self.description = description

    def __str__(self):
        return f"{self.title} (Copy: {self.copy_id}, Book: {self.book_id})"

# Code to load books.json

In [None]:
import json

def load_books(filename):
    all_books = []
    with open(filename, 'r') as f:
        json_books = json.load(f)

    for json_book in json_books:
        for cp in json_book['copies']:
            bc = BookCopy(
                cp['id'], 
                json_book['id'], 
                cp['branch_id'], 
                json_book['title'],
                json_book['description'])
            all_books.append(bc)

    print(f"Loaded {len(all_books)} book copies from {filename}")
    return all_books

Great, now we can list all books for a branch:

In [None]:
book_copies = load_books('../../snippets/library/books.json')
branch_id = branches[0].id
for b in book_copies:
    if b.branch_id == branch_id:
        print(b)

Our tasks:

* ~~List the branches and their info.~~
* ~~Print out all the books from a branch.~~
* **Print out the books a patron has checked out.**
* Provide very simple text search (substring matching) over book titles.

# First step: loading patrons

This is similar to branches. Here's a `Patron` class:

In [None]:
class Patron:
    def __init__(self, id, name):
        self.id = id
        self.name = name

    def __str__(self):
        return f"{self.name} ({self.id})"

# Code to load patrons.csv

In [None]:
def load_patrons(filename):
    all_patrons = []
    with open(filename, 'r') as f:
        reader = csv.reader(f, delimiter=',')
        for line in reader:
            all_patrons.append(Patron(line[0], line[1]))
    print(f"Loaded {len(all_patrons)} patrons from {filename}")
    return all_patrons

In [None]:
patrons = load_patrons('../../snippets/library/patrons.csv')
for p in patrons:
    print(p)

## We need to deal with checkouts

In order to list a patron's checked out books, we need to process `checkouts.json`.

Checkouts are a relationship between a `Patron` and a `BookCopy`. 

It's not clear where to put the data for a checkout. We could:
* Give `Patron` a list of `BookCopy`s that are checked out
* Give `BookCopy` a reference to the `Patron` that has checked the copy out

It's a trap...

## Keep classes simple and focused

Both options are unwieldy, and violate best practices around class design. 

Classes should be "cohesive" - they should do **one thing**.

A book or patron class which tracks checkouts does more than one thing.

## Enter: LibraryNetwork

Let's introduce a new class to tie our different types of data together, and manage relationships between them.

In [None]:
class LibraryNetwork:
    def __init__(self, branches, book_copies, patrons):
        self.branches = branches
        self.copies = book_copies
        self.patrons = patrons
        self.checkouts = {}

We'll add the operations we've built already: listing branches, and listing books for a branch:

In [None]:
class LibraryNetwork:
    def __init__(self, branches, book_copies, patrons):
        self.branches = branches
        self.copies = book_copies
        self.patrons = patrons
        self.checkouts = {}

    def list_branches(self):
        return self.branches

    def list_books_for_branch(self, branch_id):
        results = []
        for cp in self.copies:
            if cp.branch_id == branch_id:
                results.append(cp)
        return results

Now we'll add a `checkout` method, which marks a book as checked out by a patron:

In [None]:
class LibraryNetwork:
    def __init__(self, branches, book_copies, patrons):
        self.branches = branches
        self.copies = book_copies
        self.patrons = patrons
        self.checkouts = {}

    def list_branches(self):
        return self.branches

    def list_books_for_branch(self, branch_id):
        results = []
        for cp in self.copies:
            if cp.branch_id == branch_id:
                results.append(cp)
        return results
    
    def checkout(self, copy_id, patron_id):
        self.checkouts[copy_id] = patron_id

Now we load checkout data from `checkouts.json` into a `LibraryNetwork`:

In [None]:
def load_checkouts(filename, network):
    with open(filename, 'r') as f:
        checkouts = json.load(f)
        for checkout in checkouts:
            network.checkout(checkout['copy_id'], checkout['patron_id'])

We'll be reloading data for our network often as we update the LibraryNetwork class. Let's make a function that creates a network and loads all the data into it.

In [None]:
def load_network():
    branches = load_branches('../../snippets/library/branches.csv')
    book_copies = load_books('../../snippets/library/books.json')
    patrons = load_patrons('../../snippets/library/patrons.csv')

    network = LibraryNetwork(branches, book_copies, patrons)
    load_checkouts('../../snippets/library/checkouts.json', network)
    return network

In [None]:
network = load_network()
print(len(network.checkouts))

Back to our task: listing a patron's checked out books:

In [None]:
class LibraryNetwork:
    # ...
        
    def list_patrons_books(self, patron_id):
        books = []
        for copy_id in self.checkouts:
            if self.checkouts[copy_id] == patron_id:
                # we need to find the copy by copy_id... how?
                pass
        return books

## We need a way to look up `BookCopy`s by `copy_id`

We could iterate through each one to find it. 

Or we could build an index: a dictionary mapping from `copy_id` to `BookCopy` instance.

Let's update LibraryNetwork's `__init__` method to build the index:

In [None]:
class LibraryNetwork:
    def __init__(self, branches, book_copies, patrons):
        self.branches = {}
        for b in branches:
            self.branches[b.id] = b

        self.copies = {}
        for cp in book_copies:
            self.copies[cp.copy_id] = cp

        self.patrons = {}
        for p in patrons:
            self.patrons[p.id] = p
            
        self.checkouts = {}

We'll shorten this with dictionary comprehensions:

In [None]:
class LibraryNetwork:
    def __init__(self, branches, book_copies, patrons):
        self.branches = {b.id: b for b in branches}
        self.copies = {cp.copy_id: cp for cp in book_copies}
        self.patrons = {p.id for p in patrons}
        self.checkouts = {}

We have to update any places where we were using `self.branches`/`self.copies`/`self.patrons`:

In [None]:
class LibraryNetwork:
    def __init__(self, branches, book_copies, patrons):
        self.branches = {b.id: b for b in branches}
        self.copies = {cp.copy_id: cp for cp in book_copies}
        self.patrons = {p.id for p in patrons}
        self.checkouts = {}

    def list_branches(self):
        return list(self.branches.values())

    def list_books_for_branch(self, branch_id):
        results = []
        for cp in self.copies.values():
            if cp.branch_id == branch_id:
                results.append(cp)
        return results
    
    def checkout(self, copy_id, patron_id):
        self.checkouts[copy_id] = patron_id

Now the code for `list_patrons_books`:

In [None]:
class LibraryNetwork:
    # ...
    def list_patrons_books(self, patron_id):
        books = []
        for (copy_id, pid) in self.checkouts.items():
            if patron_id == pid:
                books.append(self.copies[copy_id])
        return books

A quick test:

In [None]:
class LibraryNetwork:
    def __init__(self, branches, book_copies, patrons):
        self.branches = {b.id: b for b in branches}
        self.copies = {cp.copy_id: cp for cp in book_copies}
        self.patrons = {p.id for p in patrons}
        self.checkouts = {}

    def list_branches(self):
        return list(self.branches.values())

    def list_books_for_branch(self, branch_id):
        results = []
        for cp in self.copies.values():
            if cp.branch_id == branch_id:
                results.append(cp)
        return results
    
    def checkout(self, copy_id, patron_id):
        self.checkouts[copy_id] = patron_id
        
    def list_patrons_books(self, patron_id):
        books = []
        for (copy_id, pid) in self.checkouts.items():
            if patron_id == pid:
                books.append(self.copies[copy_id])
        return books

In [None]:
network = load_network()
for c in network.list_patrons_books(patrons[3].id):
    print(c)

Our tasks:

* ~~List the branches and their info.~~
* ~~Print out all the books from a branch.~~
* ~~Print out the books a patron has checked out.~~
* **Provide very simple text search (substring matching) over book titles.**

In [None]:
class LibraryNetwork:
    # ... 
    def book_search(self, value):
        matches = []
        for cp in self.copies.values():
            if value in cp.title:
                matches.append(cp)

        return matches

In [None]:
  class LibraryNetwork:
    def __init__(self, branches, book_copies, patrons):
        self.branches = {b.id: b for b in branches}
        self.copies = {cp.copy_id: cp for cp in book_copies}
        self.patrons = {p.id for p in patrons}
        self.checkouts = {}

    def list_branches(self):
        return list(self.branches.values())

    def list_books_for_branch(self, branch_id):
        results = []
        for cp in self.copies.values():
            if cp.branch_id == branch_id:
                results.append(cp)
        return results
    
    def checkout(self, copy_id, patron_id):
        self.checkouts[copy_id] = patron_id
        
    def list_patrons_books(self, patron_id):
        books = []
        for (copy_id, pid) in self.checkouts.items():
            if patron_id == pid:
                books.append(self.copies[copy_id])
        return books
    
    def book_search(self, value):
        matches = []
        for cp in self.copies.values():
            if value in cp.title:
                matches.append(cp)

        return matches

In [None]:
network = load_network()
for m in network.book_search('love'):
    print(m)

Looks like we have to deal with duplicates. There are multiple copies of each book.

We can't easily use a set here, because different `BookCopy` instances will not be considered equal.

We can leverage a dictionary and book ids though.

In [None]:
class LibraryNetwork:
    # ...
    
    def book_search(self, value):
        matches = {}
        for cp in self.copies.values():
            if value.lower() in cp.title.lower():
                matches[cp.book_id] = cp

        return list(matches.values())


In [None]:
class LibraryNetwork:
    def __init__(self, branches, book_copies, patrons):
        self.branches = {b.id: b for b in branches}
        self.copies = {cp.copy_id: cp for cp in book_copies}
        self.patrons = {p.id for p in patrons}
        self.checkouts = {}

    def list_branches(self):
        return list(self.branches.values())

    def list_books_for_branch(self, branch_id):
        results = []
        for cp in self.copies.values():
            if cp.branch_id == branch_id:
                results.append(cp)
        return results
    
    def checkout(self, copy_id, patron_id):
        self.checkouts[copy_id] = patron_id
        
    def list_patrons_books(self, patron_id):
        books = []
        for (copy_id, pid) in self.checkouts.items():
            if patron_id == pid:
                books.append(self.copies[copy_id])
        return books
    
    def book_search(self, value):
        matches = {}
        for cp in self.copies.values():
            if value.lower() in cp.title.lower():
                matches[cp.book_id] = cp

        return list(matches.values())


In [None]:
network = load_network()
for m in network.book_search('vampire'):
    print(m)

## Bonus

What if we want to list only available, or only checked out books from a branch?

In [None]:
class LibraryNetwork:
    # ...
    def list_available_books(self, branch_id):
        avails = []
        for cp in self.list_books_for_branch(branch_id):
            if cp.copy_id not in self.checkouts:
                avails.append(cp)
        return avails

    def list_checked_out_books(self, branch_id):
        avails = []
        for cp in self.list_books_for_branch(branch_id):
            if cp.copy_id in self.checkouts:
                avails.append(cp)
        return avails

`list_available_books` and `list_checked_out_books` are very similar - could we reduce duplication somehow?

In [None]:
class LibraryNetwork:
    # ...
    
    def list_books(self, branch_id, list_available):
        avails = []
        for cp in self.list_books_for_branch(branch_id):
            if list_available:
                if cp.copy_id not in self.checkouts:
                    avails.append(cp)
            else:
                if cp.copy_id in self.checkouts:
                    avails.append(cp)
        return avails

We can make our logic more concise (but also harder to read):

In [None]:
class LibraryNetwork:
    # ... 
    
    def list_books(self, branch_id, list_checked_out):
        avails = []
        for cp in self.list_books_for_branch(branch_id):
            checked_out = cp.copy_id in self.checkouts
            if checked_out == list_checked_out:
                avails.append(cp)
        return avails

# The whole thing

[repl.it link](https://replit.com/@cosi-10a-fall23/Library-Solution)

In [None]:
import csv
import json

class Patron:
    def __init__(self, id, name):
        self.id = id
        self.name = name

    def __str__(self):
        return f"{self.name} ({self.id})"


def load_patrons(filename):
    all_patrons = []
    with open(filename, 'r') as f:
        reader = csv.reader(f, delimiter=',')
        for line in reader:
            all_patrons.append(Patron(line[0], line[1]))
    print(f"Loaded {len(all_patrons)} patrons from {filename}")
    return all_patrons


class Branch:
    def __init__(self, id, name, address):
        self.id = id
        self.name = name
        self.address = address

    def __str__(self):
        return f"{self.name} ({self.id})"


def load_branches(filename):
    all_branches = []
    with open(filename, 'r') as f:
        reader = csv.reader(f, delimiter=',')
        for line in reader:
            all_branches.append(Branch(*line))
    print(f"Loaded {len(all_branches)} branches from {filename}")
    return all_branches


class BookCopy:
    def __init__(self, copy_id, book_id, branch_id, title, description):
        self.copy_id = copy_id
        self.book_id = book_id
        self.branch_id = branch_id
        self.title = title
        self.description = description

    def __str__(self):
        return f"{self.title} (Copy: {self.copy_id}, Book: {self.book_id})"

    
def load_books(filename):
    all_books = []
    with open(filename, 'r') as f:
        json_books = json.load(f)

    for json_book in json_books:
        for cp in json_book['copies']:
            bc = BookCopy(
                cp['id'], 
                json_book['id'], 
                cp['branch_id'], 
                json_book['title'],
                json_book['description'])
            all_books.append(bc)

    print(f"Loaded {len(all_books)} book copies from {filename}")
    return all_books


def load_checkouts(filename, network):
    with open(filename, 'r') as f:
        checkouts = json.load(f)
        for checkout in checkouts:
            network.checkout(checkout['copy_id'], checkout['patron_id'])

            
class LibraryNetwork:
    def __init__(self, branches, book_copies, patrons):
        self.branches = {b.id: b for b in branches}
        self.copies = {cp.copy_id: cp for cp in book_copies}
        self.patrons = {p.id for p in patrons}
        self.checkouts = {}

    def list_branches(self):
        return list(self.branches.values())

    def list_books_for_branch(self, branch_id):
        results = []
        for cp in self.copies.values():
            if cp.branch_id == branch_id:
                results.append(cp)
        return results
    
    def checkout(self, copy_id, patron_id):
        self.checkouts[copy_id] = patron_id
        
    def list_patrons_books(self, patron_id):
        books = []
        for (copy_id, pid) in self.checkouts.items():
            if patron_id == pid:
                books.append(self.copies[copy_id])
        return books
    
    def list_books(self, branch_id, list_checked_out):
        avails = []
        for cp in self.list_books_for_branch(branch_id):
            checked_out = cp.copy_id in self.checkouts
            if checked_out == list_checked_out:
                avails.append(cp)
        return avails
    
    def list_available_books(self, branch_id):
        return self.list_books(branch_id, False)
    
    def list_checked_out_books(self, branch_id):
        return self.list_books(branch_id, True)
    
    def book_search(self, value):
        matches = {}
        for cp in self.copies.values():
            if value.lower() in cp.title.lower():
                matches[cp.book_id] = cp

        return list(matches.values())

    
def load_network():
    branches = load_branches('../../snippets/library/branches.csv')
    book_copies = load_books('../../snippets/library/books.json')
    patrons = load_patrons('../../snippets/library/patrons.csv')

    network = LibraryNetwork(branches, book_copies, patrons)
    load_checkouts('../../snippets/library/checkouts.json', network)
    return network

In [None]:
network = load_network()

print("List branches test")
print([str(x) for x in network.list_branches()])
print()
print("Available books test:")
print([str(x) for x in network.list_available_books('e1077ff6-cbc0-43ef-a791-93dbae01409e')])
print()
print("Checked out books test:")
print([str(x) for x in network.list_checked_out_books('e1077ff6-cbc0-43ef-a791-93dbae01409e')])
print()
print("Patron's books test:")
print([str(x) for x in network.list_patrons_books(patrons[3].id)])
print()
print("Book search test:")
print([str(x) for x in network.book_search("vampire")])

# Some reflections

# Keep it simple

We kept it relatively simple, but that's harder than it looks.

Managing complexity is the biggest challenge when writing code. 

> "YAGNI" (You Ain't Gonna Need it) - _Kent Beck_

> "Premature optimization is the root of all evil" - _Donald Knuth_

> "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - _Brian W. Kernighan_

The best, most experienced programmers may build **complicated systems**, but they tend to write the **simplest code**.

## Keep classes and functions small and focused

Our classes ended up pretty focused. This is good class design.

Some principles to consider:
* Modularity - separate your code into discrete parts
* Cohesion - keep classes/functions focused on one thing
* Separation of concerns - keep unrelated parts of code apart
* Loose coupling - minimize strict dependencies between different code areas
   * If changing one area of your code results in changes everywhere, you have tight coupling

## Testing

We didn't write any tests as we went along, mostly due to time constraints. 

Write tests for individual pieces as you create them. Then you'll have confidence you haven't broken them as you make changes.

## This is hard

Modeling data relationships like this, and mapping them to classes in a clean way, is hard. 

It takes practice. 

Don't be afraid to make a mistake or a mess - that's how you'll learn.