In 1991, a group of Taiwanese researchers set out to determine the ideal length for chopsticks. More than 30 people participated in the experiment by trying out chopsticks of various lengths. The researchers' approach ensured that the participants' different skill levels and length preferences didn't skew the results.

After an exciting few days of picking up peanuts and placing them into cups, the researchers gathered enough data to determine which chopsticks are most efficient. Their findings form our data set.

The first column contains the "Food pinching efficiency" measurement, which is a decimal value. The higher the value, the better the chopstick.

The second column, "Individual," holds unique identifiers for the person who used the chopstick.

The third column records the "Chopstick length" measurement in millimeters.

Each row of our data set represents a trial in which a participant used a chopstick of a certain length. It records the food pinching efficiency for a specific individual and chopstick length.

In [2]:
from csv import reader

chopsticks = list(reader(open("chopsticks.csv")))

In [3]:
headers = chopsticks[0]
chopsticks = chopsticks[1:]

Let's think a bit about how we should structure our code. We want to answer questions like:

* Which chopstick is more efficient?
* Which chopstick has the most consistent results?
* Other similar questions

All of our questions are specific to certain chopstick lengths. It would be useful to have a Chopstick class that has methods for computing these values, based on the lengths.

Before we can do that though, we need a way to store the data for each chopstick. While there are a few ways to go about this, we'll create an entire Trial class that stores information about each row of data.

In [4]:
class Trial(object):
    def __init__(self, datarow):
        self.efficiency = float(datarow[0])
        self.individual = int(datarow[1])
        self.chopstick_length = int(datarow[2])

In [5]:
first_trial = Trial(chopsticks[0]) # instance of Trial class

Let's also create a class named Chopstick whose instance properties contain information about each chopstick.

In [6]:
class Chopstick(object):
    def __init__(self, length):
        self.length = length
        self.trials = []
        for row in chopsticks:
            if int(row[2]) == self.length:
                self.trials.append(Trial(row))
    def avg_efficiency(self):
        efficiency_sum = 0
        for trial in self.trials:
            efficiency_sum += trial.efficiency
        return efficiency_sum/len(self.trials)

In [7]:
medium_chopstick = Chopstick(240)
avg_eff_210 = Chopstick(210).avg_efficiency()
print(len(medium_chopstick.trials))
print(avg_eff_210)

31
25.483870967741932


# Exception handling

When programming, we usually try to avoid writing code that will generate errors. Even so, errors can be quite useful to us because they tell us what went wrong with our code. We can use this information to improve our program's logic. If part of our code fails, we can check why it failed, and execute some other code instead.

We need a way to handle errors gracefully during code execution so that our program doesn't crash, however. This is where exception handling comes into play.

An exception is a broad characterization of what can go wrong with a program. When a statement is syntactically correct but something goes wrong during execution (a division by zero occurs or the interpreter tries to read a non-existent file, for example), the compiler raises an exception. An important distinction is that exceptions occur during the execution of the program, whereas syntax errors such as forgetting a colon or misspelling a variable don't, because our code won't run to begin with.

Typically when we write Python code, the interpreter will raise an exception (report an error) and then continue executing the rest of the code. We'll see the exception, but our program will keep running as if it never happened. This is undesireable, because our program probably relies on the previous statements to succeed.

We want to handle exceptions by observing when they occur and reacting to them accordingly instead. This way, every piece of code that executes is deliberate, and we have complete control over what our program does. In Python, we use a **try-except** block to handle exceptions.

When the Python interpreter sees this code, it attempts to execute the try section of the statement. If the interpreter raises any exceptions within the try section (if we hit some sort of error), our code will attempt to catch it, or handle it gracefully with different code. In our example, the except statement is that different code. It will catch the exception and print out our message because we anticipated that a ValueError could occur, and built the error handling in.

Here, we caught a couple different types of exceptions that we suspected could arise during the execution of the try block.

With Python, we have the ability to catch any exception by writing an except: section without specifying a particular error. This is a sort of "catch-all" that works like an else: section. Using a catch-all for exceptions is usually bad practice, however. Trying to catch every exception without being specific is dangerous because then we can't execute exception-specific logic, and it means we may not understand our code as fully as we should.

If we catch every exception in a single statement, we can't react to the exception that occurred because we have no idea what type it is. Instead, we should try catching as many specific exceptions as we possibly can. To do this, we need to think about the exceptions our code might cause, then catch and react to each one individually.

That being said, there are still times when implementing a catch-all after we've caught all of the expected exceptions is a good idea. We may want to catch the unknown exception, store it somewhere so we can find what went wrong later on, and then change our code to handle that particular exception.

We have a working program that can find the average efficiency for a length of chopstick. We need to account for what happens when we read in bad data, however. We'll handle this exception in our Trial class, because that's the class that reads in the values in our data set.

In [8]:
class Trial(object):
    def __init__(self, datarow):
        try:
            self.efficiency = float(datarow[0])
            self.individual = int(datarow[1])
            self.chopstick_length = int(datarow[2])
        except ValueError:
            self.efficiency = -1.0
            self.individual = -1
            self.chopstick_length = -1

bad_trial = Trial(chopsticks[-1])

In [9]:
bad_trial.efficiency

27.52

While we wrote exceptions for handling bad data in the Trial class, we haven't done the same for the Chopstick class.

In [10]:
class Chopstick(object):
    def __init__(self, length):
        self.length = length
        self.trials = []
        for row in chopsticks:
            if int(row[2]) == self.length:
                trial = Trial(row)
                # Verify that the data is good
                if trial.efficiency != -1 and trial.individual != -1 and trial.chopstick_length != -1:
                        self.trials.append(trial)
    def num_trials(self):
        return len(self.trials)
    def avg_efficiency(self):
        efficiency_sum = 0
        for trial in self.trials:
            efficiency_sum += trial.efficiency
        return efficiency_sum / self.num_trials()

When we try to find the average efficiency for a chopstick length that isn't in our data set, we end up dividing by zero in our avg_efficiency method. Fortunately, this throws a exception, and we can catch it.

In [11]:
class Chopstick(object):
    def __init__(self, length):
        self.length = length
        self.trials = []
        for row in chopsticks:
            if int(row[2]) == self.length:
                trial = Trial(row)
                if trial.individual >= 0:
                    self.trials.append(trial)
    def num_trials(self):
        return len(self.trials)
    def avg_efficiency(self):
        efficiency_sum = 0
        for trial in self.trials:
            efficiency_sum += trial.efficiency
        try:
            return efficiency_sum / self.num_trials()
        except ZeroDivisionError:
                return -1.0

In [12]:
bad_average = Chopstick(100).avg_efficiency()

Now it's time to answer our question. We want to determine which chopstick length is best by looking for the highest average food pinching efficiency. Because we defined our average efficiencies to be -1.0 for chopsticks with bad data, those averages won't interrupt our calculations. 0 is the lowest possible average efficiency, so -1.0 shouldn't be an issue.

In [13]:
chopstick_lengths = [180, 195, 210, 225, 240, 255, 270, 285, 300, 315, 330]
chopstick_list = [Chopstick(length) for length in chopstick_lengths]

Now let's overload the comparison operators for the Chopstick class so we can take advantage of built-in Python functions.

In [14]:
class Chopstick(object):
    def __init__(self, length):
        self.length = length
        self.trials = []
        for row in chopsticks:
            if int(row[2]) == self.length:
                trial = Trial(row)
                if trial.individual >= 0:
                    self.trials.append(trial)
    def num_trials(self):
        return len(self.trials)
    def avg_efficiency(self):
        efficiency_sum = 0
        for trial in self.trials:
            efficiency_sum += trial.efficiency
        try:
            return efficiency_sum / self.num_trials()
        except ZeroDivisionError:
            return -1.0
    def __lt__(self, other):
        return self.avg_efficiency() < other.avg_efficiency()
    def __gt__(self, other):
        return self.avg_efficiency() > other.avg_efficiency()
    def __le__(self, other):
        return self.avg_efficiency() <= other.avg_efficiency()
    def __ge__(self, other):
        return self.avg_efficiency() >= other.avg_efficiency()
    def __eq__(self, other):
        return self.avg_efficiency() == other.avg_efficiency()
    def __ne__(self, other):
        return self.avg_efficiency() != other.avg_efficiency()

In [15]:
chopstick_list = [Chopstick(length) for length in chopstick_lengths]
most_efficient = max(chopstick_list)

In [16]:
most_efficient

<__main__.Chopstick at 0x1aaafc61c18>