# An animal classification task

**Learning goal:** This case will introduce conditional logic (i.e., statements of the form "if something, then something") and how to implement it in Python.

One of your friends, who studies biology at college, is writing an essay and needs to classify a large number of animals into three categories depending on some characteristics. She has been trying to carry out the classification using a popular spreadsheet package, but last night she accidentally deleted half of her formulas and now her file is unusable. To make things worse, she didn't keep any backups. You found out about this and offered to help her out.

This is the decision tree she is using (a decision tree is just a [flow chart](https://en.wikipedia.org/wiki/Flowchart) that represents decision rules):

![Decision tree - animals](data/images/animals_decision_tree.png)

Let's break down this diagram:

1. **Does this animal fly?** If this animal does fly, move on to the next stage. If this animal does not fly, then don't assign it to any category.
2. **Is this animal warm-blooded?** Now we only have animals that fly. If this animal is warm-blooded, move on to the next stage. If it isn't, classify it as an insect.
3. **Does this animal feed on blood?** Now we only have animals that fly *and* are warm-blooded. If this animal feeds on blood, then classify it as a bat. If it doesn't, classify it as a bird (we know that there are bats that don't feed on blood, but let's just asume that there aren't any of those in your friend's sample).

In other words,

* If an animal flies, is warm-blooded, and feeds on blood, it is a bat.ddddddd
* If an animal flies, is warm-blooded, and doesn't feed on blood, it is a bird.
* If an animal flies and is not warm-blooded, it is an insect.
* If an animal doesn't fly, it should not be assigned to any category - we're only interested in animals that fly.

Now execute the following cell:

In [None]:
animal = {
    "flies":True,
    "warm_blooded":True,
    "feeds_blood":True
}

## Exercise 1

Which class should this animal be assigned to?

**Answer**. This animal flies, is warm-blooded, and feeds on blood, so it definitely is a bat.

Let's now write some code to do the classification for us. For this, we use an **`if` statement**. The syntax is as follows:

~~~python
if some_expression_is_true:
    do_something
~~~

It is *very* important that you start the line that comes after a colon (`:`) with a tab. As you have seen in previous cases, tabs tell Python how to group statements hierarchically. In this case, the tabbed lines after `if some_expression_is_true:` are the lines that should be executed if the condition is met.

You probably know already that the value of a key in a dictionary is accessed with this syntax:

~~~python
my_dictionary["my_key"]
~~~

So, if we want to find out if `animal` flies, we need to use this code:

~~~python
animal["flies"]
~~~

## Exercise 2

Run `animal["flies"]` in a code cell and verify that the returned value is `True`.

**Answer.**

In [None]:
animal["flies"]

To code the first stage of our decision tree, we need to evaluate the "Does this animal fly?" expression and then move on to the next stage. Something like this:

~~~python
if animal["flies"] == True:
    #move to the next stage
~~~

Notice that we used `==` and not `=` here. This is because `=` is used to *assign* a value to a variable, like in `four = 1 + 3`, while `==` is used to *evaluate* whether two objects have the same value or not, like in `4 == 4` (which should produce the output `True`).

Since we haven't written the code for the next stages yet, let's just simply print a string with the action that should be followed if the condition is met. Execute the cell below. The printed output should be `Move to the next stage, because this animal flies`:

In [None]:
if animal["flies"] == True:
    print("Move to the next stage, because this animal flies")

In order to tell Python what to do when the expression is not true (that is, in the case that the animal doesn't fly) we use the **`else`** keyword:

In [None]:
if animal["flies"] == True:
    print("Move to the next stage, because this animal flies")
else:
    print("Do not assign to any category")

Now that we know how to code a conditional statement, let's add the other stages of the decision tree. The second stage is this:

In [None]:
if animal["warm_blooded"] == True:
    print("Move to the next stage, because this animal is warm-blooded")
else:
    print("The animal is an insect")

If we now put both conditional statements together, we get:

In [None]:
# The first conditional
if animal["flies"] == True:
    # The second conditional
    if animal["warm_blooded"] == True:
        print("Move to the next stage, because this animal flies and is warm-blooded")
    else:
        print("The animal is an insect")
else:
    print("Do not assign to any class, because this animal doesn't fly")

Can you see that we indented the `warm_blooded` conditional? That is because this conditional is *nested* inside the first one (similar to how we did nested indexing in previous cases).

### Exercise 3

Write the third conditional.

**Answer.**

In [None]:
if animal["feeds_blood"] == True:
    print("The animal is a bat")
else:
    print("The animal is a bird")

### Exercise 4

Integrate the third conditional into the code so that you have one code block that has all three conditionals. (Since it must be nested inside the second conditional, it should be doubly indented.)

**Answer.**

In [None]:
# The first conditional
if animal["flies"] == True:
    # The second conditional
    if animal["warm_blooded"] == True:
        # The third conditional
        if animal["feeds_blood"] == True:
            print("The animal is a bat")
        else:
            print("The animal is a bird")
    else:
        print("The animal is an insect")
else:
    print("Do not assign to any class, because this animal doesn't fly")

It seems we have successfully classified our animal as a bat. Let's test this code again, this time passing an insect. For this, we overwrite the `animal` variable with another dictionary that has the values that correspond to an insect:

In [None]:
animal = {
    "flies":True,
    "warm_blooded":False,
    "feeds_blood":True
}

### Exercise 5

Run the code again. Was the animal classified correctly?

**Answer**.

In [None]:
# The first conditional
if animal["flies"] == True:
    # The second conditional
    if animal["warm_blooded"] == True:
        # The third conditional
        if animal["feeds_blood"] == True:
            print("The animal is a bat")
        else:
            print("The animal is a bird")
    else:
        print("The animal is an insect")
else:
    print("Do not assign to any class, because this animal doesn't fly")

If everything ran correctly, the animal will have been classified as an insect. Notice that in the case of insects, the value of `feeds_blood` isn't relevant in our decision tree - what makes a flying animal an insect isn't whether or not it feeds on blood, but rather that it is not warm-blooded.

## Using `if...elif`

Your friend is so impressed by your Python prowess that she asks you to help her out with another dataset. This is the decision tree:

![](data/images/animals_elif.png)

Let's create a dummy animal:

In [None]:
animal = {
    "height":5
}

Since this animal's height is 5 feet, it should be classified as a `LARGE` animal according to the decision tree:

1. This animal's height is *not* less than 1 foot, so it needs to pass through the next stage before being assigned a class.
2. This animal's height is *not* less than 2 feet, which means that the condition evaluates to `False` (that is, to "NO"). Therefore, this animal is `LARGE`.

You could easily code the first condition as:

In [None]:
# First condition
if animal["height"] < 1: # Is it less than 1 ft?
    print("SMALL")
else:
    print("Move on to the next stage, this animal is NOT SMALL")

### Exercise 6

Add the second condition.

**Answer.**

In [None]:
# First condition
if animal["height"] < 1:
    print("SMALL")
else:
    # Second condition
    if animal["height"] < 2: # If it is larger than 1 ft, is it smaller than 2 ft?
        print("MEDIUM")
    else:
        print("LARGE")

This works perfectly. However, if you'd like to make the code a bit more readable, you could replace the `else...if` with a handy shortcut, **`elif`**. As the name suggests, `elif`s can be used when you have an `if` immediately following an `else`. So, the simplified version would be:

In [None]:
if animal["height"] < 1: # First condition
    print("SMALL")
elif animal["height"] < 2: # Second condition
    print("MEDIUM")
else:
    print("LARGE")

This block has two fewer lines of code. You can follow the logic with this reasoning:

1. If the animal's height is less than 1 foot, then it is `SMALL`.
2. If the animal is not `SMALL`, then check if it is less than 2 feet. If it is, then label as `MEDIUM`.
3. If the animal's height is something else (that is, it isn't either `SMALL` or `MEDIUM`), then it has to be `LARGE`.

Now the only thing that you need to do is take your conditionals and run them on each animal in your friend's datasets. You could do this one animal at a time, but that would kind of defeat the purpose - it could be as cumbersome as using a spreadsheet, or even worse. For cases like this, you use Python **functions** and/or **loops**, which are extremely efficient ways to streamline repetitive tasks. You'll learn about these in a future case.

## A word on Boolean logic

So far our conditionals have evaluated a single condition to determine which lines of code actually get executed when we run it. Python can also evaluate multiple conditions at once, using what we call **[Boolean logic](https://www.pbs.org/video/boolean-logic-logic-gates-crash-course-computer-science-nobmpt/)**.

There are four Boolean operations that will suffice in the vast majority of cases:

* $AND$, which in Python is `and`,
* $OR$, which in Python is `or`,
* $XOR$ or **exclusive or**, which in Python is `^`, and
* $NOT$, which in Python is `not`.

This is related to the sets operations that we covered in Case 0.3. In fact, you can represent Boolean logic [using sets](https://en.wikipedia.org/wiki/Boolean_algebra#Diagrammatic_representations).

### $AND$

You have two boolean variables `A` and `B`. This is the **truth table** related to the $AND$ operation when applied to them (a truth table is just a matrix that shows the output of a Boolean operation for different values of `A` and `B`):

| Value of `A`	| Value of `B`  | Value of `A and B`|
|------------	|------------	|------------------	|
| False      	| False      	| False            	|
| False      	| True       	| False            	|
| True       	| False      	| False            	|
| True       	| True       	| True             	|

In other words, `A and B` is `True` only when both conditions are `True`.

### $OR$

This is the truth table for the $OR$ operation:

| Value of `A`	| Value of `B`  | Value of `A or B` |
|------------	|------------	|------------------	|
| False      	| False      	| False            	|
| False      	| True       	| True            	|
| True       	| False      	| True            	|
| True       	| True       	| True             	|

In other words, `A or B` is always `True` except when both conditions are `False`.

### $XOR$

This is the truth table for the exclusive or operation:

| Value of `A`	| Value of `B`  | Value of `A ^ B`  |
|------------	|------------	|------------------	|
| False      	| False      	| False            	|
| False      	| True       	| True            	|
| True       	| False      	| True            	|
| True       	| True       	| False            	|

In other words, `A ^ B` is `True` only when both conditions have different truth values.

### $NOT$

Negation is the easiest of the operators. It just reverses the value of your boolean. So, if `A` is `True`, then `not A` will be `False`. It takes only one argument, so you can't do `A not B` (but you *can* do `not (A and B)`, for instance).

---

Now let's look at an example. Let's evaluate an $OR$ operation:

In [None]:
if (4 < 5) or (400 < 3): 
    print("The conditional evaluated to True")
else:
    print("The conditional evaluated to False")

Notice that we used parentheses to tell Python where the conditions were in the code.

**Question:** Why did this code evaluate to `True`?

And here's an example of an $AND$ operation:

In [None]:
if (4 < 5) and (400 < 3): 
    print("The conditional evaluated to True")
else:
    print("The conditional evaluated to False")

**Question:** Now why did it evaluate to `False`?