## APS106 Lecture Notes - Week 12, Design Problem
# 20 Questions About Animals

## Problem Background

You've probably played the "20 Questions" game with your family. Someone thinks up an object, let's say an animal, and everyone else must guess the animal in at most 20 yes-or-no questions. A transcript might look like:

1. Is it a mammal? **YES**
1. Does it live on land? **NO**
1. Is it a type of seal or sea lion? **NO**
1. Is it a type of whale? **YES**
1. Is it black and white? **YES**
1. Is it a killer whale? **YES**

We want to write code so that the computer guesses the animal that you are thinking of. (Wow!)

There are a few challenges here:
- the computer needs to know about animals
- the computer needs to know how to differentiate between animals
- the computer needs to know what questions to ask to solve the problem within 20 questions. (Let's say the computer has a list of 10,000 animals. It won't work to just ask: "Is it a grizzly bear?", "Is it a bald eagle?", ..., 10,000 times!).

And let's make it a bit harder: whenever the computer guesses wrong (i.e., guesses a particular animal and it is incorrect), we want it to learn the new animal. And so, the program will ask you for a yes-or-no question that differentiates the animal you are thinking of from the one that it guessed. **And it will learn** the new animal so that the next time you play, if you choose the same animal, it will get it right.

## Define The Problem

You need to build a system where the program has a set of animals and a set of questions where each question partitions the set of animals in two: those animals that have that feature (e.g. "Does it swim?") and those that do not. Your program should ask a series of those distinguishing questions until it ends up with a single possible animal and then it guesses that animal. 

If it is right, you're done (and can ask to play again). 

If it is wrong, then this is an animal that is not in the known set. The program should then ask the user what question will differentiate between the animal that was guessed and the correct one. Then that question and the new animal should be incorporated into the known set and the user can be prompted to play again.

(In a real application we would also want to worry about reading and and writing out the data so that the program doesn't have to learn from scratch each time, but we can leave that for future work.)

At the beginning, the program knows no animals and so it should prompt the user for two animals and a question that distinguishes them.

## Define Test Cases 

### Test Case 1

```
Think of a question and two animals that are distinguished by the question.
What is the question: Does it swim?
For which of your two animals is the answer "yes": fish
What is the animal for which the anwer is "no": bird

Think of an animal.
Does it swim? no
Is your animal a bird? no
What is a question that differentiates your animal from a bird? Does it fly?
What is the answer to that question for a bird? yes
What is the animal that you were thinking of? ant
Would you like to play again? no
```

### Test Case 2

(Run after Test Case 1)

```
Think of an animal.
Does it swim? yes
Is your animal a fish? no
What is a question that differentiates your animal from a fish? Is it a mammal?
What is the answer to that question for a fish? no
What is the animal that you were thinking of? seal
Would you like to play again? yes
```

## Generate Many Creative Solutions

At this point, you may be thinking "how am I ever going to write this program?" It seems daunting.

As is often the case, coming up with ideas of how to represent the data (in this case the animals and the questions) and thinking about how to use that data structure is a good starting place.

For example, you might consider using a built-in data structure like a set for the animal names and a different one for the questions. OK, but then the problem is: how do we connect the animals and the questions? Remember that somehow the questions need to split the set of animals in two: those for which the answer is yes and those for which the answer is no. Hmmm ... or maybe you can use a dictionary with the key being the question and the value being the set of animals for which the answer is yes. Then through some set manipulations (union, set difference, etc) you could keep filtering down the set of animals until you have one left. This could work, but then how do you know which order to ask the question in? If you have more than 20 questions in your database you may always lose unless you can figure out some efficient order to ask questions in.

A good way to organize the data here is using a binary tree. An inner node in the tree represents a question and a leaf node can represent an animal. All the animals for which the answer is no can be stored in one subtree and all for which the answer is yes, can be in the other. So you get a picture like this.

![](img.png)

By arranging the tree in such a way that it represents the structure of the data, then we can efficiently search through our tree without having to access each node sequentially.

(Here's a math question to think about. If you have 10,000 animals and you have a perfectly balanced tree - meaning that each question divides the remaining animals into two equal sets (plus or minus 1) - how many questions do you need to ask to find any animal? Or to put it another way, what is the maximum number of animals you can represent to always be able to identify one of the known animals with 20 questions? Hint: If you have a limit of 20 questions, you can handle over 1 million animals. That is why this data structure is a good choice.)

At the beginning of each round, the program starts at the top of the tree and asks the first question. Depending on the answer, it moves to the left or right child and continues until it gets to a leaf node. 

### An Algorithm Plan

1. Ask the user for two animals and a question (see Test Case 1) and create a tree with three nodes.
1. Loop while the user wants to keep playing, starting at the first node until you reach a leaf node:
   1. Ask the question. If the answer is no go right, if it is yes, go left.
1. Guess the animal in the leaf node. 
1. If you are right celebrate and ask the user to play again.
1. If you are wrong. Ask the user for a distinguishing question and the new animal. Replace the animal at the current leaf with the question and create two children nodes one containing each animal.
1. Ask the user to play again.

### Programming Plan

1. Create a `Node` class to represent the questions and animals.
1. Do Algorithm Step 1 and test.
1. Implement the tree traversal (Algorithm Step 2) that ends either with the right answer or the wrong one. 
1. Put the "play again" part inside a loop. At this point, the program can't learn anything. It can just guess the two animals that you inserted in Algorithm Step 1 by asking one question.
1. Add the creation of new nodes when the program guesses wrong (Algorithm Step 5).

## Programming Step 1: Create a Node class

In [2]:
class Node:
    '''A Node class used by a binary tree class'''
    
    def __init__(self, cargo, no_answer = None, yes_answer = None):
        '''
        (self) -> NoneType
        Create a Node with cargo and no_answer and yes_answe subtrees
        '''
        self.cargo = cargo
        self.no_answer = no_answer
        self.yes_answer = yes_answer
                
    def __str__(self):
        '''
        (self) -> str
        Return a str representation of cargo
        '''
        return str(self.cargo)

    def print_tree(self):
        '''
        (self) -> NoneType
        Prints tree level by level
        '''

        thislevel = [self]

        while thislevel:

            nextlevel = list()

            for n in thislevel:

                print (n.cargo, " ", end = "")

                if n.no_answer != None:
                    nextlevel.append(n.no_answer)
                if n.yes_answer != None:
                    nextlevel.append(n.yes_answer)

            print("\n")
            thislevel = nextlevel


We will probably want to do more but for now, lets do as little as we need to do to get each programming step done.

## Programming Step 2: Add a question and two animals

In [3]:
print("Think of a question and two animals that are distinguished by the question.")

question = input("What is the question: ")
yes_animal = input("For which of your two animals is the answer \"yes\": ")
no_animal = input("What is the animal for which the answer is \"no\": ")

root = Node(question)
yes_node = Node(yes_animal)
root.yes_answer = yes_node

no_node = Node(no_animal)
root.no_answer = no_node

root.print_tree()

Think of a question and two animals that are distinguished by the question.
does it swim  

dog  fish  



The code above is a bit painful since it builds the tree on its own. It is nicer to be able to use a method to add a new node. So let's add a new method to `Node`.

# Breakout session 1 - Adding a yes node and a no node to hold cargo depending on the user's answer

In [4]:
# Here we are using the code from the previous cell - nothing new!

class Node:
    '''A Node class used by a binary tree class'''

    def __init__(self, cargo, no_answer = None, yes_answer = None):
        '''
        (self) -> NoneType
        Create a Node with cargo and no_answer and yes_answe subtrees
        '''
        self.cargo = cargo
        self.no_answer = no_answer
        self.yes_answer = yes_answer

    def __str__(self):
        '''
        (self) -> str
        Return a str representation of cargo
        '''
        return str(self.cargo)

    def print_tree(self):
        '''
        (self) -> NoneType
        Prints tree level by level
        '''

        thislevel = [self]

        while thislevel:

            nextlevel = list()

            for n in thislevel:

                print (n.cargo, " ", end = "")

                if n.no_answer != None:
                    nextlevel.append(n.no_answer)
                if n.yes_answer != None:
                    nextlevel.append(n.yes_answer)

            print("\n")
            thislevel = nextlevel

    def add_yes_node(self,cargo):
        '''
        Create a new node holding cargo and adds it as the yes_answer child
        '''
        yes_node = Node(cargo)
        self.yes_answer = yes_node

    def add_no_node(self,cargo):
        '''
        Create a new node holding cargo and adds it as the yes_answer child
        '''
        no_node = Node(cargo)
        self.no_answer = no_node


print("Think of a question and two animals that are distinguished by the question.")

question = input("What is the question: ")
yes_animal = input("For which of your two animals is the answer \"yes\": ")
no_animal = input("What is the animal for which the anwer is \"no\": ")

root = Node(question)
root.add_yes_node(yes_animal)
root.add_no_node(no_animal)

root.print_tree()

Think of a question and two animals that are distinguished by the question.
does it swim  

dog  fish  



OK, where are we on the Programming Plan

1. ~~Create a `Node` class to represent the questions and animals.~~
1. ~~Do Algorithm Step 1 and test.~~
1. Implement the tree traversal (Algorithm Step 2) that ends either with the right answer or the wrong one. 
1. Put the "play again" part inside a loop. At this point, the program can't learn anything. It can just guess the two animals that you inserted in Algorithm Step 1 by asking one question.
1. Add the creation of new nodes when the program guesses wrong (Algorithm Step 5).

## Programming Step 3: Tree traversal and question asking

# Breakout session 2 - Writing a tree traversal algorithm using Seb's notes from yesterday (12.2)

In [5]:
class Node:
    '''A Node class used by a binary tree class'''
    
    def __init__(self, cargo, no_answer = None, yes_answer = None):
        '''
        (self) -> NoneType
        Create a Node with cargo and no_answer and yes_answe subtrees
        '''
        self.cargo = cargo
        self.no_answer = no_answer
        self.yes_answer = yes_answer
                
    def __str__(self):
        '''
        (self) -> str
        Return a str representation of cargo
        '''
        return str(self.cargo)

    def print_tree(self):
        '''
        (self) -> NoneType
        Prints tree level by level
        '''

        thislevel = [self]

        while thislevel:

            nextlevel = list()

            for n in thislevel:

                print (n.cargo, " ", end = "")

                if n.no_answer != None:
                    nextlevel.append(n.no_answer)
                if n.yes_answer != None:
                    nextlevel.append(n.yes_answer)

            print("\n")
            thislevel = nextlevel

    def add_yes_node(self,cargo):
        '''
        Create a new node holding cargo and adds it as the yes_answer child
        '''
        yes_node = Node(cargo)
        self.yes_answer = yes_node

    def add_no_node(self,cargo):
        '''
        Create a new node holding cargo and adds it as the yes_answer child
        '''
        no_node = Node(cargo)
        self.no_answer = no_node
        
    def traverse_to_leaf(self):
        # This is from Seb's 12.2 lecture!
        '''
        (self) -> Node
        Traverse to a leaf node by asking the user questions and going to the sub-tree
        based on the answer
        '''
        on = self
        while on.yes_answer != None and on.no_answer != None: # while we have a internal node, checking if we are at the end, checking if not NONE
            answer = input(on.cargo + " ") # TWO WHILE LINES - for curr.yes_answer != None
            if answer == "yes" or answer == "y":
                on = on.yes_answer
            else:
                on = on.no_answer
        return on
    
print("Think of a question and two animals that are distinguished by the question.")

question = input("What is the question: ")
yes_animal = input("For which of your two animals is the answer \"yes\": ")
no_animal = input("What is the animal for which the answer is \"no\": ")

root = Node(question)
root.add_yes_node(yes_animal)
root.add_no_node(no_animal)

root.print_tree()

leaf = root.traverse_to_leaf()
correct = input("Is your animal a " + str(leaf) +"? ")

if correct == "yes":
    print("I rule!")
else:
    print("Sad")

Think of a question and two animals that are distinguished by the question.
does it swim  

dog  fish  

Sad


## Programming Step 4: Add the loop

In [6]:
print("Think of a question and two animals that are distinguished by the question.")

question = input("What is the question: ")
yes_animal = input("For which of your two animals is the answer \"yes\": ")
no_animal = input("What is the animal for which the anwer is \"no\": ")

root = Node(question)
root.add_yes_node(yes_animal)
root.add_no_node(no_animal)

root.print_tree()
play_again = "yes"
while play_again == "yes":
    print("Think of an animal.")
    
    leaf = root.traverse_to_leaf()
    correct = input("Is your animal a " + str(leaf) +"? ")

    if correct == "yes":
        print("I rule!")
    else:
        print("Sad")
    
    play_again = input("Would you like to play again? ")

print("OK - goodbye")

Think of a question and two animals that are distinguished by the question.
does it swim  

dog  fish  

Think of an animal.
Sad
OK - goodbye


1. ~~Create a `Node` class to represent the questions and animals.~~
1. ~~Do Algorithm Step 1 and test.~~
1. ~~Implement the tree traversal (Algorithm Step 2) that ends either with the right answer or the wrong one.~~ 
1. ~~Put the "play again" part inside a loop. At this point, the program can't learn anything. It can just guess the two animals that you inserted in Algorithm Step 1 by asking one question.~~
1. Add the creation of new nodes when the program guesses wrong (Algorithm Step 5).

## Programming Step 5: Add new nodes

In [7]:
class Node:
    '''A Node class used by a binary tree class'''
    
    def __init__(self, cargo, no_answer = None, yes_answer = None):
        '''
        (self) -> NoneType
        Create a Node with cargo and no_answer and yes_answe subtrees
        '''
        self.cargo = cargo
        self.no_answer = no_answer
        self.yes_answer = yes_answer
                
    def __str__(self):
        '''
        (self) -> str
        Return a str representation of cargo
        '''
        return str(self.cargo)

    def print_tree(self):
        '''
        (self) -> NoneType
        Prints tree level by level
        '''

        thislevel = [self]

        while thislevel:

            nextlevel = list()

            for n in thislevel:

                print (n.cargo, " ", end = "")

                if n.no_answer != None:
                    nextlevel.append(n.no_answer)
                if n.yes_answer != None:
                    nextlevel.append(n.yes_answer)

            print("\n")
            thislevel = nextlevel

    def add_yes_node(self,cargo):
        '''
        Create a new node holding cargo and adds it as the yes_answer child
        '''
        yes_node = Node(cargo)
        self.yes_answer = yes_node

    def add_no_node(self,cargo):
        '''
        Create a new node holding cargo and adds it as the yes_answer child
        '''
        no_node = Node(cargo)
        self.no_answer = no_node

    def traverse_to_leaf(self):
        # This is from Seb's 12.2 lecture!
        '''
        (self) -> Node
        Traverse to a leaf node by asking the user questions and going to the sub-tree
        based on the answer
        '''
        on = self
        while on.yes_answer != None and on.no_answer != None: # while we have a internal node, checking if we are at the end, checking if not NONE
            answer = input(on.cargo + " ") # TWO WHILE LINES - for curr.yes_answer != None
            if answer == "yes" or answer == "y":
                on = on.yes_answer

            else:
                on = on.no_answer

        return on
    
    def learn_new_animal(self):
        '''
        (self) -> NoneType
        Split the node by prompting user for question and the new animal
        '''
        
        # get the user info
        question = input("What is a question that differentiates your animal from a " 
                         + self.cargo + "? ")
        on_answer = input("What is the answer to that question for a " + self.cargo + "? ")
        new_animal = input("What is the animal that you were thinking of? ")
        
        # create new nodes
        old_node = Node(self.cargo)
        new_node = Node(new_animal)
        self.cargo = question
        
        # add new nodes to tree
        if on_answer == "yes":
            self.yes_answer = old_node
            self.no_answer = new_node
        else:
            self.yes_answer = new_node
            self.no_answer = old_node
            

In [8]:
print("Think of a question and two animals that are distinguished by the question.")

question = input("What is the question: ")
yes_animal = input("For which of your two animals is the answer \"yes\": ")
no_animal = input("What is the animal for which the anwer is \"no\": ")

root = Node(question)
root.add_yes_node(yes_animal)
root.add_no_node(no_animal)

root.print_tree()
play_again = "yes"
while play_again == "yes" or play_again == "y":
    print("Think of an animal.")
    
    leaf = root.traverse_to_leaf()
    correct = input("Is your animal a " + str(leaf) +"? ")

    if correct == "yes":
        print("I rule!")
    else:
        leaf.learn_new_animal()
    
    play_again = input("Would you like to play again? ")
    root.print_tree()
    
print("OK - goodbye")


Think of a question and two animals that are distinguished by the question.
does it swim  

hes  no  

Think of an animal.
I rule!
does it swim  

hes  no  

OK - goodbye


## Do the Testing!

## Extensions

There are a bunch of extensions that you might think of here.

1. Read in the data from a file and write it out to a file.
1. Add a method to print all the animals that the program knows about.