<h1 align = 'center'>Guessing Games</h1>
<h3 align = 'center'>machine learning, one step at a time</h3>
<h3 align = 'center'>Step 2. Learning from Mistakes</h3>

#### 2. How does a computer learning by making mistakes?

**Learning** involves making a series of ever-better-informed guesses. All of the information about a coin toss is available the second the coin hits the ground... there is not way to learn from mistakes.

But what if someone picked a number from 1 to 10?

Like this:
<nl>
    <li>**Bob**: pick a number from 1 to 10.
    <li>**Joe**: OK, I picked a number.
    <li>**Bob**: is it six?
    <li>**Joe**: no, it's not six.
</nl><p>
Has Bob learned anything?

Bob has learned one of two things:
1. the number is not six
2. Joe is a liar

Let's assume that Joe is not a liar. In that case, by learning that the number is not six, Bob has narrowed his guessing down from ten numbers to nine numbers.

In Python, the set of numbers from one to ten looks like this:

In [1]:
s = [1,2,3,4,5,6,7,8,9,10]
print(s)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


if you guessed six, and the number is not six, you could pop six off the list by finding its index:

In [2]:
i = s.index(6)
s.pop(i)

6

...and now the six will be gone...

In [3]:
print(s)

[1, 2, 3, 4, 5, 7, 8, 9, 10]


And you could guess again, from the **remaining numbers** in the list.

By guessing wrong, you learned something: you learned what is not the answer. The answer is not six. You made the list of possible answers **shorter** and the remaining problem **easier** even though the guess was wrong.

Here is a python program that tries to guess a number, learning as it goes...

In [4]:
import random

# first, write a function that pulls a random guess from a list of numbers

def guess(list):
    index = random.randrange(len(list))  # pick a random number from 1 to the length of the list
    return list.pop(index)               # pop that entry off of the list

guess(s)

9

In [5]:
# next, try to guess the number 'x' from within the list of numbers 's'

def guess_until_correct(x,s):
    tries = 0
    while len(s) > 0:
        tries += 1
        if (x == guess(s)):
            return tries

guess_until_correct(6, [1,2,3,4,5,6,7,8,9,10])

6

On average, how many guesses does it take to find the right number?

Let's try it 100,000 times and see what we get:

In [6]:
sum = 0

for i in range(0,100000):
    sum += guess_until_correct(6, [1,2,3,4,5,6,7,8,9,10])
    
print('average', sum/100000)

average 5.49195


It probably takes around 5 1/2 guesses, on average, to find the number.

**What's with the 1/2? Why doesn't it take an average of five guesses?**

...hmmmm...

What happens in this case?

In [7]:
sum = 0

for i in range(0,100000):
    sum += guess_until_correct(1, [1,2])
    
print('average', sum/100000)

average 1.50209


The average should be around 1 1/2. That's because it **always** takes at least one guess, but half the time it takes two.

The average is (1+2)/2=1.5, and it works like that for any length list.

The 'extra half' is just a way of saying 'you need between 1 and 10 guesses, and the average of the numbers 1 through 10 is 5.5', or:<p>
$$\sum_{i=1}^{10}i = 55$$<br>
$$\frac{55}{10}=5.5$$<br>

If we just took 100,000 random guesses, guess what would happen?

In [8]:
# first, write a function to try guessing randomly, without learning from prior mistakes

def guess_randomly(x,s):
    tries = 0
    while True:
        tries += 1
        if x == s[random.randrange(len(s))]:
            return tries

guess_randomly(6, [1,2,3,4,5,6,7,8,9,10])

14

In [9]:
# then run the function 100,000 times

sum = 0

for i in range(0,100000):
    sum += guess_randomly(6, [1,2,3,4,5,6,7,8,9,10])
    
print('average', sum/100000)


average 10.01238


(that should take around 10 guesses, on average, to get the right answer)

So **learning from mistakes** is almost <u>twice as efficient</u> as just guessing for numbers from one to ten... but what about a coin toss?

In [10]:
# then run the function 100,000 times

sum = 0

for i in range(0,100000):
    sum += guess_randomly(1, [1,2])
    
print('average', sum/100000)

average 2.00277


Yikes! It takes an average of around *two guesses* to guess from a total two choices. How is <b><u>that</u></b> possible?

Let's add a variable to track the **maximum** number of guesses:

In [11]:
# then run the function 100,000 times

sum = 0
maximum = 0

for i in range(0,100000):
    tries = guess_randomly(1, [1,2])
    sum += tries
    maximum = max(tries, maximum)
    
print('average', sum/100000,'maximum',maximum)

average 2.00574 maximum 17


Why does it sometimes take so many guesses, for a choice of two numbers (which is essentially the same as flipping a coin)? Hint: when the program guesses incorrectly, does it **learn** anything?