This example shows how to perform the discrete belief update discussed in section 6.2 of the course text.

Read over the description of the baby problem before seeing how to express it in math below.

Let's start by defining the transition function:

In [1]:
function T(s, a, sp)
    
    # if we feed the baby, probability that it becomes not hungry is 1.0
    if a == :feed
        if sp == :not_hungry
            return 1.0
        else
            return 0.0
        end
    
    # if we don't feed baby...
    else
        # baby remains hungry if unfed
        if s == :hungry
            if sp == :hungry
                return 1.0
            else
                return 0.0
            end
        else
            # 10% chance of baby becoming hungry given it is not hungry and unfed
            if sp == :hungry
                return 0.1
            else
                return 0.9
            end
        end
    end
                
end

T (generic function with 1 method)

Let's define the observation function:

In [2]:
function O(a, sp, o)
    if sp == :hungry
        p_cry = 0.8
    else
        p_cry = 0.1
    end
    
    if o == :cry
        return p_cry
    else
        return 1.0 - p_cry
    end 
end

O (generic function with 1 method)

The discrete belief update is defined in equations 6.7-6.11 of the course text:

In [3]:
function update_belief(b, a, o)
    bp = Dict()
    for sp in [:hungry, :not_hungry]
        sum_over_s = 0.0
        for s in [:hungry, :not_hungry]
            sum_over_s += T(s, a, sp) * b[s]
        end
        bp[sp] = O(a, sp, o) * sum_over_s
    end

    # normalize so that probabilities sum to 1
    bp_sum = bp[:hungry] + bp[:not_hungry]
    bp[:hungry] = bp[:hungry] / bp_sum
    bp[:not_hungry] = bp[:not_hungry] / bp_sum

    return bp
end

update_belief (generic function with 1 method)

Let's use our functions and follow the example in chapter 6.2.1 of the course textbook.

Step 1. We start with a uniform belief:

In [4]:
b1 = Dict()
b1[:hungry] = 0.5
b1[:not_hungry] = 0.5

0.5

Step 2. We do not feed the baby and the baby cries.

In [5]:
b2 = update_belief(b1, :not_feed, :cry)

Dict{Any,Any} with 2 entries:
  :not_hungry => 0.0927835
  :hungry     => 0.907216

Step 3. We feed the baby and it stops crying.

In [6]:
b3 = update_belief(b2, :feed, :not_cry)

Dict{Any,Any} with 2 entries:
  :not_hungry => 1.0
  :hungry     => 0.0

Step 4. We do not feed the baby and it does not cry.

In [7]:
b4 = update_belief(b3, :not_feed, :not_cry)

Dict{Any,Any} with 2 entries:
  :not_hungry => 0.975904
  :hungry     => 0.0240964

Step 5. Again, we do not feed the baby and it does not cry.

In [8]:
b5 = update_belief(b4, :not_feed, :not_cry)

Dict{Any,Any} with 2 entries:
  :not_hungry => 0.970132
  :hungry     => 0.0298684

Step 6. We do not feed the baby and the baby begins to cry.

In [9]:
b6 = update_belief(b5, :not_feed, :cry)

Dict{Any,Any} with 2 entries:
  :not_hungry => 0.462415
  :hungry     => 0.537585