# CS 182: Artificial Intelligence
# Assignment 4: Probability and Inference under Uncertainty

* Fall 2017
* Due: **Thursday**, November 2, 5:00pm

In this assignment, you will use probabilistic inference and particle filtering to guide Pacman through a tracking  problem. Note: We will use the Pacman framework developed at Berkeley. This framework is used worldwide to teach AI, therefore it is very important that you DO NOT publish your solutions online.


## Pacman (17 points)

<img src="https://s3-us-west-2.amazonaws.com/cs188websitecontent/projects/release/tracking/v1/001/busters.png">

Follow the instructions at

> http://ai.berkeley.edu/tracking.html

and complete **Q1-Q5**. You are NOT required to do Q6 or Q7, but they are interesting problems and worth taking a look at if you have extra time.

The page includes questions requiring implementation of sequential decision making and reinforcement algorithms we studied in class. [We will be using the Berkeley grading scheme to grade this part of the problem set.]


### Notes:

This is one of the hardest of the Pacman assignments, but also one of the most interesting. Be sure you understand the theoretical aspects of tracking and hidden Markov models, as well as particle filters, before you get started.

## RRT Algorithm (8 points)

<img src="https://upload.wikimedia.org/wikipedia/en/thumb/f/f5/RRT_graph1.png/300px-RRT_graph1.png">

In this section, you'll be implementing the RRT robotic planning algorithm. All code can be found in the `rrt/` files. Your goal will be to implement several functions in `util.py` to successfully find a path from a source point to the goal using RRT. This path should avoid collisions with obstacles.

**NOTE: All code should be written in `util.py`.**

**NOTE: You need to install pygame for this to run: `pip install pygame`**

Functions that should be filled in for full credit are:
* `winCondition`
* `nearestBode`
* `getNewPoint`
* `extend`
* `isCollisionFree`

To test your implementation, there are three maps that increase in difficulty:
* `python runNoObs.py` -- no obstacles (you can test your other functions with this before implementing `isCollisionFree` -- just make sure `isCollisionFree` returns True always -- should take 500ish iterations)
* `python runBugtrap.py` -- start in a bugtrap (should take a couple thousand iterations)
* `python runDoubleBugtrap.py` -- start and end in a bugtrap (as we discussed in class, this could take some time)

Note: Sometimes the graphics will make it appear that a vaild solution is just barely crossing the tip of a diagonal line. Also by construction our tests will never take a step that is bigger than an obstacle so you do not need to worry about collisions in the middle of your extensions. [You're welcome :-)](https://www.youtube.com/watch?v=79DijItQXMM)


## Written Assignment (15 points)

Answer the following questions individually, and submit as pdf to Canvas. 

### The Coin Problem (3 points)
#### Question 1 (3 points)

Ankit and Aidi decide to play a coin game to show how we can use HMMs for sequence analysis problems. Aidi tosses first, then they take turns based on rules described below. The game finishes when the subsequence "HTH" appears, and whoever last flips the coin wins. Each player can flip the coin for multiple turns in a row, and the rules for stopping and switching to the other partner are as follows:

1. Every time Aidi flips the coin, she also flips an extra unfair coin (P(H) = 0.3). She stops if the extra unfair coin lands heads. Otherwise, she keeps flipping the fair and extra biased coin (at the same time). The flips of the extra coin are not recorded.
2. Every time Ankit flips the coin, he only flips the fair coin until H appears (and all flips are recorded).

You're given a sequence of recorded coin fips. You'd like to infer the winner and the flips of each player.

Describe an HMM to model this game (draw a diagram with nodes rep and edges/arrows).

---------

### Typing Simulation (9 points)
For Questions 2-5 you will play a typing simulation. Let random variable $E$ represent the observed key press, and $X$ represent the hidden (intended) key press.  We have a language with 4 letters (A, B, C, D), and a keyboard arranged as a circle. 

<table>
<tr> <td>A</td><td>B</td> </tr> 
<tr> <td>C</td><td>D</td>  </tr>
</table>

At any time, the probability of hitting the intended key is 50%, and the probability of hitting the neighboring keys is 25%. For example:

$$ P(E | X = \mathrm{B}) $$

<table>
<tr> <td>0.25</td><td>0.5</td> </tr> 
<tr> <td>0</td><td>0.25</td>  </tr>
</table>

We will construct a filtering model for constructing the belief state for this problem.

#### Question 2 (1 point)

Assuming a uniform prior distribution, calculate the condition probability table (CPT) of $P(X=x | E=e)$ for all $x$ and $e$. 

#### Question 3 (2 points)

Now let the prior distribution be:


| x  |  P(X=x) |
|---|---|
|A | 0.4 |
|B | 0.2    |
|C | 0.1    |
|D | 0.3    | 

Calculate the CPT $P(X=x | E=e)$ for all $x$ and $e$. 
 
#### Question 4 (3 points)

Consider the following transition model:

$$P(X' | X)$$


|   |  A' | B' | C' | D' |
|---|---|---|----------|
|Begin | 1| 0| 0| 0| 
|A | 0.5 | 0.5 | 0 | 0 |
|B | 0.0 | 0.5| 0.5| 0|
|C | 0.5 | 0| 0| 0.5|
|D |  0.25   |0.25 |0.25 | 0.25|

For this problem we are concerned with true (hidden) state sequences, as opposed to observations. What is the probability under this model of the sequence of letters "A B B C D"? How about "A A B A"? What is $P(X_3=x | X_1 = \mathrm{A}, X_2 = \mathrm{B})$ for all $x$?


#### Question 5 (3 points)

Finally we consider the full filtering problem in which we compute $P(X_n | E_1, \ldots, E_n)$. Let "A B B C D" be the sequence of observed key strokes. What is the current belief state of the model? That is compute $P(X_n = x | E_1 = \mathrm{A}, E_2=\mathrm{B}, E_3=\mathrm{B}, E_4 = \mathrm{C}, E_5=\mathrm{D})$ for all $x$ and $n = 2, 3, 4, 5$.

---------

### Robotic Motion Planning (3 points)
#### Question 6 (3 points)
Describe using pseudocode an RRT-based planning algorithm that uses more than two trees. Make sure to consider issues such as the maximum number of allowable trees, when to start a tree, and when to attempt connections between trees.

What are the types of problems for which this algorithm would perform better than RRT or bi-directional RRT?