# Probabilistic Search and Destroy

Authors:

- Rohan Rele (rsr132)
- Aakash Raman (abr103)
- Alex Eng (ame136)
- Adarsh Patel (aap237)

This project was completed for Professor Wes Cowan's Fall 2019 offering of the CS 520: Intro to Artificial Intelligence course, taught at Rutgers University, New Brunswick.

# Problem Representation

In this project, we consider a two-dimensional map of cells in which one cell is randomly designated as the target. The location of the target is not known to any solving agent. Therefore, the problem is to devise an agent which can effectively query the landscape of cells, contribute towards its knowledge base based on observations, and ultimately find the target in the **minimal number of queries.**

The knowledge base itself will contain probabilistic knowledge, i.e. 

$$\text{Belief}[\text{Cell}_i] = P(\text{Target in Cell}_i|  \text{Observations through time } t)$$ 

For every cell, this is the probability that a given cell contains the target given the existing knowledge base. Initially, as the agent has no prior knowledge about the map, the belief for each cell is $\frac{1}{dim^2}$.

Each cell also contains a terrain type which corresponds to the probability that a query will return a false negative, i.e.

$$P(\text{Target not found in Cell}_i | \text{Target is in Cell}_i)$$

which is $0.1$ for **flat** terrain cells, $0.3$ for **hilly** terrain cells, $0.7$ for **forested** terrain cells, and $0.9$ for cells whose terrain is a maze of **caves.**

We assume that for any given map, each cell is assigned the flat terrain type with probability $0.2$, the hilly terrain type with probability $0.3$, the forested terrain type with probability $0.3$, and the caves terrain type with probability $0.2$.

## Landscape

We implement the landscape as a class, which has the following fields:

- `dim` (int): the dimension of the $dim$ by $dim$ map
- `landscape` (2D list) of `landCell` objects, each of which tracks:
    - `target` (int): `PRESENT = 1` if this cell is the target, or `ABSENT = 0` otherwise
    - `terrain` (int): `FLAT = 0.1` if this cell has flat terrain, `HILLY = 0.3`, `FOREST = 0.7`, or `MAZE = 0.9`, etc.
- `target_x` (int): the x-coordinate of the target cell
- `target_y` (int): the y-coordinate of the target cell

A landscape is initialized with non-target cells that are assigned terrain types based on the probabilities previously described. It then randomly selects one cell to be the target.

In [None]:
class landscape:

    dim = 0
    landscape = [[]]
    target_x = 0
    target_y = 0

    def __init__(self, dim):
        self.dim = dim
        self.landscape = [[landCell() for _ in range(self.dim)] for _ in range(self.dim)]

        target_x = random.randint(0, dim - 1)
        target_y = random.randint(0, dim - 1)
        self.landscape[target_x][target_y].target = PRESENT
        self.target_x = target_x
        self.target_y = target_y

For more implementation details, see `Landscape.py`.

The `landCell` object is also defined in a class.

In [None]:
class landCell:

    def __init__(self):
        x = random.randint(1, 100)
        self.target = ABSENT
        if x <= 20:
            self.terrain = FLAT
        elif 20 < x <= 50:
            self.terrain = HILLY
        elif 50 < x <= 80:
            self.terrain = FOREST
        else:
            self.terrain = MAZE

    def getTerrain(self):
        return self.terrain

    def isTarget(self):
        return (self.target==PRESENT)

For more implementation details, see `Cell.py`.

For example, an initialized $dim = 50$ landscape may look like this:

![Blank Landscape](./imgs/landscape_blankTest.png)

where the target is located at (40, 48).

## Agent

We also implement the agent as a class, which has the following fields:

- `knowledge` (2D list) of `agentCell` objects, each of which tracks:
    - `belief` (float): the probability that a given cell contains the target, as described above; initially $1/{dim}^2$
    - `status` (boolean): either `VISITED = True` or `UNVISITED = False` depending on whether or not the cell has been queried previously; initially `False`
- `rule` (int): either 1 or 2, corresponding to the two probability rules described below
- `num_actions` (int): the number of actions, whether queries or movements (in the later case of a movement-restricted agent), executed so far; initially 0

In [None]:
class agent:
    num_actions = 0

    def __init__(self, landscape, rule):
        self.ls = landscape
        d = self.ls.dim

        if rule == 1 or rule == 2:
            self.rule = rule
        else:
            print("Invalid rule, set to 1 by default")
            self.rule = 1

        self.knowledge = [[agentCell() for j in range(d)] for i in range(d)]
        for i in range(d):
            for j in range(d):
                self.knowledge[i][j].setBelief(1/(d**2))

For more implementation details, see `Agent.py`.

The `agentCell` object is also defined in a class.

In [None]:
class agentCell:

    def __init__(self):

        self.belief = 0
        self.status = UNVISITED

    def getBelief(self):
        return self.belief

    def getStatus(self):
        return self.status

    def setBelief(self,belief):
        self.belief = belief

    def setStatus(self,status):
        self.status = status

For more implementation details, see `Cell.py`.

For example, an initialized agent knowledge base with $dim = 50$ landscape may look like this:

![Blank Beliefs](./imgs/belief_blankTest.png)

where each cell has initial belief $\frac{1}{50^2} = 0.0004$.

The `agent` class also has the following methods (non-exhaustive list):

- `searchCell(cell)` (boolean): query a `landCell` object. If it is not the target, return `False`. If it is the target, then only return `True` with probability $p = 1 - P(\text{false negative})$, where the false negative probability depends on that cell's terrain type as described previously. Otherwise, return `False`. 

**Note:** In either case, the number of actions taken by the agent is incremented.

In [None]:
def searchCell(self,cell): #search a landCell
    self.num_actions += 1
    
    if not cell.isTarget():
        return False
    else:
        p = 1 - cell.getTerrain()
        if random.uniform(0, 1) < p:
            return True
        else:
            return False

- `getVisited()` (list): return a list of all (x,y) coordinates which the agent has already queried at a given point in time

In [None]:
def getVisited(self):
    n = self.ls.dim
    coords = []
    for x in range(n):
        for y in range(n):
            if self.knowledge[x][y].getStatus():
                coords.append((x,y))
    return coords

# Updating the belief state

We require a method to update the belief state given the results of a query. There are two cases: 

1. A query of a cell found the target, in which case the belief for this cell is set to 1, and the beliefs for all other cells are set to 0.

2. A query of a cell did not find the target, in which case the belief for this cell must be adjusted considering the probability that the query returned a false negative.

The latter case considers the probability 

$$P(\text{Target in Cell}_i | \text{Observations}_t \land \text{Failure in Cell}_j)$$

and relies on the probabilistic knowledge base.

### Probabilistic intuition

Let $H := \{\text{Target in Cell}_i\}$ and $E := \{\text{Target not found if we queried every cell}\}$. 

Then we want $P(H|E)$, or the probability that the target is in a cell given we have not found the target in any of our queries. We would like to compute this upon a failed query of a cell and accept this quantity as that cell's new belief.

By **Bayes' theorem,** this quantity is:

$$P(H|E) = \frac{P(E|H)P(H)}{P(E)}$$

Observe that $P(E|H)$ is the probability that the target is not found given the target is in the cell, which is exactly the false negative probability described above per terrain type. And $P(H)$ is exactly the agent's belief for that cell in the previous time step.

One can see that observing a failed query for a given cell will decrease our belief that this cell contains the target, but this decrease is scaled by the possibility of false negatives.

$P(E)$ is calculated with the following function: 

$$P(E) = \sum_{i \text{ visited}} P(\text{Target not found in Cell}_i | \text{Target is in Cell}_i) + \sum_{j \text{ unvisited}} P(\text{Target not in Cell}_j)$$

That is, it is the probability that some queried cell was a false negative and that the unqueried cells do not contain the target. In this situation, querying all remaining cells would not lead to us finding the target.

Finally, once a queried cell is updated, we must update the rest of the knowledge base. 

Let $R_i = |{\text{new belief of Cell}_i} - {\text{old belief of Cell}_i}|$, or the difference between the new and old beliefs of the queried cell.

Then for all remaining cells $j$, use the following update formula: 

$$\text{Belief}^{t+1}_j = \text{Belief}^t_j + \frac{\text{Belief}^t_j * R_i}{1 - R_i}$$

This, in a sense, scales the previous belief by how much our query impacted the belief of the queried cell. One can see how failed queries will increase the beliefs of all other cells per iteration, although this increase may be marginal.

### Implementation

Based on the intuition above, the implementation of a belief update is:

In [None]:
def updateBelief(self,x,y):
    if self.searchCell(self.ls.landscape[x][y]):
        for i in range(self.ls.dim):
            for j in range(self.ls.dim):
                self.knowledge[i][j].setBelief(0)
        self.knowledge[x][y].setBelief(1)
    else:
        #P(H|E) = P(E|H)P(H)/P(E)
        #H: Target in cell
        #E: Target not found
        curr_belief = self.knowledge[x][y].getBelief()
        num = self.ls.landscape[x][y].getTerrain()*curr_belief
        denom = self.probNotFound()
        remainder = abs(curr_belief - (num/denom))
        self.knowledge[x][y].setBelief(num/denom)
        for i in range(self.ls.dim):
            for j in range(self.ls.dim):
                if i == x and j == y:
                    continue
                else:
                    temp = self.knowledge[i][j].getBelief()
                    self.knowledge[i][j].setBelief(temp + (temp*remainder)/(1-remainder))

    return self.knowledge

This method returns a new belief for a specified cell and updates the beliefs about the rest of the map, as described earlier. It relies on the function `probNotFound` which computes $P(H|E)$ exactly as described above.

In [None]:
def probNotFound(self):
    n = self.ls.dim
    res = 0
    coords = self.getVisited()
    res = (n**2 - len(coords))/(n**2)
    for coord in coords:
        res += (self.ls.landscape[coord].getTerrain())*(self.knowledge[coord].getBelief())
    return res

For more implementation details, see `Agent.py`.

# Agent search strategies

Armed with this `updateBelief` function, we need to define how the agent will choose cells to query in order to search maps. We consider two rules for which cell the agent should query next:

1. Query the cell with the highest belief, i.e. the probability that **the target is in that cell.**

2. Query the cell with the highest probability that **the target will be found in such a query.**

We implement both probability rules, and then use either of them to implement the agent's search algorithm.

## Rule 1: Probability that the target is in a given cell

This is based on the same $P(H|E)$ computed above. Upon each failed query, we update the entire knowledge base of beliefs as described above, and then we visit the cell with the highest belief. Its implementation is described above via `updateBelief`.

## Rule 2: Probability that the target will be found, if a given cell is searched

### Probabilistic intuition

Observe that this probability is different from $\text{Belief}_i$. It must consider the impact of the terrain's interference with queries, i.e. potential false negatives. We want the following probability:

$$P(F) := P(\text{Target found in Cell}_i | \text{Observations}_t)$$

which is equal to:

$$(1 - P(\text{Target not found in Cell}_i | \text{Target is in Cell}_i)) * P(\text{Target is in Cell}_i)$$

Note that $P(\text{Target is in Cell}_i)$ is exactly $P(H)$ from before, and $P(\text{Target not found in Cell}_i | \text{Target is in Cell}_i)$ is determined by the terrain type of $\text{Cell}_i$, so we can easily compute $P(F)$.

### Implementation

The probability that the target will be found at a given cell if it is searched is computed by:

In [None]:
def probFound(self,x,y):
        return (1-self.ls.landscape[x][y].getTerrain())*self.knowledge[x][y].getBelief()

## Agent search implementation for both rules

We drive the above probability-calculating and belief update functions with the below methods in the `agent` class:

In [None]:
def getMaxLikCell(self):
    if self.rule == 1:
        #get max i for P(Target in Cell i)
        belief = np.array([[self.knowledge[i][j].getBelief() for j in range(self.ls.dim)] for i in range(self.ls.dim)])
        return np.unravel_index(belief.argmax(),belief.shape)
    else:
        #get max i for P(Target FOUND in Cell i)
        belief = [[self.knowledge[i][j].getBelief()*(1-self.ls.landscape[i][j].getTerrain()) for j in range(self.ls.dim)] for i in range(self.ls.dim)]
        belief = np.array(belief)
        return np.unravel_index(belief.argmax(),belief.shape)

This method will find the cell with maximum probability for either rule:

1. Return the cell coordinates with the highest Rule 1 probability, i.e. belief.
2. Return the cell coordinates with the highest Rule 2 probability, i.e. the equation in `probFound` above.

In either case, ties between multiple maximum-probability cells are broken arbitrarily.

Finally, the driver code which will repeat belief updates and getting the maximum-probability cell is:

In [None]:
def findTarget(self):
    i = random.randint(0, self.ls.dim-1); j = random.randint(0, self.ls.dim-1)
    while not self.searchCell(self.ls.landscape[i][j]):
        self.knowledge = self.updateBelief(i,j)
        i,j = self.getMaxLikCell()
    return (i,j)

This method will randomly select a cell to query for the first iteration. Then, for subsequent iterations, it will query the cell with the highest-probability (based on either rule), and update all beliefs along the way. It terminates once the target is found.

## Performance comparison between both rules

For all comparisons listed below between the two probability rules prioritized in `getMaxLikCell`, we run trials on map(s) of $dim = 50$.

### Repeated trials over the same landscape

For a fixed landscape, we run $n=200$ trials of Rule 1 agents and Rule 2 agents solving the same map, and we record the number of actions taken for each agent and trial. Note that for each trial, we also reset the location of the target to a new and different location, chosen uniformly at random via the following code:

In [None]:
def resetTarget(self):
    x = self.target_x; y = self.target_y
    self.landscape[x][y].target = ABSENT
    
    new_targ_coords = list(set([(x,y) for x in range(self.dim) for y in range(self.dim)]).difference({(x,y)}))
    
    new_target = random.choice(new_targ_coords)
    self.target_x = new_target[0]
    self.target_y = new_target[1]
    self.landscape[self.target_x][self.target_y].target = PRESENT

Here, we use the following randomly-generated landscape, where the target is **initially** at (41, 46).

![Fixed Landscape Rule One v. Two Trials](./imgs/landscape_ruleOneTwoComparison.png)

The comparison data generated is as follows:

In [1]:
import pandas as pd
fixed_ruleOneTwoComp_df = pd.read_csv('./data/fixed_map_comparison_ruleOneTwo.csv')[['RuleOne', 'RuleTwo']]
fixed_ruleOneTwoComp_df.transpose()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,190,191,192,193,194,195,196,197,198,199
RuleOne,4193,1535,1617,4949,1865,583,647,1549,3891,2623,...,4623,741,213,2001,3489,4207,2513,2145,2515,3265
RuleTwo,3825,297,1509,4045,3145,1173,121,6069,5247,545,...,17261,145,1059,25745,3599,2319,517,439,1759,687


The above trial-by-trial data on the number of searches required for the agents (using rules 1 or 2) to find the target can be visualized via the following scatter plot:

![Rules 1-2 Comparison by Trials, Scatter](./imgs/fixed_map_comparison_ruleOneTwoScatter.png)

It appears that the agent using Rule 1 requires a higher number of searches to find the target. But this pattern is not immediately clear. To mediate this, we consider the quantity:

$$\text{Diff} = \text{Number of searches}_{\text{Rule1}} - \text{Number of searches}_{\text{Rule2}}$$

![Rules 1-2 Difference by Trials, Plot](./imgs/fixed_map_comparison_ruleOneTwoDiff.png)

![Rules 1-2 Difference by Trials, Box](./imgs/fixed_map_comparison_ruleOneTwoDiffBox.png)

The above images show that the difference between the number of searches required following rules 1 and 2 has high spread. The following 1-variable statistics describe the distribution:

- **Mean:** 426.55
- **Standard Deviation:** 8418.86
- **Min:** -49320
- **Max:** 30422

All in all, for a fixed map, it appears that **the agent using Rule 2** outperforms the agent using Rule 1 in terms of, on average, **427 fewer searches required to find the target.** However, the large variance visualized above strongly motivates repeated trials over multiple randomly-generated maps in order to see if this pattern holds in general.

### Repeated trials over multiple landscapes

We conduct similar trials to compare both agents for $N = 50$ distinct and randomly generated maps. For each map and agent, we record the average number of actions taken to find the target over $n = 30$ trials. As above, we reset the target to a random new location in between each trial.

### Intuition

# Restricted agent movement

## Rule 3: Utility and decision-making strategy

## Implementation

## Performance comparison with unrestricted movement via rules 1 and 2

# A drunk man

# A moving target

Now, consider a target which is not static in that:

1. Upon every failed query, the target moves to one of its neighbors at random.
2. Upon moving to one of its neighbors, some sensor returns an observation of terrain type ("FLAT" , "HILLY", etc.). However, the sensor is broken, so it returns some type that the new target is **not,** at random. 

For example, if we query a cell and do not find the target, and we get the new observation "HILLY", then we know that the target is now in a cell which is either "FLAT", "FOREST", or "MAZE".

## Implementation

Due to the object-oriented nature of our `Landscape` class, the target can easily be moved.

The below code moves the target to one of its neighbors at random, and returns a terrain type the new target is not, at random.

In [None]:
def moveTarget(self):
    x = self.target_x; y = self.target_y
    self.landscape[x][y].target = ABSENT

    nbrs = []
    for dx, dy in dirs:
        if 0 <= x + dx < self.dim and 0 <= y + dy < self.dim:
            nbrs.append((x + dx, y + dy))

    new_nbr = random.choice(nbrs)
    self.target_x = new_nbr[0]; self.target_y = new_nbr[1]
    self.landscape[self.target_x][self.target_y].target = PRESENT

    terrains = {FLAT, HILLY, FOREST, MAZE}
    new_terrain = {(self.landscape[self.target_x][self.target_y]).terrain}
    return random.choice(list(terrains.difference(new_terrain)))

## Strategy

## A moving target AND an agent with restricted movement