# Module 7 - Programming Assignment

## Directions

There are general instructions on Blackboard and in the Syllabus for Programming Assignments. This Notebook also has instructions specific to this assignment. Read all the instructions carefully and make sure you understand them. Please ask questions on the discussion boards or email me at `EN605.445@gmail.com` if you do not understand something.

<div style="background: mistyrose; color: firebrick; border: 2px solid darkred; padding: 5px; margin: 10px;">
You must follow the directions *exactly* or you will get a 0 on the assignment.
</div>

You must submit a zip file of your assignment and associated files (if there are any) to Blackboard. The zip file will be named after you JHED ID: `<jhed_id>.zip`. It will not include any other information. Inside this zip file should be the following directory structure:

```
<jhed_id>
    |
    +--module-01-programming.ipynb
    +--module-01-programming.html
    +--(any other files)
```

For example, do not name  your directory `programming_assignment_01` and do not name your directory `smith122_pr1` or any else. It must be only your JHED ID.

Imports here if needed.

In [1]:
import copy

# Forward Planner

## Unify

Use the accompanying `unification.py` file for unification. For this assignment, you're almost certainly going to want to be able to:

1. specify the problem in terms of S-expressions.
2. parse them.
3. work with the parsed versions.

`parse` and `unification` work exactly like the programming assignment for last time.

In [2]:
from unification import parse, unification

## Forward Planner

In this assigment, you're going to implement a Forward Planner. What does that mean? If you look in your book, you will not find pseudocode for a forward planner. It just says "use state space search" but this is less than helpful and it's a bit more complicated than that. **(but please please do not try to implement STRIPS or GraphPlan...that is wrong).**

At a high level, a forward planner takes the current state of the world $S_0$ and attempts to derive a plan, basically by Depth First Search. We have all the ingredients we said we would need in Module 1: states, actions, a transition function and a goal test. We have a set of predicates that describe a state (and therefore all possible states), we have actions and we have, at least, an implicit transition function: applying an action in a state causes the state to change as described by the add and delete lists.

Let's say we have a drill that's an item, two places such as home and store, and we know that I'm at home and the drill is at the store and I want to go buy a drill (have it be at home). We might represent that as:

<code>
start_state = [
    "(item Drill)",
    "(place Home)",
    "(place Store)",
    "(agent Me)",
    "(at Me Home)",
    "(at Drill Store)"
]
</code>

And we have a goal state:

<code>
goal = [
    "(item Drill)",
    "(place Home)",
    "(place Store)",
    "(agent Me)",
    "(at Me Home)",
    "(at Drill Me)"
]
</code>

The actions/operators are:

<code>
actions = {
    "drive": {
        "action": "(drive ?agent ?from ?to)",
        "conditions": [
            "(agent ?agent)",
            "(place ?from)",
            "(place ?to)",
            "(at ?agent ?from)"
        ],
        "add": [
            "(at ?agent ?to)"
        ],
        "delete": [
            "(at ?agent ?from)"
        ]
    },
    "buy": {
        "action": "(buy ?purchaser ?seller ?item)",
        "conditions": [
            "(item ?item)",
            "(place ?seller)",
            "(agent ?purchaser)",
            "(at ?item ?seller)",
            "(at ?purchaser ?seller)"
        ],
        "add": [
            "(at ?item ?purchaser)"
        ],
        "delete": [
            "(at ?item ?seller)"
        ]
    }
}
</code>

These will all need to be parsed from s-expressions to the underlying Python representation before you can use them. You might as well do it at the start of your algorithm, once. The order of the conditions is *not* arbitrary. It is much, much better for the unification and backtracking if you have the "type" predicates (item, place, agent) before the more complex ones. Trust me on this.

As for the algorithm itself, there is going to be an *outer* level of search and an *inner* level of search.

The *outer* level of search that is exactly what I describe here: you have a state, you generate successor states by applying actions to the current state, you examine those successor states as we did at the first week of the semester and if one is the goal you stop, if you see a repeat state, you put it on the explored list (you should implement graph search not tree search). What could be simpler?

It turns out the Devil is in the details. There is an *inner* level of search hidden in "you generate successor states by applying actions to the current state". Where?

How do you know if an action applies in a state? Only if the preconditions successfully unify with the current state. That seems easy enough...you check each predicate in the conditions to see if it unifies with the current state and if it does, you use the substitution list on the action, the add and delete lists and create the successor state based on them.

Except for one small problem...there may be more than one way to unify an action with the current state. You must essentially search for all successful unifications of the candidate action and the current state. This is where my question through the semester appliesm, "how would you modify state space search to return all the paths to the goal?"

Unification can be seen as state space search by trying to unify the first precondition with the current state, progressively working your way through the precondition list. If you fail at any point, you may need to backtrack because there might have been another unification of that predicate that would succeed. Similarly, as already mentioned, there may be more than one.

So...by using unification and a properly defined <code>successors</code> function, you should be able to apply graph based search to the problem and return a "path" through the states from the initial state to the goal. You'll definitely want to use graph-based search since <code>( drive Me Store), (drive Me Home), (drive Me Store), (drive Me Home), (drive Me Store), (buy Me Store Drill), (drive Me Home)</code> is a valid plan.

Your function should return the plan...but if you pass an extra debug=True parameter, it should also return the intermediate *states* as well as the actions.

-----

### Helper Functions ###

**applySubs(xprsn, sub)**  
This function applies the substitution to the expression in the argument. The algorithm goes over each parts of the expression, and replace any variable with a constant if the variable has a substitution. If the part of the expression is another nested expression, it recurs into it and apply the same substitution.

**parseAll(ls)**  
The function serves as a general purpose parser. It recursively parses both nested list and dictionaries with a stopping point of a string. It is used to parse state, goals, and all sub-elements of actions (e.g. preconditions, adds, deletes).

**deParser(expr)**  
Recursive de-parser of expressions. The function returns the S-expression form of the input.

**setifyState(state)**  
Encode the current state as a set. The function first de-parses all elements of the state into S-expressions, and store these S-expressions as sets. This will allow two states to be easily compared, as two states are equal if and only if the two sets of statuses are the same.

**multiVarSameSub(estab, new)**  
This function checks if two substitutions assign any two different variables to the same object. The function first constructs an inverse dictionary (which is 1-to-1 since substitution are 1-to-1) of the `estab` input and looks through all assignments of `new` to ensure that the variable assignment are different. This function is used to detect inconsistent assignments during action instantiation.

**actionInstances(inConds, state)**  
The function instantiates actions based on the input state and preconditions of the action schema. The function performs a depth-first search by matching preconditions with the states. It tries the first condition, attempts to match it with all the preconditions, and for every matched preconditions, it starts a branch. This continues on until all preconditions are matched with a state. 

Once the process is over, it returns all different actions as substitution schemes that maps each variable to a specific constant.

**updateState(states, action, subs)**  
The function takes a state as an input, a specific action as a dictionary, with key of `'add'` and `'delete'`, and a substitution as a instantiated action, and returns the updated state. The function first applies the substitution scheme to the `'add'` and `'delete'`, and adjust the state accordingly.

**reprPlan(path, plan, debug)**  
The function returns a string that lists the instantiated actions and the terminal state, and if `debug` is True, also prints the starting and intermediate states.

In [3]:
def applySubs(expr, sub):
    return [applySubs(t,sub) if type(t)==list \
            else sub[t] if t in sub else t for t in expr]

def parseAll(ls):
    if type(ls) == str:
        return parse(ls)
    elif type(ls) == list:
        return [parseAll(l) for l in ls]
    elif type(ls) == dict:
        return dict([(k,parseAll(ls[k])) for k in ls])
    else:
        raise TypeError('Cannot parse type: ' + repr(type(ls)))

def deParse(expr):
    tmp = [e if type(e)==str else deParse(e) for e in expr]
    return '(' + ' '.join(tmp) + ')'

def setifyState(state):
    return set([deParse(st) for st in state])

def multiVarSameSub(estab, new):
    invEstab = dict((v, k) for k, v in estab.items())
    return any([invEstab[v]!=k for k,v in new.items() if v in invEstab])

def actionInstances(inConds, state):
    acts = list()
    if len(inConds) > len(state):
        return acts # must have more states than preconditions
    p = [( {}, copy.deepcopy(inConds))]
    while len(p) > 0:
        sub, conds = p.pop()
        for n,s in enumerate(state):
            tmp = unification(conds[0],s)
            if tmp != False and not multiVarSameSub(sub,tmp):
                tmp.update(sub)
                if len(conds) == 1:
                    acts.append(tmp)
                else:
                    p.append((tmp, [applySubs(c,tmp) for c in conds[1:]]))
    return acts

def updateState(states, action, subs):
    adds = applySubs(action['add'], subs)
    dels = applySubs(action['delete'], subs)
    return [st for st in states if st not in dels] + adds

def reprPlan(path, plan, debug):
    out = []
    for n,pl in enumerate(plan):
        if debug:
            out.append('State: %s'%[deParse(st) for st in path[n]])
        out.append('Action: %s'%deParse(pl))
    out.append('Final state: %s'%[deParse(st) for st in path[-1]])
    return '\n'.join(out)

(you can just overwrite that one and add as many others as you need). Remember to follow the **Guidelines**.


-----

So you need to implement `forward_planner` as described above. `start_state`, `goal` and `actions` should all have the layout above and be s-expressions.

Your implementation should return the plan as a **List of instantiated actions**. If `debug=True`, you should print out the intermediate states of the plan as well.

### Main Program ###

**successors(state, actions, explored)**  
The function gives the successor states of the current state based on the `actions` input. The functino uses the `actionInstances(...)` function and for all actions which does not lead to a state that is already explored, it returns the new state, the substitution schemes, as well as the text veresion of the instantiated action.

**forward_planner( start_state, goal, actions, debug=False)**  
The main program. It performs a depth-first search by starting with the `start_state`, obtaining the successors, and pushing the successors onto a stack. It repeats until a plan which leads to the goal state is found. The program then calls that `reprPlan(...)` function and returns the result.

The program first parses the starting state and the actions dictionary into usable form. It stores goal as a set, so that the various states could be made into set and check if the goal had been reached. If the program fails to find a solution, it returns `None`.

In [4]:
def successors(state, actions, explored):
    out = list()
    for a in actions:
        subs = actionInstances(actions[a]['conditions'], state)
        # print subs
        if len(subs) > 0:
            newStates = [updateState(state, actions[a], s) for s in subs]
            actionTxts = [applySubs(actions[a]['action'], s) for s in subs]
            out.extend( zip(newStates, subs, actionTxts) )
    return [(st,u,a) for st,u,a in out if setifyState(st) not in explored]


def forward_planner( start_state, goal, actions, debug=False):
    start_state,actions = parseAll(start_state), parseAll(actions)
    goal = setifyState(parseAll(goal))
    
    explored = []
    s = [([start_state], [])]    
    while len(s) > 0:
        (path, plan) = s.pop()
        state = path[-1] # most recent element on path is the current state
        if goal == setifyState(state):
            return reprPlan(path, plan, debug)
        explored.append(setifyState(state))
        instances = successors(state, actions, explored)
        for st, sub, txt in instances:
            s.append((path+[st], plan+[txt]))
    
    return None # failed to find solution

You will be solving the problem from above. Here is the start state:

In [5]:
start_state = [
    "(item Drill)",
    "(place Home)",
    "(place Store)",
    "(agent Me)",
    "(at Me Home)",
    "(at Drill Store)"
]

The goal state:

In [6]:
goal = [
    "(item Drill)",
    "(place Home)",
    "(place Store)",
    "(agent Me)",
    "(at Me Home)",
    "(at Drill Me)"
]

and the actions/operators:

In [7]:
actions = {
    "drive": {
        "action": "(drive ?agent ?from ?to)",
        "conditions": [
            "(agent ?agent)",
            "(place ?from)",
            "(place ?to)",
            "(at ?agent ?from)"
        ],
        "add": [
            "(at ?agent ?to)"
        ],
        "delete": [
            "(at ?agent ?from)"
        ]
    },
    "buy": {
        "action": "(buy ?purchaser ?seller ?item)",
        "conditions": [
            "(item ?item)",
            "(place ?seller)",
            "(agent ?purchaser)",
            "(at ?item ?seller)",
            "(at ?purchaser ?seller)"
        ],
        "add": [
            "(at ?item ?purchaser)"
        ],
        "delete": [
            "(at ?item ?seller)"
        ]
    }
}

In [8]:
plan = forward_planner( start_state, goal, actions)
print plan

Action: (drive Me Home Store)
Action: (buy Me Store Drill)
Action: (drive Me Store Home)
Final state: ['(item Drill)', '(place Home)', '(place Store)', '(agent Me)', '(at Drill Me)', '(at Me Home)']


In [9]:
plan_with_states = forward_planner( start_state, goal, actions, debug=True)
print plan_with_states

State: ['(item Drill)', '(place Home)', '(place Store)', '(agent Me)', '(at Me Home)', '(at Drill Store)']
Action: (drive Me Home Store)
State: ['(item Drill)', '(place Home)', '(place Store)', '(agent Me)', '(at Drill Store)', '(at Me Store)']
Action: (buy Me Store Drill)
State: ['(item Drill)', '(place Home)', '(place Store)', '(agent Me)', '(at Me Store)', '(at Drill Me)']
Action: (drive Me Store Home)
Final state: ['(item Drill)', '(place Home)', '(place Store)', '(agent Me)', '(at Drill Me)', '(at Me Home)']


### Own test ###

Using class example, check to see if program can solve the problem.

In [10]:
actions2 = {
     "pick_up": {
        "action": "(pick_up ?x)",
        "conditions": [
            "(on_table ?x)",
            "(clear ?x)",
            "(handempty)"
        ],
        "add": [
            "(holding ?x)"
        ],
        "delete": [
            "(on_table ?x)",
            "(clear ?x)",
            "(handempty)"
        ]
    },
    "put_down": {
        "action": "(put_down ?x)",
        "conditions": [
            "(holding ?x)"
        ],
        "add": [
            "(on_table ?x)",
            "(clear ?x)",
            "(handempty)"
        ],
        "delete": [
            "(holding ?x)"
        ]
    },
    "stack": {
        "action": "(stack ?x ?y)",
        "conditions": [
            "(holding ?x)",
            "(clear ?y)"
        ],
        "add": [
            "(on ?x ?y)",
            "(clear ?x)",
            "(handempty)"
        ],
        "delete": [
            "(holding ?x)",
            "(clear ?y)"
        ]
    },
    "unstack": {
        "action": "(unstack ?x ?y)",
        "conditions": [
            "(on ?x ?y)",
            "(clear ?x)",
            "(handempty)"
        ],
        "add": [
            "(holding ?x)",
            "(clear ?y)"
        ],
        "delete": [
            "(clear ?x)",
            "(on ?x ?y)",
            "(handempty)"
        ]
    }
}

In [11]:
start2 = [
    "(on_table B)",
    "(on_table A)",
    "(on C A)",
    "(clear B)",
    "(clear C)",
    "(handempty)"
]

goal2 = goal = [
    "(on A C)",
    "(on C B)",
    "(clear A)",
    "(handempty)",
    "(on_table B)",
]

In [12]:
plan2 = forward_planner( start2, goal2, actions2, True)
print plan2

State: ['(on_table B)', '(on_table A)', '(on C A)', '(clear B)', '(clear C)', '(handempty)']
Action: (unstack C A)
State: ['(on_table B)', '(on_table A)', '(clear B)', '(holding C)', '(clear A)']
Action: (stack C B)
State: ['(on_table B)', '(on_table A)', '(clear A)', '(on C B)', '(clear C)', '(handempty)']
Action: (pick_up A)
State: ['(on_table B)', '(on C B)', '(clear C)', '(holding A)']
Action: (stack A C)
Final state: ['(on_table B)', '(on C B)', '(on A C)', '(clear A)', '(handempty)']
