# **Automated Planning**
It's an important problem solving activity which consists in synthesizing a **sequence of actions** performed by an agent, that leads from an initial state of the world to a given target state (**goal**).
It's **semi-decidable** so it's really a difficult problem for computers.

Given:
* An initial state;
* A set of action;
* A state to achieve.

Find:
* A **plan**: partially or totally ordered set of actions needed to achieve the goal from the initial state.

An **automated planner** is an intelligent agent that operates in a certain domain described by:
* A representation of the **initial state** and the **goal**;
* A formal description of the executable actions.

It dynamically define the plan.

# Domain theory
A planner relies on **formal description** of the executable actions.
Each action is identified by a name and it's modeled through:
* **Preconditions**: that must hold to ensure that the action can be executed;
* **Postconditions**: represents the effects of the action.

# Planning vs. Execution
Planning is a solving process for deciding the steps that solve a planning problem, is:
* **Non-decomposable**: there can be interaction between subgoals;
* **Reversible**: choices made during plan generation are backtrackable because the planning activity is *offline*;
* The effects of the actions are deterministic.

A planner is:
* **Complete**: if always finds a plan when it exists;
* **Correct**: if solution found leads from the initial state to the goal.

The execution is the implementation of the plan, is:
* **Irreversible**: often the execution of an action isn't backtrackable, the activity is *online*;
* **Non-deterministic**: can have different effects from the expected ones. Working in real world **pertains uncertainty**. In principle we can find a recovery plan.

# Generative planning
It's an **offline** planning that **produces the whole plan before execution**.
Works on a snapshot of the current state.
It's based on some assumptions:
* **Atomic time**: actions are uninterruptible;
* **Determinism**;
* **Closed world assumption**: the initial state is apriori **fully known** and the plan execution is the **only cause of changing** in the world (no other agents).

It's opposed to reactive planning.

# Planning as search
It's a general purpose algorithm that may be very expensive computationally.
Planning can be seen as a forward search activity (from initial state to the goal).
There are many different views of planning as search where states and operators change

Type | **States** | **Operators**
--- | --- | ---
Theorem proving | Set of propositions | Deductive rules
Search in state space | Set of propositions | Actions
Search in plan space | Partial plans | Plan refinement or completion moves

# Linear planning 
A linear planner formulates the planning problem as a search in the state space using classical search strategies. Provides an order.
The algorithm could proceed:
* **Forward**: starts from the initial state and proceeds until it finds a state that is a **superset** of the goal (a set where a subset representing the goal exists). Not really computational affordable, needed pruning or heuristics to guide the search.
* **Backward**: starts form the goal and proceeds until it finds a state that is a **subset** of the initial state. Based on **goal regression**, a mechanism to reduce a goal in subgoals during search by applying rules (actions).


**Goal regression**: used in backward search, it's a mechanism to reduce a goal in subgoals during search by applying rules (actions). Effects of actions can be positive or negative: add-list and delete-list.
Given a goal $G$ and a rule $R$, regression of $G$ through $R$ is:
* `Regr[G,R] = true` if $G \in \text{Add-list}$;
* `Regr[G,R] = false` if $G \in \text{Delete-list}$;
* `Regr[G,R] = G` otherwise.

# Deductive planning
Deductive planning uses first order logic for representing states, goal and actions (clauses) and generates a plan as a theorem proof.
There exists two formulation:
* **Green's formulation**, easier and more general purpose;
* **Kowalsky's formulation**, way stronger.

Note: $a \to b$ can be transformed into $\neg a \lor b$ (**Horn clauses**).

# Situation calculus
* **Situation**: world snapshot describing properties (**fluents**) that hold in a given state $s$;
* **Actions**: define which fluents are true as a consequence of an action (**clauses**).

**Frame problem**: all properties true before the action must be true also after if they're untouched by the action.

# Green formulation
Green uses situation calculus to build a planning based on **logic resolution and unification**.
He finds a proof of a formula containing a state variable.
At the end of the proof the state variable will be instantiated to the plan to reach the objective.
Has high expressivity and can describe complex problems. Suffer the **frame problem** since we have to **explicitly list all fluents that change and those that do not change after a state transition**. It's not very efficient.

# Kowalsky formulation
We use a:
* Predicate $\text{holds}(\text{rel}, s/a)$ to describe all the relation rel that are true in a given state $s$ or made true by the execution of an action $A$;
* Predicate $\text{poss}(s)$ that indicate if a state is possible (reachable);
* Predicate $\text{pact}(A,s)$ that indicate that it is possible to execute an action $A$ in a state $s$ (precondition of $A$ are true in $s$).

**If** a state $s$ is possible **and** the preconditions of an action $A$ are satisfied in that state **then** it's possible the state produced after the execution of $A$: $$\text{poss}(s) \land \text{pact}(A,s) \to \text{poss}(\text{do}(A,s))$$

In this way we need one **frame assertion** per action.
We can use prolog resolution for create plans.

# STRIPS (Standford Research Institute Problem Solver)
It's an algorithm for plan construction. 
Has a specific language for the actions, an easier syntax than the situation calculus so is less expressive yet more efficient.
* **State representation**: fluent that are true in a given state;
* **Goal representation**: fluent that are true in the goal state. There can be variables.;
* **Action representation**: comes with three lists
> * **Preconditions**: fluents that should be true for applying the move;
  * **Delete-list**: fluents that become false after the move;
  * **Add-list**: fluents that become true after the move.
  * Sometimes add-list and delete-list are glued together in an effect-list with positive and negative axioms.

The frame problem is solved with the **strip assumption**: everything which is not in the add-list and delete-list is unchanged.

# The algorithm
It's a linear planner based on backward search (goal regression: from goal to initial state). 
Initial state is fully known (close world assumption).
There are two data structures:
* **Goal stack**: LIFO stack, I can only remove the top. It proceeds backwards;
* **Description of the current state**: it proceeds forward.
> **Algorithm**:
  1. Initialize the stack with the goal to reach;
  1. While the stack isn't empty:
  * If $\text{top}(\text{stack})=A$ and $A\theta \subseteq S$ (can be unified with the initial state), then $\text{pop}(A)$ and execute substitution $\theta$ (unification in general) on the stack;
  * Else, if $\text{top}(\text{stack})=a$, then select a rule $R$ with $a \in \text{Add-list}(R)$, $\text{pop}(A)$, $\text{push}(R)$, $\text{push}(\text{Precond}(R)$);
  * Else, if $\text{top}(\text{stack})=a_1 \land \dots \land a_n $ (actions emerge from the stack), then $\text{push}(a_1)$, $\text{push}(a_2)$, $\dots$, $\text{push}(a_n)$;
  * Else, if $\text{top}(\text{stack})=R$, then $\text{pop}(R)$ and apply $R$ on $S$.

The problem is divided into sungoals which might interact (not indipendent).
There are many possible goal ordering, reaching a goal may destroy another one.
At each goal we select one subgoal from the stack.
When we have a set of action that reach a goal, we execute them on the state that proceeds forward.
The process goes until the stack is empty.
When at the top of the stack we find an and of goals, we need to check that this is still satisfied in the current state before removing it, if it's not we have to reinsert it and change the order.

There are some pittfalls:
* It's a **very large search space**: the choice of ordering is not deterministic and more action are applicable to reduce a goal. A solution can be use heuristic startegies to select the goal and the action (means-ends analysis, find the most significant difference between the state and the goal and reduce that);
* The **goals are interacting**: a complete solution is try all possible orderings of goals and subgoals. In practice they're solved indipendently and verified afterward: if the conjucton isn't true, change ordering;
* **Sussman anomaly**: it happens when we can choose between two possible orderings, but one them destroy the previous achieved goal, so we need to backtrack and try the other ordering (not much efficient).

# Non-linear planning (partial order planning)
Non-linear planners are search algorithm that generate a plan as a search problem in the space of plans. In the search tree: 
* Each **node** is a partial plan;
* **Operators** are plan refinement operations.

A non-linear generative planner relies on the closed world assumption.

**Least commitment planning**: never impose more restrictions than the strictly necessary. This avoids making decisions when they're not required and avoids many backtracking since if a wrong decision is made it's necessary to backtrack.

A non-linear planner is represented as:
* A set of actions (istances of operators);
* A not exhaustive set of **orderings** between actions;
* A set of **causal links**.

The initial plan is empty, with two fake actions:
* **Start**: It has **no preconditions**. Its **effects match the initial state**;
* **Stop**: It has **no effects**. Its **preconditions match the goal**;
* **Ordering**: Start $<$ Stop.

At each step either the set of operators or the set of orderings or the set of casual links is increased until all goals are met (so add an action or an order).

A **solution** is a set of partially specified and partially ordered operators.
To obtain a real plan we must linearize.

> **High level algorithm**:
 * While the plan is not complete:
 1. Select an actions $\text{SN}$ that has a precondition non satisfied (**open goal**);
 1. Select an action $\text{S}$ (new or already in the plan) that has $\text{C}$ among its effects;
 1. Add the order constraint $\text{S}<\text{SN}$;
 1. If $\text{S}$ is a new action add the costraint $\text{Start}<\text{S}<\text{Stop}$;
 1. Add the casual link $\langle\text{S},\text{SN},\text{C}\rangle$ (the plans are interacting with each other);
 1. Solve any threat on causal links;
 * End

# Causal links and threats
In case of failure, if a choice point exists the algorithm backtracks and explores alternatives.
 * A **causal link** is a triple (datastructure) that consists of two operators $S_i$, $S_j$ and a subgoal $C$. $C$ should be precondition of $S_j$ and effect of $S_i$, so $S_i \overset{c}{\to} S_j$;
 * A causal link stores the causal relationship between actions and helps tackling the problem of interacting goals.

An action $\text{S3}$ is a **threat** for a causal link $\langle\text{S1},\text{S2},\text{C}\rangle$ if it has an effect that negates $c$ and no ordering constraint exists that prevent $\text{S3}$ to be perfomed between $\text{S1}$ and $\text{S2}$. 
Solutions can be:
* **Demotion**: the constraint $\text{S3}<\text{S1}$ is imposed;
* **Promotion**: the constraint $\text{S2}<\text{S3}$ is imposed.



# Partial order planning algorithm (POP)
> function `POP(InitialGoal, Operators)` return `plan`:
 * `plan` $:=$ `InitialPlan(Start, Stop, InitialGoal)`;
 * Loop:
 1. If `Solution(plan)` then return `plan`;
 1. $\text{SN}$, $\text{C}$ $:=$ `SelectSubgoal(plan)`;
 1. `ChooseOperator(plan, Operators, SN, C)`;
 1. `ResolveThreats(plan)`;
 * End.

 > function `SelectSubgoal(plan)` return $\text{SN}$, $\text{C}$:
 * Select $\text{SN}$ from `Steps(plan)` with unsolved precondition $\text{C}$:

 > procedure `ChooseOperator(plan, Operators, SN, C)`:
 * Pick an $\text{S}$ with effect $\text{C}$ from $\text{Operators}$ or from `Steps(plan)`;
 * If $\text{S}$ doesn't exist then return a `fail`;
 * Add the causal link $\langle\text{S},\text{SN},\text{C}\rangle$;
 * Add the ordering constraint $\text{S}<\text{SN}$;
 * If $\text{S}$ is a new action added to the plan then add $\text{S}$ to `Steps(plan)` and add the constraint $\text{Start}<\text{S}<\text{Stop}$.

 > procedure `ResolveThreats(Plan)`:
 * for each action $\text{S}$ that threats a causal links between $S_i$ and $S_j$, choose either:
 * Demotion: add the constraint $S<S_i$;
 * Promotion: add the constraint $S_j<S$;
 * If `NotConsistent(plan)` then reutrn a `fail`

 # Modal Truth Criterion (MTC)
As we have seen before, we have to protect every casual link with a proper ordering.
A partial order planning algorithm interleaves goal achievement steps with threat protection steps.
*Promotion* and *demotion* alone are not enough to ensure **completeness** (a complete planner always finds a solution if it exists).
The **Modal Truth Criterion** is a construction process that guarantees planner's completeness.
It provieds five operators to move into the space of *plan refinement*:
* **Establishment**: open goal achievement through:
> 1. A new action to be inserted;
  1. An ordering constraint with action already in the plan;
  1. A variable assigment (i.e. unification).

* **Promotion**: ordering constraint that imposes the threatening action **before** the causal link;
* **Demotion**: ordering constraint that imposes the threatening action **after** the causal link;
* **White knight**: used when I can't solve a threat with ordering. Insert a new operator or use one already in the plan between $S$ and $SN$ such that it **re-establishes** the precondition of $SN$ thretened by $S$ (reimpose the prencondition);
* **Separation**: insert *non codesignation* constraints between the variables of the negative effect and the threatened precondition so to avoid unification. Rarely used.

It's always preferable apply promotion and demotion in order to keep the number of the action limited.
Non-linear planner tends to generate very innefficient plans, even if correct.

Planning is **semi-decidable**: if there's a plan that solves a problem the planner finds it, but if there's not, the planner can work indefinitely.


# Hierarchical planning
Hierarchical planners are search algorithms that manage the creation of complex plans at **different levels of abstraction**, by considering the simplest details only after finding a solution for the most difficult ones.

Given a goal, the hierarchical planner performs a **meta-level search** to generate a **meta-level plan** which leads *from a state that is very close to the initial one to a state which is very close to the goal*.
The plan is then completed with a lower level search, taking account of details omitted at the previous level.

Hierarchical algorithm must be able to:
* Organize well the meta-levels;
* Expand abstract plans into concrete plans (planning abstract parts in terms of more specific actions and then expanding already prebuilt plans).

Note: at each level of the meta-search we may use different algorithms such as STRIPS or POP.

# ABSTRIPS
At every level of abstraction we consider only some preconditions with a threshold (**criticality value**), proportional to the complexity of its goal.
The algorithm proceeds at different levels of abstraction spaces, refines changing the threshold.
At each level, lower level preconditions are ignored.

ABSTRIPS fully explores the space of a certain level of abstracion before moving on to a more detailed level (lenght search).

> **Algorithm**:
1. A thresold values is fixed;
1. All preconditions whose criticality is lower than the treshold are considered true;
1. STRIPS (or else, at each level we may use different planners) finds a plan that meets all the considered preconditions;
1. Then uses the full plan pattern obtained as a guide and lower the value of the threshold;
1. It extends the plan with operators that meet the new preconditions;
1. Again, lowers the threshold until all the preconditions are considered.

# Macro-operators
We have two kinds of operators:
* **Atomic operators**: represents elementary actions that can be directly performed by an agent;
* **Macro operators**: represents a set of elementary actions decomposable into atomic operators. 

Before execution they should be decomposed and it's possible to *precompile it* or *plan it*:
* **Pre-compiled decomposition**: the description of the macro operator also contains the *decomposition*, that is the sequence of basic operators to be executed at run-time;
* **Planned decomposition**: the planner must perform a low-level search for synthesizing the atomic action plan that implement the macro action.

The planning algorithm can be either linear or non-linear.
A hierarchical non-linear algorithm is similar to POP where at each step one can choose between:
* Reach an open goal with an operator (either atomic or macro);
* Expand a macro step of the plan (decomposition can be either precompiled or planned)

> function `HD_POP(InitialGoal, methods, operators)` return `plan`
* `plan` $\leftarrow$ `InitialPlan(Start, Stop, InitialGoal)`;
* Loop:
1. If `Solution(plan)` then return `plan`;
1. Else, choose between:
* $SN$, $C$ $\leftarrow$ `SelectSubgoal(plan)`;
* `ChooseOperator(plan, operators, SN, C)`.
2. Or:
* SnonPrim $C$ $\leftarrow$ `SelectMacroStep(plan)`;
* `ChooseDecomposition(SononPrim, methods, plan)`.
3. `SolveThreats(plan)`.

# Decomposition
To ensure decomposition is safe, some properties must be guaranteed.
If the macro action $A$ has the effect $X$ and is expanded with the plan $P$:
1. $X$ must be the effect of at least on of the actions in which $A$ is decomposed and should be protected until the end of the plan $P$;
1. Each precondition of the actions in $P$ must be guaranteed by the previous actions in $P$ or it must be a precondition of $A$;
1. The action $P$ must not threat any causal link when $P$ is substituted for $A$ in the plan.

Under these conditions you can replace the macro action $A$ with the plan $P$. When replacing, orderings and causal links should be added:
1. **Orderings**
* For each $B$ such that $B<A$, then $B<\text{first}(P)$ is imposed (first action of $P$);
* For each $B$ such that $A<B$, then $\text{last}(P)<B$ is imposed (last action of $P$);

2. **Causal links**
* If $\langle S,A,C \rangle$ is a causal link in the initial plan, then it must be replaced by a set of causal links $\langle S,S_i,C \rangle$, where $S_i$ are the action of $P$ that have $C$ as a precondition and no other step fo $A$ before it has a $C$ as a precondition;
* If $\langle A,S,C \rangle$ is a causal link in the initial plan, then it must be replaced by a set of causal links $\langle S_i,S,C \rangle$, where $S_i$ are the action of $P$ that have $C$ as a action and no other step fo $P$ after it has the effect $C$.

# Execution
Generative planners build plans that are then executed by an executing agents.
There may be different problems, such as:
* An action should be executed but its preconditions are not satisfied, due to **incomplete/incorrect knowledge**, **unexpected conditions**, relaxation of the **close world assumption**;
* Action effects are not the one expected, due to **error of the agents** or **non-deterministic effects**.

While executing, the agent should *perceive* the changes in the world and acting accordingly, using sensors to change the description of the world.

Some planners run under the **open world assumption**, where the information that aren't explicitly stated in a state isn't false but **unknown**.
The unknown information can be retrieved via **sensing actions** added to the plan, modeled as causals actions:
* Preconditions are conditions that must be true to perform a certain observation;
* Postconditions are the result of the observation.

There are two possible aproaches: integration between planning and execution or **conditional planning**.


# Conditional planning
A conditional planner is a search algorithm that generates various alternative plans for each source of uncertainty of the plan.
It's constitued by:
* Causal actions;
* Sensing actions;
* Several alternative partial plans of which only one will be executed depending on the results of the observations.

This process is extremely high demanding, it causes an **explosion** of the search tree. 
A comprehensive plan which takes account of every possible contingency might require a lot of memory since not always all alternatives are known in advance.

Often conditional planners are associated with probabilistic planners that plan only for the most probable contest. They're called **contingiency planner**. 

# Reactive planning
Reactive planners are non-generative online algorithms, capable of interacting with the world and dealing with **dynamicity** and **non-determinism**. They:
* Observe the world in the planning state;
* Acquire unknown information;
* Monitor the implementation of actions and check the effects;
* Interleave planning and execution.

Pure reactive planners do not plan but react as triggers to world variations (closer to control systems, but based on rules).
They have access to a knowledge base that describes what actions must be carried out and under what circumstances.
They choose one action at a time without any lookahead activicy.

* **Pros**:
1. Able to interact with the real system. Robust in domains for which it's difficult to provide complete and accurate models;
1. Don't use models but perceive changes. Extremely fast responses.
* **Cons**:
1. Their perfomance in predictable domains is low.

# Hybrid systems
Modern responsive planners are **hybrids**.
Integrate **generative** and **reactive** approaches in order to exploit the computational capacity of the first and the ability to interact with the system of the second thus addredding the problem of execution.
They:
* Generates a plan to achieve the goal (offline);
* Checks the preconditions of the action that is about to run and the effects of the previously executed action;
* Backtracks the effects and reschedules in case of failures (if possible, some actions aren't backtrackable);
* **Correct the plan if unforeseen event occour**.