# **Constraints**
Many AI problems can be seen as constraint satisfaction problems.
The objective is to find a state that meets a given set of constraints.

# A toy example: eight queens problem
Given an $8\text{x}8$ chessboard, the problem consists in placing eight queens in order to avoid mutual attacks. 
The possible moves for the queen are all the positions on the same row, column and diagonal.

This problem can be modelled with variables, domanins and constraints and then solve through search strategies (states and operators).

**Model 1**
* The positions of the board are $8\text{x}8$ represented by $64$ variables $x_{ij}$;
* The instantiation of a variable $x_{ij}$ to the value $1$ indicates that this position is assigned to a queen, and if the value is $0$ the position it's free.
The domanin will be $[0,1]$;
* The constraint are that there cannot be two $1$ simultaneously on the same row, column or diagonal.

**Model 2**
* The eight queens are represented by $8$ variables, which subscript refers to the column occupied by the corresponding queen: $x_1, x_2, \dots , x_8$;
* The instantiation of each variables $x_i = k$, with $k \in [1,8]$ indicates that the corresponding queen is place on the $k$-th row of the $i$-th column.

**Constraints**
1. Domain: $1 \le x_i \le 8$ for $1 \le i \le 8$;
1. Line: $x_i \neq x_j$ for $1 \le i < j \le 8$;
1. Diagonals: $x_i \neq x_j + (j-1)$ and $x_i \neq x_j - (j-1)$ for $1 \le i < j \le 8$.

The first constrain it's an unary constraint since it involves just one variable, the rest are binary constraints.

# Scheduling as a CSP
**Scheduling**: assign tasks with a certain duration to resources at a given time. Resources can be shared;

**Variables**: start time of the activities;

**Domains**: possible start time of activities;

**Constraints**:
1. Activities can be ordered: $\text{Start}_1 + \text{Duration}_1 \le \text{Start}_2$;
1. Activities that use same resources can't overlap: $\text{Start}_1 + \text{Duration}_1 \le \text{Start}_2 \lor \text{Start}_2 + \text{Duration}_2 \le \text{Start}_1$.

# Map colouring as CSP
**Map colouring**: colour portions of map in such way that contiguous regions are colored with different colors;

**Variables**: regions;

**Domains**: colors;

**Constraints**: adjacent regions must have different colors.

It's proven that four colours are sufficient for each map

# Other examples
Criptoarithmetics, sudoku, etc...

# Constraint satisfaction problem
A constraint satisfaction problem is defined on a finite set of variables:
* $(x_1, x_2, \dots, x_n)$ decisions that we have to take;
* $(D_1, D_2, \dots, D_n)$ domanins of possible values (grid domains);
* A set of constraints.

A constraint $c(x_{i1}, x_{i2}, \dots, x_{ik})$ between $k$ variables is a subset of the cartesian product $D_{i1} \times D_{i2} \times \dots \times D_{ik}$ that specifies which values of the variables are compatible with each other. This subset doens't have to be explicitly defined but represented in terms of relationships.

A **solution** provides an **assignment of all the variables that satisfies all the constraints**.

# CSPs as search
CSPs can be solved through search:
* **Initial state**: empty assignment $\{\quad\}$;
* **Successor function**: assigns a value to a variable not yet assigned (fail if there's none);
* **Goal**: complete assignment.

This scheme is shared for all CSPs with a limited $n$ dept-first search, where $n$ is the number of variables. The path is irrelevant and the problem has $d^n$ leaves where $d$ is the cardinality of the domain. Of course this will lead to a *combinatorial explosion*.

A possible search tree for a CSP is obtained after establishing an order for variables: each level of the tree corresponds to a variable and each node corresponds to a possible value assignment.
Each leaf of the tree would then represent an assignment of values to all variables. If this assignment satisfies all constraints, then the corresponding leaf represents a solution to the problem, otherwise is a failure.

# Two approaches for the search
1. **Propagation algorithm**: based on the propagation of constraint to eliminate a priori, **while** searching, the portions of the tree that would lead to a failure. Two techniques: forward checking (FC), looking ahead (LA).
1. **Consistency techniques**: based on the propagation of constraints in order to derive a simpler problem than original. Two techniques: generate and test (GT), standard backtracking (SB).

Consider a depth-first search. It assigns a variable at a time. At each step we either:
* Find a solution;
* Discover a failure;
* Assign another variable.

The algorithm has three degrees of freedom:
* The choice of the variable ordering (doesn't affect completeness);
* The choice of the ordering of values to be assigned to the current variable;
* The propagation carried out in each node.

The first two relate to the search heuristic, the latter it's what differentiate the approaches.


# Propagation algorithms
Propagation algorithms are smart search methods that exploit constraints to prevent failures (when I have a failure I need to *backtrack* and it's computationally expensive) rather then recover from them.
Constraints provide an **apriori pruning** of the search tree, reducing the search space before reaching the failure, eliminating subtrees that lead to a failure (an **assignment prune an entire subtree**).

**Constrain and generate paradigm**: a module propagates the constraints as much as possible (*constrain*). At the end of the propagation we have reached a solution, a failure or a new information (*generate*). They perform increasing amount of checks on free variables.

1. **Forward checking**: after each assignment of the variable $x_i$, the FC propagates all constraints involving $x_i$ and all variables that are not yet instantiated (performs a *domain reduction*).

This method is very effective especially when free variables domains are associated with a reduced set of allowable values (so they're easily assignable).
> * If the domain associate with a free variable has only one value left, the association can be performed without computational effort;
* If the domain becomes empty the FC algorithm fails and backtrack.

This method is based on the observation that the assignment of a value to a variable has has impact on all the available values for the free variables; the constraints act forward and reduce the space of the solutions before exploring it.

2. **Look ahead**: beside checking the constraints with current instantiated variable, look ahead also checks the non assigned variable (additional step, not only singletons). Checks the existence, in the domains of non-associated variables, of any values that are compatible with the constraints containing only non-instantiated variables. Basically, it **verifies the possibility of future consistent assignments**.

> * **PLA**: for each value in the domain of $x_h$ checks if in the domain of not yet assigned variables $x_{h+1}, \dots , x_n$ there is a value compatible with it (**unidirectional**);
* **FLA**: for each value in the domain of $x_h$ checks if in the domain of not yet assigned variables $x_{k+1}, \dots, x_{h-1},x_{h+1}, \dots , x_n$ there is a value compatible with it (**bidirectional**);

# Consistency techniques
1. **Generate and test**: the language interpreter develops and visits a decision tree, covering it in depth by assigning values to variables without verify the consistency with any contraint. In practice, we *assing randomly all variables and check*.

This approach leads to inefficiency: constraints are used to limit the space of solutions **after** the search is performed (*aposteriori fashion*)

2. **Standard backtracking**: at each instantiation of a variable $x_i$, constraints involving $x_i$ and previously instantiated variables are checked. The constraints are used backwards and lead to an effective reduction of the search space respect to the previous approach, but this reduction is done backward after the assignment. 

The use of constraints is more effective as it doesn't keep searching in the branches that present contradictions. It's better than then previous one but still has an *aposteriori fashion*.

The main difference is that in GT the constraints are checked at the end of all assignments, in SB the constraints are checked after each assignment.

# Search heuristics
Constraint propagation is performed by a solver, order of variable selection and value selection are available to the programmer. A good heuristic can act on these two degree of freedom.

Heuristics can be classified into:
* **Variable selection heuristic**: determine what should next variable to be instantiate. Most populars: first-fail and most-constrained (most difficult variables are assigned first);
* **Value selection heuristic**: determine what value to assign to the selected variable, usually based on **least-constraining principle**.

Heuristic can be static (calculated once for all) or dynamic (calculated after each assignment). Dynamic heuristics are potentially better since they can avoid backtracks, but the computation of the hypotetical perfect heuristic would have the same complexity of the original problem (tradeoff).