# CSC 421 - Constraint Satisfaction Problems

### Instructor: Shengyao Lu

We have used **atomic** representation to solve <mark>state-space search problems</mark> by searching in trees or graphs. We also found in informed search algorithms that <mark>domain-specific heuristics could estimate the cost</mark> of reaching the goal from a given state. 

In today's class, we will look into problems using a **factored representation** for each state, which refers to a set of variables, each of which has a value. In these problems, the structure of the states is somewhat awared, and use general rather than domain-specific heuristics. 

### Readings
- Basic: Sections 6.1, 6.2, 6.3, 6.4, and Summary
- Expected: 6.5
- Advanced: All the chapter including bibliographical and historical notes

## 1. Defining Constraint Satisfaction Problems (CSP)

A CSP problem consists of three components, $\mathcal{X,D,C}$.
- $\mathcal{X}$ := a set of variables, $\{X_1, \dots, X_n\}$.
- $\mathcal{D}$ := a set of domains, $\{D_1, \dots, D_n\}$.
    - A domain $D_i$ consists of a set of allowable values $\{v_1\dots v_k\}$ for variable $X_i$. $k$ := dimension of $D_i$, different variables can have different domains of different sizes. 
    - e.g., a boolean variable $X_{bool}$ have the domain $\{true, false\}$
- $\mathcal{C}$ := a set of constraints that specify allowable combinations of values.
    - Each constraint $C_j$ consists of a pair $\left\langle scope,rel \right\rangle$.
        - $scope$ := a tuple of variables in $C_j$.
        - $rel$ := a relation that defines the values that $scope$ can take on. 
    - e.g., if $X_1,X_2$ both have the domain $\{1,2,3\}$, then two ways to represent the constraint $C_m$ are, where $C_m$ := $X_1$ must be greater than $X_2$:
        - $\left\langle (X_1,X_2), \{(3,1), (3,2), (2,1)\}\right\rangle$
        - $\left\langle (X_1,X_2),X_1>X_2\right\rangle$

### Terminology 

- **Assignment of values to variables:** e.g. $\{X_i=v_i, X_j=v_j,\dots\}$. 
- **Complete assignment:** each variable is assigned a value.  
- **Consistent (legal) assignment:** Not violate any constraints.
- **Solution to a CSP:** a consistent, complete assignment. 
- **Partial assignment:** leaves some variables unassigned.
- **Partial solution:** a consistent partial assignment.


### Example problem: Midterm schedule

Now consider we are going to have our midterm for this course on Thursday, Feb 26, 2026, at the Computer-Based Testing Lab (CBTL). There will be 4 time slots for the students to choose. You may choose whichever time slot that works for you best. However, we are not the only course holding a midterm on February 26. Other courses may include OS (Operating System), DB (database), ALG (Algorithms), MATH, and we are the AI course. 

Now let's **Assume** the following:
- the four time slots are:
    - T1: 8:30 A.M. - 9:15 A.M.
    - T2: 9:30 A.M. - 10:15 A.M.
    - T3: 12:30 P.M. - 1:15 P.M.
    - T4: 2:00 P.M. - 2:45 P.M.
- Each time slot allows at most two students to take the midterm for a given course.
- There are 4 students we need to allocate. None of them have other things to do on Feb 26. 
    - A: OS, AI
    - B: DB, MATH
    - C: AI, ALG
    - D: AI, DB

Then, how to formalize a CSP in this case? i.e., what are the three components, $\mathcal{X,D,C}$?

1. $\mathcal{X}$: Consider ${X}_{s,c}=\text{student } s \text{ takes course } c \text{ in which time slot}$
- $X_{A,OS},X_{A.AI}$
- $X_{B,DB}, X_{B,MATH}$
- $X_{C,AI}, X_{C,ALG}$
- $X_{D,AI}, X_{D,DB}$

    Therefore, $\mathcal{X}=\{X_{A,OS},X_{A.AI}, X_{B,DB}, X_{B,MATH}, X_{C,AI}, X_{C,ALG}, X_{D,AI}, X_{D,DB}\}$

2. $\mathcal{D}$: four time slots.   
$D(\cdot)\in \mathcal{D}$   
$D(X_{s,c})=\{T1,T2,T3,T4\}$

3. $\mathcal{C}$: Two main constraints:
- $C_1$: A student cannot take two different course exams in the same time slot.
    - four binary constraints
        - $\langle \{X_{A,OS},X_{A,AI}\}, X_{A,OS} \neq X_{A.AI}\rangle$
        - $\langle \{X_{B,DB}, X_{B,MATH}\} , X_{B,DB} \neq X_{B,MATH}\rangle$
        - $\langle \{X_{C,AI}, X_{C,ALG}\} , X_{C,AI} \neq X_{C,ALG}\rangle$
        - $\langle \{X_{D,AI}, X_{D,DB}\} , X_{D,AI} \neq X_{D,DB}\rangle$
- $C_2$: At most two students can be scheduled in the same time slot.
    - scope: $\mathcal{X}$
    - rel: $\forall t\in \{T1,T2,T3,T4\}, \sum_{X \in \mathcal{X}} {\mathbf{1}[X=t]\leq 2}$

### Example Problem: map coloring 

<img src="images/csp_australia.png" width="800px">

Looking at a map of Australia showing each of its states and territories. The task is "coloring each region either **red, green, or blue** in such a way that **no two neighboring regions have the same color.**" 
* $\mathcal{X}=\{WA, NT, Q, NSW, V, SA, T\}$
* $\forall D_i \in \mathcal{D}, D_i=\{red, green,blue\}$

## Summary 


1. CSPs are a special kind of problem 
2. States defined by values of a fixed set of variables
3. Goal test defined by constraints of variable values
4. Back-tracking = depth-first search with one variable assigned per node
5. Variable ordering and value selection heuristics can help significantly
6. Forward checking prevents assignments that guarantee later failure 
7. Specific-constraint type and structure (for example trees) can lead to more efficient solvers 