# 01 - Introduction

This first notebook answers four questions:

1. What do we mean by *Discrete* Optimization?
2. Why is it hard (and why does geometry matter)?
3. How is this tutorial structured?
4. What are examples for Discrete Optimization problems?

## What is Discrete Optimization?

In Discrete Optimization, some (or all) decision variables are **restricted to discrete values** (integers or binaries).

A very large part of modeling problems in the field can be phrased as a **Mixed-Integer Linear Program (MIP)**:

$
\begin{equation}
    \begin{array}{ll@{}ll}
        \displaystyle \text{max} & c^\top x &\\
        \displaystyle \text{s.t} & Ax \le b &\\
        \displaystyle            & x \in \mathbb{Z}^p \times \mathbb{R}^{n-p}
    \end{array}
\end{equation}
$

where $c \in \mathbb{R}^{n}$ is the element-wise cost vector and $Ax \le b$ is a set of constraints that define the problem.

Note: If all variables are integral ($p=n$), such problems are called **Integer Programs (IPs)**, or **Binary Programs (BPs)** in the case of binary variables $(x\in\{0,1\}^n)$.

Many classical combinatorial problems (matching, knapsack, set cover, …) fit into this template.

## What makes integer problems hard?

Linear Programs (LPs) are "nice" because the feasible region $P(A,b) = \{x \in \mathbb{R}^n \mid Ax \le b\}$ is a **convex polyhedron**. \
Convexity enables efficient algorithms (simplex, interior point, ellipsoid).

However, with integrality, the feasible set becomes $P(A,b) \cap \mathbb{Z}^n$ which is **not convex** (it is a set of isolated lattice points).

A central geometric object is the **integer hull** $\operatorname{conv}\bigl(P(A,b) \cap \mathbb{Z}^n\bigr)$ (more on that later).

You can think of modern integer optimization as combining two ideas:
* **Relaxation**: Solve something convex first (typically the LP relaxation)
* **Refinement**: Tighten the relaxation (cuts) and/or split the problem (branch-and-bound) until the solution is integral.

This tutorial builds up the polyhedral tools needed to reason about these objects and the algorithms that use them.

## Examples

To make the definitions above more clear, here are three “standard” Discrete Optimization problems you’ll see again and again.
In each case, the key modeling pattern is: **binary variables + linear constraints**.

## The Assignment / Perfect Matching Problem

**Story:** Assign $n$ workers to $n$ tasks, minimizing total cost.

Let $c_{ij}$ be the cost of assigning worker $i$ to task $j$.

We can use binary variables $x \in \{0,1\}^{n \times n}$ to encode whether a worker is assigned to a task, i.e.

$
x_{ij} = \begin{cases}
            1 & \text{if worker } i \text{ is assigned to task } j,\\
            0 & \text{otherwise.}
         \end{cases}
$

A standard MIP formulation of this problem is:

$
\begin{aligned}
    \min            & \sum_{i=1}^n\sum_{j=1}^n c_{ij} x_{ij} \\
    \text{s.t. }    & \sum_{j=1}^n x_{ij} = 1 \quad \forall i \qquad (\text{each worker gets one task}) \\
                    & \sum_{i=1}^n x_{ij} = 1 \quad \forall j \qquad (\text{each task gets one worker}) \\
                    & x_{ij}\in\{0,1\}.
\end{aligned}
$

**Note:** Although this is an IP, its constraint matrix is *totally unimodular*, so the LP relaxation already has integral extreme points. \
This connects nicely to Notebook [07 - Unimodularity and Total Dual Integrality](07-Unimodularity-and-Total-Dual-Integrality.ipynb)!

For now, let's try and brute-force a solution:

In [None]:
import itertools
import numpy as np

# A tiny assignment instance (n=4) so we can brute-force by enumerating permutations.
C = np.matrix(
    '9 2 7 8;'
    '6 4 3 7;'
    '5 8 1 8;'
    '7 6 9 4',
    dtype=float,
)

n = C.shape[0]

best_cost = np.inf
best_perm = None

for perm in itertools.permutations(range(n)):  # perm[j] = assigned task for worker j
    cost = sum(C[i, perm[i]] for i in range(n))
    if cost < best_cost:
        best_cost = cost
        best_perm = perm

best_cost, best_perm

Translating the results: The best assignment for this problem is:
* Worker 0 -> Task 1
* Worker 1 -> Task 0
* Worker 2 -> Task 2
* Worker 3 -> Task 3

With a total cost sum of 13.

## The Knapsack Problem

**Story.** Pick a subset of items with maximum value, subject to a weight (or capacity) limit.

Each item $i$ has a value $v_i$ and a weight $w_i$. With binary decision variables $x_i \in \{0,1\}$ (do we select the item or not), the classic 0-1 knapsack is:

$
\begin{aligned}
    \max            & \sum_{i=1}^n v_i x_i \\
    \text{s.t. }    & \sum_{i=1}^n w_i x_i \le W \\
                    & x_i \in \{0,1\}.
\end{aligned}
$

Unlike the assignment problem, knapsack is **NP-hard** (but very structured), which is why it’s a great playground for branch-and-bound, cutting planes, dynamic programming, and heuristics (more on that later).

Let's have another small brute-force example:

In [None]:
import itertools
import numpy as np

v = np.array([10, 7, 25, 24, 15], dtype=int)    # values
w = np.array([2, 3, 7, 6, 4], dtype=int)        # weights
W = 10                                          # capacity

n = len(v)

best_val = -1
best_x = None
for x in itertools.product([0, 1], repeat=n):
    x = np.array(x, dtype=int)
    if (w @ x) <= W:
        val = int(v @ x)
        if val > best_val:
            best_val = val
            best_x = x

best_val, best_x, int(w @ best_x)

This means we should pick the last two items, obtaining a maximum value of 39 and a weight of 10, fully using our weight limit.

## The Set Packing / Partitioning / Covering Problem

These are “set system” problems that show up everywhere (crew scheduling, facility location subproblems, column generation, …). You have a ground set of elements and a collection of subsets.

Let $A \in \{0,1\}^{m \times n}$ be an incidence matrix where $a_{ij}=1$ if subset $j$ contains element $i$.

With binary variables $x_j$ (“choose subset $j$ or not”), we now get:

- **Set Packing**: Choose subsets so that elements are used *at most once*: $Ax \le \mathbf{1},\quad x \in \{0,1\}^n$.
- **Set Partitioning**: Choose subsets such that elements are used *exactly once*: $Ax = \mathbf{1},\quad x \in \{0,1\}^n$.
- **Set Covering**: Pick subsets so that every element is covered *at least once*: $Ax \ge \mathbf{1},\quad x\in\{0,1\}^n$.

If we add costs $c_j$, we get linear objectives like $\min c^\top x$ or $\max c^\top x$.
The polyhedral structure behind these models is a major theme of the later notebooks (e.g., [10 - The Set-Packing Polytope](10-The-Set-Packing-Polytope.ipynb)).

## Next

If you’re new to the polyhedral viewpoint, the next notebook is a great start:

➡️ [02 - Fourier-Motzkin Elimination](02-Fourier-Motzkin-Elimination.ipynb)

Fourier–Motzkin elimination is a basic projection tool that we’ll reuse when moving between different representations of polyhedra.