# Algorithm: Revised Simplex Method
The original Simplex Method, invented by George Dantzig in 1947, is an iterative algorithm that solves a linear program by moving around the collection of corner points of the feasible _polytope_, improving the objective at each step until no further gain is possible.

> __Myth or fact?__ As a graduate student in 1939, Dantzig once arrived late to a statistics lecture, mistook two well-known unsolved problems on the blackboard for homework, and solved them over the next few days, before realizing they were open research questions! That story is true, but it happened years before he developed the simplex method and did not directly inspire the algorithm. But still, it's a fun story—being late isn't always bad!

__Simplex is a big deal__: Before simplex, LPs were mostly a theoretical curiosity; afterwards, they became essential tools, radically improving resource allocation and strategic planning in the second half of the twentieth century. Over seventy years later, the simplex method remains widely used in commercial optimization software. This is despite some not-so-great worst-case performance bounds!

### Algorithm
We are going to focus on the _revised simplex algorithm_, developed by Dantzig and coworkers in the 1950s at the RAND Corporation, which is a more efficient version of the original method.

> The __key idea__ behind the revised simplex algorithm is to partition the decision variables into a __basic set__ and a __non-basic set__. Then, we iteratively add and subtract variables from these sets and estimate their values, iteratively improving the objective function until we reach an optimal solution.
> * A __basic variable__ is one you're allowing to _turn on_, i.e., take a non-zero value. In the simplex method, you swap which variables are on or off to move from one corner of the feasible region to the next.
> * A __nonbasic variable__ is one you keep _off_ (held at zero)—think of non-basic variables as benchwarmers not in play. When it looks promising, you swap it in (make it basic) to move to a potentially better solution.
> * __How are the basic (and non-basic) variables related to corners?__ At any corner of the feasible polytope, exactly $m$ linearly independent variables are _turned on_ (basic set) so they solve the $m$ active equality constraints, and all other variables (non-basic set) are zero. Choosing which $m$ variables are basic (and solving for them) picks out one specific corner of the polytope, while every non-basic variable being zero defines the edges that meet at that corner.

Let's sketch out the revised simplex algorithm:

__Initialization__: Given a linear program of the form: $\min\left\{c^\top x\mid Ax+s = b,\;x\ge0,\; s\ge0,\;b\ge0\right\}$ where $A\in\mathbb{R}^{m\times{n}}$, $b\in\mathbb{R}^{m}$, and $c\in\mathbb{R}^{n}$, we want to find the optimal solution $x^{\star}$ that minimizes the objective function $c^\top x$ subject to the constraints defined by $Ax + s = b$ and the non-negativity conditions on $x$ and $s$. Note: We assume the LP has been converted from maximization form by negating the objective coefficients.

Let $z = \left(x,s\right)\in\mathbb{R}^{n+m}$. Define the initial basic set $B=\left\{s_{1},s_{2},\dots,s_{m}\right\}$ and the non-basic set $N=\left\{x_{1},x_{2},\dots,x_{n}\right\}$, where $x^{(0)}=0$ and $s^{(0)}=b$. Set the iteration counter $t\gets{0}$ and the maximum number of iterations $T$. Set $\texttt{converged}\gets\texttt{false}$.

While not $\texttt{converged}$ __do__:
1. __Optimality test__. Compute the reduced cost $\mu_{i} = c_{i} - \lambda^{\top}A_{i}$ for each non-basic variable $i \in N$, where $\lambda = (A_B^T)^{-1}c_B$ are the dual multipliers, $A_B$ is the $m \times m$ basic matrix formed by columns of $A$ corresponding to basic variables, and $c_B$ is the vector of objective coefficients for basic variables.
    - If $\mu_{i} \geq 0$ for all $i \in N$, then set $\texttt{converged}\gets\texttt{true}$ and return the current solution $z^{(t)}$.
    - If $\mu_{i} < 0$ for any $i\in{N}$, the current solution is __not optimal__; there is a non-basic variable that, if _turned on_, will strictly decrease the objective.
2. __Direction and ratio test__. Select $e \gets \arg\min_{i \in N} \mu_{i}$. This variable $e$ will enter the basis (i.e., be turned on) to improve the objective. Compute the direction $\mathbf{d} = A^{-1}_{B}A_{e}$, where $A_{e}$ is the column of the constraint matrix corresponding to variable $e$, and $A_{B}$ is the submatrix of $A$ formed by the basic variables.
    - If $\left\{j \mid d_{j} > 0\right\} = \emptyset$, the problem is __unbounded__. Exit the algorithm with an __error__.
    - Compute the step size $\alpha = \min\left\{\frac{(z_B)_{j}}{d_{j}}\mid d_{j} > 0\right\}$, where $(z_B)_j$ is the $j$th element of the basic variable vector.
    - Compute the index of the variable that will _leave_ the basic set: $l = \arg\min\left\{\frac{(z_B)_{j}}{d_{j}}\mid d_{j} > 0\right\}$.
3. __Pivot and update__: Update the basic and non-basic sets: $B \gets \left(B \setminus \{B_{l}\}\right)\cup \{e\}$ and $N \gets \left(N \setminus \{e\}\right) \cup \{B_l\}$. This means we swap the entering variable $e$ into the basic set and the leaving variable $B_l$ into the non-basic set.
    - Set $z_{e} \gets \alpha$ and update the solution vector $z_{B}^{(t+1)} \gets z_{B}^{(t)} - \alpha \cdot\mathbf{d}$.
    - Update the dual multipliers $\lambda\gets (A_{B}^T)^{-1}c_{B}$ and the iteration counter $t \gets t + 1$.
4. __Check convergence__: If $t \geq T$, set $\texttt{converged}\gets\texttt{true}$ and return the current solution $z^{(t)}$. Exit the algorithm with an __error__ if the maximum number of iterations is reached without convergence. Otherwise, loop back to step 1.

Wow! That seems intense. How efficient is the simplex algorithm?
* In the __worst case__, the simplex method can take _exponential time_ in the number of variables—Klee and Minty's 1972 example shows it may visit all $2^n$ vertices of an $n$-dimensional cube, forcing on the order of $2^n$ pivots. Thus, it has $O(2^n)$ worst-case complexity.
* However, __in practice__, the simplex method is often very efficient. It performs well on most real-world problems, and its average-case performance is polynomial time for many practical instances. The worst-case exponential bound is rarely encountered in practice, as most LPs have a structure that allows the simplex method to converge quickly.

Next, let's examine the second class of algorithms based on the KKT conditions: Interior Point methods.

___