# Convex Sets

### Affine Sets
A set $C$ is *affine* if the line through any two distinct points in $C$ lies in $C$, i.e., if for all $x, y \in C$ and $\theta \in \mathbb{R}$, we have $\theta x + (1 - \theta) y \in C$

#### Example: Solution set for linear equations
$C = \{x | A x = B\}$, where $A \in \mathbf{R}^{m \times n}$, $B \in \mathbf{R}^m$


### Convex Sets
A set $C$ is *convex* if the line segment between any two points in $C$ lies in $C$, i.e., if for all $x, y \in C$ and $\theta \in \mathbb{R}$ with $\theta \geq 0$, we have $\theta x + (1 - \theta) y \in C$

+ Every affine set is convex set

![convex_set](images/convex_set.png)

#### Convex combination
$\theta_1 x_1 + \theta_2 x_2 + \dots + \theta_k x_k$, where $\theta_1 + \theta_2 + \dots + \theta_k = 1$ and $\theta_i \geq 0, i = 1, \dots, k$, a convex combination of the points $x_1, \dots, x_k$

+ weighted average

#### Convex Hull
$\mathbf{conv} C = \{ \theta_1 x_1 + \theta_2 x_2 + \dots + \theta_k x_k | x_i \in C, \theta_i \geq 0, i = 1, \dots, k, \theta_1 + \theta_2 + \dots + \theta_k = 1 \}$

+ the set of all convex combinations of points in $C$
+ $\mathbf{conv} C$
+ convex hull is convex
+ $k$: number of variables??


#### Cones
A set $C$ is called a *cone*, if for every $x \in C$ and $\theta \geq 0$, we have $\theta x \in C$. A set $C$ is called a *convex cone*, if it is convex and cone, which means for any $x_1, x_2 \in C$ and $\theta_1, \theta_2 \geq 0$, we have $\theta_1 x_1 + \theta_2 x_2 \in C$

+ In general, the element of cone is a vector in $\mathbf{R}^n$

![cone](images/cone.png)

### Some important examples
+ a line segment is convex but not affine
+ any line is affine and thus convex

#### Hyperplane
$\{ x | a^T x = b \}$, where $a \in \mathbf{R}^n$, $a \neq 0$ and $b \in \mathbf{R}$

+ solution set for linear equation
+ affine



![hyperplane](images/hyperplane.png)

#### Halfplane
$\{ x | a^T x \leq b \}$, where $a \in \mathbf{R}^n$, $a \neq 0$ and $b \in \mathbf{R}$

+ space divided by hyperplane
+ convex, but not affine

![halfplane](images/halfplane.png)

#### Euclidean balls
$B(x_c, r) = \{ x | \lVert x - x_c \rVert_2 \leq r \} = \{ x | (x - x_c)^T(x - x_c) \leq r^2 \}$


#### Polyhedra
A *polyhedron* is defined as the solution set of a finite number of linear equalities and inequalities:
$$
\{ x | A x \preceq B, C x = d \},
$$
where $A \in \mathbf{R}^{m \times n}$, $B \in \mathbf{R^m}$ and the symbol $\preceq$ denotes *vector inequality*: $u \preceq v$ means $u_i \leq v_i$ for $i = 1, \dots, m$

![](./images/polyhedron.png)
##### Example: nonnegative orthant
$$\mathbf{R}_+^n = \{x \in \mathbf{R}^n| x \succeq 0\}$$

#### Positive semidefinite cone
$\mathbf{S}^n$ is denoted to a set of symmetric matrices,
$$\mathbf{S}^n = \{ X \in \mathbf{R}^{n \times n} | X = X^T  \}$$.

$\mathbf{S}_+^n$ is denoted to a set of symmetric positive semidefinite matrices:
$$\mathbf{S}_+^n = \{ X \in \mathbf{S}^n | X \succeq 0 \}$$
+ convex: $z^T (\theta A + (1 - \theta) B) z = z^T \theta A z + z^T (1 - \theta) B z \geq 0 + 0 = 0$


##### Example: $\mathbf{S}^2$
$$
x = \begin{bmatrix}
x & y \\
y & z
\end{bmatrix} \in \mathbf{S}^2 \iff x \geq 0, z \geq 0, xz \geq y^2
$$

![](./images/positive_semidefinite_cone.png)


## Operations that preserve convexity

### Intersection
If $S_1$ and $S_2$ are convex, then $S_1 \cap S_2$ is convex

#### Example: Positive Semidefinite Cone
Prove positive semidefinite cone is convex by using intersection. 

A positive semidefinite cone can be expressed as the intersection of an infinite number of halfspaces, i.e.: 
$$
\bigcap_{z \ne 0}\{ X \in \mathbf{S}^n | z^T X z \geq 0 \}.
$$

+ $z^T X z$ is a linear function of $X$, and thus $\{ X \in \mathbf{S}^n | z^T X z \geq 0 \}$ is a halfspace
+ Intersection preserves the convexity even if the number of sets are infinite

### Affine function
A function $f: \mathbf{R}^n \rightarrow \mathbf{R}^m$ is called *affine* if it is the sum of linear function and a constant, i.e., it has the form $f(x) = A x + b$ where $A \in \mathbf{R}^{m \times n}$ and $b \in \mathbf{R}^m$.

If $S \in \mathbf{R}^n$ is convex and $f: \mathbf{R}^n \rightarrow \mathbf{R}^m$ is an affine function, then the image of $S$ under $f$: $f(S) = \{f(x) | x \in S\}$ is convex. In other words, $x$ is one of the elements in the set $S$ and the affine function value at $x$ forms a new set which is also convex.

Similarly, if $f: \mathbf{R}^k \rightarrow \mathbf{R}^n$ is an affine function, the inverse image of $S$ under f: $f^{-1}(S) = \{x| f(x) \in S\}$ is convex. In other words, each element in $S$ can be interpreted as a function value over $f^{-1}(x)$ and the variable $x$ forms a new set which is convex.

+ scaling
+ translation
+ projection: if $S \subseteq \mathbf{R}^m \times \mathbf{R}^n$ is convex, then $\{ x_1 \in \mathbf{R}^m | (x_1, x_2) \in S \text{ for some } x_2 \in \mathbf{R}^n \}$ is convex
+ partial sum: If $S_1, S_2 \in \mathbf{R}^{n + m}$ are convex, then $S = \{ (x, y_1 + y_2) \in \mathbf{R}^{n + m} | (x, y_1) \in S_1, (x, y_2) \in S_2 \}$ is convex. For $m = 0$, the partial sum gives the intersection of $S1$ and $S_2$; for $n = 0$, it is set addition.


#### Example 1: Polyhedron
The polyhedron $\{x | A x \preceq b, C x = d\}$ can be expressed as the inverse image of the Cartesian product of the nonnegative orthant and the origin under the affine function $f(x) = (b - Ax, d - Cx)$:

1. $\{x | A x \preceq b, C x = d\} = \{ x | b - A x \succeq 0, d - C x = 0 \} = \{ x | f(x) \in \mathbf{R}_+^n \times \{ 0 \} \}$ where $\times$ represents the Cartesian product
2. Since the inverse image of a convex set $S$ under an affine function $f$, $\{ x | f(x) \in S \}$ is convex, and $\mathbf{R}_+^n \times \{ 0 \}$ is a convex set and $f(x) = (b - Ax, d - Cx)$ is an affine function, $\{ x | f(x) \in \mathbf{R}_+^n \times \{ 0 \} \}$ is a convex set
3. $\{x | A x \preceq b, C x = d\}$ is a convex set


#### Example 2: Solution set of linear matrix inequality
The soliution set of linear matrix inequality $\{X \in \mathbf{R}^n | A X \preceq B, A \in \mathbf{R}^{m \times n}, B \in \mathbf{R}^m\}$ is the inverse image of positive semidefinite cone $\{X \in \mathbf{S}^m | X \succeq 0 \}$ under the affine function $f(x): \mathbf{R}^n \rightarrow \mathbf{S}^m, f(X) = B - AX$:
$$
\{X \in \mathbf{R}^n | A X \preceq B, A \in \mathbf{R}^{m \times n}, B \in \mathbf{R}^m\} = \{X \in \mathbf{R}^n | f(X) \succeq 0 \}
$$


### Perspective function
We define a *perspective function* $P$:$ \mathbf{R}^{n+1} \rightarrow \mathbf{R}^{n}$, with $\mathbf{dom}{P} = \mathbf{R}^n \times \mathbf{R}_{++}$, as $P(z, t) = z / t$. (Here $\mathbf{R}_{++}$ denotes the sets of positive numbers: $\mathbf{R}_{++} = \{ x \in \mathbf{R} | x \gt 0 \}$, while $\mathbf{R}_+$ means non-negative) The pespective function scales or norms the vectors so the last component is one, and the drops the last component.

$[4, 2, -3, 5]^T \rightarrow [0.8, 0.4, -0.6, 1]^T \rightarrow [0.8, 0.4, -0.6]^T$

### Linear-fractional function
A linear-fractional function is formed by composing perspective function with an affine function $g(x) = \begin{bmatrix}A \\ c^T\end{bmatrix} x + \begin{bmatrix}b\\ d\end{bmatrix}$:
\begin{equation*}
\begin{aligned}
f(x) = \frac{Ax + b}{c^T x + d}, & & \mathbf{ dom }{f} = \{ x | c^T x + d \gt 0 \}
\end{aligned}
\end{equation*}

+ $A \in \mathbf{R}^{m \times n}$, $c \in \mathbf{R}^n$, $b \in \mathbf{R}^m$ and $d \in \mathbf{R}$
+ $f$: $\mathbf{R}^n \rightarrow \mathbf{R}^m$

#### Project Interpretation
We usually represents the linear-fractional function as the following:
$$
Q = \begin{bmatrix}
A & b \\
c^T & d
\end{bmatrix} \in \mathbf{R}^{(m + 1) \times (n + 1)}
$$

Define a ray $\mathcal{P}(z) = \{ z | t(z, 1), t \gt 0 \}$ in $\mathbf{R}^{n + 1}$. The inverse of the function $y = t(z, 1) = (tz, t1)$ is $z = \frac{y}{t}$, which means that we scale or normalize the vector so that its last component is $1$ and drops the last element. In other words, $\mathcal{P}(x)$ appends scaled $1$ to the vector, while $\mathcal{P}^{-1}(x)$(i.e., perspective function) scales or normalizes the vector and throw away the last element.

Therefore, the linear-fractional function can be expressed as:
$$
f(x) = \mathcal{P}^{-1}(Q(\mathcal{P}(x)))
$$

#### Example: Inverse image with $f(x)$ to be a linear-fractional function
$f(x) = \frac{Ax + b}{c^T x + d}$ where $c^T x + d \gt 0$

$C = \{ y | g^T y \leq b \}$ with $g^T \ne 0$

Then, we have:
\begin{equation*}
\begin{aligned}
f^{-1}(C) & = \{ x | g^T f(x) \leq h  \} \\
& = \{ x | g^T \frac{Ax+b}{c^T x + d} \leq h \} \\
& = \{ x | g^T (Ax+b) \leq h (c^T x + d), (c^T x + d) \gt 0 \} \\
& = \{ x | (A^T g - h c)^Tx \leq hd - g^T b, (c^T x + d) \gt 0 \},
\end{aligned}
\end{equation*}
which is another halfspace. (Why is not the answer $g^T A - h c^T$??)


### Generalized inequalities
#### Proper cone
A cone is called proper cone if it satisfies the following:
+ convex
+ closed: contain boundary
+ solid: has nonempty interior
+ pointed: contains no line

*Take Care*:The following cone is not convex and not pointed:
<img src="./images/cone_eg1.png" width="25%"/>

##### Examples
+ nonnegative orthant 

#### Generalized inequalities
A proper cone can be used to define generalized inequalities.

$$x \preceq_K y \iff y - x \in K$$
$$x \prec_K y \iff y - x \in \mathbf{int }K$$

##### Examples
+ componentwise inequality($K$ = $\mathbf{R}_+^n$)
$$
x \preceq_{\mathbf{R}_+^n} y \iff x_i \leq y_i, i = 1, \dots, n
$$

#### Minimum and minimal elements
$\preceq_K$ is not in general a linear ordering: we can have $x \npreceq y$ and $y \npreceq x$

A point x is the *minimum* element of $S$ if and only if $S \subseteq x + K$, where $x + K$ denotes all the points that are comparable to $x$ and greater than or equal to $x$ (according to $\preceq_K$)

A point x is the *minimal* element if and only if $(x - K) \cap K = \{x\}$, where $x - K$ denotes all the points that are comparable to $x$ and less than or equal to x (according to $\preceq_K$ and the only point in common with $S$ is $x$

![](./images/minimum_minimal.png)

### Separating and supporting hyperplanes

#### Separating Hyperplane Theorem
Suppose $C$ and $D$ are nonempty disjoint convex sets, there exists a $a \neq 0$ and $b$ such that $a^T x \leq b$ for all $x \in C$ and $a^T x \geq b$ for all $x \in D$, i.e., there exists a hyperplane that separates them.

+ not unique

![](./images/separating_hyperplane.png)


#### Supporting hyperplanes

Suppose $C \in \mathbf{R}^n$, and $x_0$ is a point in its boundary $\mathbf{bd} C$. If $a \neq 0$ satisifies $a^T x \leq a^T x_0$ for all $x \in C$, then the hyperplane $\{x | a^T x = a^T x_0\}$ is called the *supporting hyperplane* to $C$ at the point $x_0$.

+ not unique

![](./images/supporting_hyperplane.png)

## Dual cones and generalized inequalities


### Dual cones
Let $K$ be a cone, then $K^* = \{ y | x^T y \geq 0 \text{ for all } x \in K \}$ is called the dual cones.

+ $K^*$ is convex, even when the $K$ is not

*Proof.* $y_1, y_2 \in K^* \implies \forall x \in K: x^T y_1 \geq 0, x^T y_2 \geq 0 \implies \forall x \in K, \theta \geq 0, \theta \leq 0: \theta x^T y_1 \geq 0, (1-\theta) x^T y_2 \geq 0 \implies \theta x^T y_1 + (1-\theta) x^T y_2 \geq 0 \implies x^T(\theta y_1 + (1 - \theta)y_2) \geq 0 \implies \theta y_1 + (1 - \theta)y_2 \in K^*$. Besides, another idea is that we can view $K^*$ as the intersection of a set of halfspaces
<img src="./images/dual_cone.png" width="25%">

+ $K^*$ is closed
+ If $K$ is a proper cone, then so is its dual $K^*$



#### Example 1: Nonnegative orthant is self-dual
Nonnegative orthant is defined as $\mathbf{R}_+^n = \{x \in \mathbf{R}^n | x \succeq 0 \}$. Since for all $x \succeq 0$, we have $x^T y \geq 0$,  if $x$ is $[\dots, 1, \dots]$, then $y_j$ must be nonnegative. Therefore, $y \succeq 0$, vice versa. 

#### Example 2: Positive semidefinite cone is self-dual
+ $Y \notin \mathbf{S}_+^n \implies Y \notin (\mathbf{S}_+^n)^* \iff Y \in (\mathbf{S}_+^n)^* \implies Y \in \mathbf{S}_+^n$

*Proof.* $Y \notin \mathbf{S}_+^n \implies \exists v \in \mathbf{R}^n, v^T Y v = tr(v v^T Y) \lt 0 \implies \exists X = v v^T, tr(X Y) \lt 0$. Since $X = v v^T$ is a positive semidefinite matrix according to the definition of positie semidefinite matrix, $X \in \mathbf{S}_+^n$ $\implies$ $Y \notin (\mathbf{S}_+^n)^*$


+ $X, Y \in \mathbf{S}_+^n \implies Y \in (\mathbf{S}_+^n)^*$ 

*Proof.* $X$ is a normal matrix, and thus can be expressed as $X = \sum_i^n{\lambda_i q_i q_i^T}$ $\implies$ $\mathbf{tr}(YX) = \mathbf{tr}(Y\sum_i^n{\lambda_i q_i q_i^T}) = \mathbf{tr}(\sum_i^n{\lambda_i q_i Y q_i^T})$. Since $Y \in \mathbf{S}_+^n$ and $\lambda_i \geq 0$ because of the property of positive semidefinite matrix, $q_i Y q_i^T \geq 0$ and $\mathbf{tr}(YX) \geq 0$ $\implies$ $Y \in (\mathbf{S}_+^n)^*$

