# Convexity

[Back to index](https://shotahorii.github.io/math-for-ds/)

---

## Table of contents
1. **Convex set**  
1.1. Definition of a convex set  
1.2. Examples of convex set  
1.3. Basic property  
2. **Convex function**  
2.1. Definition of a convex function  
2.2. First-order convexity condition  
2.3. Second-order convexity condition

---

## 1. Convex set

### 1.1. Definition of a convex set

A set $A \subseteq \mathbb{R}^d$ is a **convex set** when $t{\bf x}^{(1)} + (1-t){\bf x}^{(2)} \in A$ is always true for any pair of ${\bf x}^{(1)} \in A$ and ${\bf x}^{(2)} \in A$, with any $t \in [0,1]$.

---


### 1.2. Examples of convex set

**Example 1**

$A = \{{\bf x}|{\bf m}\cdot{\bf x}+b = 0, {\bf x}\in \mathbb{R}^d\}$

**Proof**

Let: ${\bf x}^{(1)},{\bf x}^{(2)} \in A $

${\bf x}^{(1)},{\bf x}^{(2)} \in A \Longrightarrow {\bf m}\cdot{\bf x}^{(1)} = -b, \,\,\, {\bf m}\cdot{\bf x}^{(2)} = -b$

Let: $t \in [0,1]$

${\bf m}\cdot\{t{\bf x}^{(1)} + (1-t){\bf x}^{(2)}\} + b$

$= t{\bf m}\cdot{\bf x}^{(1)} + (1-t){\bf m}\cdot{\bf x}^{(2)} + b$

$= t(-b)+(1-t)(-b)+b = 0$

**Example 2**

$A = \{{\bf x}|{\bf m}\cdot{\bf x}+b \ge 0, {\bf x}\in \mathbb{R}^d\}$

**Proof**

Let: ${\bf x}^{(1)},{\bf x}^{(2)} \in A $

${\bf x}^{(1)},{\bf x}^{(2)} \in A \Longrightarrow {\bf m}\cdot{\bf x}^{(1)} \ge -b, \,\,\, {\bf m}\cdot{\bf x}^{(2)} \ge -b$

Let: $t \in [0,1]$

$t \in [0,1] \Longrightarrow t \ge 0, \,\,\, (1-t) \ge 0$

${\bf m}\cdot\{t{\bf x}^{(1)} + (1-t){\bf x}^{(2)}\} + b$

$= t{\bf m}\cdot{\bf x}^{(1)} + (1-t){\bf m}\cdot{\bf x}^{(2)} + b$

$\ge t(-b)+(1-t)(-b)+b = 0$

---

### 1.3. Basic property

When each of $A$ and $B$ is a convex set, $A \cap B$ is a convex set.

**Proof**

Let: ${\bf x}^{(1)},{\bf x}^{(2)} \in A\cap B$

${\bf x}^{(1)},{\bf x}^{(2)} \in A\cap B \Longrightarrow {\bf x}^{(1)},{\bf x}^{(2)} \in A \Longrightarrow t{\bf x}^{(1)} + (1-t){\bf x}^{(2)} \in A$

${\bf x}^{(1)},{\bf x}^{(2)} \in A\cap B \Longrightarrow {\bf x}^{(1)},{\bf x}^{(2)} \in B \Longrightarrow t{\bf x}^{(1)} + (1-t){\bf x}^{(2)} \in B$

Hence, 

$t{\bf x}^{(1)} + (1-t){\bf x}^{(2)} \in A\cap B$

**Note**

$A \cup B$ is not always a convex function even when each of $A$ and $B$ is a convex set.

---

## 2. Convex function

### 2.1. Definition of a convex function

A function $f({\bf x})$ is a **convex function** when $f(t{\bf x}^{(1)} + (1-t){\bf x}^{(2)}) \le tf({\bf x}^{(1)})+(1-t)f({\bf x}^{(2)})$ is always true for any pair of ${\bf x}^{(1)},{\bf x}^{(2)} \in \mathbb{R}^d$, with any $t \in [0,1]$.

#### Another definition using epigraph

Given a function $f$, a set defined as below is called an epigraph.

$epi\,f = \{({\bf x},y) \in \mathbb{R}^d \times \mathbb{R} | y \ge f({\bf x})\}$

A function $f({\bf x})$ is a convex function if $epi\,f$ is a convex set.

---

### 2.2. First-order convexity condition

Let: $\nabla f({\bf u})= (\frac{\partial f({\bf x})}{\partial x_1},\frac{\partial f({\bf x})}{\partial x_2},...,\frac{\partial f({\bf x})}{\partial x_d})^T |_{{\bf x}={\bf u}} \,\,\, , {\bf u} \in \mathbb{R}^d$

$f({\bf x})$ is a convex function $\Longleftrightarrow f({\bf v}) \ge f({\bf u}) + \nabla f({\bf u})\cdot({\bf v}-{\bf u}) \,\,\,\,\, ({\bf u}, {\bf v} \in \mathbb{R}^d)$

#### Proof  (when d=1)

Let: $f(x)$ be a convex function, $u,v \in \mathbb{R}$

$f(tv + (1-t)u) \le tf(v)+(1-t)f(u)$

$\Longrightarrow f(u + t(v-u)) \le f(u) + t(f(v)-f(u))$

$\Longrightarrow \frac{f(u + t(v-u)) - f(u)}{t} \le f(v)-f(u)$

$\Longrightarrow \frac{f(u + t(v-u)) - f(u)}{t(v-u)}(v-u) \le f(v)-f(u)$

As this inequality is true for any $t \in [0,1]$, let $t \to 0$

Let: $h = t(v-u)$ then, $t \to 0 \Rightarrow h \to 0$

$\Longrightarrow \lim\limits_{h \to 0} \frac{f(u + h) - f(u)}{h}(v-u) \le f(v)-f(u)$

$\Longrightarrow f'(u)(v-u) \le f(v)-f(u)$

$\Longrightarrow f(v) \ge f(u) + f'(u)(v-u)$

#### Proof (in general - not only when d=1)

Let: $f({\bf x})$ be a convex function, ${\bf u}, {\bf v} \in \mathbb{R}^d$

$tf({\bf v})+(1-t)f({\bf u}) \ge f(t{\bf v} + (1-t){\bf u}) = f({\bf u} + t({\bf v}-{\bf u}))$

Note: With Taylor series, $f({\bf u} + t({\bf v}-{\bf u})) = f({\bf u}) + \nabla f({\bf u})\cdot\{t({\bf v}-{\bf u})\} + o(t||{\bf v}-{\bf u}||) $

$\Longrightarrow tf({\bf v})+(1-t)f({\bf u}) \ge f({\bf u}) + \nabla f({\bf u})\cdot\{t({\bf v}-{\bf u})\} + o(t||{\bf v}-{\bf u}||)$

$\Longrightarrow tf({\bf v}) \ge tf({\bf u}) + \nabla f({\bf u})\cdot\{t({\bf v}-{\bf u})\} + o(t||{\bf v}-{\bf u}||)$

$\Longrightarrow f({\bf v}) \ge f({\bf u}) + \nabla f({\bf u})\cdot({\bf v}-{\bf u}) + \frac{o(t||{\bf v}-{\bf u}||)}{t}$

Let: $t \to 0$

Note: $\lim\limits_{t \to 0}\frac{o(t||{\bf v}-{\bf u}||)}{t} = 0$

$\Longrightarrow f({\bf v}) \ge f({\bf u}) + \nabla f({\bf u})\cdot({\bf v}-{\bf u})$

---

### 2.3. Second-order convexity condition

Let: the Hessian $H_{\bf x} = \nabla^2f({\bf x})$ as $\nabla^2f({\bf x})_{ij} = \frac{\partial^2f({\bf x})}{\partial x_i \partial x_j}, \,\,\,\,\, i,j = 1,2,...,d$

$f({\bf x})$ is a convex function $\Longleftrightarrow H_{\bf x}$ is positive semi-definite

In other words,

$f({\bf x})$ is a convex function $\Longleftrightarrow {\bf u}^T\nabla^2f({\bf x}){\bf u}$ for any ${\bf x}, {\bf u} \in \mathbb{R}^d$

#### Proof (when d=1)

Let: $f(x)$ be a convex function, $u,v \in \mathbb{R}$

$f(v) \ge f(u) + f'(u)(v-u)$

Let: $v = u + \delta u$

$f(u + \delta u) \ge f(u) + f'(u)\delta u$

Note: with Taylor series, $f(u + \delta u) = f(u) + f'(u)\delta u + \frac{f''(u)}{2}(\delta u)^2 + o((\delta u)^2)$

$\Longrightarrow f(u) + f'(u)\delta u + \frac{f''(u)}{2}(\delta u)^2 + o((\delta u)^2) \ge f(u) + f'(u)\delta u$

$\Longrightarrow \frac{f''(u)}{2}(\delta u)^2 + o((\delta u)^2) \ge 0$

$\Longrightarrow \frac{f''(u)}{2} + \frac{o((\delta u)^2)}{(\delta u)^2} \ge 0$

Let: $t \to 0$

Note: $\lim\limits_{\delta u \to 0}\frac{o((\delta u)^2)}{(\delta u)^2} = 0$

$\Longrightarrow f''(u) \ge 0$

#### Proof (in general - not only when d=1)

Let: $f({\bf x})$ be a convex function, ${\bf u},{\bf d} \in \mathbb{R}^d, \lambda > 0$

$f({\bf u} + \lambda {\bf d}) \ge f({\bf u}) + \lambda \nabla f({\bf u})^T{\bf d}$

Note: with Taylor series, $f({\bf u} + \lambda {\bf d}) = f({\bf u}) + \lambda \nabla f({\bf u})^T{\bf d} + \frac{\lambda^2}{2} {\bf d}^T \nabla^2 f({\bf u}){\bf d} + o(\lambda^2)$

$\Longrightarrow f({\bf u}) + \lambda \nabla f({\bf u})^T{\bf d} + \frac{\lambda^2}{2} {\bf d}^T \nabla^2 f({\bf u}){\bf d} + o(\lambda^2) \ge f({\bf u}) + \lambda \nabla f({\bf u})^T{\bf d}$

$\Longrightarrow \frac{\lambda^2}{2} {\bf d}^T \nabla^2 f({\bf u}){\bf d} + o(\lambda^2) \ge 0$

$\Longrightarrow \frac{1}{2} {\bf d}^T \nabla^2 f({\bf u}){\bf d} + \frac{o(\lambda^2)}{\lambda^2} \ge 0$

Let: $\lambda \to 0$

Note: $\lim\limits_{\lambda \to 0}\frac{o(\lambda^2)}{\lambda^2} = 0$

$\Longrightarrow {\bf d}^T \nabla^2 f({\bf u}){\bf d} \ge 0$