# Chapter 2 Operators and Fixed Points Reading Note

## Three subsections and seven subsubsections

### Stability

- conjugate maps

- convergence rates

- Newton's method

### Order

- Partial orders

- Order-preserving maps

- Fixed points and order

### Matrices and Operators

- Linear operators

## Definition

**Definition (Dynamical system)**

A **dynamical system** is a pair $(U,T)$ where

- $U$: state space is a subset of $\mathbb{R}^n$

- $T$ is a self-map on $U$. 

**Definition (Conjugate)**

Dynamical systems $(U,T)$ and $(\hat U, \hat T)$ are called conjugated under $\Phi$ if 

1. $\Phi: U\to\hat U$ a bijection
2. $T = \Phi^{-1} \circ \hat T\circ \Phi$ on $U$

**Definition (Homeomorphism)**

$\Phi:U\mapsto \hat U$ is called a **homeomorphism** if it is

- continuous
- bijective
- has continuous inverse

**Definition (Topologically conjugate)**

The dynamical system $(U,T)$ and $(\hat U,\hat T)$ are **topologically conjugate** under $\Phi$ if 

- $(U,T)$ and $(\hat U,\hat T)$ are conjugate under $\Phi$

- $\Phi$ is a homeomorphism

**Definition (locally stable)**

Let $U$ be a subset of $\mathbb{R}^n$  and let $T$ be a self-map on $U$. 

A fixed point $u^*$ of $T$ in $U$ is called **locally stable** for the dynamical system $(U,T)$ if there exists an open set $O\subset U$ such that $u^*\in O$ and $T^k u\to u^*$ as $k\to\infty$ for every $u\in O$.

In other words, the domain of attraction for $u^*$ contains an open neigborhood of $u^*$.

**Definition (at rate at least $q$)**

Let $e_k := \|u_k-u^*\|$ for all $k\in\mathbb{N}$.

We say that $(u_k)_{k\ge 0}$ converges to $u^*$ **at rate at least $q$** if $q\ge 1$ and for some $\beta \in (0,\infty)$, $N\in\mathbb{N}$, we have

$$
e_{k+1}\le \beta e_k^q,\,\,\forall k\ge N
$$

**Definition (at rate $q$)**

We say that convergence occurs **at rate $q$** if, in addition, 

$$
\limsup_{k\to\infty} \frac{e_{k+1}}{e_k^q} = \beta
$$

-  If $q=2$, then we say that convergence is (at least) **quadratic**
-  If $q=1$ and $\beta <1$, then we say that convergence is (at least) **linear**

**Definition (Worst case complexity)**

Measures the number of fundamental operations such as addition and multiplication of floating point numbers.

**Definition (partial order)**

A **partial order** on a nonempty set $P$ is a relation $\precsim$ on $P\times P$ that, for any $p,q,r,\in P$ satisfies

- Reflexivity: $p\precsim p$
- Anti-symmetry: $p\precsim q, q\precsim p \implies p=q$
- Transitivity: $p\precsim q, q\precsim r\implies p\precsim r$

**Definition (partially ordered set)**

The pair $(P,\precsim)$ is called a partially ordered set. 

**Definition (pointwise order)**

Fix an arbitrary nonemptyset $X$. The **pointwise order** $\le$ on the set $\mathbb{R}^X$ of all functions from $X$ to $\mathbb{R}$ is defined as follows,

given $u,v$ in $\mathbb{R}^X$, set $u\le v$ if $u(x)\le v(x)$ for all $x\in X$.


**Definition (total order)**

A partial order $\precsim$ on $P$ is called **total** if for all $p,q\in P$, either $p\precsim q$ or $q\precsim p$

**Definition (greatest element)**

Given a partially ordered set $(P,\precsim)$ and $A\subset P$, we say that $g\in P$ is a **greatest element** of $A$ if $g\in A$ and

$$
a\in A\implies a\precsim g
$$

**Definition (least element)**

Given a partially ordered set $(P,\precsim)$ and $A\subset P$, we say that $l\in P$ is a **least element** of $A$ if $l\in A$ and

$$
a\in A\implies l\precsim a
$$

**Definition (maximum)**

If $A$ is totally ordered, then a greatest element $g$ of $A$ is called a maximum of $A$.

**Definition (minimum)**

If $A$ is totally ordered, then a least element $l$ of $A$ is called a minimum of $A$.

**Definition (upper bound)**

Given a partially ordered set $(P,\precsim)$ and a nonempty subset $A$ of $P$, we call $u\in P$ an **upper bound** of $A$ if $a\precsim u$ for all $a\in A$.

**Definition (supremum)**

Letting $U_p(A)$ be the set of all upper bounds of $A$ in $P$, we call $\bar u\in P$ a **supremum** of $A$ if

$$
\bar u \in U_p(A), \bar u\precsim u\,\,\forall u\in U_p(A)
$$

Thus, $\bar u$ is the least element of the set of upper bounds $U_p(A)$, whenever it exists.

The supremum of $A$ is typically denoted as $\bigvee A$.

**Definition (lower bound)**

We call $\ell\in P$  a **lower bound** of $A$ if $a\succsim\ell$ for all $a\in A$.

**Definition (infimum)**

An element $\bar\ell \in P$ is called **infimum** of $A$ if $\ell$ is a lower bound of $A$ and $\bar\ell\succsim \ell$ for every lower bound $\ell$ of $A$.

We write $\bar \ell = \bigwedge A$ to denote the infimum.

**Definition(sublattice)**

A subset $V$ of $\mathbb{R}^X$ is called a **sublattice** of $\mathbb{R}^X$ if

$$
u,v\in V\implies u\vee v\in v, u\wedge v\in V
$$

In other words, $V$ is closed under pairwise supremum and infimum.

**Definition (order interval)**

Given a partially ordered set $(P,\precsim)$, and $a,b\in P$, the **order interval** $[a,b]$ is defined as all $p\in P$ such that $a\precsim p\precsim b$.

If $a\not\precsim b$, then $[a,b] = \emptyset$.



**Definition (upper envelope)**

Take $\{T_\sigma\}:= \{T_\sigma\}_{\sigma\in\Sigma}$ to be a finite family of self-maps on a sublattice $V\subset \mathbb{R}^X$. 

Define, 

$$
Tv = \bigvee_{\sigma\in\Sigma} T_\sigma \,\,\,(v\in V)
$$

From the sublattice property, $T$ is a self-map on $V$. 

$T$ is called the **upper envelope** of the functions $\{T_\sigma\}$.

**Definition (order-preserving)**

Given two partially ordered sets $(P,\precsim)$ and $(Q, \trianglelefteq)$, a map $T$ from $P$ to $Q$ is called **order-preserving** if given $p,p'\in P$, we have,

$$
p\precsim p' \implies Tp\trianglelefteq Tp'
$$

**Definition (order-reversing)**

$T$ is called **order-reversing** if,

$$
p\precsim p' \implies Tp'\trianglelefteq Tp
$$

**Definition (increasing)**

Given two partially order sets $(P,\precsim)$, $(\mathbb{R},\le)$, we call $h\in\mathbb{R}^P$ **increasing** if 

$$
p\precsim p'\implies h(p)\le h(p')
$$

**We use the symbol $i\mathbb{R}^P$ for the set of increasing functions in $\mathbb{R}^P$**.

**Definition (decreasing)**

Given two partially order sets $(P,\precsim)$, $(\mathbb{R},\le)$, we call $h\in\mathbb{R}^P$ **decreasing** if 

$$
p\precsim p'\implies h(p)\ge h(p')
$$

## Theorem and some key takeaway

- Linear transformations are conjugated (diffeomorphic) with their Jordan normal form though their generalized eigenbasis

- Conjugation implies fixed point in one system is the fixed point in the other system

- $\Phi: fix(T)\mapsto fix(\hat T)$ is a bijection.

- Topological conjugacy preserves convergence.

- Topological conjugacy is a equivalence relation

- Orders of convergence are studied in the neigborhood of zero, implying that higher orders are faster.

- Successive approximation typically converge at a **linear rate**.

- Successive approximation always converges when global stability holds

- Under mild conditions, there exists a neigborhood of the fixed point within which the Newton iterates converge **quadratically**

- We can accelerate computation by exploiting the problem's special structure such as differentiability, convexity, monotonicity. But we face a **tradeoff between speed and robustneess**. **More robust methods exploit less structure (less speed)**

- Successive approximate can be partially parallelized, but the algorithm is inherently serial

- Newton's method is less serial (involving less steps of iteration) but each steps is more expensive (inverting high dimension matrices). Since less serial $\implies$ more potential for parallelization. 

- The objective in dynamic programming is to maximize/minimize a lifetime value/cost function. A function over a state space. Thus, the objective takes values in a particular ordered set and we seek greatest/least elements.

- If a partially ordered set has a greatest element, it is the supremum of the set, if the supremum is in the set, it is the greatest element.

- If $V$ is a sublattice, then the supremum and infimum of any finite subset of $V$ is in $V$.

- The $i\mathbb{R}^P$ is a sublattice.

**Proposition 2.1.1.**

If $(U,T)$ and $(\hat U, \hat T)$ are conjugated dynamical systems, then

- $u \in fix (T)\iff \Phi u \in fix(\hat T)$

- $\Phi^{-1} \hat u \in fix (T)\iff \hat u \in fix(\hat T)$

- $|fix(T)| = |fix(\hat T)|$

**Proposition 2.1.2.**

If $(U,T)$ and $(\hat U,\hat T)$ are topologically conjugate, then

- $T$ is globally stable on $U$ if and only if $\hat T$ is globally stable on $\hat U$

- the unique fixed points $u^*\in U$ and $\hat u^*\in\hat U$ satisfy $\hat u^* = \Phi u^*$.

**Hartman-Grobman Theorem**

If $J_T(u^*)$ is nonsingular and contains no eigenvalues on the unit circle in $\mathbb{C}$, then there exists an open neigborhood $O$ of $u^*$ such that $(O,T)$ and $(O,\hat T)$ are topologically conjugate. 

$\hat T$ is the linearization of $T$ near $u^*$, i.e.,

$$
\hat T u = u^* + J_T(u^*) (u-u^*) + \mathcal{O}((u-u^*)^2)
$$

**Corolloary of Hartman-Grobman Theorem**

Under the condition in Hartman-Grobman theorem, the fixed point $u^*$ is locally stable whenever $\rho(J_T(u^*))<1$. 

**Lemma (Inequalities and identities related to pointwise partial order on $\mathbb{R}^X$, $X$ is finite)**

For $f,g,h \in\mathbb{R}^X$, we have

1. Triangle inequality: $|f+g|\le |f| + |g|$

2. Distribution law with addition: 

$$
(f\wedge g)+h = (f+h)\wedge (g+h)
$$

$$
(f\vee g)+h = (f\vee h) + (g\vee h)
$$

3. Distribution with wedge and vee:

$$
(f\vee g)\wedge h = (f\wedge h)\vee(g\wedge h)
$$

$$
(f\wedge g)\vee h = (f\vee h)\wedge (g\vee h)
$$

4. Difference of minimum is less than the difference

$$
|f\wedge h- f\wedge g| \le |f-g|
$$

5. Difference of maximum is less than the difference

$$
|f\vee h - g\vee h| \le|f-g|
$$

If $f,g,h\in\mathbb{R}^X_+$,  we have **minimum wiht a sum is less than the sum of two minimum**

$$
(f+g)\wedge h \le (f\wedge h)+(g\wedge h)
$$

**Lemma (useful to DP)**

Let $D$ be a finite set. If $f$ and $g$ are elements of $\mathbb{R}^D$, then,

$$
|\max_{z\in D} f(z)- \max_{z\in D} g(z)|\le \max_{z\in D}|f(z)-g(z)|
$$

**Lemma (useful to DP)**

If, for each $\sigma\in \Sigma$, the operator $T_\sigma$ is a contraction of modulus $\lambda_\sigma$ under the supremum norm, then $T = \bigvee_\sigma T_\sigma$ is a contraction of modulus $\max_\sigma \lambda_\sigma$ under the same norm.

### Newton's fixed point method

Let $T$ be a differntiable self-map on an open set $U\subset \mathbb{R}^n$.

We want to find a fixed point of $T$ on $U$.

1. start from an arbitrary guess $u_0\in U$. 

2. Find the fixed point of the linearization of $T$, i.e., $\hat T$ around the guess and let this fixed point be $x_1$.

Since $\hat T u = Tu_0 + J_T(u_0)(u-u_0)$. Let $u_1$ be the fixed point of this linearization, this implies, 

$$
u_1 = Tu_1 = Tu_0 + J_T(u_0)(u_1-u_0) = Tu_0 + J_T(u_0)u_1-J_T(u_0)u_0
$$

$$
(I-J_T(u_0))u_1 = Tu_0 - J_T(u_0)u_0
$$

$$
u_1 = (I-J_T(u_0))^{-1}(Tu_0 - J_T(u_0)u_0)
$$

3. Set $u_1$ as the new guess

4. Find the fixed point of the linearization of $T$ around $u_1$, and let this be $u_2$

5. Iterate the above procedure to lead to the sequence of points

$$
u_{k+1} = Qu_k
$$

where

$$
Qu: = (I-J_T(u))^{-1}(Tu - J_T(u)u)
$$



### Suprema and Infima under a pointwise order

For a pair of functions $\{u,v\}$, the supremum in $(\mathbb{R}^X,\le)$ is the pointwise maximum, while the infimum in $(\mathbb{R}^X, \le)$ is the pointwise minimum.

The same principle holds, for **finite collections of functions**.


Thus, if $\{v_i\} = \{v_i\}_{i\in I}$ is a finite subset of $\mathbb{R}^X$, then, for all $x\in X$, 

$$
\left(\bigvee_{i} v_i\right)(x) := \max_{i\in I}v_i{x}
$$

$$
\left(\bigwedge_{i} v_i\right)(x) := \min_{i\in I}v_i{x}
$$