<hr style="border:1px solid gray"> </hr>

# Lecture: What is Homotopy Continuation?

Homotopy Continuation is a computational framework for solving systems of polynomials equations over the complex numbers $\mathbb C$.

## Example:
Polynomials in variables $x,y$: 

\begin{align*}
\text{❌} \quad g(x,y) &= 13 \exp(x) + y + 1\\[0.5em]
\text{❌} \quad h(x,y) &= \sqrt{x+y} - y^2\\[0.5em]
\text{✅} \quad f(x,y) &= 4x^2 + 2xy - (10 + 4i) (y-1)\\
\end{align*}



<hr style="border:1px solid gray"> </hr>

In the following, let

$$F(x_1,\ldots,x_n) = \begin{bmatrix} f_1(x_1,\ldots,x_n)\\ \vdots \\ f_m(x_1,\ldots,x_n)\end{bmatrix}$$

be a **system** of $m$ polynomials in $n$ variables.

* If $n=m$, we call $F$ a <u>square system</u>
* If $n<m$, we call $F$ <u>overdetermined</u>
* If $n>m$, we call $F$ <u>underdetermined</u>

<p style="border:3px; border-style:solid; padding: 0.5em; text-align:center">We assume $m=n$.
</p>

<hr style="border:1px solid gray"> </hr>

# What solving means

In the context of homotopy continuation *solving* means to compute **numerical approximations** of **isolated solutions**.

Let us look at an example: take 

$$F(x,y,z) = \begin{bmatrix} x^2 + y^2 + z^2 - 1\\ x^2 - y + z^2\\ x-z\end{bmatrix}.$$

A useful representation of $F=0$ is by expressing it with a *Gröbner basis*:

$$ \mathcal G = \{x-z, \; y-2z^2, \; z^4+\tfrac{1}{2} z^2 - \tfrac{1}{4}\}.$$

<br>

* We can read off from $\mathcal G$ the number of solutions of $F=0$. 

* And we can solve $F=0$ by iteratively solving *univariate equation*. 

<br>

But this is a reduction to the problem we started with: solving equations.

Homotopy Continuation takes a different approach. 

It keeps the multivariate structure and tries directly to 

<p style="border:3px; border-style:solid; padding: 0.5em; text-align:center">compute points $\xi\in\mathbb C^n$ such that the distance $\Vert \xi-\zeta\Vert$ is small,
</p>


where $\zeta$ is a true zero: $F(\zeta)=0$.

<br>



In [32]:
using HomotopyContinuation

@var x y z
F = [x^2 + y^2 + z^2 - 1; x^2 - y + z^2; x-z]

zeros = solve(F)

Result with 4 solutions
• 4 paths tracked
• 4 non-singular solutions (2 real)
• random_seed: 0x8b97d294
• start_system: :polyhedral


In [25]:
[round.(s, digits = 3) for s in solutions(zeros)]

4-element Array{Array{Complex{Float64},1},1}:
 [-0.0 - 0.899im, -1.618 - 0.0im, 0.0 - 0.899im]
 [0.556 + 0.0im, 0.618 - 0.0im, 0.556 + 0.0im]
 [-0.556 + 0.0im, 0.618 + 0.0im, -0.556 - 0.0im]
 [0.0 + 0.899im, -1.618 - 0.0im, -0.0 + 0.899im]

The true solutions of 
$$  x^2 + y^2 + z^2 - 1 = x^2 - y + z^2 = x-z = 0$$
are

\begin{align*}
(x,y,z) &= (\tfrac{1}{2}\sqrt{\sqrt{5} -1},\quad \tfrac{1}{2}(\sqrt{5} -1),\quad \tfrac{1}{2}\sqrt{\sqrt{5} -1}\,)\\[0.5em]
(x,y,z) &= (-\tfrac{1}{2}\sqrt{\sqrt{5} -1},\quad \tfrac{1}{2}(\sqrt{5} -1),\quad -\tfrac{1}{2}\sqrt{\sqrt{5} -1}\,)\\[0.5em]
(x,y,z) &= (\tfrac{i}{2}\sqrt{\sqrt{5} +1},\quad -(\sqrt{5} +1),\quad\tfrac{i}{2}\sqrt{\sqrt{5} +1}\,)\\[0.5em]
(x,y,z) &= (-\tfrac{i}{2}\sqrt{\sqrt{5} +1},\quad -(\sqrt{5} +1),\quad-\tfrac{i}{2}\sqrt{\sqrt{5} +1}\,)\\
\end{align*}

<br>

Why do we even compute numerical approximations when we can have such exact results?

<br>

<u>Answer 1:</u> not all zeros can be written in terms of simple operations like $+,-,\cdot,/,\sqrt{\quad}$.



<u>Answer 2:</u> numerical methods are often faster than exact computations, especially when the problems become more complicated.

<hr style="border:1px solid gray"> </hr>


# The basic idea

We first discuss the basic idea underlying homotopy continuation. For this, we denote 

$$\mathbb C[x_1,\ldots,x_n]_{d} := \{\text{polynomials with coefficients in $\mathbb C$ of degree at most $d$ in the variables $x_1,\ldots,x_n$}\}.$$

Let $d_1,\ldots,d_n$ be fixed and define

$$\mathcal R := \mathbb C[x_1,\ldots,x_n]_{d_1} \times \cdots \times \mathbb C[x_1,\ldots,x_n]_{d_n}$$

to be the vector space of systems of $n$ polynomials in $n$ variables with complex coefficients.

Suppose that the <u>system we are interested in</u> is 

$$F(x) \in \mathcal R.$$

Suppose further that there is another system

$$G(x) \in \mathcal R,$$

of which we know or can easily compute a zero $\zeta$ with $G(\zeta)=0$. Let $H(x,t): \mathcal R \times [0,1] \to \mathcal R$
be a **homotopy** in $\mathcal R$ with 

$$H(x,1) = G(x)\quad \text{and}\quad H(x,0)=F(x).$$ 



<p style="border:3px; border-style:solid; padding: 0.5em; text-align:center">
The idea is to track $\zeta$ along the homotopy $H(x,t)$ from $t=1$ to $t=0$.
</p>

<br>

<img src="tracking.gif" width="500" style="float:right;">

<img src="geometry.pdf" width="400"> 

## Example

$n=m=1$ and $G(x)=x^8 -1$.

The zeros of $G$ are $\{\exp(2\pi i/8)\mid 1\leq i\leq 8\}$.

(points on the unit circle)

As $t$ moves from $1$ to $0$ the blue points move in the complex plane $\mathbb C$.

<hr style="border:1px solid gray"> </hr>

# What tracking means

Tracking means to compute a numerical approximation of the **solution curve** $x(t)$ with

$$H(x(t), t) = 0.$$

Differentiating at $t$ gives:

$$\frac{\partial}{\partial t} H(x,t) + \frac{\partial}{\partial x} H(x,t) \,\frac{\partial}{\partial t} x(t) = 0.$$


Computing $x(t)$ is equivalent to solving an 
<bdi style="border:3px; border-style:solid; padding: 0.2em;">
ODE initial value problem!
</bdi>

<br>

Numerical approximation means that we compute discrete values $t_1,t_2,\ldots,t_k,\ldots,$ and points $\widetilde{x}_1,\widetilde{x}_2,\ldots,\widetilde{x}_k,\ldots$ such that $\Vert \widetilde{x}_k - x(t_k)\Vert$ is small for all $k$. Then, we can move $\widetilde{x}_k$ towards $x(t_k)$ using *Newton's method*. 

This is called the <bdi style="border:3px; border-style:solid; padding: 0.2em;">corrector</bdi>.

<br>

<img src="predictor-corrector.png" width="500"> 

<br>

The <bdi style="border:3px; border-style:solid; padding: 0.2em;">predictor</bdi> produces $\widetilde{x}_{k}$ from an approximation of $x(t_{k-1})$. 


There are many predictor methods for ODEs. HomotopyContinuation.jl uses a variant of the so called Padé-predictor

(Sascha and Simon can tell more about this).

<hr style="border:1px solid gray"> </hr>

# Gotta catch them all

So far, the discussion evolved around tracking a single solution curve and hence computing a single solutions.

Polynomial homotopy continuation can do more: it can compute <bdi style="border:3px; border-style:solid; padding: 0.2em;">all solutions</bdi>.

Recall the definition 

$$\mathbb C[x_1,\ldots,x_n]_{d} := \{\text{polynomials with coefficients in $\mathbb C$ of degree at most $d$ in the variables $x_1,\ldots,x_n$}\}.$$

and

$$\mathcal R := \mathbb C[x_1,\ldots,x_n]_{d_1} \times \cdots \times \mathbb C[x_1,\ldots,x_n]_{d_n}.$$

Suppose $F(x)\in\mathcal R$. Then, we can take the following as start system

$$G(x) = \begin{bmatrix} x_1^{d_1} - 1\\ \vdots \\ x_n^{d_n} - 1\end{bmatrix}.$$

$G(x)$ is called a **total degree start system**.

<u> **Theorem**</u> (Bézout's theorem): $F(x)=0$ has at most $d_1\cdots d_n$ isolated solutions.

<br>

This means that $G(x)$ has at least as many isolated solutions as $F(x)$. 

If we can find for each zero $F(\zeta)=0$ a corresponding zero $G(\xi)=0$, such that $\xi$ gets tracked towards $\zeta,$ we can compute all solutions of $F(x)=0$.

<br>


<bdi style="border:3px; border-style:solid; padding: 0.2em;">Problem 1:</bdi>   Many zeros of $G(x)$ do not get tracked towards a zero of $F$. Computation is wasteful!


In [35]:
@var x y
F = [x^2 - x^2 * y^2 + 2y^2 - 1;  x - 4x * y + 3]
solve(F, start_system = :total_degree)

Result with 4 solutions
• 8 paths tracked
• 4 non-singular solutions (0 real)
• random_seed: 0xaecaa79f
• start_system: :total_degree


<br>

<bdi style="border:3px; border-style:solid; padding: 0.2em;">
Problem 2:</bdi>   Which homotopy to take?

<br>
<br>

It is appealing to take the *straight-line homotopy* $H(x,t) = tF(x) + (1-t)G(x)$.

It is, however, better to take a <u>path in the complex numbers</u>, because predictor-corrector methods rely on  $\frac{\partial}{\partial x} H(x,t)$ being invertible. This is why one usually uses the homotopy

$$H(x,t) = \lambda tF(x) + (1-t)G(x),$$

where $\lambda\in\mathbb C\setminus \mathbb R$ is a randomly chosen complex number.


The space of polynomial systems $\Delta \subset \mathcal R$, whose derivative is *not* invertible, form an **algebraic subvariety** of $\mathcal R$. 

$\Delta$ is called the **discriminant**.

* In $\mathcal R\cap \{\text{systems with real coefficients}\}$ it has real codimension $1$.

* In $\mathcal R$ it has real codimension $2$. We can go around it!

## Example

The space of degree-2 univariate polynomials is $\mathbb C[x] = \{px^2 + qx + r \mid p,q,r\in\mathbb C, p\neq 0\}$.

Put $a:=\frac{q}{p}$ and $b:=\frac{r}{p}$. Then $\Delta = \{a^2 - 4b = 0\}$.

<br>

<img src="poly_deg_2.pdf" width="500">  <span style="display:inline-block; width: 1.5cm;"></span> <img src="poly_deg_2_C.pdf" width="500"> 

<br>

<hr style="border:1px solid gray"> </hr>

# Start systems

The start system 
$$G(x) = \begin{bmatrix} x_1^{d_1} - 1\\ \vdots \\ x_n^{d_n} - 1\end{bmatrix}.$$
is not always an optimal choice.

<br>

<bdi style="border:3px; border-style:solid; padding: 0.2em;">
Are there better choices?</bdi><br>

<br>


Yes, but the difficulty is constructing them.
<hr style="border:1px solid gray"> </hr>
<br>

At the end of this first lecture we want to show that good start systems exist at least.

**<u>Definition:</u>** Let $V\subset \mathcal R$ be an algebraic variety. For simplicity we assume it is smooth. We define

(1) $S:=\{(F,z) \in V\times \mathbb C^n \mid F(z)=0\}$.

(2) $\Sigma := \{(F,z)\in S \mid \det(JF(z))=0\}$, where $JF(x)=(\frac{\partial f_i}{\partial x_j})_{i,j}$ is the Jacobian matrix.

(3) $\pi:S \to V, (F,z)\mapsto F$.

(4) $\Delta := \pi(\Sigma)$ is the discriminant.

Observe that $\pi$ is surjective.

<br>
<bdi style="border:3px; border-style:solid; padding: 0.2em;">
We call $V$ a family of polynomial systems</bdi><br>
<br>

**<u>Theorem:</u>** Let $G\in V\setminus \Delta$ and $F\in V$. The start system is $G$. The target system is $F$.

Let the isolated zeros of $G$ and $F$ be

\begin{align*}
\pi^{-1}(G) \cap (S \setminus \Sigma) &= \{ (G,z_1),\ldots, (G, z_k)\} \;\;\text{ and }\\[0.2em]
\pi^{-1}(F) \cap (S \setminus \Sigma) &= \{ (F,y_1),\ldots, (G, y_\ell)\}
\end{align*}

<img src="geometry_2.pdf" width="400" style="float:right;">

Then 

$$k\geq \ell$$

and for almost all paths $\gamma: [0,1]\to V$ with $\gamma(1) = G$ and $\gamma(0)=F$ we find paths 

$$\Gamma_1(t),\ldots,\Gamma_\ell(t)\subset (S\setminus \Sigma)$$

and a permutation $\sigma\in \mathfrak S_{k}$ such that for all $i$:

$$\Gamma_i(1) = (G, z_{\sigma(i)}),\quad \Gamma_i(0)=(F, y_i),\quad \text{ and } \quad \pi\,(\Gamma_i(t))=\gamma(t)$$

<br>

<ul style="border:3px; border-style:solid; padding: 2em;">
  <li style="margin-bottom: 5px;">This theorem means that <u>every</u> $G\in V\setminus \Delta$ can be used as a start system.</li>
  <li style="margin-bottom: 5px;">All $G\in V\setminus \Delta$ have $k$ isolated zeros in $\mathbb C^n$ ($k$ is the <u>generic</u> number of zeros).
  <li style="margin-bottom: 5px;">A good start system has a $k$, which is not much larger than $\ell$.</li>
  <li style="margin-bottom: 5px;">To find a good start system means that we must also find a good $V$.</li>
  <li>The ``larger'' $V$ the larger is $k$.</li>
</ul> 


**<u>Proof:</u>**  The discriminant $\Sigma$ is a subvariety of $\V$. It has real codimension at least 2. This implies that almost all paths 

$$\gamma: [0,1]\to V \quad\text{ with } \quad \gamma(1) = G(x) \quad\text{ and } \quad \gamma(0)=F(x)$$

are such that 

$$\gamma(\,(0,1]\,) \cap \Delta = \emptyset.$$

Fix $0< t\leq 1$ and write

$$f = \gamma(t).$$

Since $\pi$ is surjective, $f$ has a zero $\zeta$. 

By assumption $f\not \in \Delta$.  This implies that $(f,\zeta)\not\in \Sigma$ and so $\det(Jf(\zeta))\neq 0$ <u>for all zeros</u> $\zeta$ of $f$.

By the implicit function theorem, there is a neighborhood $\hat{U}_{t} \subset S$ of $(f,\zeta)$ such that 

$$\pi|_{\hat{U}_{t}}: \hat{U}_{t}\to \pi(\hat{U}_{t})\subset V$$

is invertible. Let

$$U_t:=\pi(\,\hat{U}_{t}\,)\quad \text{ and let }\quad \psi_{t}: U_{t}\to \hat{U}_{t}$$

be the inverse of $\pi$ on $U_{t}$. 

<br>

We have constructed a family of pairs $(\psi_t, U_t)_{t\in (0,1]}$ indexed by $(0,1]$. 

This gives an open cover of the compact interval $[0,1]$. For this cover we can find a partition of unity $p(t)$  on $(0,1]$ to define

$$\Gamma(t) := \sum_{t\in (0,1]}\, p(t)\cdot\psi_t(\,\gamma(t)\,).$$

For $t=1$ we must find a uniquely defined zero $z_j$ of $G$ such that $\Gamma(1) = (G, z_j)$.

<br>

We have now shown that that we can pair each zero of $f=\gamma({t})$ with uniquely determined zero of $G$.

<u>We can also back</u> from $G$ to $f$: for each zero of $G$ there exists a uniquely determined zero of $\gamma(t)$ for $0<t\leq 1$.

The `

We can also use the argument above for all isolated zeros $y_1,\ldots,y_\ell$ of $F$: 

<p style="border:3px; border-style:solid; padding: 0.2em; text-align:center;">
we can connect all zeros of $F$ with some zeros of $G$, but not the other way round.
</p>

This shows that $k\geq \ell$ and the existence of paths

$$\Gamma_1(t),\ldots,\Gamma_\ell(t)\subset (S\setminus \Sigma)$$

and a permutation $\sigma\in \mathfrak S_{k}$ such that for all $i$:

$$\Gamma_i(1) = (G, z_{\sigma(i)}),\quad \Gamma_i(0)=(F, y_i),\quad \text{ and } \quad \pi\,(\Gamma_i(t))=\gamma(t).$$

The proof is finished. \qed
