<hr style="border:1px solid gray"> </hr>

# Lecture: What is Homotopy Continuation?

Homotopy Continuation is a method for solving systems of polynomials equations over the complex numbers $\mathbb C$.

## Example:
Polynomials in variables $x,y$: 

\begin{align*}
\text{❌} \quad g(x,y) &= 13 \exp(x) + y + 1\\[0.5em]
\text{❌} \quad h(x,y) &= \sqrt{x+y} - y^2\\[0.5em]
\text{✅} \quad f(x,y) &= 4x^2 + 2xy - (10 + 4i) (y-1)\\
\end{align*}



<hr style="border:1px solid gray"> </hr>

In the following, let

$$F(x_1,\ldots,x_n) = \begin{bmatrix} f_1(x_1,\ldots,x_n)\\ \vdots \\ f_m(x_1,\ldots,x_n)\end{bmatrix}$$

be a **system** of $m$ polynomials in $n$ variables.

* If $n=m$, we call $F$ a <u>square system</u>

* If $n<m$, we call $F$ <u>overdetermined</u>

* If $n>m$, we call $F$ <u>underdetermined</u>

<p style="border:3px; border-style:solid; padding: 0.5em; text-align:center">We assume $m=n$.
</p>

<hr style="border:1px solid gray"> </hr>

# What solving means

In the context of homotopy continuation *solving* means to compute **numerical approximations** of **isolated solutions**.

Let us look at an example: take 

$$F(x,y,z) = \begin{bmatrix} x^2 + y^2 + z^2 - 1\\ x^2 - y + z^2\\ x-z\end{bmatrix}.$$

A useful representation of $F=0$ is by expressing it with a *Gröbner basis*:

$$ \mathcal G = \{x-z, \; y-2z^2, \; z^4+\tfrac{1}{2} z^2 - \tfrac{1}{4}\}.$$

<br>

* We can read off from $\mathcal G$ the number of solutions of $F=0$. 

* And we can solve $F=0$ by iteratively solving *univariate equation*. 

<br>

But this is a reduction to the problem we started with: solving equations.

Homotopy Continuation takes a different approach. 

It keeps the multivariate structure and tries directly to 

<p style="border:3px; border-style:solid; padding: 0.5em; text-align:center">compute points $\xi\in\mathbb C^n$ such that the distance $\Vert \xi-\zeta\Vert$ is small,
</p>


where $\zeta$ is a true zero: $F(\zeta)=0$.

<br>



<hr style="border:1px solid gray"> </hr>

# Software Option 1: Bertini

In [None]:
using Bertini

@var x y z
F = [x^2 + y^2 + z^2 - 1; x^2 - y + z^2; x-z]

result = bertini(F)

In [None]:
zeros = result[:finite_solutions]
map(s -> round.(s, digits = 3), zeros)

<hr style="border:1px solid gray"> </hr>

# Software Option 2: PHCPack

In [2]:
using PHCpack

@var x y z
F = [x^2 + y^2 + z^2 - 1; x^2 - y + z^2; x-z]

result = phc(F)

File path: /var/folders/dm/81rrvn3d6hxb31lf0brq019m0000gn/T/jl_sJNETf
  0.120984 seconds (894 allocations: 52.250 KiB)
A list of 4 solutions has been refined :
Number of regular solutions     : 4.
Number of singular solutions    : 0.
Number of real solutions        : 2.
Number of complex solutions     : 2.
Number of clustered solutions   : 0.
Number of solutions at infinity : 0.
Number of failures              : 0.
Frequency tables for correction, residual, and condition numbers :
FreqCorr :  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 : 4
FreqResi :  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 : 4
FreqCond :  4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 : 4
Small correction terms and residuals counted to the right.
Well conditioned and distinct roots counted to the left.

TIMING INFORMATION for Solving the polynomial system
The elapsed time in seconds was                  0.003644000 =  0h 0m 0s  4ms
User time in seconds was                         0.002070000 =  0h 0m 0s  2ms
System CPU time in seconds was              

<hr style="border:1px solid gray"> </hr>

# Software Option 3: HomotopyContinuation.jl

In [None]:
using HomotopyContinuation

@var x y z
F = [x^2 + y^2 + z^2 - 1; x^2 - y + z^2; x-z]

result2 = solve(F)

In [None]:
zeros = solutions(result2)
map(s -> round.(s, digits = 3), zeros)

<hr style="border:1px solid gray"> </hr>

# Option 3: Exact Solutions

The exact solutions of 
$$  x^2 + y^2 + z^2 - 1 = x^2 - y + z^2 = x-z = 0$$
are

\begin{align*}
(x,y,z) &= (\tfrac{1}{2}\sqrt{\sqrt{5} -1},\quad \tfrac{1}{2}(\sqrt{5} -1),\quad \tfrac{1}{2}\sqrt{\sqrt{5} -1}\,)\\[0.5em]
(x,y,z) &= (-\tfrac{1}{2}\sqrt{\sqrt{5} -1},\quad \tfrac{1}{2}(\sqrt{5} -1),\quad -\tfrac{1}{2}\sqrt{\sqrt{5} -1}\,)\\[0.5em]
(x,y,z) &= (\tfrac{i}{2}\sqrt{\sqrt{5} +1},\quad -(\sqrt{5} +1),\quad\tfrac{i}{2}\sqrt{\sqrt{5} +1}\,)\\[0.5em]
(x,y,z) &= (-\tfrac{i}{2}\sqrt{\sqrt{5} +1},\quad -(\sqrt{5} +1),\quad-\tfrac{i}{2}\sqrt{\sqrt{5} +1}\,)\\
\end{align*}

<br>

Why do we even compute numerical approximations when we can have such exact results?

<br>

<u>Answer 1:</u> not all zeros can be written in terms of simple operations like $+,-,\cdot,/,\sqrt{\quad}$.



<u>Answer 2:</u> numerical methods are often faster than exact computations, especially when the problems become more complicated.

<hr style="border:1px solid gray"> </hr>


# The basic idea

We first discuss the basic idea underlying homotopy continuation. 

We denote 

$$\mathbb C[x_1,\ldots,x_n]_{d} := \{\text{polynomials with coefficients in $\mathbb C$ of degree at most $d$ in the variables $x_1,\ldots,x_n$}\}.$$

Let $d_1,\ldots,d_n$ and define

$$\mathcal R := \mathbb C[x_1,\ldots,x_n]_{d_1} \times \cdots \times \mathbb C[x_1,\ldots,x_n]_{d_n}$$

to be the vector space of systems of $n$ polynomials in $n$ variables with complex coefficients.

<br> 

Suppose that the <u>system we are interested in</u> is 

$$F(x) \in \mathcal R.$$

Suppose further that there is another system

$$G(x) \in \mathcal R,$$

of which we know or can easily compute 

$$\text{a zero $\zeta$ with $G(\zeta)=0$.}$$

Let $H(x,t): \mathcal R \times [0,1] \to \mathcal R$
be a **homotopy** in $\mathcal R$ with 

$$H(x,1) = G(x)\quad \text{and}\quad H(x,0)=F(x).$$ 



<p style="border:3px; border-style:solid; padding: 0.5em; text-align:center">
The idea is to track $\zeta$ along the homotopy $H(x,t)$ from $t=1$ to $t=0$.
</p>

<br>

<img src="tracking.gif" width="500" style="float:right;">

<img src="geometry.png" width="400"> 

## Example

$n=m=1$ and $G(x)=x^8 -1$.

The zeros of $G$ are $\{\exp(2\pi i/8)\mid 1\leq i\leq 8\}$.

(points on the unit circle)

As $t$ moves from $1$ to $0$ the blue points move in the complex plane $\mathbb C$.

<hr style="border:1px solid gray"> </hr>

# What tracking means

Tracking means to compute a numerical approximation of the **solution curve** $x(t)$ with

$$H(x(t), t) = 0.$$

Differentiating at $t$ gives:

$$\frac{\partial}{\partial t} H(x,t) + \frac{\partial}{\partial x} H(x,t) \,\frac{\partial}{\partial t} x(t) = 0.$$

<br>
Computing $x(t)$ is equivalent to solving an 
<bdi style="border:3px; border-style:solid; padding: 0.2em;">
ODE initial value problem!
</bdi>

<br>
<br>

Numerical approximation means that we compute discrete values $t_1,t_2,\ldots,t_k,\ldots,$ and points $\widetilde{x}_1,\widetilde{x}_2,\ldots,\widetilde{x}_k,\ldots$ such that $\Vert \widetilde{x}_k - x(t_k)\Vert$ is small for all $k$. 

We can move $\widetilde{x}_k$ towards $x(t_k)$ using *Newton's method*. 

This is called the <bdi style="border:3px; border-style:solid; padding: 0.2em;">corrector</bdi>.

<br>

<img src="predictor-corrector.png" width="500"> 

<br>

The <bdi style="border:3px; border-style:solid; padding: 0.2em;">predictor</bdi> produces $\widetilde{x}_{k}$ from an approximation of $x(t_{k-1})$. 


There are many predictor methods for ODEs. HomotopyContinuation.jl uses a variant of the so called Padé-predictor

(Sascha and Simon can tell more about this).

<hr style="border:1px solid gray"> </hr>

# Gotta catch them all

So far, the discussion evolved around tracking a single solution curve and hence computing a single solution.

Polynomial homotopy continuation can do more: it can compute <bdi style="border:3px; border-style:solid; padding: 0.2em;">all solutions</bdi>.

Recall the definition 

$$\mathbb C[x_1,\ldots,x_n]_{d} := \{\text{polynomials with coefficients in $\mathbb C$ of degree at most $d$ in the variables $x_1,\ldots,x_n$}\}.$$

and

$$\mathcal R := \mathbb C[x_1,\ldots,x_n]_{d_1} \times \cdots \times \mathbb C[x_1,\ldots,x_n]_{d_n}.$$

Suppose $F(x)\in\mathcal R$. Then, we can take the following as start system

$$G(x) = \begin{bmatrix} x_1^{d_1} - 1\\ \vdots \\ x_n^{d_n} - 1\end{bmatrix}.$$

$G(x)$ is called a **total degree start system**.

<u> **Theorem**</u> (Bézout's theorem): $F(x)=0$ has at most $d_1\cdots d_n$ isolated solutions.

<br>

This means that $G(x)$ has at least as many isolated solutions as $F(x)$. 

If we can find for each zero $F(\zeta)=0$ a corresponding zero $G(\xi)=0$, such that $\xi$ gets tracked towards $\zeta,$ we can compute all solutions of $F(x)=0$.

<br>


<bdi style="border:3px; border-style:solid; padding: 0.2em;">Problem 1:</bdi>   Many zeros of $G(x)$ do not get tracked towards a zero of $F$. Computation is wasteful!


In [None]:
@var x y
F = [x^2 - x^2 * y^2 + 2y^2 - 1;  x - 4x * y + 3]
solve(F, start_system = :total_degree)

<br>

<bdi style="border:3px; border-style:solid; padding: 0.2em;">
Problem 2:</bdi>   Which homotopy to take?

<br>
<br>

It is appealing to take the *straight-line homotopy* $H(x,t) = tF(x) + (1-t)G(x)$.

It is better to take a <u>path in the complex numbers</u>.

Predictor-corrector methods rely on  $\frac{\partial}{\partial x} H(x,t)$ being invertible. This is why one usually uses the homotopy

$$H(x,t) = \lambda tF(x) + (1-t)G(x),$$

where $\lambda\in\mathbb C\setminus \mathbb R$ is a randomly chosen complex number.

<br>

The space of polynomial systems $\Sigma \subset \mathcal R$, whose derivative is *not* invertible, form an <u>algebraic subvariety</u> of $\mathcal R$. 

$\Sigma$ is called the **discriminant**.

* In $\mathcal R\cap \{\text{systems with real coefficients}\}$ it has real codimension $1$.

* In $\mathcal R$ it has real codimension $2$. We can go around it!

## Example

The space of degree-2 univariate polynomials is $\mathbb C[x] = \{px^2 + qx + r \mid p,q,r\in\mathbb C, p\neq 0\}$.

Put $a:=\frac{q}{p}$ and $b:=\frac{r}{p}$. Then $\Sigma = \{a^2 - 4b = 0\}$.

<br>

<img src="poly_deg_2.png" width="500">  <span style="display:inline-block; width: 1.5cm;"></span> <img src="poly_deg_2_C.png" width="500"> 

<br>

<hr style="border:1px solid gray"> </hr>

# Start systems

The start system 
$$G(x) = \begin{bmatrix} x_1^{d_1} - 1\\ \vdots \\ x_n^{d_n} - 1\end{bmatrix}.$$
is not always an optimal choice.

<br>

<bdi style="border:3px; border-style:solid; padding: 0.2em;">
Are there better choices?</bdi><br>

<br>


Yes, but it is difficult to construct them.
<hr style="border:1px solid gray"> </hr>
<br>

At the end of this first lecture we want to show that good start systems exist at least.

**<u>Definition:</u>** Let $V\subset \mathcal R$ be an algebraic variety. 

For simplicity we assume $V$ is smooth. 

<br>

<p style="border:3px; border-style:solid; padding: 0.5em; text-align:center">
We call $V$ a family of polynomial systems.<br>
In Sascha's talk $V$ will be given by a parametrization $F:\mathbb C^N\to V, q\mapsto F(x,q)$.
</p>

<br>

**<u>Definition:</u>**

(1) $S:=\{(F,z) \in V\times \mathbb C^n \mid F(z)=0\}$.

(2) $\Delta := \{(F,z)\in S \mid \det(JF(z))=0\}$, where $JF(x)=(\frac{\partial f_i}{\partial x_j})_{i,j}$ is the Jacobian matrix.

(3) $\pi:S \to V, (F,z)\mapsto F$.

(4) $\Sigma := \pi(\Delta)$ is the discriminant.

Observe that $\pi$ is surjective.

<br>

**<u>Theorem:</u>** Let $G\in V\setminus \Sigma$ and $F\in V$. The start system is $G$. The target system is $F$.

Let the isolated simple zeros of $G$ and $F$ be

\begin{align*}
\pi^{-1}(G) \cap (S \setminus \Sigma) &= \{ (G,z_1),\ldots, (G, z_k)\} \;\;\text{ and }\\[0.2em]
\pi^{-1}(F) \cap (S \setminus \Sigma) &= \{ (F,y_1),\ldots, (G, y_\ell)\}
\end{align*}

<img src="geometry_2.png" width="400" style="float:right;">

Then 

$$k\geq \ell$$

and for almost all paths $\gamma: [0,1]\to V$ with $\gamma(1) = G$ and $\gamma(0)=F$ we find paths 

$$\Gamma_1(t),\ldots,\Gamma_\ell(t)\subset (S\setminus \Sigma)$$

and a permutation $\sigma\in \mathfrak S_{k}$ such that for all $i$:

$$\Gamma_i(1) = (G, z_{\sigma(i)}),\quad \Gamma_i(0)=(F, y_i),\quad \text{ and } \quad \pi\,(\Gamma_i(t))=\gamma(t)$$

<br>

<ul style="border:3px; border-style:solid; padding: 2em;">
  <li style="margin-bottom: 5px;">This theorem means that <u>every</u> $G\in V\setminus \Sigma$ can be used as a start system.</li>
  <li style="margin-bottom: 5px;">All $G\in V\setminus \Sigma$ have $k$ isolated zeros in $\mathbb C^n$ ($k$ is the <u>generic</u> number of zeros).
  <li style="margin-bottom: 5px;">A good start system has a $k$, which is not much larger than $\ell$.</li>
  <li style="margin-bottom: 5px;">To find a good start system means that we must find a good $V$.</li>
  <li>The ``larger'' $V$ the larger is $k$.</li>
</ul> 


**<u>Proof:</u>**  Let $F_0,F_1\in V\setminus \Sigma$.

The discriminant $\Sigma$ is a subvariety of $V$. It has real codimension at least 2. This implies that almost all paths 

$$\gamma: [0,1]\to V \quad\text{ with } \quad \gamma(1) = F_1(x) \quad\text{ and } \quad \gamma(0)=F_0(x)$$

are such that 

$$\gamma(\,(0,1]\,) \cap \Sigma = \emptyset.$$

Fix $0\leq t\leq 1$ and write

$$F_t = \gamma(t).$$

By assumption $F_t\not \in \Sigma$. This implies that $(F_t,\zeta)\not\in \Delta$ and so $\det(Jf(\zeta))\neq 0$ <u>for all zeros</u> $\zeta$ of $f$.

Fix one of the zeros $\zeta_t$ of $F_t$.

By the implicit function theorem, there is a neighborhood $\hat{U}_{t} \subset S$ of $(F_t,\zeta_t)$ such that 

$$\pi|_{\hat{U}_{t}}: \hat{U}_{t}\to \pi(\hat{U}_{t})\subset V$$

is invertible. Let

$$U_t:=\pi(\,\hat{U}_{t}\,)\quad \text{ and let }\quad \psi_{t}: U_{t}\to \hat{U}_{t}$$

be the inverse of $\pi$ on $U_{t}$. 

<br>

We have constructed a family of pairs $(\psi_t, U_t)_{t\in (0,1]}$ indexed by $(0,1]$. 

This gives an open cover of the compact interval $[0,1]$. For this cover we can find a partition of unity $p(t)$  on $(0,1]$ to define

$$\Gamma(t) := \sum_{t\in (0,1]}\, p(t)\cdot\psi_t(\,\gamma(t)\,).$$

<br>

For each $t$ we have $\Gamma(t) = (F_t, \zeta_t)$, where $F_t(\zeta_t)=0$.

<br>

This implies that for each pair $0\leq t,s\leq 1$ there exists an injection of the zeros of $F_{t}(x)$ to the zeros of $F_{s}(x)$.

<p style="border:3px; border-style:solid; padding: 0.2em; text-align:center;">
This implies that the number of zeros is constant along $\gamma(t)$.
</p>

Let this number be $k$.

<br>

Now, we consider a path $\gamma(t)$ $F_1(x)=F(x)$ and $F_0(x)=G(x)$.

We can choose $\gamma(t)$ such that $\gamma((0,1])\cap \Sigma =\emptyset$. 

But we can't choose $\gamma(t)$ such that $\gamma(0)\not\in \Sigma$, because $G(x)$ could be in $\Sigma$.

The argument above was based on local considerations. 

Therefore, we can use the same arguments for the simple isolated zeros of $G(x)$ and find that there is an injection of those zeros into the isolated simple zeros of $F(x)$. 

This shows that 

$$k\geq \ell.$$

The proof is finished.