## 1-3. Number Theoretic Transform (NTT)
**GOAL:** faster multiplication between polynomials.

In previous section, we saw that following naive polynomial multiplication is slow.

We will improve it in this section using Foruier transformation and introduce its integer variant, NTT, very briefly. 

In [None]:
# Functions from previous lecturenote
import torch
import math
import cmath

stddev = 3.2
N = 2**10
Q = 2**27

def keygen(dim):
    return torch.randint(2, size = (dim,))

def errgen(stddev):
    e = torch.round(stddev*torch.randn(1))
    e = e.squeeze()
    return e.to(torch.int)

def uniform(dim, modulus):
    return torch.randint(modulus, size = (dim,))

def polymult(a, b, dim, modulus):
    res = torch.zeros(dim).to(torch.int)
    for i in range(dim):
        for j in range(dim):
            if i >= j:
                res[i] += a[j]*b[i-j]
                res[i] %= modulus
            else:
                res[i] -= a[j]*b[i-j] # Q - x mod Q = -x
                res[i] %= modulus

    res %= modulus
    return res

We can use Fourier transform to perform faster 'convolution'.
[Convolution theorem](https://en.wikipedia.org/wiki/Convolution_theorem) say that "the Fourier transform of a convolution of two functions (or signals) is the pointwise product of their Fourier transforms."

Complexity

A naive convolution has complexity of $O(n^2)$

FFT (fast Fourier transform) has complexity of $O(n \log n)$, and pointwise multiplication has complexity $O(n)$, thus total complexity is 
$$
O(n \log n + n ) = O(n \log n).
$$

If we consider the multiplication of polynomial as convolution of coefficient vector, it assumes $X^M = 1$ for some $M$.
However, we use a ring $\mathcal{R} = \mathbb{Z}[X]/\left< X^N+1 \right>$, where $X^N = -1.$

A easiest (but little bit inefficient) way to perform normal FFT is padding $N$ zeros as we have $X^{2N} = {-1}^2 = 1$.

PyTorch naturally supports FFT.

As FFT is defined over complex numbers, we map numbers in $\mathbb{Z}_Q$ to real numbers in $[0,1)$ before FFT simply using division by $Q$.
For secret keys, we don't need to do such transformation; as secret key is binary, we can consider multiplication of $\boldsymbol{z}$ as a *subset sum* of coefficients.

In [None]:

# @param: scale decides wether or not to map Z_Q to [0,1).
def polyfft(a, N, Q, scale=True):
    zeros = torch.zeros(N, dtype=torch.float64)

    apad = torch.cat((a, zeros))
    if scale:
        apad /= Q

    return torch.fft.fft(apad)

def polyifft(afft, N, Q):
    a = torch.fft.ifft(afft)
    aflip = torch.real(a[:N] - a[N:])
    aflip -= torch.round(aflip)

    aflip *= Q
    aint = aflip.to(torch.int32)
    
    aint %= Q

    return aint


In [None]:

# @param: scale decides wether or not to map Z_Q to [0,1).
def polyfft(a, N, Q, scale=True):
    a = a.to(torch.float64)
    apad = torch.cat((a, -a))
    if scale:
        apad /= Q

    return torch.fft.fft(apad)

def polyifft(afft, N, Q):
    a = torch.fft.ifft(afft)
    aflip = torch.real(a[:N] - a[N:])
    aflip -= torch.round(aflip)

    aflip *= Q
    aint = aflip.to(torch.int32)
    
    aint %= Q

    return aint


Now we compare the results.
First, generate $\boldsymbol{a}$ and $\boldsymbol{z}$.

In [None]:
# secret key
z = keygen(N)

# random polynomial
a = uniform(N, Q)

a, z

In [None]:
# ordinary method
azslow = polymult(a, z, N, Q)
azslow

In [None]:
# using fft
A = polyfft(a, N, Q, scale=True)
Z = polyfft(z, N, Q, scale=False)

az = polyifft(A*Z, N, Q)

az

Same result, but the runtime differs a lot.

### 1.3.1. More efficient negacyclic FFT 

We do a negacyclic FFT, for we convert the given vector of coefficients $\boldsymbol{a}$ to $\boldsymbol{b}$ of length N/2 where
$$
b_j = (a_j - i a_{N/2 + j}) w^j.
$$
Here, $w$ is a $2N$-th root of unity, $e^{−\pi i/N}$.
Then, we do FFT on $\boldsymbol{b}$. 

To multiply two polynomial, we perform pointwise multiplication of the FFTed values.

The inverse FFT for a given value $\boldsymbol{c} = FFT(\boldsymbol{b})$, is used to recover the product.
The product $\boldsymbol{a}$ is given as
$$
a_j = Real(b_jw^j) \text{ and } a_{N/2+j} = Imag(b_jw^j).
$$


See [nuFHE document](https://nufhe.readthedocs.io/en/latest/implementation_details.html?highlight=ntt#polynomial-multiplication) for detail

In [None]:
roots = torch.tensor(range(N//2), dtype=torch.complex128)
roots = torch.exp((-1j*math.pi/N)*roots)

In [None]:
def negacyclic_fft(a, N, Q, scale=True):
    acomplex = a.to(torch.complex128)
    
    if scale:
        acomplex /= Q

    left = acomplex[:N//2]
    right = acomplex[N//2:] 

    left -= 1j*right*roots

    return torch.fft.fft(left)

def negacyclic_ifft(A, N, Q):
    b = torch.fft.ifft(A)
    b *= roots

    a = torch.cat((torch.real(b), torch.imag(b)))
    a -= torch.round(a)

    a *= Q
    aint = a.to(torch.int32)
    aint %= Q

    return aint

In [None]:
# using fft
A = negacyclic_fft(a, N, Q, scale=True)
Z = negacyclic_fft(z, N, Q, scale=False)

az = negacyclic_ifft(A*Z, N, Q)

az

In [None]:
azslow

In [None]:
# using fft
A = polyfft(a, N, Q, scale=True)
Z = polyfft(z, N, Q, scale=False)

az = polyifft(A*Z, N, Q)

az

In [None]:
# secret key
z = keygen(N)

# random polynomial
a = uniform(N, Q)

m = torch.zeros(N).to(torch.int)
m[0] = 1

def errpolygen(dim, stddev):
    e = torch.round(stddev*torch.randn(dim))
    e = e.squeeze()
    return e.to(torch.int)

e = errpolygen(N, stddev)

In [None]:
polymult(a, z, N, Q)

In [None]:
A = negacyclic_fft(a, N, Q)

Z = negacyclic_fft(z, N, Q, scale=False)

negacyclic_ifft(A*Z, N, Q)

In [None]:
a

In [None]:
a.shape

In [None]:
zeros = torch.zeros(N, dtype=torch.float64)
zeros

In [None]:
apad = torch.cat((a, zeros))
apad /= Q
apad


In [None]:
Apad = torch.fft.fft(apad)
Apad

In [None]:
zpad = torch.cat((z, zeros))
Zpad = torch.fft.fft(zpad)
Zpad

In [None]:
AZ = Apad * Zpad

In [None]:
az = torch.fft.ifft(AZ)
az

In [None]:
az = az[:N] - az[N:]

az

In [None]:
az = torch.real(az)

az

In [None]:
az -= torch.round(az)
az *= Q
az = az.to(torch.int32)
az

In [None]:
az %= Q
az

In [None]:
az.size()