# 高速フーリエ変換(FFT)の正確さ

## 高速フーリエ変換(FFT)とは
高速フーリエ変換(FFT:Fast Fourier Transform)とは、離散フーリエ変換(DFT:Discrete Fourier Transform)を高速に行うアルゴリズムを指す。

### 離散フーリエ変換(DFT)
データが $N$ 個の離散フーリエ変換を $X_n$ とした場合、以下のように書くことができる。

$$
\left(\begin{array}{c}
X_{0} \\
X_{1} \\
X_{2} \\
\vdots \\
X_{N-1}
\end{array}\right)=\left(\begin{array}{ccccc}
1 & 1 & 1 & \cdots & 1 \\
1 & e^{-i \frac{2 \pi}{N}} & e^{-i \frac{4 \pi}{N}} & \cdots & e^{-i \frac{2 \pi(N-1)}{N}} \\
1 & e^{-i \frac{4 \pi}{N}} & e^{-i \frac{8 \pi}{N}} & \cdots & e^{-i \frac{4 \pi(N-1)}{N}} \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
1 & e^{-i \frac{2 \pi(N-1)}{N}} & e^{-i \frac{4 \pi(N-1)}{N}} & \cdots & e^{-i \frac{2 \pi(N-1)(N-1)}{N}}
\end{array}\right)\left(\begin{array}{c}
x_{0} \\
x_{1} \\
x_{2} \\
\vdots \\
x_{N-1}
\end{array}\right)
$$

$W^n_N$は回転因子という。
$$
W^n_k = e^{-i \frac{2 \pi n}{N}}
$$

これを用いると、

$$
\left(\begin{array}{c}
X_{0} \\
X_{1} \\
X_{2} \\
\vdots \\
X_{N-1}
\end{array}\right)=\left(\begin{array}{ccccc}
1 & 1 & 1 & \cdots & 1 \\
1 & W_{N} & W_{N}^{2} & \cdots & W_{N}^{N-1} \\
1 & W_{N}^{2} & W_{N}^{4} & \cdots & W_{N}^{2(N-1)} \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
1 & W_{N}^{(N-1)} & W_{N}^{2(N-1)} & \cdots & W_{N}^{(N-1)(N-1)}
\end{array}\right)\left(\begin{array}{c}
x_{0} \\
x_{1} \\
x_{2} \\
\vdots \\
x_{N-1}
\end{array}\right)
$$

$$
X_{n}=\sum_{k=0}^{N-1} x_{k} W_{N}^{n k}
$$

離散フーリエ変換では、 $X_0$ から $X_{n−1}$ の $N$ 回の離散フーリエ変換を行うためには、 $N^2$ 回、複素数の計算を行う必要がある。しかし高速フーリエ変換では、この膨大な複素数の計算を減らすことができる。$\frac{N}{2} \log _{2} N$
高速フーリエ変換では、データ数が2の冪乗である必要がある。

(ここからはまだ調べている途中で記載できていないです：バタフライ演算)

Juliaでは高速フーリエ変換のライブラリとして、FFTW.jlが準備されている(OCamlで最適なCのコードを自動生成する仕組み)。

### 早く計算するには
- 並列処理の設定を行う。export JULIA_NUM_THREADS=8などと、スレッド数を指定。
- planfftを利用し、inplace(計算機科学においてデータ構造の変換を行うにあたって、追加の記憶領域をほとんど使わずに行うアルゴリズム)を行う。
- Juiaでも、MATLABと同じくMLKを使用する。

In [4]:
?fft

search: [0m[1mf[22m[0m[1mf[22m[0m[1mt[22m [0m[1mf[22m[0m[1mf[22m[0m[1mt[22m! [0m[1mf[22m[0m[1mf[22m[0m[1mt[22mfreq [0m[1mF[22m[0m[1mF[22m[0m[1mT[22mW [0m[1mf[22m[0m[1mf[22m[0m[1mt[22mshift r[0m[1mf[22m[0m[1mf[22m[0m[1mt[22m i[0m[1mf[22m[0m[1mf[22m[0m[1mt[22m b[0m[1mf[22m[0m[1mf[22m[0m[1mt[22m i[0m[1mf[22m[0m[1mf[22m[0m[1mt[22m! b[0m[1mf[22m[0m[1mf[22m[0m[1mt[22m! r[0m[1mf[22m[0m[1mf[22m[0m[1mt[22mfreq



```
fft(A [, dims])
```

Performs a multidimensional FFT of the array `A`. The optional `dims` argument specifies an iterable subset of dimensions (e.g. an integer, range, tuple, or array) to transform along. Most efficient if the size of `A` along the transformed dimensions is a product of small primes; see `Base.nextprod`. See also [`plan_fft()`](@ref) for even greater efficiency.

A one-dimensional FFT computes the one-dimensional discrete Fourier transform (DFT) as defined by

$$
\operatorname{DFT}(A)[k] =
  \sum_{n=1}^{\operatorname{length}(A)}
  \exp\left(-i\frac{2\pi
  (n-1)(k-1)}{\operatorname{length}(A)} \right) A[n].
$$

A multidimensional FFT simply performs this operation along each transformed dimension of `A`.

!!! note
    This performs a multidimensional FFT by default. FFT libraries in other languages such as Python and Octave perform a one-dimensional FFT along the first non-singleton dimension of the array. This is worth noting while performing comparisons.



In [8]:
?plan_fft#plan_fftは、最適化されたfft関数（技術的には、プランとfftw_execute_dftのラッパー）を返すだけ?

search: [0m[1mp[22m[0m[1ml[22m[0m[1ma[22m[0m[1mn[22m[0m[1m_[22m[0m[1mf[22m[0m[1mf[22m[0m[1mt[22m [0m[1mp[22m[0m[1ml[22m[0m[1ma[22m[0m[1mn[22m[0m[1m_[22m[0m[1mf[22m[0m[1mf[22m[0m[1mt[22m! [0m[1mp[22m[0m[1ml[22m[0m[1ma[22m[0m[1mn[22m[0m[1m_[22mr[0m[1mf[22m[0m[1mf[22m[0m[1mt[22m [0m[1mp[22m[0m[1ml[22m[0m[1ma[22m[0m[1mn[22m[0m[1m_[22mi[0m[1mf[22m[0m[1mf[22m[0m[1mt[22m [0m[1mp[22m[0m[1ml[22m[0m[1ma[22m[0m[1mn[22m[0m[1m_[22mb[0m[1mf[22m[0m[1mf[22m[0m[1mt[22m [0m[1mp[22m[0m[1ml[22m[0m[1ma[22m[0m[1mn[22m[0m[1m_[22mi[0m[1mf[22m[0m[1mf[22m[0m[1mt[22m! [0m[1mp[22m[0m[1ml[22m[0m[1ma[22m[0m[1mn[22m[0m[1m_[22mb[0m[1mf[22m[0m[1mf[22m[0m[1mt[22m!



```
plan_fft(A [, dims]; flags=FFTW.ESTIMATE, timelimit=Inf)
```

Pre-plan an optimized FFT along given dimensions (`dims`) of arrays matching the shape and type of `A`.  (The first two arguments have the same meaning as for [`fft`](@ref).) Returns an object `P` which represents the linear operator computed by the FFT, and which contains all of the information needed to compute `fft(A, dims)` quickly.

To apply `P` to an array `A`, use `P * A`; in general, the syntax for applying plans is much like that of matrices.  (A plan can only be applied to arrays of the same size as the `A` for which the plan was created.)  You can also apply a plan with a preallocated output array `Â` by calling `mul!(Â, plan, A)`.  (For `mul!`, however, the input array `A` must be a complex floating-point array like the output `Â`.) You can compute the inverse-transform plan by `inv(P)` and apply the inverse plan with `P \ Â` (the inverse plan is cached and reused for subsequent calls to `inv` or `\`), and apply the inverse plan to a pre-allocated output array `A` with `ldiv!(A, P, Â)`.

The `flags` argument is a bitwise-or of FFTW planner flags, defaulting to `FFTW.ESTIMATE`. e.g. passing `FFTW.MEASURE` or `FFTW.PATIENT` will instead spend several seconds (or more) benchmarking different possible FFT algorithms and picking the fastest one; see the FFTW manual for more information on planner flags.  The optional `timelimit` argument specifies a rough upper bound on the allowed planning time, in seconds. Passing `FFTW.MEASURE` or `FFTW.PATIENT` may cause the input array `A` to be overwritten with zeros during plan creation.

[`plan_fft!`](@ref) is the same as [`plan_fft`](@ref) but creates a plan that operates in-place on its argument (which must be an array of complex floating-point numbers). [`plan_ifft`](@ref) and so on are similar but produce plans that perform the equivalent of the inverse transforms [`ifft`](@ref) and so on.


In [6]:
using BenchmarkTools

N=2^11
A_real = rand(N, N)
FFT = plan_fft(A_real);

In [7]:
@benchmark B = FFT * A_real

BenchmarkTools.Trial: 
  memory estimate:  128.00 MiB
  allocs estimate:  4
  --------------
  minimum time:     141.370 ms (0.48% GC)
  median time:      155.255 ms (5.76% GC)
  mean time:        158.071 ms (7.96% GC)
  maximum time:     240.408 ms (37.25% GC)
  --------------
  samples:          32
  evals/sample:     1

In [10]:
N=2^11
C_real = rand(N, N)
FFT = fft(C_real);

In [11]:
@benchmark D = FFT * C_real

BenchmarkTools.Trial: 
  memory estimate:  64.00 MiB
  allocs estimate:  2
  --------------
  minimum time:     344.108 ms (0.00% GC)
  median time:      356.412 ms (0.00% GC)
  mean time:        370.236 ms (2.96% GC)
  maximum time:     486.717 ms (20.39% GC)
  --------------
  samples:          14
  evals/sample:     1

In [1]:
using FFTW

len = 5
x = [2pi*k/len for k = 0:len-1]
cos_x = cos.(x)
println(fft(cos_x))

┌ Info: Precompiling FFTW [7a1cc6ca-52ef-59f5-83cd-3a7055c09341]
└ @ Base loading.jl:1278


Complex{Float64}[-2.220446049250313e-16 + 0.0im, 2.5 - 2.76434240485155e-16im, -2.220446049250313e-16 - 2.4926059914975068e-17im, -2.220446049250313e-16 + 2.4926059914975068e-17im, 2.5 + 2.76434240485155e-16im]


In [2]:
using FFTW

len=5
x = [2pi*k/len for k = 0:len-1]
sin_x = sin.(x)
println(fft(sin_x))

Complex{Float64}[1.1102230246251565e-16 + 0.0im, -2.139456371091744e-16 - 2.5im, 1.5843448587791657e-16 - 4.3164793190846535e-17im, 1.5843448587791657e-16 + 4.3164793190846535e-17im, -2.139456371091744e-16 + 2.5im]


https://qiita.com/ageprocpp/items/0d63d4ed80de4a35fe79

https://cognicull.com/ja/f5q2jl62

下のコードはMATLABのverifyfftのコード

function Z = verifyfft(z,sign)
%VERIFYFFT    Verified forward and backward 1-dimensional FFT
%
%   res = verifyfft(z,sign)
%
%   z     input vector or matrix
%         length of z must be a power of 2 
%   sign   1 forward FFT (default)
%         -1 inverse FFT
% 
%As in Matlab, the inverse FFT is scaled such that forward and inverse FFT
%are inverse operations.
%For matrix input, FFT is performed on each column; row vector input
%is converted into column vector. 
%For N-dimensional FFT apply verifyfft N times.
% 

% written  09/24/14     S.M. Rump  (based no Marcio Gameiro's code)
% modified 01/16/16     S.M. Rump  improved error estimates
%

% data generated by fft_data_gen
%

  global INTLAB_CONST
  
  [n,col] = size(z);
  if n==1
    if col==1
      Z = intval(z);
      return
    else
      isrow = 1;
      z = z(:);
      n = col;
      col = 1;
    end
  else
    isrow = 0;
  end
    
  if nargin==1
    sign = 1;       % default: forward
  end
  
  % check dimension
  log2n = round(log2(n));
  if 2^log2n~=n
    error('length must be power of 2')
  end
  
  % bit-reversal
  % v = bin2dec(fliplr(dec2bin(0:n-1,log2n))) + 1
  f = 2^(log2n-1);
  v = [0;f]; 
  for k=1:log2n-1
    f = 0.5*f;
    v = [ v ; f+v ];
  end
  z = z(v+1,:);
  
  % Danielson-Lanczos algorithm
  Z = intval(z);
  Index = reshape(1:n*col,n,col);
  nmax = INTLAB_CONST.FFTDATA_NMAX; % maximum in fft_data
  if n<=nmax
    r = INTLAB_CONST.FFTDATA_R;     % roots of unity in  r +/- d
    d = INTLAB_CONST.FFTDATA_D(log2n);
    Phi = midrad(r(1:nmax/n:nmax),d);
    if sign==-1
      Phi = (Phi.')';      
    end
  else
    % compute roots of unity, division exact because n is power of 2
    theta = intval('pi') * ( sign*(0:(n-1))'/n ); 
    Phi = cos(theta) + 1i*sin(theta);
  end
  v = 1:2:n;
  w = 2:2:n;
  t = Z(w,:);
  Z(w,:) = Z(v,:) - t;
  Z(v,:) = Z(v,:) + t;
  
  for index=1:(log2n-1)     % Executed log2(n) times
    m = 2^index;
    m2 = 2*m;
    vw = reshape(1:n,m2,n/m2);
    v = vw(1:m,:);
    w = vw(m+1:m2,:);
%     t = bsxfun(@times,exp(1i*pi*(0:m-1)'/m),Z(w));  % doesn't work for intervals
%     theta = intval('pi') * (sign*(0:(m-1))'/m);     % division exact because m=2^p
%     t = exp(1i*theta) .* Z(w);
    indexv = reshape(Index(v(:),:),m,col*n/m2);
    indexw = reshape(Index(w(:),:),m,col*n/m2);
%     t = repmat(Phi(1:n/m:end),1,n/m2*col);
    t = Phi(1:n/m:end,ones(1,n/m2*col)) .* Z(indexw);   % Tony's trick
    Z(indexw) = Z(indexv) - t;
    Z(indexv) = Z(indexv) + t;
  end
  
  Z = [Z(1,:); flipud(Z(2:end,:))];
  if sign==-1
    Z = Z/n;        % error-free since n is a power of 2
  end
  
  if isrow          % change to row vector
    Z = transpose(Z);
  end
  
end
