# Discrete Fourier transform

## Intro

### Technical words

After reading this chapter,  you might be familiar with all this terms (if not, re-read this chapter!)

* Sampling, discretization
* Disrecte-Fourier-Transform, discrete-exponential-waves
* Fast Fourier Transform
* Shanon theorem, Nyquist frequency
* Sampling rate
* Prefiltering
* upsampling, expansion, interpolation



### pull some data from github

In [None]:
import os

if not os.path.exists("assets_signal"):
    print("the directory assets_signal is create")
    !git clone https://github.com/vincentvigon/assets_signal
else:
    print("the directory assets_signal is updated")
    %cd assets_signal
    !git pull https://github.com/vincentvigon/assets_signal
    %cd ..


In [None]:
!pwd

### Import python

In [None]:
%reset -f
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import Image

np.set_printoptions(linewidth=500,precision=3,suppress=True)
plt.style.use("default")

## Reminder on Fourier series

### A continuous interval

In [None]:
M=5           #max freq, initial value: 5
N=2*M+1       #nb element in the basis
nb_points=101 #for discretization, initial value: 101

"""we work on a non-symetric interval (why not)"""
left = -1
right = 3
T = right - left

t = np.linspace(left, right,nb_points , endpoint=False)
plt.plot(t,np.zeros_like(t),'+');

### Stacking waves from $e_{-M}$ to $e_{+M}$ ($1\heartsuit\spadesuit$)

In [None]:
basis_exp=np.empty([N,len(t)],dtype=np.complex64)

for n in range(-M,+M+1):
    print("creation of the exponential wave corresponding to n=%d"%n)
    basis_exp[M+n,:]=np.exp(2*1j*np.pi*t*n/T)

basis_exp.shape

In [None]:
fig,axs=plt.subplots(N,2,figsize=(8,N),sharex=True, sharey=True)

for n in range(-M,M+1):
    m=n+M
    axs[m,0].plot(t,np.real(basis_exp[m,:]))
    axs[m,1].plot(t,np.imag(basis_exp[m,:]))

    axs[m,0].set_title("real part, n=%d"%n)
    axs[m,1].set_title("imag part, n=%d"%n)


fig.tight_layout()

Note the Hermitian symetry above: $e_{-n}=\overline{e_n}$

***To you:*** $(1\heartsuit\spadesuit)$ Two of the previous plot are not so nice. Correct them.

---
***Answer:*** The plots of $e_0$ real and imaginary parts did not share their `y-axis` with the rest of the plots, which was less readable.

---

In [None]:
"we check the orthonormality by computing the hermitian products between exponential-waves"
all_dot_prod=basis_exp@basis_exp.T.conj()/len(t)
print(all_dot_prod)

### Decomposition of a signal ($1\heartsuit\spadesuit$)


In [None]:
"a signal we want to decompose"
f=(t-1)*(t-3)**2
plt.plot(t,f);

In [None]:
""" We compute the fourier coef = the coortinates with respect to the expo-basis
alpha[i] =  her(f,basis[i,:]) ~ 1/N sum_j  basis[i,j].conj() f[j] """
alpha = basis_exp.conj()@f /len(t)

In [None]:
"""we plot the amplitude-spectrum"""
fig,ax=plt.subplots()
ax.plot(range(-M,M+1),np.abs(alpha),".")
ax.set_xticks(range(-M,M+1));

In [None]:
"""the half amplitude spectrum"""
fig,ax=plt.subplots()
ax.plot(range(0,M+1),np.abs(alpha[M:]),".")
ax.set_xticks(range(0,M+1));

In [None]:
"""the half amplitude spectrum, with frequencies as x-labels"""
fig,ax=plt.subplots()
ax.plot(range(0,M+1),np.abs(alpha[M:]),".")
ax.set_xticks(range(0,M+1));

frequencies=np.arange(0,M+1)/T
ax.set_xticklabels(frequencies)
ax.set_xlabel("frequencies in Herz");

In [None]:
""" f_approx[:] = sum_k  alpha[k] basis[k,:]   """
f_approx= alpha@basis_exp

plt.plot(t,f)
plt.plot(t,np.real(f_approx));

***To you:*** During the plot, a warning was throwed:
`ComplexWarning: Casting complex values to real discards the imaginary part`

Can you explain why $(1\heartsuit\spadesuit)$?

---
***Answer:*** `f_approx` is a complex vector. To be used as an argument of the `plot` function, it must be converted to a real vector. This is done by taking its real part and discarding its imaginary part. Hence the warning.

To avoid it, we can explicitly plot the real part of `f_approx`.

---

***To you:*** Restart all this section, but changing the parameters as follows:

* firstly, set `nb_points=11` and keep `M=5`
* secondly, set`M=50` and re-initialise `nb_points=101`. In this case, skip the drawing of all the waves, it is too long.


In both cases, the matrix `basis_exp` is square, and has as many lines as the size of the signal: so `f_approx` is equal to `f`. This simply comes from the fact that the vector `f` can be exactly express in the basis `basis_exp`.


## Discrete-Fourier and FFT

Now we forgot for a moment, that signals are indexed by the time. We make a pure discrete approach: A signal is simply a vector indexed by integers. We also define our waves directly as vectors.



### A discrete basis, from $0$ to $2M+1$


We define discrete-exponential-waves by:
$$
d_n(k)=e^{+2i\pi \frac {nk}N} \qquad k=0,1,...N-1
$$


***Remark:*** the vector $d_n$ is a discretisation of the expo-wave  $e_n(t)=e^{+2i\pi \frac {nt}T}$ but we keep this fact for later.

In [None]:
M=4
N=2*M+1
k=np.arange(0,N)

In [None]:
basis_dis=np.empty([N,N],dtype=np.complex64)
for n in range(N):
    basis_dis[n,:]=np.exp(2*1j*np.pi*n*k/N)

In [None]:
fig,axs=plt.subplots(N,2,figsize=(8,N),sharex=True,sharey=True)

for n in range(N):
    axs[n,0].plot(k,np.real(basis_dis[n,:]),".")
    axs[n,1].plot(k,np.imag(basis_dis[n,:]),".")

    axs[n,0].set_ylim(-1.1,1.1)
    axs[n,1].set_ylim(-1.1,1.1)

### Orthonormality ($1\heartsuit\spadesuit$)

The natural hermitian product for our purpose is:
$$
\mathtt{her}(u,v)= \frac 1 N \sum_{n=0}^{N-1} u_n\, \bar v_n
$$
wich is the discrete equivalent of $\frac 1 T \int_0^T f \bar g$ .


So the coordiantes of $u$ on the discrete basis are:
$$
\beta_n= \mathtt{her}(u,d_n) = \sum_{k=0}^{N-1} u_k e^{-2i\pi \frac{nk}N}
$$
(do not forget the minus, which comes from the conjugaison). And so the reconstruction formula is:
$$
u_k = \sum_{n=0}^{N-1}  \beta_n  e^{+2i\pi \frac{nk}N}
$$

Vocabulary:


* $(\beta_n)$ are called the discrete Fourier coefficients
* The are often denonted by $\hat u_n$.
* The transformation $u\to \beta$ is called the discrete Fourier Transform.
* The transformation $\beta\to u$ is called the inverse discrete Fourier Transform.


On peut aussi écrire la formule de recontruction, sans préciser les indices $k$:
$$
u = \sum_n \beta_n d_n
$$

***To you:*** With python, check $(1\heartsuit\spadesuit)$ that this basis is orthonormal. Help: use the matrix multiplication.  

In [None]:
print(1/N * basis_dis @ np.conjugate(basis_dis).T)

### The same basis, from $-M$ to $+M$ ($1\heartsuit\spadesuit$) ($1\heartsuit\spadesuit$)

***To you:*** ($1\heartsuit\spadesuit$)Check that $d_n=d_{N+n}$ for all $n$. In particular, the family
$$
d_{-M}, ..., d_0,..., d_{+M}
$$
is a decay version of
$$
d_0,...,d_{N-1}
$$
Moreover from the frac that $e^{-ia}=\overline{e^{ia}}$ we have:
$$
d_{N-1} = d_{-1} = \overline{d_1}
$$

---
***Answer:*** This is a consequence of the $2i\pi$-periodicity of the exponential function : for all k, $d_{N+n}(k) = e^{2i\pi\frac{(N+n)k}{N}} = e^{2i\pi k + 2i\pi\frac {nk}{N} } = e^{2i\pi\frac{nk}{N}} = d_n(k)$.

---

This decay version $d_{-M}, ..., d_{+M}$ is closer that what we did with Fourier-series, but from now, we will work with the natural version $d_0,...,d_{N-1}$.

You have to remerber that, when you sort the waves from the lowest frequency to the highest frequency you get

* $d_0$ which is the constant wave
* $d_1$ and $d_{N-1}$ which are conjugate
* $d_2$ and $d_{N-2}$ which are conjugate
* ...
* $d_{M}$ and $d_{M+1}$ which are conjugate


Similarly, if we take an odd $N=2M$, when you sort the waves from the lowest frequency to the highest frequency you get:

* $d_0$ which is the constant wave
* $d_1$ and $d_{N-1}$ which are conjugate
* $d_{M}$ which is is own conjugate, so what ($1\heartsuit\spadesuit$)?

---
***Answer:*** $d_M$ is real. In fact, $d_M(k) = e^{i\pi k} = (-1)^k$.

---


In [None]:
"""The same basis as before, but puting the constant waves at the middle (as for Fourier-series)"""
basis_dis_dec=np.empty([N,N],dtype=np.complex64)
for n in range(-M,M+1):
    basis_dis_dec[n+M,:]=np.exp(2*1j*np.pi*n*k/N)

In [None]:
fig,axs=plt.subplots(N,2,figsize=(8,N),sharex=True,sharey=True)

for n in range(N):
    axs[n,0].plot(k,np.real(basis_dis_dec[n,:]),".")
    axs[n,1].plot(k,np.imag(basis_dis_dec[n,:]),".")

    axs[n,0].set_ylim(-1.1,1.1)
    axs[n,1].set_ylim(-1.1,1.1)

### Discrete decomposition with fft $(1\heartsuit\spadesuit)$

Somebody gives us a sampled (=discretized) signal. It is simply a vector: we even do not know the original duration in second. But we can decompose it in the discrete-basis.



In [None]:
f2=np.loadtxt("assets_signal/signalToFilter.txt")
N=len(f2)

fig,ax=plt.subplots(figsize=(8,2))
ax.plot(range(N),f2);

In [None]:
"""We take the basis which has as many elements as the length of the discrete-signal"""
basis_dis=np.empty([N,N],dtype=np.complex128)
x=np.arange(0,N)
for k in range(N):
    basis_dis[k,:]=np.exp(x*2*1j*np.pi*k/N)

In [None]:
%%time
alpha = basis_dis.conj()@f2 / N

The FFT (Fast Fourier Transform) it a fast algorithm to compute the discrete-Fourier-transform. It is recursive: The transformation of a signal of size $N$ is make from the decomposition of two sub-signals of size $N/2$. To the complexity of the FFT is $N\log(N)$ while the natural algorithm, which is a matrix multiplication, has complexity ... $(1\heartsuit\spadesuit)$.

---
***Answer:*** $N^2$ because it is a matrix-vector multiplication.

---

In [None]:
%%time
alpha_fft=np.fft.fft(f2)

In [None]:
fig,(ax0,ax1)=plt.subplots(2,1,figsize=(8,4))
ax0.plot(range(N),np.abs(alpha))
ax0.set_title("Fourier coef by oursef")
ax1.plot(range(N),np.abs(alpha_fft))
ax1.set_title("Fourier coef by fft")
fig.tight_layout();

But remark the difference: `np.fft.fft()` does not divide by `N` (see later on).

In [None]:
%%time
f2_recons= alpha@basis_dis

In [None]:
%%time
f2_recons_fft=np.fft.ifft(alpha_fft)

In [None]:
fig,(ax0,ax1)=plt.subplots(2,1,figsize=(8,4))
ax0.plot(range(N),np.abs(f2_recons))
ax0.set_title("signal reconstituated by oursef")
ax1.plot(range(N),np.abs(f2_recons_fft))
ax1.set_title("signal reconstituated by fft")
fig.tight_layout();

### Be carful of the  conventions

I choose to the define the Discrete Fourier Coefficients as
$$
\beta_k=\frac 1 N \sum_{n=0}^{N-1} u_k e^{-2i\pi \frac{nk}N}
$$
Which gives the reconstruction formula:
$$
u_n = \sum_{k=0}^{N-1}  \beta_k  e^{+2i\pi \frac{nk}N}
$$


But most of people define
$$
\tilde \beta_k= \sum_{n=0}^{N-1} u_k e^{-2i\pi \frac{nk}N}
$$
So $\tilde \beta_k$ are $N$-times the coordinates in the basis $(d_n)$ so the reconstruction formula becomes:
$$
u_n =\frac 1 N \sum_{k=0}^{N-1}  \tilde  \beta_k  e^{+2i\pi \frac{nk}N}
$$

Other people put $\frac 1 {\sqrt{N}}$ in front of both formulas.



### Be carefull of the amplitude spectrum



The list of coefficients $(\beta_n)$ is also called the spectrum of the discrete signal and $(|\beta_n|)$ is called amplitude spectrum. Note that the signal must be reconstitude from the spectrum, and not from the amplitude-spectrum. A very common error is to think that

         ifft(abs(fft(signal))) == signal

Trick: to avoid bugs, I advice you always write explicit names for your variable, ex: `amplitude_spectrum`.


### Optimization for real signal

Because of the hermitian symetry, one can only remorise half of the spectrum. Next, we will work with `np.fft.rfft` and `np.fft.irfft`  which produce and inverse the half-spectrum. The letter `r` stands for `real`.

In [None]:
half_spectrum=np.fft.rfft(f2)
"""the size is divided by 2, but it is still complex"""
half_spectrum.shape,half_spectrum.dtype

In [None]:
fig,ax=plt.subplots(figsize=(8,2))
ax.plot(range(len(half_spectrum)),np.abs(half_spectrum));

In [None]:
f2_recons_rfft=np.fft.irfft(half_spectrum)
f2_recons_rfft.shape,f2_recons_rfft.dtype

In [None]:
fig,ax=plt.subplots(figsize=(8,2))
ax.plot(range(len(f2_recons_rfft)),f2_recons_rfft);

## Time and frequencies comeback $\hookleftarrow$ $(2\heartsuit\spadesuit)$

Somedy say us that the previous signal  has a duration of 2 seconds. So we can add more informative xlabels

In [None]:
f2=np.loadtxt("assets_signal/signalToFilter.txt")
N=len(f2)
T=2

t=np.linspace(0,T,N)
fig,ax=plt.subplots(figsize=(8,2))
ax.plot(t,f2);

We also would want to add more informative xlabels on the spectrum: The natural choice is: to put under the coef $\beta_n$ the frequency of the wave $d_n$ viewed as a signal of $T$-seconds.


Because of the hermitian symetry, this frequency is:

* $\frac n T$  when $n \leq \frac N2$.
* $\frac {N-n} T$  when $n> \frac N2$.


The waves are indexed from $0$ to $N-1$ so the frequencies goes from $0$ to $\frac{(N-1)}{2}\frac 1 T$.  


In [None]:
basis_dis=np.empty([N,N],dtype=np.complex64)
k=np.arange(0,N)
for n in range(N):
    basis_dis[n,:]=np.exp(2*1j*np.pi*n*k/N)

""" the wave basis_dis[3,:], with the good xlabels """
fig,ax=plt.subplots(figsize=(8,2))
ax.plot(t,np.real(basis_dis[3,:]))
ax.set_title("this wave has frequency 3/T");

In [None]:
half_spectrum=np.fft.rfft(f2)
frequencies=np.linspace(0,(N-1)/2/T,len(half_spectrum))

fig,ax=plt.subplots(figsize=(8,2))
ax.plot(frequencies,np.abs(half_spectrum))
ax.set_xlabel("frequencies in Hz");

In [None]:
frequencies_zoom=frequencies[:120]
spectrum_zoom=half_spectrum[:120]

fig,ax=plt.subplots(figsize=(8,2))
ax.plot(frequencies_zoom,np.abs(spectrum_zoom))
ax.set_xlabel("frequencies in Hz");

***To you:*** $(2\heartsuit\spadesuit)$ Make a zoom on the interval $[0,60Hz]$.    Help:so you have to find the good index, by a formula.

---
***answer:*** $\frac n T = 60 \Longleftrightarrow n = T\times60 = 120$

---

## Good and bad sampling $\hookleftarrow$

When you discretize (=sample) a signal, the sample-rate is the number of points you take per second. So the interval between to points is the inverse of the sample-rate.

The Shanon criterium indicates that, to make a good sampling, the sample-rate must be twice the highest frequency presents in the signal ( in its fourier-series-desomposition).

This theorem is usualy given with this other formulation: With a given sampling-rate $\nu$ (ex: 44100Hz), the highest frequence that you can sample is $\frac \nu 2$. This frequency is called the Nyquist frequency.

Let's observe why.



### A smooth signal and its half-spectrum

In [None]:
"A periodic signal, but only ploted on a bounded interval of T seconds"
def signal(t):
    return np.sin(8*2*np.pi*t)+0.5*np.sin(20*2*np.pi*t)

"""we plot it smoothly"""
T=4    #duration
t_smooth=np.linspace(0,T,1000)
fig,ax=plt.subplots(figsize=(10,1))
signal_smooth=signal(t_smooth)
ax.plot(t_smooth,signal_smooth);

From its definition, the highest frequency present in this signal is $20 Hz$.

In [None]:
half_amplitude_spectrum=np.abs(np.fft.rfft(signal_smooth))
freqs=np.linspace(0,len(t_smooth)/(2*T),len(half_amplitude_spectrum))
plt.plot(freqs,half_amplitude_spectrum);
plt.xlabel("frequencies in Hz")

### Different samplings

We keep this signal, but we sample it with sampling rate smaller and smaller. It is clear that:

* with a very hight sampling rate (ex: 200) we recover this signal just with our eyes.
* with a very low sampling rate (ex:10) it is impossible to imagine the origninal signal

It is less  clear that something change at the sampling rate $40$ which is twice the highest frequency prensent in the signal.

In [None]:
sampling_rates=[200,100,50,44,42,40,38,36,20,10]

nb=len(sampling_rates)
fig,axs=plt.subplots(nb,1,figsize=(8,nb),sharex=True)


for i,sampling_rate in enumerate(sampling_rates):
    t=np.linspace(0,T,sampling_rate*T)
    axs[i].plot(t,signal(t),".-")
    axs[i].set_title("sampling rate: %d"%(sampling_rate))

fig.tight_layout()

Let's observe now the half-amplitude-spectrum. Observe how the highest frequency rebound at the right when the sampling rate goes under 40.

When the sampling-rate is realy too small, it is not easy to see which of the two pick of the deteriored spectrum correspond to which of the two picks of the original spectrum. In particular, because of this rebound, the two picks can add themselves, deteriorating definitively the information. This phenonemnum is called the 'aliasing' (=repliement ou recouvrement de spectre, in french).

In [None]:
fig,axs=plt.subplots(nb,1,figsize=(8,nb))

for i,sampling_rate in enumerate(sampling_rates):

    t=np.linspace(0,T,sampling_rate*T)
    spectrum=np.abs(np.fft.fft(signal(t)))/len(t)
    freqs=np.linspace(0,len(t)/T,len(spectrum))
    axs[i].plot(freqs,spectrum,".")
    axs[i].set_title("sampling rate: %d"%(sampling_rate))

fig.tight_layout()

### Theoric explanation $(1\heartsuit\spadesuit)$ $(2\heartsuit\spadesuit)$

Continuous time signals $f$ on $[0,T]$ can be writed:
$$
f(t) =   \sum_{j\in \mathbb Z} \alpha_j   e^{ 2i\pi  \frac {j t} T   }
$$
Consider $u=(u_0,...,u_{N-1})$ a vector which is a sampling of $f$, so:
$$
 u_n = f(n \frac {T}N)   \qquad \text{for } n=0,...,N-1
$$
Let us denote by $\beta_n$ its coordinates in the discrete-sinCos-basis of size $N$:
$$
u_n = \sum_{k=0}^{N-1}  \beta_k  e^{2i\pi   \frac { k    n } N }
$$


Let us find the relation between the infinite spectrum $(\alpha_j)_{j\in \mathbb Z}$ and the finite spectrum $(\beta_k)_{k\in 0..N-1}$

\begin{alignat}{1}
  u_n = f(n \frac {T}N) &=  \sum_{j \in \mathbb Z} \alpha_j   \exp ( 2i\pi   \frac {j n } N ) \\
  &=  \sum_{q \in \mathbb Z}   \sum_{k=0  }^{N-1}  \alpha_{k+qN}   \exp ( 2i\pi   \frac { (k+qN  )  n } N ) \\
    &=  \sum_{q \in \mathbb Z}   \sum_{k=0  }^{N-1}  \alpha_{k+qN}   \exp ( 2i\pi   \frac { k    n } N ) \\
        &=    \sum_{k=0  }^{N-1}   \Big(  \sum_{q \in \mathbb Z}   \alpha_{k+qN}  \Big)    \exp ( 2i\pi   \frac { k    n } N ) \\
\end{alignat}

From the unicity of the decomposition, we get:
$$
  \beta_k =  \Big(  \sum_{q \in \mathbb Z}   \alpha_{k+qN}  \Big)
$$

Interpretation of the above formula: Imagine that the infinite spectrum $\alpha$ is a paper band where coefficients are printed. If you roll this band on itself, with a period $N$, and if you sum the coefficients which superpose because of the rolling, you get the finite spectrum $\beta$.

If all the $\alpha_j$ outside of $[-N/2, +N/2]$ are zero, this rolling+additionning is not destructive. But in the other case, the sumation can mix the frequecies producing alisings.  The information present on $f$ cannot be recover from $u$.


Of course, all this depend of $N$.  So, to avoid aliasing, we have to chose sampling rate suffisently large. For sound, sampling rates go fom 8000 Hz (very, very low quality)  to 192 000 Hz (very, very high quality). But the more usual one is 44100 Hz.


***To you:*** $(1\heartsuit\spadesuit)$ Explain why $44100 Hz$ is a reasonable choice, concidering that audible sounds goes from  20 Hz to 20 000 Hz.

---
***Answer:*** "With a given sampling-rate $\nu$ (ex: 44100Hz), the highest frequence that you can sample is $\frac \nu 2$". If we want to sample frequencies up to 20 000 Hz, we have to chose $\nu > 2\times 20\ 000 = 40\ 000$, so 44 100 Hz is fine.

---

***To you:***  $(2\heartsuit\spadesuit)$ During the previous code, we see a pick of the spectrum that rebound on the right, this is due to the Hermitian symetry and by the fact that we plot only the halph-amplitude spectrum. Remake this plots with the complete amplitude-spectrum to see the rolling. Help: use `np.fft.fft` in place of `np.fft.rfft`.  

---
***Bonus:*** ($5\star$)
The aliasing comes from the fact that when discretized, some sine/cosine functions will give the same vector. This is illustrated below for a sampling rate of 20.

As we can see, sine and cosine functions with frequencies 4, 24 and 44 have the exact same discretization.

There is also something interesting with "symmetrical" frequencies 16, 36, etc: the corresponding cosine functions have the same discretization than the previous ones (those with frequencies 4, 24, ...), while the sine functions have their discretization be the opposite. This is due to the hermitian symmetry of the specter of real functions.


In [None]:
nu = 20
T = 1
N = nu*T
t = np.linspace(0,T, 1000*T)
n = np.linspace(0,T, N, endpoint=False)

fig, ax = plt.subplots(6,2,figsize=(12,12),sharex=True,sharey=True)
fig.suptitle("Sampling rate : "+str(nu))

for q in range(3):
  freq = 4 + q*N

  ax[2*q,0].plot(t, np.cos(2*np.pi*freq*t), alpha = 0.5, label="real signal")
  ax[2*q,0].plot(n, np.cos(2*np.pi*freq*n), '.-', label="sampled signal")
  ax[2*q,0].set_title("cos de freq " + str(freq))
  ax[2*q,0].legend()

  ax[2*q,1].plot(t, np.sin(2*np.pi*freq*t), alpha = 0.5, label="real signal")
  ax[2*q,1].plot(n, np.sin(2*np.pi*freq*n), '.-', label="sampled signal")
  ax[2*q,1].set_title("sin de freq " + str(freq))
  ax[2*q,1].legend()

  freq = (q+1)*N - 4
  ax[2*q+1,0].plot(t, np.cos(2*np.pi*freq*t), alpha = 0.5, label="real signal")
  ax[2*q+1,0].plot(n, np.cos(2*np.pi*freq*n), '.-', label="sampled signal")
  ax[2*q+1,0].set_title("cos de freq " + str(freq))
  ax[2*q+1,0].legend()

  ax[2*q+1,1].plot(t, np.sin(2*np.pi*freq*t), alpha = 0.5, label="real signal")
  ax[2*q+1,1].plot(n, np.sin(2*np.pi*freq*n), '.-', label="sampled signal")
  ax[2*q+1,1].set_title("sin de freq " + str(freq))
  ax[2*q+1,1].legend()

Discretization of sine and cosine functions with other frequencies can be seen below. Cases 0, 1, 10, 19, 20 are interesting to understand what is going on.

In [None]:
from matplotlib import animation, rc
from IPython.display import HTML

nu = 20
T = 1
N = nu*T
t = np.linspace(0,T, 1000*T)
n = np.linspace(0,T, N, endpoint=False)

# animation function: this is called sequentially
def animate(k):
  line1.set_data(t, np.cos(2*np.pi*k*t))
  line2.set_data(n, np.cos(2*np.pi*k*n))
  ax[0].set_title("cos de freq " + str(k))

  line3.set_data(t, np.sin(2*np.pi*k*t))
  line4.set_data(n, np.sin(2*np.pi*k*n))
  ax[1].set_title("sin de freq " + str(k))
  return (line1,line2,line3,line4,)

#Préparation du plot
fig, ax = plt.subplots(1,2,figsize=(10,4),sharey=True)
plt.close()
ax[0].set_xlim(( 0, 1))
ax[0].set_ylim(( -1, 1))
ax[1].set_xlim(( 0, 1))
ax[1].set_ylim(( -1, 1))

line1, = ax[0].plot([], [], alpha = 0.5)
line2, = ax[0].plot([], [], '.-')
line3, = ax[1].plot([], [], alpha = 0.5)
line4, = ax[1].plot([], [], '.-')

#Création et affichage de l'animation
anim = animation.FuncAnimation(fig, animate, frames=44, blit=True)

rc('animation', html='jshtml')
anim

Thus, with a sampling rate of 20, the signal $f : t \mapsto 2\cos(22\times2\pi t) + 3\sin(13\times2\pi t)$ will have the exact same discretization as the signal $g : t \mapsto 2\cos(2\times2\pi t) - 3\sin(7\times2\pi t)$ (see below).

In [None]:
fig,ax = plt.subplots(1,2,figsize=(10,4),sharey=True)

ax[0].set_title("f function and its discretization")
ax[0].plot(t, 2*np.cos(22*2*np.pi*t) + 3*np.sin(13*2*np.pi*t), alpha = 0.5, label="real signal")
ax[0].plot(n, 2*np.cos(22*2*np.pi*n) + 3*np.sin(13*2*np.pi*n), '.-', label="sampled signal")
ax[0].legend()

ax[1].set_title("g function and its discretization")
ax[1].plot(t, 2*np.cos(2*2*np.pi*t) - 3*np.sin(7*2*np.pi*t), alpha = 0.5, label="real signal")
ax[1].plot(n, 2*np.cos(2*2*np.pi*n) - 3*np.sin(7*2*np.pi*n), '.-', label="sampled signal")
ax[1].legend();

### To recover a signal well sampled $(2\heartsuit\spadesuit)$ $(3\heartsuit\spadesuit)$



In [None]:
sampling_rate=60
t=np.linspace(0,T,sampling_rate*T)
signal_sampled=signal(t)

fig,ax=plt.subplots(figsize=(10,1))
plt.plot(t,signal_sampled,".-");

 Now, we will make upsampling  (=expansion= interpolation) on the sampled-signal  above, which means that you will create a sampled-signal smoother, and closer of the orignal signal.
  
  
  
 You do not know its analytic expression of the signal, but you know:

*  That the sampling rate of the sample is 60Hz.
*  The orignal signal does not contains frequencies greater than 30.


***To you:***

* Plot its halph-amplitude-spectrum with the good xlabels $(2\heartsuit\spadesuit)$.

* $(3\heartsuit\spadesuit)$ Modyfie the half-spectrum, and use `np.fft.irfft` to obtain a signal with a sample-rate as big as you want.    



In [None]:
freq = np.linspace(0, sampling_rate/2, (T*sampling_rate)//2 + 1)
half_spectrum = np.fft.rfft(signal_sampled)

sampling_rate_new = 240
half_spectrum_ext = np.concatenate([half_spectrum, np.zeros(sampling_rate_new - sampling_rate)])
signal_rec = np.fft.irfft(half_spectrum_ext)*sampling_rate_new/sampling_rate

fig, ax = plt.subplots(2,1,figsize=(12,10))

ax[0].set_title("Half amplitude spectrum")
ax[0].set_xlabel("Frequency (Hz)")
ax[0].plot(freq, np.abs(half_spectrum),'.')

ax[1].set_title("Signal")
ax[1].set_xlabel("Time (s)")
ax[1].plot(t[t<1], signal_sampled[t<1], label="original sampling")
t_new = np.linspace(t[0], t[-1], len(signal_rec))
ax[1].plot(t_new[t_new<1], 2*signal_rec[t_new<1], '.-', label="upsampling")
ax[1].legend();

### Pre-filtering $(4\star)$


***To you:*** You are a sound ingenier. You have recorded the famuous band "The beatles" in your studio. You have very good digital (=numeric) recorder, which allows you a sampling rates of 88200 Hz.

But you have to engrave this song on a vynil with sampling-rate 22050 Hz.  Which precaution must you take?

---
***Réponse:*** We have to set high frequencies ( > 11 025 Hz) to zero to avoid aliasing.

---

***Bonus:*** Make a little program that illustrate this:

* create a simple sound (with some audible frequencies, ex: 440Hz), as a python function.
* add to it some very high frequency at 15000 Hz (some parasite).
* make a discretization of this signal with sampling-rate 88200 Hz. Imagine that this is your original recording.
* Now prefilter this sound and from this, create a discretization with sampling-rate 22050 Hz.


Remark: At the epoch of The beatles, recorders were not digital, but analog (=analogic in french). So the prefiltering was also made with analog filter make with some electronics.




In [None]:
def sound(t):
  return np.sin(2*np.pi*440*t) + 2*np.sin(2*np.pi*800*t) + 0.1*np.sin(2*np.pi*15000*t)

In [None]:
T = 1
sr = 88200#sampling rate
t = np.linspace(0,1,sr*T)
recording = sound(t)

In [None]:
fig, ax = plt.subplots(2,1,figsize=(10,8))
ax[0].plot(t[t<0.005], recording[t<0.005])

half_spectrum = np.fft.rfft(recording)
frequence = np.linspace(0, sr/2, len(half_spectrum))
ax[1].plot(frequence, np.abs(half_spectrum));

In [None]:
recording_compressed = np.fft.irfft(half_spectrum[frequence<11025])
t_comp = np.linspace(0,T, len(recording_compressed))
plt.plot(t_comp[t_comp<0.005], recording_compressed[t_comp<0.005]);

### To recover a signal badly sampled $(3\heartsuit\spadesuit)$ $(4\star)$


Imagine: You make a thesis in astronomy. Your teacher: Huber Rives, gives you a signal to analyze: it is a very low energy signal, so, difficult to detect. Instruments embarked on an observation satelite can get you a discretization of this signal, with a sample rate of 100Hz (only).  Here is a conversation:

* (you) But master, this sample-rate is really too low, probably there is aliasing.
* (Huber Rives) You're right, but you do not have to ignore that, theoriticaly,  frequencies of such  a signal range from 60Hz to 90Hz
* (you) Oh yes master, I understand



(perhaps did you not really understand imediatly, but you will think about it)








In [None]:
signal_badly_sampled=np.loadtxt("assets_signal/signalFromSpace.txt")
"""we know that the duration is 1s"""
T=1
N=len(signal_badly_sampled)
freq_ech=N/T
fig,ax=plt.subplots(figsize=(10,1))
ax.plot(np.linspace(0,T,N),signal_badly_sampled,".-")
ax.set_title("signal_badly_sampled")
ax.set_xlabel("time in sec");

In [None]:
spec_rolled=np.fft.fft(signal_badly_sampled)
fig,ax=plt.subplots(figsize=(10,1))
freqs=np.linspace(0,freq_ech,len(spec_rolled))
ax.plot(freqs,np.abs(spec_rolled),".-")
ax.set_title("half amplitude spectrum (rolled)")
ax.set_xlabel("freqs in Hz");

***To you:***

* $(3\heartsuit\spadesuit)$ On a paper, try to draw the good infinite amplitude-spectrum of the orignia signal: recall that is is an even function. The constrain is: once rolled with a period of 100, its give you the above drawing. To validate, reproduce very schematically this drawing with python (e.g. with a `plt.bar`)


* $(4\star)$ Modify the spectrum, inverse this modification, to recover the good signal.

In [None]:
fig,ax = plt.subplots(figsize=(10,1))
ax.set_title("full amplitude spectrum")
ax.bar([-100, -78, -71, 71, 78, 100], [0, 1, 2, 2, 1, 0]);

In [None]:
fig,ax = plt.subplots(figsize=(5,1))
ax.set_title("rolled amplitude spectrum")
ax.bar([0,22,29,71,78,100], [0, 1, 2, 2, 1, 0]);

In [None]:
new_sr = 1000
new_spec = np.concatenate([ np.zeros(50), spec_rolled[N//2:], np.zeros(new_sr//2-50-N//2) ])
original_signal = np.fft.irfft(new_spec)*new_sr/N

fig,ax=plt.subplots(2,1,figsize=(10,4))
fig.suptitle("original half amplitude spectrum and original signal (upsampled)")
ax[0].plot(np.linspace(0,len(new_spec),len(new_spec)), np.abs(new_spec))
ax[1].plot(np.linspace(0,T,len(original_signal)),original_signal);

Checking :

In [None]:
# recovered signal with sampling rate 100 :
downsampled_signal = original_signal[np.arange(0,new_sr,new_sr//N)]

spec_rolled_bis=np.fft.fft(downsampled_signal)
fig,ax=plt.subplots(figsize=(10,1))
freqs=np.linspace(0,freq_ech,len(spec_rolled_bis))
ax.plot(freqs,np.abs(spec_rolled_bis),".-")
ax.set_title("half amplitude spectrum (rolled)")
ax.set_xlabel("freqs in Hz");

### To provocate aliasing

In many scientif domaines, one plays with aliasing to observe signals  (including images):  some hight frequency waves, which would be invisible for our detectors, become detectable because of the rolling of the spectrum.

To create aliasing, one can provocate multiplicative or additive interference, one can also use several distant detectors, which allows to make interference between a signal and a decay version of itself.


***To you:***

* $1\heartsuit$ Why can "sampling" can be seen as an interfecrence?
* $1\heartsuit$ What is the purpose of the grid bellow. This photo,  whose scale is $10\mu m \times 10\mu m$, was taken [here.](
https://www.nanosurf.com/en/application/photoresin-interference-grid)



In [None]:
import IPython
IPython.display.Image("assets_signal/nanoGrid.png")