<a target="_blank" href="https://colab.research.google.com/github/TUIlmenauAMS/Videocoding/blob/main/LecturesJupterNotebooks/Lecture5/Lecture5.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

<font size="8" color ="Brown"><center>
# Lecture 5, Video Coding Transform Coding, Synthesis
        
</center></font>
<br>

<p style="line-height:1.5">
<font size="6">
    
Last time we saw how we can apply transform coding to our image blocks, and how to invert the transform for the decoding process. We also saw that we can interpret the transform as a special case of a filter bank. For the analysis the filters appear as columns of the transform matrix, in **reverse order**. How does the equivalent filter bank look like for the synthesis?
<br><br>
For the synthesis we had the equation.<br>
$$
\boldsymbol x =\boldsymbol T^{-T} \cdot \boldsymbol y \cdot \boldsymbol T^{-1}
$$
Now if we look at a synthesis filter bank, we first **upsample** our subband signals by N, and then **filter** them with our synthesis filters.
The following picture shows a 1-dimensional synthesis filter bank (in the decoder) with critical sampling ($m$ here is the block index, the y correspond to one row or one column):   
![Lecture5-1.PNG](https://github.com/TUIlmenauAMS/Videocoding/blob/main/LecturesJupterNotebooks/Lecture5/Img-Lecture5/Lecture5-1.PNG?raw=1)    
<br>
In this block diagram we can see that we obtain the impulse response of filter $h_k$ by inputing an impulse in subband $y_k$, and zeros in all other subbands,<br>
$y_k(0)=1$, and 0 else.<br>
Now apply this impulse function as vector $\boldsymbol y$ to our transform implementation, and we get<br>
$$\boldsymbol y. \boldsymbol T^{-1}$$
We can view this synthesis transform as a matrix consisting of row vectors,<br>
$$
\boldsymbol T ^{-1}=\left[ \matrix{\boldsymbol t_0 \cr
\boldsymbol t_1 \cr
....\cr
\boldsymbol t_{N-1}} \right]
$$
Hence, the result is the k'th row of the synthesis matrix,
<br>
$$
\boldsymbol y \cdot \boldsymbol T^{-1}=\boldsymbol t_k
$$
<br>
Now we see that the **impulse response** of our k'th subband filter of our transforn is its **k'th row, not reversed in space** (or time), **unlike the analysis**.<br>
Again we find that we have filters of exactly the transform length N. Observe that for $\boldsymbol T^{-T}$ we obtain the equivalent impulse resonses in the columns (since it is the transpose matrix).<br>
<br>
**In Conclusion:** We obtain the equivalent synthesis filter bank impulse response by reading out **each row** of the synthesis transform Matrix, now **not reversed**.
<br><br>
To know the equivalent impulse response has the advantage that we can **anlyse them**, for instance by **looking at their frequency responses** by applying the Fourier transform (freqz) to them.
<br>
<br>    
<center> **Python Example:** </center>
<br>
Take again the DCT2 as our transform matrix,
</font></p>    

In [None]:
import numpy as np
import scipy.fftpack
I=np.eye(4)
T=scipy.fftpack.dct(I,norm='ortho')
T=np.matrix(T)
T

matrix([[ 0.5       ,  0.65328148,  0.5       ,  0.27059805],
        [ 0.5       ,  0.27059805, -0.5       , -0.65328148],
        [ 0.5       , -0.27059805, -0.5       ,  0.65328148],
        [ 0.5       , -0.65328148,  0.5       , -0.27059805]])

<p style="line-height:1.5">
<font size="6">The inverse matrix then is used for the inverse transform (in the decoder),
</font></p>

In [None]:
Tinv=T.I
Tinv


matrix([[ 0.5       ,  0.5       ,  0.5       ,  0.5       ],
        [ 0.65328148,  0.27059805, -0.27059805, -0.65328148],
        [ 0.5       , -0.5       , -0.5       ,  0.5       ],
        [ 0.27059805, -0.65328148,  0.65328148, -0.27059805]])

<p style="line-height:1.5">
<font size="6"><br>
Observe that the inverse matrix is identical to its transpose. This is because the DCT2 transform matrix is **Orthogonal** (the transpose needs a factor to become the inverse) or **Orthonormal** (the transpose needs no factor to become the inverse).<br><br>
Here we can see that, for instance, the first row contains the impulse response of the **first synthesis** filter, $h_0(n)$, which here is indeed **identical** to the **first analysis** filter.<br>
Or, take the $2^{nd}$ subband, $h_1(n)$, as an example. In the analysis part, we have the equivalent impulse response of the $2^{nd}$ analysis filter as
<br><br></font></p>

In [None]:
h1=np.flipud(T[:,1])
h1

matrix([[-0.65328148],
        [-0.27059805],
        [ 0.27059805],
        [ 0.65328148]])

<p style="line-height:1.5">
<font size="6"><br>
The $2^{nd}$ synthesis filter $h_1(n)$is obtained with<br><br>
</font></p>

In [None]:
Tinv[1,:]

matrix([[ 0.65328148,  0.27059805, -0.27059805, -0.65328148]])

<p style="line-height:1.5">
<font size="6">
    <br>
We can see that this $2^{nd}$ **synthesis** impulse response (a row vector) is the **time reverse** version of the $2^{nd}$ **analysis** impulse response (a column vector). Observe that this is generally true for **orthogonal transform** matrices (where the synthesis matrix is the (conjugate) transpose of the analysis matrix)!<br><br>

Taking again our subband block $\boldsymbol y$ from our previous example,<br><br></font></p>

In [None]:
x=0.3*np.ones((4,4))
y=np.dot(np.dot(np.transpose(T),x),T)
y

matrix([[1.2, 0. , 0. , 0. ],
        [0. , 0. , 0. , 0. ],
        [0. , 0. , 0. , 0. ],
        [0. , 0. , 0. , 0. ]])

<p style="line-height:1.5">
<font size="6">
    <br>we get the inverse transform with<br><br>
</font></p>

In [None]:
np.dot(np.dot(np.transpose(Tinv),y),Tinv)


matrix([[0.3, 0.3, 0.3, 0.3],
        [0.3, 0.3, 0.3, 0.3],
        [0.3, 0.3, 0.3, 0.3],
        [0.3, 0.3, 0.3, 0.3]])

<p style="line-height:1.5">
<font size="6">
<br>Here we see that we indeed get our original block back!<br>
(Observe that inv(T') = inv(T)' ).<br>
<br>
Also Observe that the beauty of this approach is, that we can deal with each block of our image **independently** of the others blocks, meaning there is no overlap of filters into neighbouring blocks!
<br><br>

**Place of the Transform in Image/Video Coding**
<br>
<br>
The following two block diagrams show where in an image or video coder the transform appears. Observe that it is applied after the color transform and the low pass (LP) filtering  and down sampling. DC$_r$ and  DC$_b$ denote the downsampled color components, and the subscript "dec" denotes the decoder versions.
<br>
![Lecture5-2.JPG](https://github.com/TUIlmenauAMS/Videocoding/blob/main/LecturesJupterNotebooks/Lecture5/Img-Lecture5/Lecture5-2.jpg?raw=1)
<br>
<br>
![Lecture5-3.JPG](https://github.com/TUIlmenauAMS/Videocoding/blob/main/LecturesJupterNotebooks/Lecture5/Img-Lecture5/Lecture5-3.jpg?raw=1)
</font></p>