-
Notifications
You must be signed in to change notification settings - Fork 0
/
quant-matrix.tex
325 lines (289 loc) · 11.9 KB
/
quant-matrix.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
\label{quantmatrices}
This annex specifies the default quantisation matrices to be used
in the low delay syntax and provides an informative description of quantisation
matrix design principles and of quantiser selection in both the core
and low-delay syntax.
\subsection{Quantisation matrices (low delay syntax)}
\label{defaultquantmatrices}
This section defines default quantisation matrices to be used
for the quantisation of slice coefficients in the low-delay syntax.
The following tables define matrices for $\TransformDepth\leq 4$.
Values of $\TransformDepth$ not present in the tables
in this section shall require a custom matrix to be encoded,
as per Section \ref{sliceparams}. Informative advice for
constructing quantisation matrices based on noise power
conservation and perceptual weighting is given in
Annex \ref{custommatrices}.
\begin{table}[!ht]
\centering
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline
\multicolumn{2}{|c|}{\cellcolor[gray]{0.75}}& \multicolumn{5}{|c|}{\cellcolor[gray]{0.75}{$\TransformDepth$}} \\
\hline
Level & Orientation & 0 & 1 & 2 & 3 & 4 \\
\hline
0 & LL & 0 & 5 & 5 & 5 & 5\\
\hline
1 & HL,LH, HH & - & 3, 3, 0 & 3, 3, 0 & 3, 3, 0 & 3, 3, 0 \\
\hline
2 & HL,LH, HH & - & - & 4, 4, 1 & 4, 4, 1 & 4, 4, 1 \\
\hline
3 & HL,LH, HH & - & - & - & 5, 5, 2 & 5, 5, 2 \\
\hline
4 & HL,LH, HH & - & - & - & - & 6, 6, 3 \\
\hline
\end{tabular}
\caption{Default quantisation matrices for $\WaveletIndex==0$ (Deslauriers-Dubuc (9,7))
\label{table:qm0}}
\end{table}
\begin{table}[!ht]
\centering
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline
\multicolumn{2}{|c|}{\cellcolor[gray]{0.75}}& \multicolumn{5}{|c|}{\cellcolor[gray]{0.75}{$\TransformDepth$}} \\
\hline
Level & Orientation & 0 & 1 & 2 & 3 & 4 \\
\hline
0 & LL & 0 & 4 & 4 & 4 & 4\\
\hline
1 & HL,LH, HH & - & 2, 2, 0 & 2, 2, 0 & 2, 2, 0 & 2, 2, 0 \\
\hline
2 & HL,LH, HH & - & - & 4, 4, 2 & 4, 4, 2 & 4, 4, 2 \\
\hline
3 & HL,LH, HH & - & - & - & 5, 5, 3 & 5, 5, 3 \\
\hline
4 & HL,LH, HH & - & - & - & - & 7, 7, 5 \\
\hline
\end{tabular}
\caption{Default quantisation matrices for $\WaveletIndex==1$ (LeGall (5,3))
\label{table:qm1}}
\end{table}
\begin{table}[!ht]
\centering
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline
\multicolumn{2}{|c|}{\cellcolor[gray]{0.75}}& \multicolumn{5}{|c|}{\cellcolor[gray]{0.75}{$\TransformDepth$}} \\
\hline
Level & Orientation & 0 & 1 & 2 & 3 & 4 \\
\hline
0 & LL & 0 & 5 & 5 & 5 & 5\\
\hline
1 & HL,LH, HH & - & 3, 3, 0 & 3, 3, 0 & 3, 3, 0 & 3, 3, 0 \\
\hline
2 & HL,LH, HH & - & - & 4, 4, 1 & 4, 4, 1 & 4, 4, 1 \\
\hline
3 & HL,LH, HH & - & - & - & 5, 5, 2 & 5, 5, 2 \\
\hline
4 & HL,LH, HH & - & - & - & - & 6, 6, 3 \\
\hline
\end{tabular}
\caption{Default quantisation matrices for $\WaveletIndex==2$ (Deslauriers-Dubuc (13,7)))
\label{table:qm2}}
\end{table}
\begin{table}[!ht]
\centering
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline
\multicolumn{2}{|c|}{\cellcolor[gray]{0.75}}& \multicolumn{5}{|c|}{\cellcolor[gray]{0.75}{$\TransformDepth$}} \\
\hline
Level & Orientation & 0 & 1 & 2 & 3 & 4 \\
\hline
0 & LL & 0 & 8 & 12 & 16 & 20\\
\hline
1 & HL,LH, HH & - & 4, 4, 0 & 8, 8, 4 & 12, 12, 8 & 16, 16, 12 \\
\hline
2 & HL,LH, HH & - & - & 4, 4, 0 & 8, 8, 4 & 12, 12, 8 \\
\hline
3 & HL,LH, HH & - & - & - & 4, 4, 0 & 8, 8, 4 \\
\hline
4 & HL,LH, HH & - & - & - & - & 4, 4, 0 \\
\hline
\end{tabular}
\caption{Default quantisation matrices for $\WaveletIndex==3$ (Haar with no shift))
\label{table:qm3}}
\end{table}
\begin{table}[!ht]
\centering
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline
\multicolumn{2}{|c|}{\cellcolor[gray]{0.75}}& \multicolumn{5}{|c|}{\cellcolor[gray]{0.75}{$\TransformDepth$}} \\
\hline
Level & Orientation & 0 & 1 & 2 & 3 & 4 \\
\hline
0 & LL & 0 & 8 & 8 & 8 & 8\\
\hline
1 & HL,LH, HH & - & 4, 4, 0 & 4, 4, 0 & 4, 4, 0 & 4, 4, 0 \\
\hline
2 & HL,LH, HH & - & - & 4, 4, 0 & 4, 4, 0 & 4, 4, 0 \\
\hline
3 & HL,LH, HH & - & - & - & 4, 4, 0 & 4, 4, 0 \\
\hline
4 & HL,LH, HH & - & - & - & - & 4, 4, 0 \\
\hline
\end{tabular}
\caption{Default quantisation matrices for $\WaveletIndex==4$ (Haar with single shift per level))
\label{table:qm4}}
\end{table}
\begin{table}[!ht]
\centering
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline
\multicolumn{2}{|c|}{\cellcolor[gray]{0.75}}& \multicolumn{5}{|c|}{\cellcolor[gray]{0.75}{$\TransformDepth$}} \\
\hline
Level & Orientation & 0 & 1 & 2 & 3 & 4 \\
\hline
0 & LL & 0 & 0 & 0 & 0 & 0\\
\hline
1 & HL,LH, HH & - & 4, 4, 8 & 4, 4, 8 & 4, 4, 8 & 4, 4, 8 \\
\hline
2 & HL,LH, HH & - & - & 8, 8, 12 & 8, 8, 12 & 8, 8, 12 \\
\hline
3 & HL,LH, HH & - & - & - & 13, 13, 17 & 13, 13, 17 \\
\hline
4 & HL,LH, HH & - & - & - & - & 17, 17, 21 \\
\hline
\end{tabular}
\caption{Default quantisation matrices for $\WaveletIndex==5$ (Fidelity))
\label{table:qm6}}
\end{table}
\begin{table}[!ht]
\centering
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline
\multicolumn{2}{|c|}{\cellcolor[gray]{0.75}}& \multicolumn{5}{|c|}{\cellcolor[gray]{0.75}{$\TransformDepth$}} \\
\hline
Level & Orientation & 0 & 1 & 2 & 3 & 4 \\
\hline
0 & LL & 0 & 3 & 3 & 3 & 3\\
\hline
1 & HL,LH, HH & - & 1, 1, 0 & 1, 1, 0 & 1, 1, 0 & 1, 1, 0 \\
\hline
2 & HL,LH, HH & - & - & 4, 4, 2 & 4, 4, 2 & 4, 4, 2 \\
\hline
3 & HL,LH, HH & - & - & - & 6, 6, 5 & 6, 6, 5 \\
\hline
4 & HL,LH, HH & - & - & - & - & 9, 9, 7 \\
\hline
\end{tabular}
\caption{Default quantisation matrices for $\WaveletIndex==6$ (Daubechies (9,7))
\label{table:qm7}}
\end{table}
\clearpage
\begin{informative*}
\subsection{Quantisation matrix design and quantiser selection (Informative)}
\label{qmatrixdesign}
This section provides an informative guide to the principles used to design the default
quantisation matrix
\subsubsection{Noise power normalisation}
\label{noisenorm}
The quantisation matrices defined in the preceding section are designed to counteract the
differential power gain of the various wavelet filters, so that quantisation noise from
each subband is weighted equally in terms of its contribution to noise power when transformed
back into the picture domain. Let $\alpha$ and $\beta$ represent the noise gain factors of
the low-pass and high-pass wavelet filters used in wavelet decomposition. In a single level of
wavelet decomposition, quantisation noise in each of the four subbands is therefore weighted by the factors shown in Figure \ref{fig:onelevelweight}.
\end{informative*}
\setlength{\unitlength}{1em}
\begin{figure}[!h]
\centering
\begin{picture}(20,27)
\put(0,5){\line(1,0){20}}
\put(0,5){\line(0,1){20}}
\put(20,5){\line(0,1){20}}
\put(20,25){\line(-1,0){20}}
\put(10,5){\line(0,1){20}}
\put(0,15){\line(1,0){20}}
\put(3,19.5){\text{\Large LL -- $\alpha^2$}}
\put(3,9.5){\text{\Large LH -- $\alpha\beta$}}
\put(13,19.5){\text{\Large HL -- $\alpha\beta$}}
\put(13,9.5){\text{\Large HH -- $\beta^2$}}
\end{picture}
\caption{Subband weights for a 1-level decomposition}\label{fig:onelevelweight}
\end{figure}
\begin{informative*}
For higher levels of decomposition, these subband weighting factors iterate
in the same manner as the wavelet transform itself. For example, with a two-level
decomposition, the first level LL band, with weight $\alpha^2$ is further decomposed
to give four more bands with weights as for the 1-level decomposition, but multiplied
by $\alpha^2$. This yields the weights shown in Figure \ref{fig:twolevelweight}.
\end{informative*}
\setlength{\unitlength}{1em}
\begin{figure}[!h]
\centering
\begin{picture}(30,40)
\put(0,5){\line(1,0){30}}
\put(0,5){\line(0,1){30}}
\put(30,5){\line(0,1){30}}
\put(30,35){\line(-1,0){30}}
\put(15,5){\line(0,1){30}}
\put(0,20){\line(1,0){30}}
\put(5.5,12){\text{\Large LH -- $\alpha\beta$}}
\put(20.5,27){\text{\Large HL -- $\alpha\beta$}}
\put(20.5,12){\text{\Large HH -- $\beta^2$}}
\put(7.5,20){\line(0,1){15}}
\put(0,27.5){\line(1,0){15}}
\put(2,31){\text{\Large LL -- $\alpha^4$}}
\put(2,23.5){\text{\Large LH -- $\alpha^3\beta$}}
\put(9,31){\text{\Large HL -- $\alpha^3\beta$}}
\put(9,23.5){\text{\Large HH -- $\alpha^2\beta^2$}}
\end{picture}
\caption{Subband weights for a 2-level decomposition}\label{fig:twolevelweight}
\end{figure}
\begin{informative*}
In this specification, wavelet synthesis filters have been defined in terms of lifting stages,
which are filters operating on subsampled data. Wavelet filters are more traditionally
represented in terms of an iterated binary polyphase filter bank: the relationship between
these representation is described in Annex \ref{lifting}. The factors $\alpha$ and $\beta$
are most easily computed from the filter bank representation. In this case $\alpha$ is either
the RMS power gain of the low-pass synthesis filter, or the {\em reciprocal} of the RMS power
gain of the low-pass analysis filter; and $\beta$ is the RMS power gain of the high-pass
synthesis filter of the reciprocal of the RMS power gain of the high-pass analysis filter.
Thus, in the terminology of Annex \ref{lifting},
$\alpha=\dfrac{1}{(\sum_n h(n)^2)^{\frac{1}{2}}}$ or
$\alpha=(\sum_n \tilde{h}(n)^2)^{\frac{1}{2}}$
and
$\beta=\dfrac{1}{(\sum_n g(n)^2)^{\frac{1}{2}}}$ or
$\beta=(\sum_n \tilde{g}(n)^2)^{\frac{1}{2}}$
These alternative definitions arise because the wavelet filters defined in this specification
are not orthogonal, but technically {\em biorthogonal} and so, strictly speaking, there is
not power addition of the quantisation noise in each subband. The values used for quantisation
matrices have been computed from the analysis rather than the synthesis filters, as this yields
better compression results in practice.
Note also that these factors must also take into account the shift factors used to add accuracy
bits prior to each wavelet decomposition stage. For a filter shift of $d$, $\alpha$ and
$\beta$ are each multiplied by $2^{-d/2}$.
Given a subband weighting factor $w$, a quantisation offset for that subband may be defined
as $4*\log_2(w)$ rounded to the nearest integer. These offsets are then normalised so as
to be non-negative, to produce the tables of the preceding section.
\subsubsection{Custom quantisation matrices}
\label{custommatrices}
Custom matrices may be defined that take into account not only noise power normalisation
but also perceptual weighting based on spatial frequency. Additional multiplicative factors
may be computed for each subband, which produce a matrix of quantisation offsets which may
then be added to the default unweighted quantisation matrices to produce a weighted quantisation
matrix.
An example perceptual weighting may be constructed from the CCIR 959 Contrast Sensitivity
Function (CSF). This is a function $csf(s)$ which produces a value representing the
sensitivity to detail at a given normalised spatial frequency $s$. For luminance, it is defined
by
\[csf(s)=0.255*(1+0.2561*s^2)^{-0.75}\]
Assuming an isotropic response, we may form a 2-d perceptual weighting function on
horizontal and vertical spatial frequencies $x_s,y_s$ by
\begin{eqnarray*}
c(x_s,y_s) & = & \dfrac{1}{csf((xs^2+ys^2)^{\frac{1}{2}})} \\
& = & 0.255*(1+0.2561*(x_s^2+y_s^2))^{0.75}
\end{eqnarray*}
Each subband in a wavelet decomposition represents a subset of spatial frequencies according
to level and orientation, partitioning the spatial frequency domain as per Figure \ref{fig:orientlevel}.
Note that this partitioning is un-normalised, since output pictures (and their compression artefacts) may
be viewed at a range of distances.
Accordingly we may pick a representative, un-normalised horizontal and vertical spatial frequency $(f_x(b),f_y(b))$ -- perhaps the middle frequency of the band. For example, an LH band $b$ at level 1 in a 1-level
decomposition will have mid frequency at $(pw/4,3*ph/4)$ where $ph$ and $pw$ are the padded
width and height of the picture (Section \ref{subbandwidthheight}). This may be turned into a true
spatial frequency by normalising by the number of horizontal and vertical cycles per degree the output
pictures will subtend at the target viewing distance and aspect ratio:
\[ (f_x(b)/cpd_x,f_y(b)/cpd_y)\]
and this value may be fed into the weighting function to get a value $c(b)$. The appropriate
quantisation offset for that subband is then $4*\log_2(c(b))$, which may be used to define a modified
quantisation matrix.
\end{informative*}