-
Notifications
You must be signed in to change notification settings - Fork 0
/
picture-dec.tex
192 lines (157 loc) · 7.21 KB
/
picture-dec.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% - This chapter defines the overall process - %
% - for decoding a picture - %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\label{picturedec}
This section defines the processes for decoding a picture from a Dirac stream.
Picture decoding depends upon correctly parsing the stream, and decoding operations
are dependent on decoding the sequence header and picture metadata
(Section \ref{sequenceheader} and \ref{picturesyntax}) and unpacking the coefficient
and motion data (Sections \ref{wltunpacking} and \ref{motiondec}).
\subsection{Overall picture decoding process}
\label{overallpicturedec}
Picture data from the current picture being decoded is stored in the $\CurrentPicture$ state
variable, which is a map with labels $\PicNum$, and $Y$, $C1$ and $C2$ representing
luma and chroma data.
After decoding the decoded picture is returned to the decoding application.
The $picture\_decode()$ process shall be invoked after parsing the $picture\_parse()$ process and shall be defined as follows:
\begin{pseudo}{picture\_decode}{}
\bsCODE{\CurrentPicture=\{\}}
\bsCODE{\CurrentPicture[\PicNum]=\PictureNumber}
\bsIF{is\_ref(()}
\bsCODE{ref\_picture\_remove()}{\ref{refbuffer}}
\bsEND
\bsIF{\ZeroResidual==\false}
\bsCODE{inverse\_wavelet\_transform()}{\ref{idwt}}
\bsELSE
\bsCODE{\CurrentPicture[Y]=\bf{0}}
\bsCODE{\CurrentPicture[C1]=\bf{0}}
\bsCODE{\CurrentPicture[C2]=\bf{0}}
\bsEND
\bsIF{is\_inter()}
\bsCODE{ref1=get\_ref(\RefOneNum)}{\ref{refbuffer}}
\bsIF{num\_refs()==2}{\ref{parsecodevalues}}
\bsCODE{ref2=get\_ref(\RefTwoNum)}{\ref{refbuffer}}
\bsEND
\bsCODE{motion\_compensate(ref1[Y], ref2[Y], \CurrentPicture[Y], Y)}{\ref{motioncompensate}}
\bsCODE{motion\_compensate(ref1[C1], ref2[C1], \CurrentPicture[C1], C1)}{\ref{motioncompensate}}
\bsCODE{motion\_compensate(ref1[C2], ref2[C2], \CurrentPicture[C2], C2)}{\ref{motioncompensate}}
\bsEND
\bsCODE{clip\_picture()}{\ref{pictureclip}}
\bsIF{is\_ref()}
\bsCODE{ref\_picture\_add()}{\ref{refbuffer}}
\bsEND
\bsCODE{offset\_output\_picture(\CurrentPicture)}{\ref{videooutput}}
\bsRET{\CurrentPicture}
\end{pseudo}
\subsection{Picture reordering}
Picture numbers within the stream may not be in numerical order, and
subsequent reordering may be required: the size
of the decoded picture buffer required to perform any such reordering may be
specified as part of the application profile and level (Annex~\ref{profilelevel}).
\subsection{Random access}
\label{randomaccess}
Sequence headers represent safe entry points for decoding a sequence.
An accessible picture (with reference to a given sequence header)
shall be defined as a picture decodeable without dependence on
to data prior to the sequence header in coded order.
Accessibility should normally imply that each accessible picture has
no reference picture prior to the sequence header, and no chain of
references leading to a reference picture prior to the sequence header.
A given level may allow this condition to
be relaxed (for example, in P-only coding where unavailable references
may be substituted for by zero pictures), but where no specific provision
to the contrary is specified in an applicable level or profile, it shall
apply.
The first picture data unit after a sequence header shall be called the
access picture and shall be accessible with respect to the sequence header.
It should normally be an intra picture. If the sequence contains inter pictures
it should normally be an intra reference picture.
All picture data units subsequent to the sequence header in coded order
shall also be accessible with respect to the sequence header
if their picture numbers are greater than
or equal to that of the access picture. The access picture therefore represents
a temporal access point into the sequence.
\begin{informative}
If a sequence satisfies a maximum reordering depth constraint
(Annex~\ref{picturereordering})
of size $N$ all pictures more than $N$ pictures later than the sequence header will
have larger picture numbers than the first picture after the sequence header, and hence will be accessible. A reordering depth constraint thus implies that after a
sequence header at most $N$ pictures will need to be discarded before all pictures are decodeable.
\end{informative}
\subsection{Reference picture buffer management}
\label{refbuffer}
This section specifies how the Dirac stream data shall be used to manage the reference
picture buffer $\RefBuffer$. The reference picture buffer has a maximum size of
$\RefBufferSize$ elements, as set in the applicable level (Annex~\ref{profilelevel}).
The $ref\_picture\_remove()$ process shall be defined as
follows:
\begin{pseudo}{ref\_picture\_remove}{}
\bsCODE{n=\RetiredPicture}
\bsFOR{k=0}{\RefBufferSize-1}
\bsIF{\RefBuffer[k][\PicNum]==n}
\bsFOR{j=k}{\RefBufferSize-2}
\bsCODE{\RefBuffer[j]=\RefBuffer[j+1]}
\bsEND
\bsCODE{\RefBufferSize -= 1}
\bsEND
\bsEND
\end{pseudo}
The $get\_ref(n)$ function shall returns the reference picture in the buffer with
picture number $n$. If there is no such picture it shall return an all-zero picture.
The $ref\_picture\_add()$ process for adding pictures to the reference picture
buffer shall proceed according to the following rules:
{\bf Case 1.} If the reference picture buffer is not full i.e. has fewer than $\MaxRefBufferSize$ elements,
then add $\CurrentPicture$ to the end of the buffer.
{\bf Case 2.} If the reference picture is full i.e. it has $\MaxRefBufferSize$ elements, then remove the
first (i.e. oldest) element of the buffer, $\RefBuffer[0]$, set
\[\RefBuffer[i] = \RefBuffer[i+1] \]
for $i=0$ to $\RefBufferSize-2$, and set the last element $\RefBuffer[\RefBufferSize-1]$ equal to
a copy of $\CurrentPicture$.
\input{idwt}
\subsection{Motion compensation}
\input{mc}
\subsection{Clipping}
\label{pictureclip}
Picture data must be clipped prior to being output or being
used as a reference:
\begin{pseudo}{clip\_picture}{}
\bsFOREACH{c}{Y,C1,C2}
\bsCODE{clip\_component(\CurrentPicture[c])}
\bsEND
\end{pseudo}
\begin{pseudo}{clip\_component}{comp\_data,c}
\bsIF{c==Y}
\bsCODE{bit\_depth=\LumaDepth}
\bsELSE
\bsCODE{bit\_depth=\ChromaDepth}
\bsEND
\bsFOR{y=0}{\height(comp\_data)-1}
\bsFOR{x=0}{\width(comp\_data)-1}
\bsCODE{data = \clip(comp\_data[y][x], -2^{bit\_depth-1}, 2^{bit\_depth-1}-1)}
\bsEND
\bsEND
\end{pseudo}
\begin{informative}
Note that clipping is incorporated into motion compensation, so that strictly speaking additional
clipping is only required for intra pictures.
\end{informative}
\subsection{Video output ranges}
\label{videooutput}
Video output data ranges are deemed to be non-negative, so that the offset and excursion
values may be applied by subsequent processing. Since decoded video data is bipolar, it must be suitably offset before output:
\begin{pseudo}{offset\_output\_data}{picture\_data}
\bsFOREACH{c}{Y, C1, C2}
\bsIF{c==Y}
\bsCODE{bit\_depth=\LumaDepth}
\bsELSE
\bsCODE{bit\_depth=\ChromaDepth}
\bsEND
\bsCODE{comp=picture\_data[c]}
\bsFOR{y=0}{\height(comp)-1}
\bsFOR{x=0}{\width(comp)-1}
\bsCODE{comp[y][x]+=2^{bit\_depth-1}}
\bsEND
\bsEND
\bsEND
\end{pseudo}