picture-dec.tex

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% - This chapter defines the overall process  - %
% - for decoding a picture                    - % 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\label{picturedec}

This section defines the processes for decoding a picture from a Dirac stream.

Picture decoding depends upon correctly parsing the stream, and decoding operations
 are dependent on decoding the sequence header and picture metadata 
(Section \ref{sequenceheader} and \ref{picturesyntax}) and unpacking the coefficient 
and motion data (Sections \ref{wltunpacking} and \ref{motiondec}).

\subsection{Overall picture decoding process}
\label{overallpicturedec}

Picture data from the current picture being decoded is stored in the $\CurrentPicture$ state
variable, which is a map with labels $\PicNum$, and $Y$, $C1$ and $C2$ representing
luma and chroma data.

After decoding the decoded picture is returned to the decoding application.

The $picture\_decode()$ process shall be invoked after parsing the $picture\_parse()$ process and shall be defined as follows:

\begin{pseudo}{picture\_decode}{}
\bsCODE{\CurrentPicture=\{\}}
\bsCODE{\CurrentPicture[\PicNum]=\PictureNumber}
\bsIF{is\_ref(()}
    \bsCODE{ref\_picture\_remove()}{\ref{refbuffer}}
\bsEND
\bsIF{\ZeroResidual==\false}
    \bsCODE{inverse\_wavelet\_transform()}{\ref{idwt}}
\bsELSE
    \bsCODE{\CurrentPicture[Y]=\bf{0}}
    \bsCODE{\CurrentPicture[C1]=\bf{0}}
    \bsCODE{\CurrentPicture[C2]=\bf{0}}
\bsEND
\bsIF{is\_inter()}
    \bsCODE{ref1=get\_ref(\RefOneNum)}{\ref{refbuffer}}
    \bsIF{num\_refs()==2}{\ref{parsecodevalues}}
        \bsCODE{ref2=get\_ref(\RefTwoNum)}{\ref{refbuffer}}
    \bsEND
    \bsCODE{motion\_compensate(ref1[Y], ref2[Y],  \CurrentPicture[Y], Y)}{\ref{motioncompensate}}
    \bsCODE{motion\_compensate(ref1[C1], ref2[C1],  \CurrentPicture[C1], C1)}{\ref{motioncompensate}}
    \bsCODE{motion\_compensate(ref1[C2], ref2[C2],  \CurrentPicture[C2], C2)}{\ref{motioncompensate}}
\bsEND
\bsCODE{clip\_picture()}{\ref{pictureclip}}
\bsIF{is\_ref()}
    \bsCODE{ref\_picture\_add()}{\ref{refbuffer}}
\bsEND
\bsCODE{offset\_output\_picture(\CurrentPicture)}{\ref{videooutput}}
\bsRET{\CurrentPicture}
\end{pseudo}


\subsection{Picture reordering}
Picture numbers within the stream may not be in numerical order, and 
subsequent reordering may be required: the size  
of the decoded picture buffer required to perform any such reordering may be 
specified as part of the application profile and level (Annex~\ref{profilelevel}).
 
\subsection{Random access}
\label{randomaccess}
Sequence headers represent safe entry points for decoding a sequence. 

An accessible picture (with reference to a given sequence header)
shall be defined as a picture decodeable without dependence on 
to data prior to the sequence header in coded order.

Accessibility should normally imply that each accessible picture has
no reference picture prior to the sequence header, and no chain of 
references leading to a reference picture prior to the sequence header. 
A given level may allow this condition to
be relaxed (for example, in P-only coding where unavailable references
may be substituted for by zero pictures), but where no specific provision
to the contrary is specified in an applicable level or profile, it shall 
apply.

The first picture data unit after a sequence header shall be called the 
access picture and shall be accessible with respect to the sequence header.
It should normally be an intra picture. If the sequence contains inter pictures
it should normally be an intra reference picture.

All picture data units subsequent to the sequence header in coded order 
shall also be accessible with respect to the sequence header
if their picture numbers are greater than 
or equal to that of the access picture. The access picture therefore represents 
a temporal access point into the sequence.

\begin{informative}

If a sequence satisfies a maximum reordering depth constraint 
(Annex~\ref{picturereordering})
of size $N$ all pictures more than $N$ pictures later than the sequence header will
have larger picture numbers than the first picture after the sequence header, and hence will be accessible. A reordering depth constraint thus implies that after a 
sequence header at most $N$ pictures will need to be discarded before all pictures are decodeable.
\end{informative}
\subsection{Reference picture buffer management}
\label{refbuffer}

This section specifies how the Dirac stream data shall be used to manage the reference 
picture buffer $\RefBuffer$. The reference picture buffer has a maximum size of
$\RefBufferSize$ elements, as set in the applicable level (Annex~\ref{profilelevel}).

The $ref\_picture\_remove()$ process shall be defined as
follows:

\begin{pseudo}{ref\_picture\_remove}{}
\bsCODE{n=\RetiredPicture}
\bsFOR{k=0}{\RefBufferSize-1}
   \bsIF{\RefBuffer[k][\PicNum]==n}
        \bsFOR{j=k}{\RefBufferSize-2}
            \bsCODE{\RefBuffer[j]=\RefBuffer[j+1]}
        \bsEND
        \bsCODE{\RefBufferSize -= 1}
    \bsEND
\bsEND
\end{pseudo}

The $get\_ref(n)$ function shall returns the reference picture in the buffer with 
picture number $n$.  If there is no such picture it shall return an all-zero picture.

The $ref\_picture\_add()$ process for adding pictures to the reference picture
buffer shall proceed according to the following rules:

{\bf Case 1.} If the reference picture buffer is not full i.e. has fewer than $\MaxRefBufferSize$ elements,
then add $\CurrentPicture$ to the end of the buffer. 

{\bf Case 2.} If the reference picture is full i.e. it has $\MaxRefBufferSize$ elements, then remove the
first (i.e. oldest) element of the buffer, $\RefBuffer[0]$, set
\[\RefBuffer[i] = \RefBuffer[i+1] \]
for $i=0$ to $\RefBufferSize-2$, and set the last element $\RefBuffer[\RefBufferSize-1]$ equal to
a copy of $\CurrentPicture$.

\input{idwt}

\subsection{Motion compensation}
\input{mc}

\subsection{Clipping}
\label{pictureclip}

Picture data must be clipped prior to being output or being
used as a reference:

\begin{pseudo}{clip\_picture}{}
\bsFOREACH{c}{Y,C1,C2}
    \bsCODE{clip\_component(\CurrentPicture[c])}
\bsEND
\end{pseudo}


\begin{pseudo}{clip\_component}{comp\_data,c}
\bsIF{c==Y}
    \bsCODE{bit\_depth=\LumaDepth}
\bsELSE
    \bsCODE{bit\_depth=\ChromaDepth}
\bsEND
\bsFOR{y=0}{\height(comp\_data)-1}
    \bsFOR{x=0}{\width(comp\_data)-1}
        \bsCODE{data = \clip(comp\_data[y][x], -2^{bit\_depth-1}, 2^{bit\_depth-1}-1)}
     \bsEND
\bsEND
\end{pseudo}

\begin{informative}
Note that clipping is incorporated into motion compensation, so that strictly speaking additional
clipping is only required for intra pictures.
\end{informative}

\subsection{Video output ranges}
\label{videooutput}

Video output data ranges are deemed to be non-negative, so that the offset and excursion
 values may be applied by subsequent processing. Since decoded video data is bipolar, it must be suitably offset before output:

\begin{pseudo}{offset\_output\_data}{picture\_data}
\bsFOREACH{c}{Y, C1, C2}
    \bsIF{c==Y}
        \bsCODE{bit\_depth=\LumaDepth}
    \bsELSE
        \bsCODE{bit\_depth=\ChromaDepth}
    \bsEND
    \bsCODE{comp=picture\_data[c]}
    \bsFOR{y=0}{\height(comp)-1}
        \bsFOR{x=0}{\width(comp)-1}
            \bsCODE{comp[y][x]+=2^{bit\_depth-1}}
        \bsEND
    \bsEND
\bsEND
\end{pseudo}