\title{Linear Dithering} %RUN_VERBATIMS sh & Consider the problem of converting a gray-scale image into a binary image, while keeping as much as possible of the visual information. The two standard techniques are {\bf thresholding} and {\bf dithering}. The goal of this note is to show that thresholding and dithering are just two points on a multidimensional continuum of binarization methods, that we call {\bf linear dithering}. %SCRIPT plambda i/weiro.png "87 > 255 *" -o weiro-bin.png %SCRIPT dither i/weiro.png weiro-dit.png %SCRIPT plambda i/weiro.png x,l|blur -z z 0.1|plambda - '0 < 255 *' -o weiro-lin.png \begin{tabular}{llll} \includegraphics{i/weiro.png} & \includegraphics{weiro-bin.png} & \includegraphics{weiro-dit.png} & \includegraphics{weiro-z01.png} \\ original image & thresholding & dithering & linear dithering \end{tabular} \section{Classic thresholding and dithering} Thresholding at the right level is surprisingly effective. It is often possible to find a threshold that captures most of the relevant information in the image, even in the case of extreme lighting conditions. Yet, it is clear than for some images it will be impossible to find a single satisfactory threshold. Dithering (also called error diffusion) is aimed at representing all the possible gray levels of the original image, at the price of a small loss of resolution. Look at this photo of a famous bongo player, for example. There is a strong spot light righto into his face, and the rest of the image is very dark. I would say that it is impossible to find a single threshold where the face and the hand are recognizable. Yet there is! And quite easy to find manually, around the value of 87, for example. We show also the results of Floyd-Sternberg dithering and the linear dithering described below. Notice that Floyd-Sternberg dithering destroys all the texture of the clothes, which is well-preserved by thresholding and linear dithering. (Too well-preserved, in the last case, as it enhances the jpeg compression artifacts of the original.) \begin{gallery} \galleryline{i/bongos.jpg} \galleryline{bongos-bin.png} \galleryline{bongos-dit.png} \galleryline{bongos-lin.png} \end{gallery} \begin{verbatim} plambda i/bongos.jpg "87 > 255 *" -o bongos-bin.png dither i/bongos.jpg bongos-dit.png plambda i/bongos.jpg x,l | blur z 0.25 | plambda - '0 < 255 *' -o bongos-lin.png \end{verbatim} %\begin{gallery} % \galleryline{i/cockatiels.png} % \galleryline{cockatiels-bin.png} % \galleryline{cockatiels-dit.png} %\end{gallery} %\begin{verbatim} %plambda i/cockatiels.png "87 > 255 *" -o cockatiels-bin.png %dither i/cockatiels.png cockatiels-dit.png %\end{verbatim} \section{Dithering with pre-processing} We can pre-process gray-scale images before dithering them. We consider two pre-processings: a {\bf contrast change} by a function of the form~$x\mapsto\tanh(\lambda x)/\tanh(\lambda)$, and a {\bf linear filtering} of the original image that enhances its contrast. For simplicity, here we assume that our input images are of zero mean and take values on~$[-1,1]$. For the contrast change, notice two things. First, the $\lambda$-scaled~$\tanh$ tends to a step function as~$\lambda\to\infty$, and to the identity on~$[-1,1]$ as~$\lambda\to0$. Second, dithering a binary image produces exactly the same image. Thus, just by composing the dithering with a contrast change, we obtain a one-parameter family of methods that contains pure dithering and pure binarization as particular cases. \includegraphics{tanhs.png} %SCRIPT gnuplot <tanhs.png %SCRIPT set term pngcairo lw 2 %SCRIPT set samples 1000 %SCRIPT set zeroaxis %SCRIPT set key top left %SCRIPT plot [-1:1] [-1:1] tanh(10*x)/tanh(10),tanh(1*x)/tanh(1) %SCRIPT END In the figures below, the first row contains the result of applying the contrast change, and the second row the dithering of each image. %SCRIPT ditanh() { plambda "127 - 127 / $1 * tanh$1 tanh / 127 * 127 +" ; } %SCRIPT cat i/weiro.png | ditanh 0.1 | iion - weiro-tanh0.png %SCRIPT cat i/weiro.png | ditanh 3 | iion - weiro-tanh3.png %SCRIPT cat i/weiro.png | ditanh 7 | iion - weiro-tanh7.png %SCRIPT cat i/weiro.png | ditanh 900 | iion - weiro-tanhi.png %SCRIPT cat i/weiro.png | ditanh 0.1 | dither - weiro-ditanh0.png %SCRIPT cat i/weiro.png | ditanh 3 | dither - weiro-ditanh3.png %SCRIPT cat i/weiro.png | ditanh 7 | dither - weiro-ditanh7.png %SCRIPT cat i/weiro.png | ditanh 900 | dither - weiro-ditanhi.png \begin{tabular}{llll} $\lambda=0$ & $\lambda=3$ & $\lambda=7$ & $\lambda=\infty$ \\ \includegraphics{weiro-tanh0.png} & \includegraphics{weiro-tanh3.png} & \includegraphics{weiro-tanh7.png} & \includegraphics{weiro-tanhi.png} \\ \includegraphics{weiro-ditanh0.png} & \includegraphics{weiro-ditanh3.png} & \includegraphics{weiro-ditanh7.png} & \includegraphics{weiro-ditanhi.png} \\ \end{tabular} For the linear filtering, we consider a family of filters~$k$ of the form~$\widehat{k}(\xi)=|\xi|^\mu$ for~$\mu\in[0,2]$, acting over images of zero mean, suitably normalized to conserve the second moment of the image. These filters interpolate continuously between the identity ($\mu=0$) and minus the Laplacian operator ($\mu=2$). The case~$\mu=1$ can be called~\emph{linear retinex}. For~$\mu<0$, this is called the Riesz scale space. %SCRIPT riesz() { periodize | fft | plambda ":R $1 ^ *" | ifft | imhalve ; } %SCRIPT cat i/weiro.png | riesz 0.1 | qauto - weiro-riesz01.png & %SCRIPT cat i/weiro.png | riesz 0.5 | qauto - weiro-riesz05.png & %SCRIPT cat i/weiro.png | riesz 1.0 | qauto - weiro-riesz10.png & %SCRIPT cat i/weiro.png | riesz 2.0 | qauto - weiro-riesz20.png & %SCRIPT cat i/weiro.png | riesz 0.1 | qauto | dither - weiro-diriesz01.png & %SCRIPT cat i/weiro.png | riesz 0.5 | qauto | dither - weiro-diriesz05.png & %SCRIPT cat i/weiro.png | riesz 1.0 | qauto | dither - weiro-diriesz10.png & %SCRIPT cat i/weiro.png | riesz 2.0 | qauto | dither - weiro-diriesz20.png & %SCRIPT wait \begin{tabular}{llll}$\mu=0.1$&$\mu=0.5$&$\mu=1$&$\mu=2$\\ \includegraphics{weiro-riesz01.png} & \includegraphics{weiro-riesz05.png} & \includegraphics{weiro-riesz10.png} & \includegraphics{weiro-riesz20.png} \\ \includegraphics{weiro-diriesz01.png} & \includegraphics{weiro-diriesz05.png} & \includegraphics{weiro-diriesz10.png} & \includegraphics{weiro-diriesz20.png} \\ \end{tabular} Notice that the Laplacian is locally constant nearly everywhere, except just around the edges. Thus, dithering this image results in a checkerboard pattern of density$50\%$, but with a characteristic bias around the edges which renders the structures visible. \section{Thresholding with linear pre-processing} Now, forget a moment about the dithering step and consider only linear filtering and thresholding. In modern parlance, thresholding with linear pre-processing would be called \emph{single layer convolutional neural network with Heaviside activation function}. More precisely, we take an image of zero mean, apply a linear filter, and threshold the result at 0. Let us see how changing shape of the kernel produces different effects. We use the following radial kernels (we omit the normalization factors in this table): \begin{tabular}{ll} name & formula \\ Gauss &$G_\sigma(r) =\delta-\exp\frac{-r^2}{2\sigma^2}$\\ Laplace &$L_\sigma(r) =\delta-\exp\frac{-r}{\sigma}$\\ Cauchy &$C_\sigma(r) =\delta-\frac{1}{\sigma^2+r^2}$\\ Riesz &$\widehat{R_\sigma}(\rho)=\rho^\sigma$\\ truncated inverse-log &$Z_\sigma(r)=T_\sigma\frac{1}{\log(r)}$\\ log-Cauchy &$Q_\sigma(r)=-\log(\sigma^2+r^2)$\\ \end{tabular} These filters are all positive, so the effect they produce is blurring the images. For this application we apply them to the Laplacian of the input image (or, equivalently, we filter the image by the Laplacian of these kernels). %%SCRIPT blur -s G 1 i/weiro.png|plambda - '0 > 255 *' -o weiro-G01.png %%SCRIPT blur -s G 5 i/weiro.png|plambda - '0 > 255 *' -o weiro-G05.png %%SCRIPT blur -s G 15 i/weiro.png|plambda - '0 > 255 *' -o weiro-G15.png %\begin{tabular}{lll} % \includegraphics{weiro-G01.png} & % \includegraphics{weiro-G05.png} & % \includegraphics{weiro-G15.png} \\ %$G_{1}$& %$G_{5}$& %$G_{15}$\\ %\end{tabular} % %%SCRIPT blur -s L 1 i/weiro.png|plambda - '0 > 255 *' -o weiro-L01.png %%SCRIPT blur -s L 5 i/weiro.png|plambda - '0 > 255 *' -o weiro-L05.png %%SCRIPT blur -s L 15 i/weiro.png|plambda - '0 > 255 *' -o weiro-L15.png %\begin{tabular}{lll} % \includegraphics{weiro-L01.png} & % \includegraphics{weiro-L05.png} & % \includegraphics{weiro-L15.png} \\ %$L_{1}$& %$L_{5}$& %$L_{15}$\\ %\end{tabular} % %%SCRIPT blur -s C 1e-3 i/weiro.png|plambda - '0 > 255 *' -o weiro-C01.png %%SCRIPT blur -s C 1 i/weiro.png|plambda - '0 > 255 *' -o weiro-C05.png %%SCRIPT blur -s C 1e5 i/weiro.png|plambda - '0 > 255 *' -o weiro-C15.png %\begin{tabular}{lll} % \includegraphics{weiro-C01.png} & % \includegraphics{weiro-C05.png} & % \includegraphics{weiro-C15.png} \\ %$C_{1}$& %$C_{5}$& %$C_{15}\$ \\ %\end{tabular} %SCRIPT plambda x,l 255 *' -o barb-bin.png \end{verbatim} \begin{gallery} \galleryline{i/barb.png} \galleryline{barb-dit.png} \galleryline{barb-lin.png} \galleryline{barb-bin.png} \end{gallery} \emph{ PS: all the experiments on this note are generated from comments extracted of the original .tex source file. } %\section{Two-layer binarization} % vim:set tw=77 filetype=tex spell spelllang=en: