We introduce a new approach inspired by the SLIC methodology that implicitly enforces connectivity and efficiently generates compact, connected, and nearly uniform superpixels. The new formulation will introduce a different neighbourhood structure that implicitly encode spatial proximity and contiguity in it. 

#### Original SLIC segmentation

The SLIC algorithm is essentially a **_kmeans_ algorithm that partitions a color image into a desired number of regular, compact superpixels with a low computational overhead**. Actually, it locally clusters pixels in the combined 5D color ($(L, a, b)$ values of the CIELAB color space) and image plane ($(x, y)$ pixel coordinates) space. Note that the reason why CIELAB color space is chosen is that it is perceptually uniform for small color distance.  

The SLIC algorithm takes as input a desired number of approximately equally-sized superpixels $K$ and begins by sampling $K$ regularly spaced cluster centers $C_k = [L_k, a_k, b_k, x_k, y_k]^T$ at regular grid intervals $S$ (depending on $K$ and the number $N$ of pixels in the image: $S = \sqrt{N/K}$). In practice, for roughly equally sized superpixels there would be a superpixel center at every grid interval $S$. Since the spatial extent of any superpixel is approximately $S^2$ (the approximate area of a superpixel), it is assumed that pixels that are associated with this cluster center lie within a $2S \times 2S$ area around the superpixel center on the $(x, y)$ plane. This becomes the search area for the pixels nearest to each cluster center. Instead of directly using the Euclidean distance in the 5D space, a _"perceptual closeness"_ measure $D_s$ that considers superpixel size is introduced to control the compactness of superpixels:
$$
D_s ({\mathbf x}_i, {C_k}) = d_{lab} ({\mathbf x}_i, {C}_k) + \frac{m}{S} \cdot {d}_{xy} ({\mathbf x}_i, {C}_k),
$$
defined as the (weighted) sum of the spectral distance $d_{lab}$ in the color space:
$$
d_{lab} ({\mathbf x}_i, {C}_k) = \sqrt{(L_k - L_i)^2 + (a_k - a_i)^2 + (b_k - b_i)^2} 
$$
and the spatial distance $d_{xy}$ in the plane (normalized by the grid interval $S$):
$$
d_{xy} ({\mathbf x}_i, {C_k}) = \sqrt{(x_k - x_i)^2 + (y_k - y_i)^2} .
$$
of a pixel ${\mathbf x}_i=[ L_i, a_i, b_i, x_i, y_i ]^T$ to any cluster center $C_k$. Here, $m$ is introduced to control the compactness of superpixels (the greater the value of $m$, the more spatial proximity is emphasized and the more compact the cluster). 
This distance is used to cluster together preceptually close pixels in an iterative manner.
Namely, each pixel ${\mathbf x}_i$ is associated with the nearest cluster center $C_k$  (_w.r.t_ the distance $D_s$) whose search area overlaps this pixel. After all the pixels are associated with their nearest cluster center, new centers are  computed as the barycenters of all the pixels belonging to the updated clusters. This process is iterated till convergence to produce approximately equally-sized superpixels. 

A post-processing of the clusters output by the _kmeans_ clustering is necessary to enforce connectivity. Indeed, the distance $D_s$ (based on composite Euclidean distance), while being easy to calculate, **does not take into account the image intensity values between two image pixels and thus ignores connectivity**: a pixel belonging to an object distinct from the cluster center but with similar spectral values, while separated by other intermediate distinct structures, can still reach low $D_s$ and be considered as 'perceptually close', hence the superpixels can be ripped apart. 

#### Extended amoeba-based SLIC segmentation

Generally, making an appropriate choice of homogeneity criterion is critical to the success of any region-based segmentation procedure, especially for multispectral images, where the criterion is highly dependent on the choice of the (spectral and spatial) closeness measure. Our algorithm is based on the estimation of amoeba-like neighborhoods around the cluster 
centers [[LDM07]](LDM07), [[GS09a]](GS09a). The amoeba construction exploits the connections between successive image pixels along the so-called geodesic paths to define such a measure [[GS09a]](GS09a). It describes similarity relationships based on pairwise differences between neighboring pixels (_e.g._, using the connections between successive image pixels along geodesic paths).

<table>
<tr>
<td><kbd><img src="img/excerpt1.png" alt="input excerpt#1" width="250"> </kbd></td>
<td><kbd><img src="img/exceprt1-amoeba-superpixels.png" alt="amoeba superpixels" width="250"> </kbd></td>
<td><kbd><img src="img/excerpt1-mean-amoeba-approximations.png" alt="mean amoeba approximations" width="250"> </kbd></td>
</tr>
<header>
<td align="centre"><code>input excerpt</code></td>
<td align="centre"><code>amoeba superpixels (different scales displayed)</code></td>
<td align="centre"><code>amoeba mean-averaged approximations (different scales displayed)</code></td>
</header>
</table>


A new _"perceptual closeness"_ measure between $\x_i$ and $C_k$ is formulated as the minimal length of the  shortest path ${\cal P} = (p_0={\mathbf x}_i, p_1, \cdots, p_n=C_k)$ joining them [[GS09a]](GS09a). $D_a$ is estimated as the combined spatial/spectral cost of "traveling" from ${\mathbf x}i$ to $C_k$ in the image: 
$$
D_a ({\mathbf x}_i,C_k) = \min_{\cal P} \{ L_a({\cal P}) ({\mathbf x}_i,C_k) \}
$$
where
$$
L_a({\cal P}) ({\mathbf x}_i, C_k) = \sum_{j=0}^{j=n-1} d_{lab}(p_j,p_{j+1}) + \alpha \cdot d_{xy}(p_j,p_{j+1}) + \beta \cdot d_g(p_j,p_{j+1})
$$
with $d_{lab}$ and $d_{xy}$ as before. 
Similarly to $m$ previously, $\alpha$ quantifies the relative influences of the spectral proximity _w.r.t_ the spatial one (simply computed as the local Euclidean distance) in the estimation of the perceptual measure: the higher $\alpha$, the more spectral closeness is emphasized. 
In addition, we  incorporate local information about the meaningfull structures present in the image using the cost induced by the gradient structure tensor of the image $d_g(p_j,p_{j+1})$. This way, high gradient pixels act like barriers in the amoeba propagation: the lower the gradient values from $p_j$ to $p_{j+1}$, the higher the probability they belong to the same amoeba. Say it otherwise, the higher $\beta$, the less probable amoeba superpixels will cross regions of high gradient. Whilst using more information, we will need to deal with the problem of how to weight and combine different kinds of information as  $\alpha$ and $\beta$ quantify the relative influences of spectral and spatial proximities in the estimation of the perceptual measure: the higher $\alpha and \beta$ are, the less compact the clusters are. 

#### Algorithm implementation

Given a user-specified amount of superpixels, the algorithm first puts, likewise the original SLIC, some seeds roughly in a lattice structure on the image along with small disturbance in order to avoid the placement on strong intensity boundaries. The seeds serve as initial estimates of the superpixel centers. The location of the centers and shape of each superpixel keep changing in turn as the algorithm runs.


<img src="fig/algorithm-amoeba.png" alt="algorithm amoeba superpix” width=“750">


\begin{algorithm}[h]
\SetAlgoLined 
\KwIn{image of~$N$ pixels, desired number of superpixels~$K$ and (arbitrary) residual threshold~$T$}
\KwOut{cluster centers $\{ C_k \}_{1 \leq k \leq L}$, $L \approx K$, and image $\{ Q_a \}_{1 \leq k \leq L}$ of superpixels' labels}
\BlankLine
initialize the distance map: $D_a \leftarrow \infty$ and the label image: $Q_a \leftarrow 0$ \;
approximate the grid step: $S \leftarrow \sqrt{N/K}$ \;
initialize the $L$ cluster centers $C_k = [l_k, a_k, b_k, x_k, y_k]^T$ at regular steps $S$ \;
initialize the residual error $e \leftarrow \infty$ \;
\While{$e\leq T$}{
\For{$k \leftarrow 1$ \KwTo $L$}{
compute the local amoeba distance $d_a$ from $C_k$ as $D_a(\cdot,C_k)$ in Eq.~(\ref{eq:amoeba-closeness}) in a $2S \times 2S$ square neighborhood ${\cal V}_k$ around $C_k$\;
update locally the label image: ${Q_a}_{\mid {\cal V}_k \cap d_a \leq {D_a}_{\mid {\cal V}_k}} \leftarrow k$ in ${\cal V}_k$ 
in order to assign the best matching pixels to $C_k$\;
update locally the distance map as the local pointwise minimum distance: ${D_a}_{\mid {\cal V}_k} \leftarrow {D_a}_{\mid {\cal V}_k} \wedge d_a$ in ${\cal V}_k$ \;
}
update cluster centers $C_k$ \;
compute residual error $e$ as the $L^1$ distance between old and new centers\;
}
\end{algorithm}


#### Outputs

By computing distance along geodesic paths, the amoeba measure  accounts for both the distance between points and  the roughness of the surface. It penalizes pixels that belong to a different connected geodesic component. Altogether, the use of amoebas implicitely encodes spatial proximity in the neighbourhood structure and explicitly enforces connectivity, besides compactness and regularity, in the superpixel shapes. It seamlessly accomodates grayscale or color images.

The following properties of the amoeba-based  superpixels are observed _w.r.t_ the five basic principles proposed in [[LSKFDS09]](LSKFDS09):
1.  uniform size and coverage: the algorithm partitions an image into regions that are approximately uniform in size and shape, provided that superpixel size is comparable to the size of the smallest target region; only the superpixels near objects' boundaries deform sharply while the others remain roughly the same during each iteration; in the homogenous region, most resulting superpixels remain roughly like squares.
2. connectivity: this is automatically ensured using geodesic paths, so that each superpixel represents a simply connected set of pixels;
3. compactness: each superpixel has a regular shape and size with smooth boundaries which better captures spatially coherent information; in the absence of local edge information, superpixels remain compact;
4. smooth, edge-preserving flow: the algorithm is not a geometric-flow based formulation, so difficulties that occur in the edge evolution process (such as boundary crossing and collision) are nonexistent;
5.  no superpixel overlap: this is automatically ensured by the algorithm, _i.e._ every pixel is assigned to a single superpixel.


**<a name="References"></a>References** 

* <a name="ASSLFS12"></a>Achanta R., Shaji A., Smith K., Lucchi A., Fua P., and Susstrunk S. (2012): **SLIC superpixels compared to state-of-the-art superpixel methods**, _IEEE Transactions on Pattern Analysis and Machine Intelligence_, 34(11):2274–2282, doi:[]().
* <a name="BBI08"></a>Bagon S., Boiman O., and Irani M. (2008): **What is a good image segment? A unified approach to segment extraction**, in _Proc. ECCV_, Lecture Notes in Computer Science, vol. 5305, pp. 30–44, doi:[]().
* <a name="BM06"></a>Bertelli L. and Manjunath B. (2006): **Redundancy in all pairs fast marching method**, in _Proc. IEEE ICIP_, pp. 3033–3036, doi:[]().
* <a name="CMT0"></a>Coeurjolly D., Miguet D., and Tougne L. (2004): **2D and 3D visibility in discrete geometry: an application to discrete geodesic paths**, _Pattern Recognition Letters_, 
25(5):561–570, doi:[]().
* <a name="DP06"></a>Debayle J. and Pinoli J. (2006): **General adaptive neighborhood image processing.**, _Journal of Mathematical Imaging and Vision_; **Part I: Introduction and theoretical aspects**, 25(2):245–266, doi:[](); **Part II: practical application examples** 25(2):267–284, doi:[]().
* <a name="FVS09"></a>B. Fulkerson, A. Vedaldi, and S. Soatto (2009): ”Class segmentation and object localization with superpixel neighborhoods”, in _Proc. IEEE ICCV_, pp. 670–677, doi:[10.1109/ICCV.2009.5459175](http://dx.doi.org/10.1109/ICCV.2009.5459175).
* <a name="GS09"></a>Grazzini J. and Soille P. (2009): [**Edge-preserving smoothing using a similarity measure in adaptive geodesic neighbourhoods**](http://www.sciencedirect.com/science/article/pii/S003132030800469X), _Pattern Recognition_, 42(10):2306-2316, doi:[10.1016/j.patcog.2008.11.004](http://dx.doi.org/10.1016/j.patcog.2008.11.004).
* <a name="GDS10"></a>Grazzini J., Dillard S., and Soille P. (2010): [**Multichannel image regularisation using anisotropic geodesic filtering**](http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5596008), in _Proc. IEEE ICPR_, pp. 2664-2667, doi:[10.1109/ICPR.2010.653](http://dx.doi.org/10.1109/ICPR.2010.653).
* <a name="GS08"></a>Grazzini J.  and Soille P. (2008): [**Adaptive morphological filters using similarities based on geodesic time**](http://www.springerlink.com/content/f6v62233xqkklq72), in _Proc. DGCI_, Lecture Notes in Computer Science, vol. 4992, pp.519-528, doi:[10.1007/978-3-540-79126-3_46](http://dx.doi.org/10.1007/978-3-540-79126-3_46).
* <a name="Hanbury08"></a>Hanbury A. (2008): **How do superpixels affect image segmentation?**, in _Proc. IbCPR_, Lecture Notes in Computer Science, vol. 5197, pp. 178–186, doi:[]().
* <a name=“LDM07”></a>Lerallut R., Decenciere E., and Meyer F. (2007): **Image filtering using morphological amoebas**, _Image and Vision Computing_, 25(4):395–404, doi:[10.1016/j.imavis.2006.04.018](http://dx.doi.org/10.1016/j.imavis.2006.04.018).
* <a name=“LDM07”></a>Levinshtein A., Stere A., Kutulakos K., Fleet D., Dickinson S., and Siddiqi K. (2009): **TurboPixels: Fast superpixels using geometric flows**, _IEEE Transactions
on Pattern Analysis and Machine Intelligence_, 31(12):2290–2297, doi:[]().
* <a name=“MPWMJ08”></a>Moore A.P., Prince S., Warrell J., Mohammed U., and Jones G. (2008): **Superpixel lattices**, in _Proc. IEEE CVPR_, pp. 1–8, doi:[10.1109/CVPR.2008.4587471](http://dx.doi.org/10.1109/CVPR.2008.4587471).
* <a name=“FH04”></a>Felzenszwalb P. and Huttenlocher D. (2004): **Efficient graph-based image segmentation**, _International Journal of Computer Vision_, 59(2):167-181, doi:[10.1023/B:VISI.0000022288.19776.77](http://dx.doi.org/10.1023/B:VISI.0000022288.19776.77).
* <a name=“SM00”></a>Shi J. and Malik J. (2000): **Normalized cuts and image segmentation**, _IEEE Transactions on Pattern Analysis and Machine Intelligence_, 22(8):888–905, doi:[]().
* <a name=“Soille08”></a>Soille P. (2008): **Constrained connectivity for hierarchical image decomposition and simplification**, _IEEE Transactions on Pattern Analysis and Machine Intelligence_, 30(7):1132–1145, doi:[]().
