(always be aware of your imports and <b><u><i>preserve namespaces</i></u></b>!!!)

In [1]:
import os
import sys
import time
import numpy as np
import scipy.ndimage as nd
import matplotlib.pyplot as plt

%matplotlib tk

plt.ion()
plt.rcParams["image.cmap"] = "gist_gray"

---

## Filtering

Filtering is one of the most important techniques in data analysis with many, many applications including image and time series analysis.  First let's start with 1D.  In the continuous (non-discrete) case, a "filter" $f$ is applied to a function $g$ as

$g^{\prime}(x) = \int_{-\infty}^{\infty} ~ f(x-x_0) ~ g(x_0) ~ dx_0$,

where $g^{\prime}$ is the filtered version of $g$.  The discrete case is a bit easier to understand:

$g^{\prime}_{i} = \sum_{j=i-w}^{i+w} ~ f_j ~ g_j$.

By definition, the filter should be constructed so that

$\sum_{j=-w}^{w} f_j = 1$

The most important filters are the <b>top-hat</b> and <b>Gaussian</b> (the <b>median</b> filter is also widely used but works slightly differently than the above definition).

### Top-hat filter

The top-hat filter looks like this:

In [2]:
xxf, dx = np.linspace(-10, 10, 1000, retstep=True)

width = 2
tf    = 1.0 * (xxf > -width) * (xxf < width)
tf   /= tf.sum()

fig0, ax0 = plt.subplots(num=0)
lin0, = ax0.plot(xxf, tf, 'k')
ax0.text(-10, 0.01, 'Top-hat filter', va='bottom', ha='left', 
         fontsize=20)
ax0.set_ylim(-0.005, 0.01)
fig0.canvas.draw()

it replaces the value at $i$ with the mean of values from $i-w$ to $i+w$ where $w$ is the width of the filter (here $w=2$).

Imagine I have an array of data values, let's use temperature as an example:

In [3]:
day = np.arange(20.)
T = np.array([38.6, 34.8, 36.3, 31.9, 34.2, 48.4, 46.4, 
              50.3, 51.6, 35.5, 51.3, 41.3, 38.9, 54.1, 
              40.5, 50.9, 43.5, 54.4, 44.5, 42.6])

In [4]:
fig1, ax1 = plt.subplots(num=1, figsize=[7, 5])
pntsa,    = ax1.plot(day, T, "o", color="dodgerblue", ms=10)
ax1.set_ylabel("temperature", fontsize=20)
ax1.set_xlabel("day", fontsize=20)
fig1.canvas.draw()

If we have a top hat filter of width 2, the we replace $T[i]$ with the mean of $T[i-2:i+2]$.  For example, $T[5]=48.4$ which replace by mean of $T[3:7]=[31.9, 34.2, 48.4, 46.4]$ which is $40.2$.

In [5]:
rng = ax1.axvspan(day[5-2], day[5+2], facecolor="lime", alpha=0.1)
ax1.plot(day[5], np.mean(T[5-2:5+2]), "o", color="darkorange", ms=10)
fig1.canvas.draw()

In [6]:
spn = rng.get_xy()
spn[:, 0] = [day[6-2], day[6-2], day[6+2], day[6+2], day[6-2]]
rng.set_xy(spn)
ax1.plot(day[6], np.mean(T[6-2:6+2]), "o", color="darkorange", ms=10)
fig1.canvas.draw()

Similarly, $T[6]$ is replaced by the mean of $T[4:8] = [34.2, 48.4, 46.4, 50.3]$ which is $44.8$.

In [7]:
pntsa.set_alpha(0.25)

for ii in range(2, len(day) - 2):
    spn[:, 0] = [day[ii-2], day[ii-2], day[ii+2], day[ii+2], day[ii-2]]
    rng.set_xy(spn)
    ax1.plot(day[ii], np.mean(T[ii-2:ii+2]), "o", color="darkorange", 
            ms=10)
    fig1.canvas.draw()
    time.sleep(1)

And if we double the width of the filter the noise is reduced further:

In [8]:
for ii in range(4, len(day) - 4):
    spn[:, 0] = [day[ii-4], day[ii-4], day[ii+4], day[ii+4], day[ii-4]]
    rng.set_xy(spn)
    ax1.plot(day[ii], np.mean(T[ii-4:ii+4]), "o", color="darkred",ms=10)
    fig1.canvas.draw()
    time.sleep(1)

In [9]:
plt.close(1)

Let's make some noisy data and see how it works:

In [10]:
xx = np.linspace(-100, 100, int(200./dx))

seed, sigma = 314, 0.1
np.random.seed(seed)
noise = sigma * np.random.randn(len(xx))

yy = np.cos(1e-1 * xx) + noise

fig1, ax1 = plt.subplots(num=1)
lin1a, = ax1.plot(xx, yy, "darkred")
ax1.set_ylim(-1.5, 1.5)
fig1.canvas.draw()

In [11]:
yysm  = np.zeros(len(yy))
fsize = len(tf)

for ii in range(fsize/2, len(yysm) - fsize/2):
    yysm[ii] = (yy[ii - fsize/2:ii + fsize/2] * tf).sum()
    
lin1b, = ax1.plot(xx, yysm, "dodgerblue")
fig1.canvas.draw()

In [12]:
plt.close(0)
plt.close(1)

### Gaussian filter

The Gaussian filter is precisely as it sounds, it is a filter shaped like a Gaussian with standard deviation $\sigma$,

$f_{\rm Gaussian} = \frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{x^2}{2\sigma^2}}$

again,

In [13]:
xxf, dx = np.linspace(-10., 10., 1000, retstep=True)

sigma = 10
gf    = np.exp(-xxf**2 / np.sqrt(2.0 * sigma**2))
gf   /= gf.sum()

fig2, ax2 = plt.subplots(num=2)
lin2,     = ax2.plot(xxf, gf, "k")
ax2.text(-10, 0.01, "Gaussian filter", va="bottom", ha="left", 
         fontsize=20)
ax2.set_ylim(-0.005, 0.01)
fig2.canvas.draw()

It is easiest to see it's effects on a step function:

In [14]:
xx = np.linspace(-100., 100., int(200./dx))
yy = 1.0 * (xx > 20.)

plt.close(3)
fig3, ax3 = plt.subplots(num=3)
lin3a, = ax3.plot(xx, yy, "darkred")
ax3.set_ylim(-0.1, 1.5)
fig3.canvas.draw()

In [15]:
yysm = np.zeros(len(yy))
fsize = len(gf)

for ii in range(fsize/2, len(yysm) - fsize/2):
    yysm[ii] = (yy[ii - fsize/2:ii + fsize/2] * gf).sum()
    
lin3b, = ax3.plot(xx, yysm, "dodgerblue")
fig3.canvas.draw()

SciPy has several fast canned filtering routines.  For example,

In [16]:
gf = nd.filters.gaussian_filter

In [17]:
fig4, ax4 = plt.subplots(num=4)
lin4a, = ax4.plot(xx, yy)
lin4b, = ax4.plot(xx, gf(yy, 200)) # width of the filter is in sample units
lin4c, = ax4.plot(xx, gf(yy, 1000))
fig4.canvas.draw()

This particular filter also works in 2d <i>and</i> allows you to specify the width of the filter for each axis:

In [18]:
img_L = nd.imread('images/city_image.jpg').mean(2)
nrow, ncol = img_L.shape[:2]
xsize = 5.5
ysize = xsize * float(nrow) / float(ncol)

fig5, ax5 = plt.subplots(3, 1, num=5, figsize=[xsize, 3 * ysize])
[i.axis("off") for i in ax5]
fig5.subplots_adjust(0, 0, 1, 1, 0, 0)
im5a = ax5[0].imshow(img_L)
im5b = ax5[1].imshow(gf(1.0 * img_L, 5))
im5c = ax5[2].imshow(gf(1.0 * img_L, [2, 20]))
fig5.canvas.draw()

In [19]:
[plt.close(i) for i in range(5)]

[None, None, None, None, None]

### Edge detection revisited

Gaussian smoothing to eliminate noise when finding edges.  Recall how we tried to find edges before:

In [20]:
edges_simp = np.abs(img_L[5:, 5:] - img_L[:-5, 5:]) + \
    np.abs(img_L[5:, 5:] - img_L[5:, :-5])

im5b.set_data(edges_simp > 60)
im5b.set_clim(0, 1)
im5c.set_data(np.zeros(img_L.shape))
fig5.canvas.draw()

If we eliminate some noise with smoothing, our edges become a bit clearer:

In [23]:
imgsm_L = gf(img_L, 2)

gauss_der = np.abs(imgsm_L[1:, 1:] - imgsm_L[:-1, 1:]) + \
    np.abs(imgsm_L[1:, 1:] - imgsm_L[1:, :-1])

im5c.set_data(gauss_der)
im5c.set_clim(0, 10)
fig5.canvas.draw()

In [24]:
im5c.set_data(gauss_der > 7)
im5c.set_clim(0, 1)
fig5.canvas.draw()

In [25]:
[plt.close(i) for i in range(6)]

[None, None, None, None, None, None]

---

### Sharpening in grayscale

1. Load an image and convert to grayscale
2. Smooth with a circular Gaussian of width 2
3. Subtract the result of 2. from 1.
4. Add half the result of 3. to 1. and display
5. Add twice the result of 3. to 1. and display

In [26]:
imgsm_L = gf(img_L, 2) # nb, gf argument must be of type float.

img_hpf = img_L - imgsm_L

fig6, ax6 = plt.subplots(num=6, figsize=[xsize, ysize])
fig6.subplots_adjust(0, 0, 1, 1)
ax6.axis("off")
im6 = ax6.imshow(img_L)
fig6.canvas.draw()

In [27]:
plt.figure()
plt.imshow(img_hpf)

<matplotlib.image.AxesImage at 0x1145f3b50>

In [28]:
sflag = [True]
imgs  = [img_L, img_L + img_hpf]

def toggle(event):
    """
    Toggle between original image and sharpened images.
    """
    
    # -- if the "n" key is pressed
    if event.key == " ":

        # flip the display flag
        sflag[0] = ~sflag[0]
        
        # reset the data
        im6.set_data(imgs[sflag[0]])
        fig6.canvas.draw()
        
dum = fig6.canvas.mpl_connect("key_press_event", toggle)

We can do this in color as well:

In [29]:
imgc     = nd.imread('images/city_image.jpg')
imgc_hpf = 1.0 * imgc - gf(1.0 * imgc, (2, 2, 0))

# reset imgs for toggle function
imgs = [imgc, (imgc + imgc_hpf).clip(0,255).astype(np.uint8)]

In [27]:
plt.close("all")

<i>Aside on <b>median filtering</b></i>:

Median filtering replaces a given data point with the median of the measurements within some window.  This process preserves edges (unlike gaussian filtering).

In [30]:
xx = np.linspace(-100., 100., 200)
yy = 1.0 * (xx > 20.)

np.random.seed(103)
noise = np.random.randn(xx.size) * 0.1

mf = nd.filters.median_filter

fig7, ax7 = plt.subplots(num=7)
lin7a, = ax7.plot(xx, yy + noise, lw=0.5)
lin7b, = ax7.plot(xx, gf(yy + noise, 10))
lin7c, = ax7.plot(xx, mf(yy + noise, 10))
fig7.canvas.draw()

Or in 2D:

In [31]:
fig8, ax8 = plt.subplots(3, 1, num=8, figsize=[xsize, 3*ysize])
fig8.subplots_adjust(0, 0, 1, 1, 0, 0)
[i.axis("off") for i in ax8]

ax8[0].imshow(imgc)
ax8[1].imshow(gf(1.0 * imgc, [5, 5, 0]).clip(0, 255).astype(np.uint8))
ax8[2].imshow(mf(1.0 * imgc, [5, 5, 1]).clip(0, 255).astype(np.uint8))
fig8.canvas.draw()

In [30]:
plt.close("all")