### Lineament Analysis

Let's take what we've learned about edge filters and start to do something more interesting...

Many of us have spent time trying to interpret, digitize, and analyze orientations of linear features that are visible in data of some sort.  There are very good reasons why it's often done by hand, but sometimes it's possible to automate.

Let's take a look at some aerial photography data from near Arches National Park in Utah, USA.  There are prominent linear features (joints/fractures, in this case) visible in the imagery.  Ideally, we want a rose diagram of them without manually digitizing each one:

In [None]:
%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt
import rasterio as rio
import scipy.ndimage

from context import data
from context import utils

with rio.open(data.naip.lineaments, 'r') as src:
    aerial_photo = src.read()

# Rasterio has a "bands-on-first-axis" convention, matplotlib/etc has
# a "bands-on-last-axis" convention. Use moveaxis to switch between.
aerial_photo = np.moveaxis(aerial_photo, 0, -1)

fig, ax = plt.subplots(constrained_layout=True)
ax.imshow(aerial_photo)
ax.set(xticks=[], yticks=[])
plt.show()

For this type of analysis, we often don't need color information.  It's easiest to leave it out.  We'll analyze grayscale data instead.  Let's use a simple average of RGB values:

In [None]:
gray_aerial = aerial_photo.astype(float).mean(axis=-1)

fig, ax = plt.subplots(constrained_layout=True)
ax.imshow(gray_aerial, cmap='gray')
ax.set(xticks=[], yticks=[])
plt.show()

Our lineaments are basically edges. Remember that we said gradient magnitude is a type of edge detector?  Let's go ahead and look at the raw gradient magnitude. We don't care about absolute values at all in this case, so let's use a Sobel filter.

Last time we used `scipy.ndimage.generic_gradient_magnitude` and `scipy.ndimage.sobel`.  However, that's a bit verbose and we don't care about the separate X and Y components, so let's use scikit-image's Sobel filter method instead, which is a bit simpler.  We'll use the "toggler" again so that we can easily compare it to the original imagery.

In [None]:
import skimage.filters

im_grad_mag = skimage.filters.sobel(gray_aerial)

fig, ax = plt.subplots(constrained_layout=True)
ax.imshow(aerial_photo)
im = ax.imshow(im_grad_mag, cmap='gray_r', label='Sobel', vmin=0, vmax=50)
ax.set(xticks=[], yticks=[])
utils.Toggler(im).show()

Oy... That's noisy... 

Thankfully, we just talked about gaussian gradient magnitude as a way of producing less noisy gradients.  Let's apply it:

In [None]:
sigma = 3
gauss_grad_mag = scipy.ndimage.gaussian_gradient_magnitude(gray_aerial, sigma)

fig, ax = plt.subplots(constrained_layout=True)
ax.imshow(aerial_photo)
im1 = ax.imshow(im_grad_mag, cmap='gray_r', label='Sobel', vmin=0, vmax=50)
im2 = ax.imshow(gauss_grad_mag, cmap='gray_r', label='Gauss', vmin=0, vmax=15)
ax.set(xticks=[], yticks=[])
utils.Toggler(im1, im2).show()

Okay, we have something now that we could think about using directly.  Let's try thresholding the gaussian gradient magnitude.

In [None]:
grad_thresh = gauss_grad_mag > 5

fig, ax = plt.subplots(constrained_layout=True)
ax.imshow(gray_aerial, cmap='gray')
im = ax.imshow(np.ma.masked_equal(grad_thresh, 0), vmin=0,
               interpolation='nearest')
ax.set(xticks=[], yticks=[])

def update(thresh):
    grad_thresh = gauss_grad_mag > thresh
    im.set_data(np.ma.masked_equal(grad_thresh, 0))

utils.Slider(ax, 1, 10, update, start=3).show()

You can fairly easily imagine skeletonizing the classification we just created to get the exact edges.  It's the same idea as what we used to get a nice line at the toe of slope around seamounts earlier.

You might even imagine trying to do a bit better job of skeletonizing so that nearby "ridges" linked up instead of being separate features.

That's the basic idea behind the [Canny filter](https://en.wikipedia.org/wiki/Canny_edge_detector). 

It's essentially skeletonizing a thresholding gaussian gradient magnitude, but it tries to join up nearby features to give nice, continuous lines.  Let's give it a try on this data:

In [None]:
import skimage.feature

canny = skimage.feature.canny(gray_aerial, sigma=3)

fig, ax = plt.subplots(constrained_layout=True)
ax.imshow(gray_aerial, cmap='gray')
ax.imshow(np.ma.masked_equal(canny, 0), vmin=0, interpolation='nearest')
ax.set(xticks=[], yticks=[])
plt.show()

Okay! Now we're getting closer to identifying our lineaments.  However, we've still got a lot of noise and a lot of small features we're not too interested in.

Not only that, but how are we going to turn these into actual lineaments we can get an orientation of?  If we vectorize this result, the line segments would be mostly curved and not pointing along the features we're interested in.

Basically we want to try to extract straight lines from these ridges.

A good way to detect straight lines in an image is to use a [Hough Transform](https://en.wikipedia.org/wiki/Hough_transform).  The basic idea is to progressively rotate the image and sum along rows of the rotated result, then add each summed version as a new column.  Straight lines will form a local peak in the resulting array.  I'm not going to go over this in too much detail, though.

The key thing to know about a Hough transform is that it gives an _infinite_ line. I.e. `y = Ax + B`.  You don't get start and end points, which is usually what we're interested in.

Therefore, there's a variant called a [probabilistic Hough transform](https://scikit-image.org/docs/dev/auto_examples/edges/plot_line_hough_transform.html#probabilistic-hough-transform) that attempts to identify likely straight line segments of a given length.  It results in discrete line segments with specific locations.  The length can be used as "knob" to tune whether you're finding lots of small linear features or fewer large linear features.  

Sounds perfect for this task! Let's appy it to our Canny-filtered edges above:

In [None]:
from skimage.transform import probabilistic_hough_line
from matplotlib.collections import LineCollection

gap_ratio = 0.12

fig, ax = plt.subplots(constrained_layout=True)
ax.imshow(aerial_photo)
col = ax.add_collection(LineCollection([], color='yellow'))
ax.set(xticks=[], yticks=[])

def update(length):
    lines = probabilistic_hough_line(canny, line_length=length, 
                                     line_gap=int(length*gap_ratio))
    col.set_segments(lines)

utils.Slider(ax, 5, 50, update, start=10).show()

No matter how we do this, we pick up some features we're not interested in, and leave out some that we are.  Overall, it's reasonable if we're mostly interested in orientations.  Let's show a rose diagram of the lineaments we identified:

In [None]:
lines = probabilistic_hough_line(canny, line_length=30, 
                                 line_gap=5)
 
# Calculate azimuth
lines = np.array(lines)
dx, dy = np.squeeze(np.diff(lines, axis=1)).T
# Negative dy due to image orientation, 90 - angle for azimuth
angles = np.pi / 2 - np.arctan2(-dy, dx)

fig = plt.figure(constrained_layout=True)
ax1 = fig.add_subplot(2, 1, 1)
ax2 = fig.add_subplot(2, 1, 2, projection='polar', theta_offset=np.pi/2,
                      theta_direction=-1)
ax1.imshow(aerial_photo)
ax1.add_collection(LineCollection(lines, color='yellow'))
ax2.hist(np.concatenate([angles, angles + np.pi]), bins=60)

ax1.set(xticks=[], yticks=[])
ax2.set(xticks=[], yticks=[], axisbelow=True)

plt.show()

Okay, this worked. However it did a poor job of identifying many of the prominent lineaments.  It also identifies a lot of lineaments that are purely perpendicular to the sun direction. We should be able to do better.

One of the ways we can identify lineaments is to look for _consistent directions_ of image gradients.  Rather than just looking at the magnitudes, let's take direction into account as well.

A common and very useful technique is the [structure tensor](https://en.wikipedia.org/wiki/Structure_tensor). The idea should be familiar to most folks who've worked on orientation statistics.  We take the gradient vectors of the image within a moving window for each pixel.  Then we build a 2x2 covariance matrix from the dx, dy componenents of those gradient vectors.  This is the structure tensor -- it's a symmetric 2x2 matrix for each pixel in the image, based on the covariance of nearby gradients.

In other words, it tells us how aligned image gradients are within each region of the image as well as how large they are.

Let's dive right in to a farily large standalone example to understand what's going on.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import rasterio as rio

from skimage.feature import structure_tensor, structure_tensor_eigvals

from context import data

with rio.open(data.naip.lineaments, 'r') as src:
    image = src.read()

# This assumes a grayscale image. For simplicity, we'll just use RGB mean.
data = image.astype(float).mean(axis=0)

# Compute the structure tensor. This is basically local gradient similarity.
# We're getting three components at each pixel that correspond to a 2x2
# symmetric matrix. i.e. [[axx, axy],[axy, ayy]]
axx, axy, ayy = structure_tensor(data, sigma=2.5, mode='mirror')

# Then we'll compute the eigenvalues of that matrix.
v1, v2 = structure_tensor_eigvals(axx, axy, ayy)

# And calculate the eigenvector corresponding to the largest eigenvalue.
dx, dy = v1 - axx, -axy

# We have a vector at each pixel now.  However, we don't really care about all
# of them, only those with a large magnitude.  Also, we don't need to worry
# about every pixel, as adjacent values are very highly correlated. Therefore,
# let's only consider every 10th pixel in each direction.

# Top 10th percentile of magnitude
mag = np.hypot(dx, dy)
selection = mag > np.percentile(mag, 90)

# Every 10th pixel (skipping left edge due to boundary effects)
ds = np.zeros_like(selection)
ds[::10, 10::10] = True
selection = ds & selection


# Now we'll visualize the selected (large) structure tensor directions both
# superimposed on the image and as a rose diagram...
fig = plt.figure(constrained_layout=True)
ax1 = fig.add_subplot(2, 1, 1)
ax2 = fig.add_subplot(2, 1, 2, projection='polar', theta_offset=np.pi/2,
                      theta_direction=-1)

ax1.imshow(np.moveaxis(image, 0, -1))

y, x = np.mgrid[:dx.shape[0], :dx.shape[1]]

no_arrow = dict(headwidth=0, headlength=0, headaxislength=0)
ax1.quiver(x[selection], y[selection], dx[selection], dy[selection],
           angles='xy', units='xy', pivot='middle', color='red', **no_arrow)


# We actually want to be perpendictular to the direction of change.. i.e.
# we want to point _along_ the lineament. Therefore we'll subtract 90 degrees.
# (Could have just gotten the direction of the smaller eigenvector, but we
# need to base the magnitude on the largest eigenvector.)
angle = np.arctan2(dy[selection], dx[selection]) - np.pi/2
ax2.hist(np.concatenate([angle.ravel(), angle.ravel() + np.pi]), bins=120)

ax1.set(xticks=[], yticks=[])
ax2.set(xticks=[], yticks=[], axisbelow=True)
plt.show()

Thin Section Grain Analysis
-----------------------------------------

We've done a lot so far with large-scale data.  A key reason to know image processing methods, though, is that they apply to all scales.  Let's shift gears and spend a bit of time working with photomicrographs.  

In this case, we'll try to measure the shape preferred orientation of mineral grains in a rock.  We won't go too deeply into the specifics, so let's just try to get an idea of the distribution of grain orientations in our sample.  

If you're not familiar with optical petrology, we're looking at a slide with a very thin (~30 microns) slice of rock attached to it.  It's common to use both plane polarized light and cross polarized light.  You can compare the change in appearance in the figure below:

In [None]:
%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt
import rasterio as rio
import skimage.io

from context import data
from context import utils

xpl_rgb = skimage.io.imread(data.bgs_rock.amphibolite_xpl)
ppl_rgb = skimage.io.imread(data.bgs_rock.amphibolite_ppl)

fig, ax = plt.subplots(constrained_layout=True)
im_ppl = ax.imshow(ppl_rgb, label='Plane Polarized')
im_xpl = ax.imshow(xpl_rgb, label='Cross Polarized')
ax.set(xticks=[], yticks=[])
utils.Toggler(im_xpl).show()

Here's the (poor) 40,000 foot (yes, I know... Units...) overview: 

Minerals with a high degree of birefringence (controlled by anisotropy in the speed of light through the crystal) can appear in a range of "gaudy" colors under cross polars.  The color depends on the orientation of the crystal, so an individual color doesn't tell us much. The range of colors for a certain mineral is a useful indicator, though.  

In this case, we don't really care what the colors mean, we only want to use the color to distinguish one grain from another adjacent grain.  We're interested in being able to identify distinct grains because we're interested in the shape of each grain.

If you look closely, you'll see that the colors within a grain aren't constant. They vary a bit. There are also lots of small mineral grains and inclusions within larger grains that we don't need to worry as much about for this analysis. Ideally we'd try to separate them, but it's okay if we don't. We mostly want to look at the orientation of the largest grains.

Okay. So we want to identify distinct grains.  In image processing terms, we want to segment the image. We'll do this by finding regions with similar colors.  These "regions of a similar color" often are referred to as "superpixels" in image processing terms.  There are a _huge_ variety of methods to do this. Some use gradients to define "watersheds" of similar color, some use clustering, and you can even apply more flexible methods like a trained CNN to do this.

Let's apply a fairly well-known and widely used "superpixel" method: Simple Linear Iterative Clustering (SLIC).  It's based on [K-means clustering](https://en.wikipedia.org/wiki/K-means_clustering), which is a common method to find groups of similar data.  In this case,  it's clustering spatially as well as in color-space.  It's actually using X,Y coordinates directly in the clustering and working in 5 dimensions (RGB + XY).

To make things a bit easier on ourselves, let's just work with the center of the image (avoids needing to separate out the background that's not part of the thin section):

In [None]:
xpl_rgb = xpl_rgb[500:3000, 1000:4000, :]

fig, ax = plt.subplots(constrained_layout=True)
ax.imshow(xpl_rgb)
ax.set(xticks=[], yticks=[])
plt.show()

Now let's segment the image using SLIC and play around with the parameters a bit...

In [None]:
import skimage.segmentation

grains = skimage.segmentation.slic(xpl_rgb, sigma=0.5, multichannel=True,
                                   n_segments=1500, compactness=0.1)

# It's hard to color each grain with a unique color, so we'll show boundaries
# in yellow instead of coloring them like we did before.
overlay = skimage.segmentation.mark_boundaries(xpl_rgb, grains, start_label=1)

fig, ax = plt.subplots(constrained_layout=True)
ax.imshow(overlay)
plt.show()

Here's a more compelete example where we extract properties (such as long axis and short axis) of each segmented region to look at the distribution of grain orientations:

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import skimage.io
import skimage.segmentation
import skimage.measure
import scipy.ndimage

from context import data
from context import utils

# Amphibolite under cross polars (from BGS, see data/bgs_rock/README.md)
xpl_rgb = skimage.io.imread(data.bgs_rock.amphibolite_xpl)

# Let's use the center of the image to avoid needing to worry about the edges.
xpl_rgb = xpl_rgb[500:3000, 1000:4000, :]

# This attempts to group locally similar colors. It's kmeans in 5 dimensions
# (RGB + XY).  N_segments and compactness are the main "knobs" to turn.
grains = skimage.segmentation.slic(xpl_rgb, sigma=0.5, multichannel=True,
                                   n_segments=1500, compactness=0.1)

# It's hard to color each grain with a unique color, so we'll show boundaries
# in yellow instead of coloring them like we did before.
overlay = skimage.segmentation.mark_boundaries(xpl_rgb, grains)

# Now let's extract information about each individual grain we've classified.
# In this case, we're only interested in orientation, but there's a lot more
# we could extract.
info = skimage.measure.regionprops(grains)

# And calculate the orientation of the long axis of each grain...
angles = []
for item in info:
    cov = item['inertia_tensor']
    azi = np.degrees(np.arctan2((-2 * cov[0, 1]), (cov[0,0] - cov[1,1])))
    angles.append(azi)

# Make bidirectional (quick hack for plotting)
angles = angles + [x + 180 for x in angles]

# Now display the segmentation and a rose diagram
fig = plt.figure(constrained_layout=True)
ax1 = fig.add_subplot(1, 2, 1)
ax2 = fig.add_subplot(1, 2, 2, projection='polar', theta_offset=np.pi/2,
                      theta_direction=-1)
ax1.imshow(overlay)
ax2.hist(np.radians(angles), bins=60)

ax1.set(xticks=[], yticks=[])
ax2.set(xticklabels=[], yticklabels=[], axisbelow=True)
plt.show()


### Wrapping it all up

That's all for now, folks!  Hopefully you can see ways to apply some of this to problems you're actively working on.  There are a ton of very powerful methods exposed in common python image processing libraries, and it's easy to get a bit lost in the huge variety of options.  Hopefully this has given you enough of an understanding of common methods to start exploring on your own.  There's a lot out there there's very 