Exploring ideas for new shoreline extraction routines for application on label outputs of 4-class segmentation models #168

dbuscombe-usgs · 2023-08-03T01:04:50Z

I wanted to see if I could come up with a solution for shoreline extraction that starts with the 4-class label output as the basis. I put together a script to test this workflow, using an RGB image, the segformer RGB model, and cloud reference shorelines that I just made using GIMP for simplicity. These files are here

inputs_for_shoreline_extarction.zip

This is what they look like

First, I propose we filter the label image with the reference shoreline mask and the cloud mask. If our 4-class greyscale label is the variable grey, the binary cloud mask is called cloud_mask, and the binary shoreline buffer mask is called ref_shoreline_mask

grey = grey.astype(np.float16)
grey[cloud_mask==0] = np.nan
grey[ref_shoreline_mask==0] = np.nan

From here, I have devised 3 new methods for shoreline extraction, each of which builds more complexity on top of the last

method 1: grab the land/water contour directly from this masked greyscale image

def get_shoreline_points(grey):
    ## use matplotlib's contour function to get the contour of the value 1, 
    ## which is water/sand or whitewater/sand interface, depending on the presence of whitewater
    cs = plt.contour(grey,[1])
    plt.close()
    ## the above could be modified instead to get the sand contour, value = 2, e.g.
    ## cs = plt.contour(grey,[2])

    # the shoreline may or may not be segmented, so this next variable may be length 1 or more than 1
    shoreline_segments = cs.collections[0].get_paths()

    ## loop through each segment and get the X,Y coordinate values
    X = []; Y = []
    for p in shoreline_segments:
        v = p.vertices
        x = v[:,0]
        y = v[:,1]
        X.append(x)
        Y.append(y)

    # convert the list of lists into numpy arrays
    X = np.hstack(X)
    Y = np.hstack(Y)
    return X, Y

X, Y = get_shoreline_points(grey)

this is what we get:

method 2: same as method 1, but filter found shorelines that are very close to the boundary of the shoreline mask

We could achieve this using a distance function

from scipy.ndimage import distance_transform_cdt
distances = distance_transform_cdt(ref_shoreline_mask, metric='chessboard', return_distances=True, return_indices=False)

now we have the distance matrix, we can further filter the label image based on a threshold distance in pixels. This is a matrix of pixel values that encode the distance to the boundary of the reference shoreline mask

thres = 10
grey[distances<thres] = np.nan

## now when we get the shoreline, hopefully many shoreline segments very near the edge will disappaear
X, Y = get_shoreline_points(grey)

Nice:

method 3: same as method 1 or 2, but use dynamic boundary tracing rather than contouring

It is a way to find the 'least-cost' path through an image, so is good for finding interfaces

Here are the new libraries and functions we need


from numpy.matlib import repmat

# =========================================================
def dpboundary(imu):
   '''
   dynamic boundary tracing in an image
   (translated from matlab: CMP Vision Algorithms http://visionbook.felk.cvut.cz)
   '''
   # get the image dimensions
   m,n = np.shape(imu)
   #preallocate two arrays for collecting cost and position
   c = np.zeros((m,n))
   p = np.zeros((m,n))
   # initizlize the cost with the first row
   c[0,:] = imu[0,:]
   #cyle through all rows
   for i in range(1,m):
      # this cost starts out as the previous
      c0 = c[i-1,:]
      # the next is vectorized code for computing least costly path through
      # by the place with most similar image intensity
      tmp1 = np.squeeze(ascol(np.hstack((c0[1:],c0[-1]))))
      tmp2 = np.squeeze(ascol(np.hstack((c0[0], c0[0:len(c0)-1]))))
      d = repmat( imu[i,:], 3, 1 ) + np.vstack( (c0,tmp1,tmp2) )
      del tmp1, tmp2
      # record where the minimium cost is
      p[i,:] =  np.argmin(d,axis=0)
      # record what the minimum cost is
      c[i,:] =  np.min(d,axis=0)

   p[p==0] = -1
   p = p+1
   # this next loop allocates pixel coordinates to the minimum cost path
   x = np.zeros((m,1))
   #cost = np.min(c[-1,:])
   xpos = np.argmin( c[-1,:] )
   for i in reversed(range(1,m)):
      x[i] = xpos
      if p[i,xpos]==2 and xpos<n:
         xpos = xpos+1
      elif p[i,xpos]==3 and xpos>1:
         xpos = xpos-1
   x[0] = xpos
   return x

# =========================================================
def ascol( arr ):
   '''
   reshapes row matrix to be a column matrix (N,1).
   '''
   if len( arr.shape ) == 1: arr = arr.reshape( ( arr.shape[0], 1 ) )
   return arr

Now we make a binary image where shoreline points are -1, everything else zero, and call the dpboundary function to this

dp_input = np.zeros_like(grey)
for x,y in zip(X,Y):
   dp_input[int(y), int(x)] = -1

# call dpboundary on this image
shoreline = dpboundary(dp_input)

This one extrapolates to the edge of the image ... Nice!

We can discuss all this tomorrow @2320sharon . We need to test on some more images ...

code here: shoreline_detect.zip

The text was updated successfully, but these errors were encountered:

dbuscombe-usgs · 2023-08-03T03:31:58Z

Upon reflection, I guess it's just one method really, with "method 2" being just an additional logic-based filter, and "method 3" being a way to use those found shorelines to trace a "shoreline path" through the entire image

dbuscombe-usgs · 2023-08-03T03:58:10Z

I tidied up my script, and added two more examples. It's simple and fast and seems quite robust. I'm quite pleased with it.

all code and data here:

example_shoreline_detection_workflow.zip

dbuscombe-usgs · 2023-08-03T04:06:18Z

I think the advantages are:

it works purely in the image coordinate space, and exploits the fact that in coastseg you generate the cloud and shoreline buffers as rasters
it is therefore very fast, because it doesnt have to use vectors or convert between types or structure, or convert or reproject coordinates (nightmare!)
applying the masks to the 4-class label eliminates the need to binarize the image, and by extension, having to deal with booleans and binary logic which can be tricky for both us and the computer to wrap our heads around. Instead, we can find the line separating class 0,1 and 2. Using matplotlib's contour function, you can ask for a specific contour, so I ask for the contour of the value 1. It finds the edge of the whitewater, and if whitewater is missing, it finds the edge of the water instead.
the distance filtering is only helpful in situations where shoreline segments are find within a specified distance of the reference shorelien buffer boundary, so we may or may not want to implement that
the dynamic boundary tracing technique allows us to get a continuous shoreline from the shoreline extracted from the label. Like extrapolation. Genius!

2320sharon · 2023-08-03T16:59:15Z

This is seriously cool and this should make extracting shorelines much faster. Would we still want to offer the same settings as coastsat for extracting shorelines with our model or do you think we may have to adapt them a bit? I believe we should adapt the settings to meet our needs when extracting shorelines instead of adhering to what coastsat has already built.

dbuscombe-usgs · 2023-08-04T01:51:00Z

I tested a few more images, this time a few more tricky examples and the results were interesting

These ones worked great. The common theme is coastlines oriented from top to bottom ...

However, at this site, the dynamic boundary tracing caused a problem, extrapolating the shoreline in an impossible area

I think I know the solution to this problem and will work on it and update here later.

Finally, the dynamic boundary tracing creates issues for non-straight shorelines, as we predicted:

It feels so good to get to work on an image processing task!

dbuscombe-usgs · 2023-08-04T01:53:13Z

In the last two examples, it failed because the dynamic boundary tracing (DBT) is supposed to only work on straight(ish) interfaces, so I will try to think of a switch I can program in that determines when the DBT is/is not appropriate to apply ...

dbuscombe-usgs · 2023-08-04T02:34:48Z

Let's examine why the shoreline failed for this image:

If we examine the input to the DBT algorithm (blue/yellow mask) and the outputs, we realize how powerful this algorithm is at cutting through noise (an alternative we could explore here is the RANSAC algorithm)

I reasoned that the DBT algo needs more signal to work with because of the large gaps. I therefore dilated the input and that improved things a lot

The next problem is trickier ... there are too many shorelines so DBT doesn't apply

... we need a way to tell how complex the shoreline is. One metric that comes to mind is the "standard distance" of the shoreline locations. This is the average distance away from the center of the point cloud. It does a reasonable job at separating the long, straight coastline from the complex coastline. In each of the plots, below, the standard distance is the title, and the most complex coast has by far the largest standard distance

dbuscombe-usgs · 2023-08-04T03:01:03Z

If I implement all of the above ideas, here are the new outputs for all 9 images. Not bad!

My files:
script_and_masks.zip
images.zip
figs.zip

2320sharon · 2023-08-04T18:32:51Z

Thanks for explaining your though process here. I gotta say you got some massive improvements with those techinques.

Do you recommend any books, websites, or courses for learning more about image processing? I find it quite interesting and knowing more about it seems to be the key to solving these kinds of problems

dbuscombe-usgs · 2023-08-16T21:09:15Z

These routines should only be implemented for the 'zoo method'. i.e. using the segformer models.

One nice thing about the approach is that it filters imagery based on label outputs, not image inputs. My logic is that it is probably a lot easier to identify bad images from model outputs (which are low-dimensional) than bad inputs (which are high dimensional). I'm making the case that it is harder to identify a bad image than a bad model output. The only real downside is the computational expense of pointing the model at each image, rather than only the 'good' ones.

However, I feel like the ideal approach is a two-pronged approach:

a non-aggressive filter on the input images (like the black pixel filter we already have, with a permissive threshold)
an aggressive filter on the model output label images

dbuscombe-usgs added Testing Test Case Scenarios, automated testing, etc. Research Investigate if something is possible, experiment labels Aug 3, 2023

dbuscombe-usgs assigned dbuscombe-usgs and 2320sharon Aug 3, 2023

dbuscombe-usgs changed the title ~~Exploring ideas more new shoreline extraction routines for label outputs of 4-class segmentation models~~ Exploring ideas for new shoreline extraction routines for application on label outputs of 4-class segmentation models Aug 3, 2023

dbuscombe-usgs mentioned this issue Aug 5, 2023

Processing imagery with dask/xarray: example application for identifying outlier imagery #154

Closed

This was referenced Aug 17, 2023

Feature Request: Automatic Good/Bad Image Filter #171

Open

minor feature request: rename SDS_unet_classifier.ipynb (again!) #177

Closed

2320sharon added the V2 for version 2 of coastseg label Aug 24, 2023

2320sharon mentioned this issue Oct 5, 2023

Zoo Classifier Workflow Update #197

Open

3 tasks

2320sharon closed this as completed Dec 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exploring ideas for new shoreline extraction routines for application on label outputs of 4-class segmentation models #168

Exploring ideas for new shoreline extraction routines for application on label outputs of 4-class segmentation models #168

dbuscombe-usgs commented Aug 3, 2023

dbuscombe-usgs commented Aug 3, 2023

dbuscombe-usgs commented Aug 3, 2023

dbuscombe-usgs commented Aug 3, 2023

2320sharon commented Aug 3, 2023

dbuscombe-usgs commented Aug 4, 2023 •

edited

dbuscombe-usgs commented Aug 4, 2023

dbuscombe-usgs commented Aug 4, 2023

dbuscombe-usgs commented Aug 4, 2023

2320sharon commented Aug 4, 2023

dbuscombe-usgs commented Aug 16, 2023

Exploring ideas for new shoreline extraction routines for application on label outputs of 4-class segmentation models #168

Exploring ideas for new shoreline extraction routines for application on label outputs of 4-class segmentation models #168

Comments

dbuscombe-usgs commented Aug 3, 2023

method 1: grab the land/water contour directly from this masked greyscale image

method 2: same as method 1, but filter found shorelines that are very close to the boundary of the shoreline mask

method 3: same as method 1 or 2, but use dynamic boundary tracing rather than contouring

dbuscombe-usgs commented Aug 3, 2023

dbuscombe-usgs commented Aug 3, 2023

dbuscombe-usgs commented Aug 3, 2023

2320sharon commented Aug 3, 2023

dbuscombe-usgs commented Aug 4, 2023 • edited

dbuscombe-usgs commented Aug 4, 2023

dbuscombe-usgs commented Aug 4, 2023

dbuscombe-usgs commented Aug 4, 2023

2320sharon commented Aug 4, 2023

dbuscombe-usgs commented Aug 16, 2023

dbuscombe-usgs commented Aug 4, 2023 •

edited