# Segmentation to Information: Edge Cases
Devoted to investigating and patching incorrect edge case behavior in `seg2info.py`

In [None]:
%load_ext autoreload
%autoreload 2

## `other` on horizon
The current algorithm for detecting the boundary between sky and water/ice is currently implemented as detecting the boundary between anything that isn't water/ice and water/ice. However, what if there is a water drop on the lens obscuring the horizon? The algorithm would count the `other` of the water drop as sky, making the boundary bend down around the horizon and ultimately messing up the line of best fit. Let's subclass `Seg2Info`, reproduce the existing algorithm in here so it persists once we implement a fix, and demonstrate the problem.

In [None]:
from cv_experiments.seg2info import Seg2Info
import numpy as np
import cv2
import matplotlib.pyplot as plt

In [None]:
class Seg2Info_OldSkyEdge(Seg2Info):
    def sky_edge(self, img):  # Copypasted from Seg2Info before planned fix
        """Find the sky edge by searching down each column for the transition from no water/ice to some water/ice.\
            Do this with a bunch of different samples, throw out the bad ones if possible, and average
        """
        oceanness = np.max([img[:,:,self.proc_props["four_values"].index("water")], img[:,:,self.proc_props["four_values"].index("ice")]], axis=0)
        # Enforce monotonicity (took me a while to realize the lack of this was causing problems...)
        oceanness = np.maximum.accumulate(oceanness, axis=0)

        samples = np.stack([np.apply_along_axis(np.searchsorted, 0, oceanness, x, side="right")
            for x in self.proc_props["transition_sample_locs"]], axis=-1)
        filtered_samples = np.where(((samples > 0) & (samples < img.shape[0])), samples, np.nan)
        result = np.mean(filtered_samples, axis=-1)
        return np.where(np.isnan(result), np.min(samples, axis=-1), result)
s2i_oldsky = Seg2Info_OldSkyEdge()

In [None]:
raindrop = {"name": "simple_horizon_raindrop",
    "seginput": cv2.imread("../representatives/byname/simple_horizon_raindrop_seginput.jpg"),
    "segmap": cv2.imread("../representatives/byname/simple_horizon_raindrop_segmap.png", cv2.IMREAD_GRAYSCALE)}
fig, axs = plt.subplots(2, 3, figsize=(20, 10))
axs = axs.ravel()
axs[0].imshow(raindrop["seginput"])
s2i_oldsky.simple_composite(axs[1], raindrop, title=raindrop["name"])
s2i_oldsky.plot_mask(axs[2], raindrop, title=raindrop["name"])

img = s2i_oldsky.one_hot_four(raindrop["segmap"])
img = s2i_oldsky.upscale(img)
img = s2i_oldsky.undistort(img)
detected = s2i_oldsky.sky_edge(img)
s2i_oldsky.plot_mask(axs[3], {"segmap": img}, title="Detected sky edge (old algorithm)")
axs[3].scatter(np.arange(img.shape[1]), detected, color="yellow", s=8)

horizon = s2i_oldsky.find_horizon(img)
s2i_oldsky.plot_line(axs[4], {"segmap": img, "line": horizon}, title="Inferred horizon (old algorithm)")

axs[5].set_xlim(0, 1)
axs[5].set_ylim(0, 1)
axs[5].text(0.5, 0.5, ":(", {"fontsize": 100, "ha": "center", "va": "center"});

As expected. (The detected edge doesn't perfectly track the edge of the drop because of the "enforce monotonicity" step.)

Now, a fix. Our four-value representation still differentiates between `sky` and `rest`. In cases where `rest` isn't getting in the way, there is a transition between `sky` and [`water` or `ice`], and we aim to detect it. In cases where `rest` _is_ getting in the way, there are zero such transitions. Let's detect these cases and just not detect anything for those columns. As long as we still have enough unobscured horizon, the line of best fit should still work.

Implementation: currently, we more or less treat `rest` the same as `sky`. That means we don't have to worry about detecting blobs of `rest` completely surrounded by sky. Also, we don't have to worry about blobs of `rest` completely surrounded by ocean, as by the time we get to ocean we will have already detected a horizon -- the monotonicity enforcement will remove such blobs. The only blobs of `rest` we need to worry about are those that have `sky` above them and [`water` or `ice`] below them. An entire column should be invalidated iff this occurs. Thus, we can do our normal search with the various thresholds, then, before we average the results together, invalidate each result if the pixel just above is `rest`. 

In [None]:
class Seg2Info_NewSkyEdge(Seg2Info):
    def sky_edge(self, img, rest_thresh=0.5):
        """Find the sky edge by searching down each column for the transition from no water/ice to some water/ice.\
            Do this with a bunch of different samples, throw out the bad ones if possible, and average
        """
        oceanness = np.max([img[:,:,self.proc_props["four_values"].index("water")], img[:,:,self.proc_props["four_values"].index("ice")]], axis=0)
        # Enforce monotonicity (took me a while to realize the lack of this was causing problems...)
        oceanness = np.maximum.accumulate(oceanness, axis=0)
        isrest = img[:,:,self.proc_props["four_values"].index("rest")] > rest_thresh

        samples = np.stack([np.apply_along_axis(np.searchsorted, 0, oceanness, x, side="right")
            for x in self.proc_props["transition_sample_locs"]], axis=-1)
        # If the pixel immediately above us is `rest`, we haven't actually found anything
        y = np.clip(samples.T-self.proc_props["upscale_factor"], 0, None)
        samples = np.where(~isrest[y, np.arange(img.shape[1])].T, samples, np.nan)
        # If we hit a bounds, we haven't actually found anything
        filtered_samples = np.where(((samples > 0) & (samples < img.shape[0])), samples, np.nan)
        result = np.mean(filtered_samples, axis=-1)
        # Report np.nan if anything is bordering `rest`; report the bounds if anything is out of bounds
        return np.where(np.isnan(result), np.min(samples, axis=-1), result)
    
    # We need to filter out NaNs before passing to fit
    def find_horizon(self, img):
        """Find a horizon as the line of best fit of the sky_edge"""
        width = img.shape[1]
        x = np.arange(width)-width//2
        y = self.sky_edge(img)
        valids = ~np.isnan(x) & ~np.isnan(y)
        return np.polynomial.Polynomial.fit(x[valids], y[valids], 1).convert()

s2i_newsky1 = Seg2Info_NewSkyEdge()
detected = s2i_newsky1.sky_edge(img)
s2i_newsky1.plot_mask(plt.gca(), {"segmap": img}, title="Detected sky edge (new algorithm)")
plt.gca().scatter(np.arange(img.shape[1]), detected, color="yellow", s=8)
plt.show()
horizon = s2i_newsky1.find_horizon(img)
s2i_newsky1.plot_line(plt.gca(), {"segmap": img, "line": horizon}, title="Inferred horizon (new algorithm)")

Success!

One complication: we use `sky_edge` a second time, to detect and adjust for minor deviations from the line of best fit after rotation. One option would be to just not adjust the columns in which we cannot detect a sky edge. Another would be to do some interpolation to fabricate a sky edge. The second option seems more appealing, but what if the missing data is on the edge of the image? Then we don't have two sides to interpolate between. In these cases, we could just pick the value from the edge we do know and keep it constant. Or we could pick a delta of `0` for the other side. Let's do the latter. We'll work with a fabricated version of this segmap that has a bit more variation to adjust for:

In [None]:
fabricated = {"name": "fabricated_horizon",
    "seginput": cv2.imread("../representatives/byname/fabricated_rest_adjustment.jpg"),
    "segmap": cv2.imread("../representatives/byname/fabricated_rest_adjustment.png", cv2.IMREAD_GRAYSCALE)}
img = s2i_newsky1.one_hot_four(fabricated["segmap"])
img = s2i_newsky1.upscale(img)
img = s2i_newsky1.undistort(img)
detected = s2i_newsky1.sky_edge(img)
horizon = s2i_newsky1.find_horizon(img)
s2i_newsky1.plot_line(plt.gca(), {"segmap": img, "line": horizon}, title="Fabricated horizon with line of best fit")
plt.gca().scatter(np.arange(img.shape[1]), detected, color="yellow", s=8)
plt.plot();

In [None]:
import pandas as pd
class Seg2Info_SkyEdgeInterpolation(Seg2Info_NewSkyEdge):
    @staticmethod
    def interpolate_sky_edge(edge, line):
        padded = np.zeros(len(edge)+2, dtype=edge.dtype)
        intercept, slope = line
        padded[1:-1] = edge
        padded[0] = -slope*len(edge)/2+intercept
        padded[-1] = slope*len(edge)/2+intercept
        interpolated = pd.Series(padded).interpolate()
        return interpolated.to_numpy()[1:-1]
    
    # Identical to original except added interpolation line
    def adjust_and_crop(self, img, line, height, interpolation_method=cv2.INTER_LINEAR):
        """After an image has been rotated so the horizon is roughly horizontal, this translates\
            each column of pixels to make the horizon perfectly flat, then moves the horizon so it is\
            buffer_px away from the top of the image
        """

        buffer_px = self.proc_props["upscale_factor"]*self.proc_props["sky_buffer"]
        intercept,_ = line
        intercept = int(intercept)
        # Search search_range pixels up and down of the intercept for the [ice and water] to sky edge
        search_range = int(0.05*height)
        new_edge = self.sky_edge(img[(intercept-search_range):(intercept+search_range), :])+(intercept-search_range)
        new_edge = self.interpolate_sky_edge(new_edge, line)
        delta = buffer_px-new_edge
        orig_height, width = img.shape[:2]
        mapx = np.tile(np.arange(width), (orig_height, 1))
        mapy = np.tile(np.arange(orig_height), (width, 1)).T
        translated = cv2.remap(img, mapx.astype(np.float32), (mapy-delta).astype(np.float32),
            interpolation_method, borderValue=self.proc_props["rest_value"])
        new_height = width+buffer_px
        return translated[:new_height,:]

s2i_newsky2 = Seg2Info_SkyEdgeInterpolation()
detected_interp = s2i_newsky2.interpolate_sky_edge(detected, horizon)
s2i_newsky2.plot_line(plt.gca(), {"segmap": img, "line": horizon}, title="Same line, points are now interpolated")
plt.gca().scatter(np.arange(img.shape[1]), detected_interp, color="yellow", s=8)
plt.show()

rot, scale, height = s2i_newsky2.rotate_image(img, horizon)
adj = s2i_newsky2.adjust_and_crop(rot, horizon, height)
s2i_newsky2.plot_mask(plt.gca(), {"segmap": adj}, title="Adjusted with interpolation")


Looks good! Now we'll paste these improvements back into the original class and make sure everything still works:

In [None]:
s2i_final = Seg2Info()
s2i_final.plot_log_polar(plt.gca(), {"logpolar": s2i_final.apply(s2i_final.whole_pipeline, [fabricated], "segmap")[0]}, title="Logpolar with finalized fix")

Excellent.