MODIS Interpolation Comparisons #39

djhoese · 2022-05-17T01:06:14Z

While working on my 3 PRs for optimizing MODIS interpolation (#35, #36, #38), I noticed that in real world cases my cython-based PRs were producing invalid/bad images where the bow-tie pixels of an image were clearly being mapped incorrectly from the default interpolation used by Satpy (the geotiepoints/modisinterpolator.py version that uses the CVIIRS algorithm). This happens even though the unit tests pass! This means that the tests aren't strict/rigid enough in their thresholds to the point that they let invalid/bad results through.

NOTE: All plots/images in this issue were made with the current main branch and don't use my updated code.

I've since been trying to do a deep dive on what is going on and how the "simple" interpolation compares to the CVIIRS interpolation. I've discovered at least one problem with the last column of the data. Here is a zoomed in view of the differences between the two methods' longitude arrays, specifically the upper right corner, when used on real Aqua data:

You may be able to see on the right most columns (where I've removed the black border line in the matplotlib plot) that the differences are suddenly much higher for the last column. If I look at the individual plots for these longitudes I can't see anything obvious in either method's results. I'll post the images I've generated so far to start the discussion, but we need to come up with ways to decide which method to use as "truth".

Note I refer to the CVIIRS algorithm as "angle-based" because it requires solar zenith angles. I refer to the other method as "simple" because it pretty close to basic bilinear interpolation from the basic understanding I have.

CC @simonrp84

The text was updated successfully, but these errors were encountered:

djhoese · 2022-05-17T01:11:26Z

Ah, the last "column" is actually 3 columns of data (here's a zoomed in view of the last 200 columns of the image, only showing the last 10 columns in this view):

Edit: I think given the angle-based/CVIIRS algorithm has this last line of this method I blame CVIIRS (where cols == 4):

    def _expand_tiepoint_array_1km(self, arr, lines, cols):
        arr = da.repeat(arr, lines, axis=1)
        arr = da.concatenate((arr[:, :lines//2, :], arr, arr[:, -(lines//2):, :]), axis=1)
        arr = da.repeat(arr.reshape((-1, self.cscan_full_width - 1)), cols, axis=1)
        return da.hstack((arr, arr[:, -cols:]))

mraspaud · 2022-05-17T06:38:20Z

Thanks a lot for looking into this! The last columns are extrapolated, right?

djhoese · 2022-05-17T14:13:16Z

So if I exclude the last 3 columns from the lon/lat difference limit we get much better differences. Considering all pixels we get (lon diff min, lon diff max, lat diff min, lat diff max):

-0.0030819955405547717 0.06203039353681561 -0.031740776286142136 0.00018770053469552295

Excluding 3 right-most columns:

-0.0030819955405547717 0.0018535406073283411 -0.0003218550081101057 0.00018770053469552295

Plotting the differences with these new limits shows:

A difference of -.00308199554 (the maximum longitude difference) at the equator is a difference of 343.086 meters in the mercator projection. Considering this geolocation interpolation is 1km to 250m, that's still a pretty significant difference.

djhoese · 2022-05-17T14:20:39Z

@mraspaud as far as extrapolation, I'd have to check that one link that @simonrp84 linked to on slack, but I don't think there is extrapolation in the column/cross-track direction but there is in the row/along-track direction. In the simple interpolation I use scipy's map_coordinates with these indexes:

    x = np.arange(res_factor * num_cols, dtype=np.float32) * (1. / res_factor)
    # 0.375 for 250m, 0.25 for 500m
    y = np.arange(res_factor * ROWS_PER_SCAN, dtype=np.float32) * \
        (1. / res_factor) - (res_factor * (1. / 16) + (1. / 8))

So the x/column indexes are all positive and less than or equal to the original indexes. The y/row indexes have the - (res_factor * (1. / 16) + (1. / 8)) term which should mean that the first couple rows of the output geolocation are extrapolated...I think. But also the "simple" algorithm has special handling for these (the first two rows and the last 2 rows) to perform a special linear extrapolation.

djhoese · 2022-05-17T14:58:39Z

Ok so here is the information that Simon had linked to: https://www.icare.univ-lille.fr/modis-geolocation/

We can see from this image that the last 3 250m columns are indeed "after" the last defined 1km column (in red):

There is a similar image for the first columns which show that the first 250m column is inline with the first 1km column's X coordinate. This matches what the simple interpolation is doing from what I can tell (first x index is 0, last index is 0.75 beyond the last column index (or 0.25 pixels beyond the edge of the last 1km pixel).

This does suggest that the code I showed above that is in the CVIIRS/angle-based interpolation where it appends the last 4 columns in the tiepoint array is wrong (again, from what I understand). It should at least be extrapolating those columns' tiepoints.

@mraspaud I propose we replace the "valid"/truth test data with the output of the simple interpolation and compare the CVIIRS/angle-based interpolation based on that. I also propose that for 1km->500m and 1km->250m interpolation, Satpy should use the simple interpolation by default. We can then work toward:

Fixing the interpolation of the angled-based interpolation for at least these last 3 columns.
Possibly add 5km->1km interpolation to the simple interpolation.

mraspaud · 2022-05-19T16:49:55Z

I will try to have a look at this tomorrow

djhoese · 2022-05-24T19:17:10Z

After some discussion on slack and in a call, @mraspaud has pointed out some major issues in the simple interpolation. If we plot the the geodetic distance between pixels cross track (this is actually the average of the same row in multiple scans) we see this:

The simple linear interpolation has very clear steps between control points. This isn't unexpected but isn't great either if we can do any better especially given the cviirs algorithm is able to produce much cleaner line overall. Initial tests of using order=2 in the map_coordinates call shows much better interpolation in the middle of the scan but the ends (due to the interpolation) go wacky.

Additionally this plot shows the last columns of the simple interpolation are actually just a repeat of the last input pixel. This is due to the "nearest" mode used for pixels outside the input region. There isn't a mode builtin to map_coordinates that will do anything better. This is overall a bug/failing in the simple interpolation that we will need to work on at a future date.

djhoese · 2022-05-24T20:08:07Z

I'll make a PR soon, but I've come up with a "good enough" hack to extrapolate the simple interpolation's right-most columns. Even with this though, I'm a little concerned by the different between simple and cviirs at the tops and bottoms of scans. Here's a zoom in on the word part of the longitude array (the upper-left in this particular data case).

mraspaud · 2022-05-25T11:15:29Z

I found a bug in the modisintepolator.py that is due to the different nature of the modis tiepoints compared to the cviirs one. The fix is pretty simple, and improve the interpolation a the scan edges a lot.
Here is the patch for main:

diff --git a/geotiepoints/modisinterpolator.py b/geotiepoints/modisinterpolator.py
index 6d70db2..5fb83a3 100644
--- a/geotiepoints/modisinterpolator.py
+++ b/geotiepoints/modisinterpolator.py
@@ -23,7 +23,14 @@
 """Interpolation of geographical tiepoints using the second order interpolation
 scheme implemented in the CVIIRS software, as described here:
 Compact VIIRS SDR Product Format User Guide (V1J)
-http://www.eumetsat.int/website/wcm/idc/idcplg?IdcService=GET_FILE&dDocName=PDF_DMT_708025&RevisionSelectionMethod=LatestReleased&Rendition=Web
+https://www.eumetsat.int/media/45988
+and
+Anders Meier Soerensen, Stephan Zinke,
+A tie-point zone group compaction schema for the geolocation data of S-NPP and NOAA-20 VIIRS SDRs to reduce file sizes
+in memory-sensitive environments,
+Applied Computing and Geosciences, Volume 6, 2020, 100025, ISSN 2590-1974,
+https://doi.org/10.1016/j.acags.2020.100025.
+(https://www.sciencedirect.com/science/article/pii/S2590197420300070)
 """
 
 import xarray as xr
@@ -34,9 +41,8 @@
 from .geointerpolator import lonlat2xyz, xyz2lonlat
 
 R = 6371.
-# Aqua scan width and altitude in km
-scan_width = 10.00017
-H = 705.
+# Aqua altitude in km
+H = 709.
 
 
 def compute_phi(zeta):
@@ -51,7 +57,7 @@ def compute_zeta(phi):
     return np.arcsin((R + H) * np.sin(phi) / R)
 
 
-def compute_expansion_alignment(satz_a, satz_b, satz_c, satz_d):
+def compute_expansion_alignment(satz_a, satz_b, satz_c, satz_d, scan_width):
     """All angles in radians."""
     zeta_a = satz_a
     zeta_b = satz_b
@@ -180,7 +186,7 @@ def interpolate(self, lon1, lat1, satz1):
 
         satz_a, satz_b, satz_c, satz_d = get_corners(da.deg2rad(satz1))
 
-        c_exp, c_ali = compute_expansion_alignment(satz_a, satz_b, satz_c, satz_d)
+        c_exp, c_ali = compute_expansion_alignment(satz_a, satz_b, satz_c, satz_d, self.cscan_width)
 
         x, y = self.get_coords(scans)
         i_rs, i_rt = da.meshgrid(x, y)

djhoese · 2022-05-25T12:15:52Z

@mraspaud with that change things are much better:

Note the 1e-5 on the latitude difference and the extra 0 in the longitude difference decimal.

mraspaud · 2022-05-25T12:33:35Z

Yeah, I think we're approaching numerical precision as the lons and lats are in float32

djhoese added bug question labels May 17, 2022

djhoese assigned mraspaud, adybbroe and djhoese May 17, 2022

djhoese mentioned this issue May 24, 2022

Add right-most column "extrapolation" to simple modis interpolation #40

Closed

5 tasks

djhoese mentioned this issue May 25, 2022

Fix MODIS cviirs-based interpolation #41

Merged

5 tasks

djhoese closed this as completed in #41 May 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MODIS Interpolation Comparisons #39

MODIS Interpolation Comparisons #39

djhoese commented May 17, 2022

djhoese commented May 17, 2022 •

edited

Loading

mraspaud commented May 17, 2022

djhoese commented May 17, 2022

djhoese commented May 17, 2022

djhoese commented May 17, 2022

mraspaud commented May 19, 2022

djhoese commented May 24, 2022

djhoese commented May 24, 2022

mraspaud commented May 25, 2022 •

edited

Loading

djhoese commented May 25, 2022

mraspaud commented May 25, 2022

MODIS Interpolation Comparisons #39

MODIS Interpolation Comparisons #39

Comments

djhoese commented May 17, 2022

djhoese commented May 17, 2022 • edited Loading

mraspaud commented May 17, 2022

djhoese commented May 17, 2022

djhoese commented May 17, 2022

djhoese commented May 17, 2022

mraspaud commented May 19, 2022

djhoese commented May 24, 2022

djhoese commented May 24, 2022

mraspaud commented May 25, 2022 • edited Loading

djhoese commented May 25, 2022

mraspaud commented May 25, 2022

djhoese commented May 17, 2022 •

edited

Loading

mraspaud commented May 25, 2022 •

edited

Loading