Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AHI HSD true_color incorrect with cache_sensor_angles #2010

Closed
Plantain opened this issue Feb 6, 2022 · 20 comments · Fixed by #2013
Closed

AHI HSD true_color incorrect with cache_sensor_angles #2010

Plantain opened this issue Feb 6, 2022 · 20 comments · Fixed by #2013

Comments

@Plantain
Copy link

Plantain commented Feb 6, 2022

Expected behaviour:
Imagery should be bit-for-bit, or at least visually indistinguishable with cache_sensor_angles=True or False

Actual Behaviour:
Very different imagery, including a large seam line
True
sensor
False
no_sensor

Sample code

import os
os.environ['PYTROLL_CHUNK_SIZE'] = "1100"
os.environ['DASK_NUM_WORKERS'] = "4"
os.environ['OMP_NUM_THREADS'] = "8"

import satpy
satpy.config.set(cache_sensor_angles=True)
satpy.config.set(cache_lonlats=False)
satpy.config.set(cache_dir="/tmp/")

from satpy.resample import get_area_def
import glob
from satpy import Scene

def satpy(fnames, outname, reader, composite):
    scn = Scene(reader=reader, filenames = fnames)
    scn.load([composite], generate=False)
    scn = scn.resample(scn.finest_area(),resampler='native')
    scn.save_dataset(composite, filename=outname, write='geotiff',  compress="DEFLATE", num_threads="32", tiled='YES')

def himawari():
    filenames = glob.glob('*FLDK*')
    outname = 'himawari.tiff'
    satpy(filenames, outname, 'ahi_hsd', 'true_color')

himawari()

Environment Info:
satpy git
pyresample git
Ubuntu 20.04

@djhoese
Copy link
Member

djhoese commented Feb 6, 2022

I've started looking at this, but I have a couple questions. Is there a reason you don't cache lonlats while at the same time cache the sensor angles? Could you past the list of filenames your glob is producing?

@djhoese
Copy link
Member

djhoese commented Feb 6, 2022

Good news is I'm able to reproduce this:

image

@djhoese
Copy link
Member

djhoese commented Feb 6, 2022

So this line seems to be at exactly 140 degrees East longitude. I don't see a clear horizontal line and I also don't see this line go all the way up the image. This makes me wonder if it is some kind of divide by zero that is happening with the arguments being rounded when given to caching.

Edit: 🤦 140.25 is the reference longitude of the satellite

@djhoese
Copy link
Member

djhoese commented Feb 6, 2022

WTF, I changed the rounding when caching to be 2 decimal digits and changed the hardcoded start time to 2022 (instead of 2000). And I get this:

image

While doing this I did notice that the individual bands involved all have a slightly different satellite height and start time.

@djhoese
Copy link
Member

djhoese commented Feb 6, 2022

Further debugging where I also see the issue:

import os
os.environ["PYTROLL_CHUNK_SIZE"] = "1100"

import satpy
from satpy import Scene, DataQuery
from glob import glob
from dask.diagnostics import ProgressBar

product = DataQuery(name="B03", modifiers=("sunz_corrected", "rayleigh_corrected"))
with ProgressBar(), satpy.config.set(cache_lonlats=False, cache_sensor_angles=True):
    scn = Scene(reader="ahi_hsd", filenames=glob("/data/satellite/ahi/hsd/2330/HS_H08_20181112_2330_B??_FLDK*_S*"))
    scn.load([product])
    new_scn = scn.resample(resampler="native")
    new_scn.save_datasets(base_dir=".")

@djhoese
Copy link
Member

djhoese commented Feb 6, 2022

The above works for B01 also (works as in produces a bad image). I did another hack where I commented out line 229-230 in satpy/modifiers/angles.py so the start_time is not overwritten. This produces a normal valid image. So this means the problem isn't the satellite location information, but rather the datetime. A little worrying.

Edit: And setting the starting date to 2000-01-02 12:00:00 (one day later) seems to work just fine. It must be some divide by zero equivalent with using the initial Greenwich mean sidereal time.

@Plantain
Copy link
Author

Plantain commented Feb 6, 2022

I've started looking at this, but I have a couple questions. Is there a reason you don't cache lonlats while at the same time cache the sensor angles? Could you past the list of filenames your glob is producing?

I was just proving it was caching of sensor angles, as my first guess was it was lat-lon related given the obvious latitudinal deliniation.

Mentions of date/time and AHI smells a bit like #1384 - I wonder if there is some connection...

@pnuu
Copy link
Member

pnuu commented Feb 7, 2022

#1384 was purely for the HRIT variant of the data, and the ahi_hsd code doesn't call it.

@djhoese
Copy link
Member

djhoese commented Feb 7, 2022

As @pnuu said, #1384 was purely HRIT, but it seems HSD suffers from a similar problem. The start times defined in the HSD data are all slightly different (I don't remember exactly how far off right now) so regular uncache generation of the "sunz_corrected" modifier is actually computing the sun angles 3 separate times for each input band. When caching is used it replaces the start time with one static time when generating sensor zenith angles but not solar zenith angles. Without caching this effects both solar and sensor angle generations, but also includes things like the satellite height which were slightly different between bands. Looking at the HSD reader it looks like there is a scheduled_time and a start_time where the scheduled_time is the nominal time the data was supposed to be observed. Looks like I modified a lot of this in #473 so time fields are averaged between segments. Bottom line, this could be a huge optimization for AHI angle generation if we could normalize/synchronize the start time.

As for this issue, I will investigate further but I think this is mainly an issue with the static hardcoded start_time used in the cached angle generation. It results in 0 showing up in a few places that I think are causing issues. I will no more in the next couple hours.

@simonrp84
Copy link
Member

@djhoese How far different are the start times?

@djhoese
Copy link
Member

djhoese commented Feb 7, 2022

In [1]: from satpy import Scene; from glob import glob
scn
In [2]: scn = Scene(reader="ahi_hsd", filenames=glob("/data/satellite/ahi/hsd/2330/HS_H08_20181112_2330_B??*FLDK*.DAT"))

In [3]: scn.load(["B01", "B02", "B03"])

In [4]: for data_arr in scn.values(): print(data_arr.attrs["start_time"])
2018-11-12 23:30:20.447583
2018-11-12 23:30:20.696706
2018-11-12 23:30:20.186442

@pnuu
Copy link
Member

pnuu commented Feb 7, 2022

That seems pretty much the same as ahi_hrit.

@djhoese
Copy link
Member

djhoese commented Feb 7, 2022

So when it comes to sensor angles the main piece of code is the gmst function in pyorbital which has this:

def gmst(utc_time):
    """Greenwich mean sidereal utc_time, in radians.

    As defined in the AIAA 2006 implementation:
    http://www.celestrak.com/publications/AIAA/2006-6753/
    """
    ut1 = jdays2000(utc_time) / 36525.0
    theta = 67310.54841 + ut1 * (876600 * 3600 + 8640184.812866 + ut1 *
                                 (0.093104 - ut1 * 6.2 * 10e-6))
    return np.deg2rad(theta / 240.0) % (2 * np.pi)

The jdays2000 is number of fractional days since January 1st, 2000 12:00:00. In the satpy angles code during caching I hardcode this to that January 1st date. This means ut1 is 0. This also means theta is 67310.54841. From what I can tell this doesn't seem like it should set the return value to anything odd or cause any other issues in the later uses of gmst, but apparently it does. My simple solution unless someone is an expert on what the second half of the theta calculations above represent is to just bump the hardcoded year to 2002 instead of 2000.

@djhoese
Copy link
Member

djhoese commented Feb 7, 2022

And its not like the returned value from gmst is way different when you change the time:

In [1]: from pyorbital.astronomy import gmst

In [2]: from datetime import datetime, timedelta

In [3]: gmst(datetime(2000, 1, 1, 12, 0, 0))
Out[3]: 4.894961212823059

In [4]: gmst(datetime(2000, 1, 1, 12, 0, 1))  # one second difference
Out[4]: 4.895034133981612

In [5]: gmst(datetime(2000, 1, 1, 13, 0, 0))  # one hour difference
Out[5]: 5.157477383614096

In [6]: gmst(datetime(2000, 1, 2, 12, 0, 0))  # one day difference
Out[6]: 4.912164004628369

@djhoese
Copy link
Member

djhoese commented Feb 7, 2022

Ugh and now I tried hardcoding the start_time to the 2020-01-0112:00:00 but use the real sat position values and that worked fine.

@djhoese
Copy link
Member

djhoese commented Feb 7, 2022

I figured it out. The caching was swapping the zarr filenames so it was returning sensor azimuth and zenith and sensor zenith as azimuth. Son of a...

@djhoese
Copy link
Member

djhoese commented Feb 7, 2022

Fixed in #2013. Let me know if you can test it and how it goes.

@mraspaud
Copy link
Member

mraspaud commented Feb 8, 2022

@Plantain ^

@djhoese
Copy link
Member

djhoese commented Feb 8, 2022

Oops. For some reason I thought @Plantain had already tested this. I guess now that it is merged into main, let me know if it is still a problem for you.

@Plantain
Copy link
Author

I can confirm this appears to fix the issue! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants