AHI HRIT reader has gotten slower #1384

pnuu · 2020-10-01T08:45:21Z

Describe the bug
Compared to older versions of Satpy, the ahi_hrit reader seems to have gotten very much slower.

To Reproduce

Create an environment with Satpy 0.19.1.
Create an environment with Satpy 0.23.0.
Compare the timings creating the same set of composites, e.g. by running the below script and composite definition. The built-in composite needs to be overridden, as it isn't consistent over these versions.

#!/usr/bin/env python

import glob
import os
os.environ['PYTROLL_CHUNK_SIZE'] = "1024"
os.environ['DASK_NUM_WORKERS'] = "2"
os.environ['OMP_NUM_THREADS'] = "1"
import time

from satpy import Scene


def main():
    fnames = glob.glob(
        "/home/lahtinep/data/satellite/geo/himawari-8/IMG*20201001070*")
    comps = ["truecolor",]
    glbl = Scene(reader='ahi_hrit', filenames=fnames)
    glbl.load(comps)
    glbl.save_datasets(base_dir='/tmp/')


if __name__ == "__main__":
    tic = time.time()
    main()
    print("Elapsed time:", time.time() - tic)

in $PPP_CONFIG_DIR/composites/ahi.yaml

  truecolor:
    compositor: !!python/name:satpy.composites.SelfSharpenedRGB
    prerequisites:
    - name: B03
      modifiers: [sunz_corrected, rayleigh_corrected]
    - name: green
    - name: B01
      modifiers: [sunz_corrected, rayleigh_corrected]
    high_resolution_band: red
    standard_name: truecolor

in $PPP_CONFIG_DIR/enhancements/ahi.yaml

  truecolor:
    standard_name: truecolor
    operations:
    - name: cira_stretch
      method: !!python/name:satpy.enhancements.cira_stretch

Expected behavior
Similar timings for the old and new versions. The other chains I've upgraded over the same versions have actually gotten faster. For the full set of composites I got processing times (via Trollflow) of around 10 minutes. With the new version(s) the process crashes around 28 minutes when the memory runs out.

Actual results
For Satpy 0.19.1: 49.9 s
For Satpy 0.23.0: 59.2 s

Environment Info:

OS: Linux
Satpy Versions: 0.19.1 and 0.23.0
Pyspectral Versions: 0.9.2 and 0.10.0

The text was updated successfully, but these errors were encountered:

pnuu · 2020-10-01T11:16:27Z

The only difference I see in the hrit_jma.py file is the addition of start_time and end_time properties, addition of acq_time to the coordinates. The scanline time parsing itself looks quite fast, so it can't be that. But was there something in setting .coords triggering the computation of the res xr.DataArray`?

pnuu · 2020-10-01T11:26:52Z

No, no extra computations. I used a custom scheduler to check that, and both 0.19.1 and 0.23.0 call da.compute() 4 times.

djhoese · 2020-10-01T11:56:41Z

Are we sure the reader is slower? Maybe the compositing? What about dask versions? Same input storage device? Same timings after repeated executions?

pnuu · 2020-10-01T12:11:48Z

I'm starting to lean on compositing. With simple composites like natural_color the new version is somewhat faster, also for ahi_hrit, but when creating four composites using DayNightCompositor the computation time increases from 115 s to 160 s.

The weird thing is that the compositing is slow only with ahi_hrit reader. For example seviri_l1b_hrit which uses the same base class has gotten faster between these two versions. Same goes for the other chains I've upgraded, using readers viirs_compact, avhrr_l1b_aapp and abi_l1b.

With 0.19.1 the support library versions are whatever were current in early November last year.

simonrp84 · 2020-10-01T12:55:56Z

Maybe it's specific to the composite being used?
Your AHI composite has rayleigh_corrected as a modifier, does your SEVIRI composite? The default ones on github don't have the Rayleigh correction.
Not sure how much that helps, as Rayleigh correction has been in for many years, but maybe it'll point things in the right direction.

pnuu · 2020-10-01T13:00:02Z

I ran the "four Day/Night composites" case again with the custom scheduler. The old version calls compute() four times (total time 136 s):

[INFO: 2020-10-01 15:53:30 : satpy.composites] Getting reflective part of B07
Compute() called 1 times
Compute() called 2 times
Compute() called 3 times

[INFO: 2020-10-01 15:53:50 : satpy.writers] Computing and writing results...
Compute() called 4 times

For the new version compute() is called only once (187 s):

[INFO: 2020-10-01 15:56:24 : satpy.writers] Computing and writing results...
Compute() called 1 times

So I'm now thinking it's something in Pyspectral. Fancy indexing or something like that?

pnuu · 2020-10-01T13:04:11Z

Maybe it's specific to the composite being used?
Your AHI composite has rayleigh_corrected as a modifier, does your SEVIRI composite? The default ones on github don't have the Rayleigh correction.

The GOES/ABI truecolor has the same modifiers applied without slowdown, SEVIRI composites don't have rayleigh.

pnuu · 2020-10-02T07:22:16Z

Here are the profiles Dask diagnostics give.

With 0.19.1 (Pyspectral 0.9.2):

With 0.23.0 (Pyspectral 0.10.0):

Two things I notice are the 1 GB higher memory usage and the single CPU usage for much larger time with 0.23.0 after 80 s time has elapsed.

pnuu · 2020-10-02T09:38:24Z

In the above graphs the colorful bits in the end are calls to store_chunk(). So for some reason in the new installation (rasterio 0.1.6, gdal 3.1.2) it takes much more time to perform these calls, and they happen in a single thread. The older installation has rasterio 1.1.0 and gdal 3.0.2.

For both cases the input data are read from the same location (NAS via NFS) and the results are also saved to same location (local /tmp/).

mraspaud · 2020-10-02T09:41:31Z

you mention gdal and rasterio, could you try taking them out of the loop by eg writing a png file instead?

mraspaud · 2020-10-02T09:44:18Z

Also you show that the reflective part on B07 is used, that's not for true color, right?

pnuu · 2020-10-02T09:46:02Z

Also you show that the reflective part on B07 is used, that's not for true color, right?

That is for the later case with four Day/Night composites which also have the slowness present.

I'm now testing saving to PNG, results in a bit.

pnuu · 2020-10-02T10:02:28Z

And the profile graphs when saving with glbl.save_datasets(base_dir='/tmp/', writer='simple_image', filename='{name}_{start_time:%Y%m%d_%H%M%S}.png').

Satpy 0.19.1 (142 seconds):

Satpy 0.23.0 (132 seconds):

pnuu · 2020-10-02T10:13:49Z

So: rasterio slows down the new Satpy saving (for AHI HRIT only). The memory usage is the same for both saving methods, the new Satpy using 1 GB more RAM.

mraspaud · 2020-10-02T10:40:09Z

Another thing I'm thinking about: the size of the resulting images is the same right? so it's not compression or something...

pnuu · 2020-10-02T10:57:24Z

There are some minor differences in the filesizes, but nothing really pronounced:

$ ls -l /tmp/*20201001* | awk '{print $5, $8;}'
57151471 /tmp/cloud_phase_with_dust_20201001_080000.tif
58287673 /tmp/cloud_phase_with_dust_20201001_080020.tif
58448131 /tmp/convective_storms_with_24h_microphysics_20201001_080000.tif
58444157 /tmp/convective_storms_with_24h_microphysics_20201001_080020.tif
56867602 /tmp/day_microphysical_with_night_microphysical_20201001_080000.tif
55772658 /tmp/day_microphysical_with_night_microphysical_20201001_080020.tif
64940820 /tmp/day_solar_with_ash_20201001_080000.tif
64025212 /tmp/day_solar_with_ash_20201001_080020.tif

The top one is with Satpy 0.19.2, the lower (with seconds in the filenames) is Satpy 0.23.0.

pnuu · 2020-10-02T11:17:59Z

Looking at the images, I think the file size differences are due to minute differences in the blending of Day/Night composites, and change in NIR reflectance (pytroll/pyspectral#112) cutting.

pnuu · 2020-10-03T12:08:20Z

On Slack @mraspaud noticed that commenting out the self.start_time and self.end_time properties from hrit_jma.py::HRITJMAFileHandler reduced the processing time, and there were some thoughts that the differing times for the segments will cause some problems with metadata combination code.

I did another dask profiler test by commenting out only the start_time property. The resulting CPU/memory graphs are below.

Satpy 0.23.0:

Satpy 0.23.0 with the start_time property removed:

So indeed, the start_time property using acq_time is causing the increase in processing time and memory.

These graphs were made on my work laptop with more memory than the server all the previous graphs were made, so apart from the peak memory use these results are not comparable with the earlier ones.

djhoese · 2020-10-03T14:30:44Z

I haven't been completely following this but are the start time and end time produced by the file handler single datetime objects or are they a single (scalar) array of type object or type datetime?

pnuu · 2020-10-03T14:59:47Z

The start_time and end_time properties by the base class are single datetime objects. The versions in hrit_jma.py also return single datetime objects, but that includes getting them from an array of np.datetime64s and conversion to datetime.

pnuu · 2020-10-07T05:39:42Z

Currently, with the acq_time used for start_time, the file times for IMG*001 segments of different channels covering the same area are as follow (sorted by time):

2016-10-07 03:00:20.131200
2016-10-07 03:00:20.131200
2016-10-07 03:00:20.131200
2016-10-07 03:00:20.304000
2016-10-07 03:00:20.304000
2016-10-07 03:00:20.304000
2016-10-07 03:00:20.476800
2016-10-07 03:00:20.476800
2016-10-07 03:00:20.476800
2016-10-07 03:00:20.563200
2016-10-07 03:00:20.563200
2016-10-07 03:00:20.563200
2016-10-07 03:00:20.563200
2016-10-07 03:00:20.563200
2016-10-07 03:00:20.563200
2016-10-07 03:00:20.822400
2016-10-07 03:00:20.822400
2016-10-07 03:00:20.822400
2016-10-07 03:00:20.908800
2016-10-07 03:00:20.908800
2016-10-07 03:00:20.908800
2016-10-07 03:00:20.908800
2016-10-07 03:00:20.908800
2016-10-07 03:00:20.908800
2016-10-07 03:00:21.168000
2016-10-07 03:00:21.168000
2016-10-07 03:00:21.168000
2016-10-07 03:00:21.254400
2016-10-07 03:00:21.254400
2016-10-07 03:00:21.254400
2016-10-07 03:00:21.427200
2016-10-07 03:00:21.427200
2016-10-07 03:00:21.427200
2016-10-07 03:00:21.513600
2016-10-07 03:00:21.513600
2016-10-07 03:00:21.513600
2016-10-07 03:00:21.600000
2016-10-07 03:00:21.600000
2016-10-07 03:00:21.600000

With the older version where the nominal time is used, all of these had start_time 2016-10-07 03:00:00.

pnuu · 2020-10-08T06:48:10Z

Couple more tests:

Remove fractional seconds

start_time = self.acq_time[0].astype(datetime)
return start_time - dt.timedelta(microseconds=start_time.microsecond)

-> still slow and uses more memory

Remove seconds and fractional seconds:

start_time = self.acq_time[0].astype(datetime)
return start_time - dt.timedelta(seconds=start_time.second, microseconds=start_time.microsecond)

-> faster and lower memory usage.

This verifies that accessing the acquisition times isn't the culprit, but something else.

I'll see what happens in satpy.yaml_reader.FileYAMLReader.start_time property with return min(x[0].start_time for x in self.file_handlers.values()) with different versions.

pnuu · 2020-10-08T07:17:52Z

Interesting. Doing (removing seconds and their fractions)

start_time = min(x[0].start_time for x in self.file_handlers.values())
start_time = start_time - dt.timedelta(seconds=start_time.second, microseconds=start_time.microsecond)
return start_time

in satpy.yaml_reader.FileYAMLReader.start_time property doesn't help. So the slow and memory-hungry bit happens somewhere between the file handler (hrit_ahi.py) code and this use of this property.

pnuu · 2020-10-08T09:52:22Z

As far as I can see, all of this happens before the dependency tree is initialized. I'm running out of ideas to test.

pnuu · 2020-10-08T11:37:08Z

Next test: memory profiling satpy.Scene.__init__(). No change in memory use at this point between self.acq_time[0].astype(datetime) and self._start_time.

pnuu · 2020-10-08T12:02:36Z

Memory profiling Scene.save_datasets() didn't reveal anything that would hint at the root cause (delayed processing and all that), but shows nicely the increase in memory use:

Using self.acq_time[0].astype(datetime)

1369 5113.441 MiB 4928.590 MiB           return writer.save_datasets(datasets, compute=compute, **save_kwargs)

Using self._start_time

1369 3813.168 MiB 3628.602 MiB           return writer.save_datasets(datasets, compute=compute, **save_kwargs)

The number fields are: code line number / memory usage / increase in memory usage for that line.

pnuu · 2020-10-08T12:24:01Z

Interestingly, using writer='simple_image' reverses the memory consumption..

pnuu · 2020-11-12T08:08:22Z

Some more testing. If I load only plain channels without any modifiers, there is practically no difference in processing time. So now I'm suspecting that the time-dependent modifiers (sunz_modified, rayleigh_corrected) have something to do with the slowdown.

pnuu · 2021-12-13T08:28:58Z

Another test. Replacing

    @property
    def start_time(self):
        """Get start time of the scan."""
        return self.acq_time[0].astype(datetime)

with a static, identical time to the Scene I'm testing, with:

    @property
    def start_time(self):
        """Get start time of the scan."""
        return datetime(2021, 12, 8, 6, 0, 20)

reduces the memory usage. So it isn't the "extra" seconds compared to the start_time derived from the filename causing the problem.

Doing self._start_time = times64[0].astype(datetime) in _get_acq_time() and using that in start_time() property is again slower and uses more memory. The same slowness happens if using self._start_time = times64[0].copy().astype(datetime) to get rid of potential Numpy references in the Dask graph.

pnuu · 2022-01-19T13:52:12Z

Ok. Next hypothesis: each of the channel has a very slightly varying start_time, sometimes in the order of milliseconds. This might cause the SZA correction to be computed separately for each VIS channel.

pnuu · 2022-01-19T14:07:58Z

Yes! Removing the composites where sunzen_corrected is used, removes also the increase in memory usage. In this case it doesn't matter if start_time property is commented out or as it is currently.

satpy.modifiers.SunZenithCorrector is calling satpy.modifiers.angles.get_cos_sza(), which is using the start_time of the given dataset. I'll test if rounding the time to a full second/minute changes things.

pnuu · 2022-01-19T15:02:07Z

Adjusting the cache key creation in SunZenithCorrector from

        key = (vis.attrs["start_time"], area_name)

to

        key = (0, 1)

should always re-use the cached coszen array, but my debug printing shows it is always recomputed. Adjusting the start_time property still has the desired effect even if the prints show repeated computes. I'm again getting lost 😅

djhoese · 2022-01-19T15:12:01Z

How are you determining that things are being recomputed?

pnuu · 2022-01-19T15:30:32Z

Just by my debug prints in if/elif/else blocks. The code should go to else (my addition for debug printing) when the coszen data are available from the cache, but it is always None and goes to the if block where satpy.modifiers.angles.get_cos_sza() is called. So this is always returning None, no matter what the cache key is.

I'm getting too tired now for further work, but I'll continue tomorrow if there's nothing more urgent happening

djhoese · 2022-01-19T15:46:03Z

I wonder if the WeakValueDictionary used for the cache is no longer doing anything useful after various satpy refactorings and new versions of python.

Either way, I asked about computing because there is a chance that if the get_cos_sza() produces the same set of dask tasks then it should be alright. There is a chance that the default names dask is generating for the dask arrays are too unique and later on when it goes to optimize the dask graph it doesn't realize that the tasks are the same. For some of these tasks we may be able to generate the name manually to ensure that this doesn't happen. So what I wanted to make sure by asking my question was if the dask array is being generated multiple times (multiple calls to get_cos_sza) or if the dask array is being computed multiple times (multiple equal arrays being generated during .compute()). I see that you mean the former.

pnuu · 2022-01-19T15:46:48Z

Printing the size of self.coszen_cache at the beginning of each call to SunZenithCorrector prints 0 and at the end 1. So the cache seems to get cleared between correcting different channels.

djhoese · 2022-01-19T15:49:28Z

That will likely happen when there are no more handles (reference counts) on the value returned by the dictionary. If we are only using the dask array (which it looks like we are) then maybe when the DataArray version of the cza result is garbage collected it is also removed from the WeakValueDictionary cache.

pnuu · 2022-01-20T06:10:05Z

Yep, the WeakValueDictionary got cleared after use as all the calls were considered unique (different start_times). Changing the caching to a normal dictionary shows the accumulation of cached cossza arrays. And further rounding the time used in the cache key shows that the cached cossza is re-used. The processing time and memory use are not affected though 🤦‍♂️.

pnuu · 2022-01-20T08:34:43Z

A simple workaround would be to add a reader kwarg use_exact_start_time=True to the file handler and either return the time with microsecond resolution (default, current) or self._start_time which uses the datetime parsed from the filename. But I think the better choice would be to round the cached data time to closest minute(?) similar to the angles are rounded to closest 0.5 degrees (or something) when caching on-disk.

pnuu · 2022-01-20T10:49:57Z

I went the easy way and made a PR using the reader_kwarg approach.

mraspaud · 2022-01-20T11:24:22Z

Rounding to the closest second would work here, right? We could even have a flag for the caching that say how long is acceptable

pnuu · 2022-01-20T11:28:52Z

I tried that, but evidently didn't find all the places where data_arr.attrs['start_time'] were used as adding rounding to those places didn't help. And those places were plentiful, so couldn't even think of a single place, other than in the reader, where that could've been done instead of doing it in several places.

djhoese · 2022-01-20T14:06:25Z

So making sure that the cossza is reused didn't effect performance? And we're still not sure where/how things are going wrong?

pnuu · 2022-01-20T17:03:16Z

No, it didn't affect. Only using the nominal time lowered the memory usage and made the whole processing go faster.

pnuu added component:readers optimization labels Oct 1, 2020

pnuu mentioned this issue Jan 20, 2022

Add reader kwarg to 'ahi_hrit' to disable exact start_time #1986

Merged

3 tasks

mraspaud closed this as completed in #1986 Feb 4, 2022

Plantain mentioned this issue Feb 6, 2022

AHI HSD true_color incorrect with cache_sensor_angles #2010

Closed

djhoese mentioned this issue Feb 7, 2022

Define time metadata options and usage #2012

Closed

AHI HRIT reader has gotten slower #1384

AHI HRIT reader has gotten slower #1384

Comments

pnuu commented Oct 1, 2020

pnuu commented Oct 1, 2020

pnuu commented Oct 1, 2020

djhoese commented Oct 1, 2020

pnuu commented Oct 1, 2020

simonrp84 commented Oct 1, 2020

pnuu commented Oct 1, 2020

pnuu commented Oct 1, 2020

pnuu commented Oct 2, 2020

pnuu commented Oct 2, 2020

mraspaud commented Oct 2, 2020

mraspaud commented Oct 2, 2020

pnuu commented Oct 2, 2020

pnuu commented Oct 2, 2020

pnuu commented Oct 2, 2020

mraspaud commented Oct 2, 2020

pnuu commented Oct 2, 2020

pnuu commented Oct 2, 2020

pnuu commented Oct 3, 2020

djhoese commented Oct 3, 2020

pnuu commented Oct 3, 2020

pnuu commented Oct 7, 2020

pnuu commented Oct 8, 2020

pnuu commented Oct 8, 2020

pnuu commented Oct 8, 2020

pnuu commented Oct 8, 2020

pnuu commented Oct 8, 2020 • edited

pnuu commented Oct 8, 2020

pnuu commented Nov 12, 2020

pnuu commented Dec 13, 2021 • edited

pnuu commented Jan 19, 2022

pnuu commented Jan 19, 2022

pnuu commented Jan 19, 2022

djhoese commented Jan 19, 2022

pnuu commented Jan 19, 2022

djhoese commented Jan 19, 2022

pnuu commented Jan 19, 2022

djhoese commented Jan 19, 2022

pnuu commented Jan 20, 2022 • edited

pnuu commented Jan 20, 2022 • edited

pnuu commented Jan 20, 2022

mraspaud commented Jan 20, 2022

pnuu commented Jan 20, 2022

djhoese commented Jan 20, 2022 • edited

pnuu commented Jan 20, 2022

pnuu commented Oct 8, 2020 •

edited

pnuu commented Dec 13, 2021 •

edited

pnuu commented Jan 20, 2022 •

edited

pnuu commented Jan 20, 2022 •

edited

djhoese commented Jan 20, 2022 •

edited