Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise the FCI L1c/netcdf_utils by introducing on-demand variables collection and caching #2237

Merged
merged 20 commits into from Dec 1, 2022

Conversation

ameraner
Copy link
Member

@ameraner ameraner commented Oct 18, 2022

A first attempt at optimising the NetCDF4FileHandler for faster FCI L1c loading.

Idea: skip the size-based caching at Scene initialisation (cache_var_size=0), and instead get and cache variable as numpy-xarrays when needed.

Also only required variables/attributes (listed in the reader YAML) are collected. This reduces the number of collected variables/attributes from 93880 to 2070.

This PR also included a lot of tests for solving the single-threadedness reported in #2186, so we'll be closing it with this PR:

  • create file handlers in parallel: slower than single-threaded

  • read datasets (scn.load() call) in parallel: slower than single-treaded

  • Closes FCI L1c reader is single threaded #2186

  • Tests added

  • Fully documented

@ameraner
Copy link
Member Author

ameraner commented Oct 18, 2022

Testing code snippet, using one RC of data from the latest test data (download link from the EUM sftp)

import time
import warnings
from glob import glob

import numpy as np
from satpy import Scene

warnings.filterwarnings("ignore", category=UserWarning, module=r'.*crs')
warnings.filterwarnings("ignore", category=RuntimeWarning, module=r'.*dask')

# debug_on()

path_to_testdata = 'path_to_test_data/RCRC0072/'

scn_time = []
load_time = []
save_time = []
total_time = []

channels = [
    'vis_06',
    'vis_08',
    'ir_105',
    'ir_87',
]

n = 5

for i in range(n):
    print(i)
    scene_start_time = time.time()
    scn = Scene(filenames=glob(path_to_testdata + "*BODY*.nc"), reader='fci_l1c_nc')
    scn_time.extend([time.time() - scene_start_time])
    print(f"Done. It took {time.time() - scene_start_time:.2f}s  to start Scene.")

    load_start_time = time.time()
    scn.load(channels, upper_right_corner='NE')
    load_time.extend([time.time() - load_start_time])
    print(f"Done. It took {time.time() - load_start_time:.2f}s to load datasets.")

    save_start_time = time.time()
    scn.save_datasets()
    save_time.extend([time.time() - save_start_time])
    print(f"Done. It took {time.time() - save_start_time:.2f}s  to save datasets.")

    print(f"Done. It took {time.time() - scene_start_time:.2f}s  for all ops.")
    total_time.extend([time.time() - scene_start_time])

    del scn

print("*" * 50)
print(f"Done. It took {np.mean(scn_time):.2f}s  to start Scene.")
print(f"Done. It took {np.mean(load_time):.2f}s  to load datasets.")
print(f"Done. It took {np.mean(save_time):.2f}s  to save datasets.")
print(f"Done. It took {np.mean(total_time):.2f}s  for all ops.")

@ameraner
Copy link
Member Author

ameraner commented Oct 18, 2022

From the code above, the average stats for loading and saving two VIS and two IR channels are:
Before this PR (go back to cache_var_size=10000):

Done. It took 10.15s  to start Scene.
Done. It took 4.72s  to load datasets.
Done. It took 13.03s  to save datasets.
Done. It took 27.90s  for all ops.

after this PR:

Done. It took 6.21s  to start Scene.
Done. It took 6.19s  to load datasets.
Done. It took 13.08s  to save datasets.
Done. It took 25.49s  for all ops.

So, we shave some time off the Scene initialisation because of the skipped caching, it takes some more time to load the dataset (need to get the variables now), and in total we save a couple of seconds...

@ameraner
Copy link
Member Author

ameraner commented Oct 18, 2022

Note that almost all of the remaining Scene initialisation time now comes from the file opening here

file_handle = netCDF4.Dataset(self.filename, 'r')

@codecov
Copy link

codecov bot commented Nov 15, 2022

Codecov Report

Merging #2237 (563e303) into main (8c1ccdd) will increase coverage by 0.21%.
The diff coverage is 97.79%.

@@            Coverage Diff             @@
##             main    #2237      +/-   ##
==========================================
+ Coverage   94.13%   94.35%   +0.21%     
==========================================
  Files         293      310      +17     
  Lines       45079    46554    +1475     
==========================================
+ Hits        42437    43926    +1489     
+ Misses       2642     2628      -14     
Flag Coverage Δ
behaviourtests 4.59% <0.00%> (-0.09%) ⬇️
unittests 94.99% <97.79%> (+0.20%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
satpy/readers/netcdf_utils.py 98.32% <96.25%> (-1.68%) ⬇️
satpy/readers/fci_l1c_nc.py 98.19% <100.00%> (+0.02%) ⬆️
satpy/tests/reader_tests/test_fci_l1c_nc.py 100.00% <100.00%> (ø)
satpy/tests/reader_tests/test_netcdf_utils.py 96.12% <100.00%> (+1.22%) ⬆️
satpy/readers/seadas_l2.py 96.84% <0.00%> (-1.28%) ⬇️
satpy/tests/test_resample.py 88.90% <0.00%> (-0.37%) ⬇️
satpy/composites/ahi.py 100.00% <0.00%> (ø)
satpy/readers/hrit_jma.py 97.94% <0.00%> (ø)
satpy/tests/test_utils.py 100.00% <0.00%> (ø)
satpy/readers/yaml_reader.py 97.50% <0.00%> (ø)
... and 48 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@coveralls
Copy link

coveralls commented Nov 15, 2022

Coverage Status

Coverage increased (+0.2%) to 94.946% when pulling 563e303 on ameraner:feature_optimise_fcil1c into e7d24a3 on pytroll:main.

@pnuu
Copy link
Member

pnuu commented Nov 17, 2022

I'm not happy how I implemented the channel name replacement/expansion, because that's completely based on the FCI reader and file structure. I think it should be done in a more generic way, maybe using a dictionary of "replace the key with each item in the given list" kind of approach.

@djhoese
Copy link
Member

djhoese commented Nov 17, 2022

Can you use python format strings?

@pnuu pnuu marked this pull request as ready for review November 18, 2022 07:37
@pnuu
Copy link
Member

pnuu commented Nov 18, 2022

Can you use python format strings?

Oh yeah, .format() is cleaner than string replacement. Done in f0cda99

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks a lot for the speedup! The caching function doesn't seem to be well tested according to codecov? otherwise just a couple of questions/comments

satpy/readers/fci_l1c_nc.py Show resolved Hide resolved
satpy/readers/netcdf_utils.py Outdated Show resolved Hide resolved
@ameraner ameraner changed the title Optimise the FCI L1c/netcdf_utils by introducing on-demand caching Optimise the FCI L1c/netcdf_utils by introducing on-demand variables collection and caching Nov 23, 2022
@djhoese djhoese added enhancement code enhancements, features, improvements component:readers optimization labels Nov 23, 2022
@pnuu
Copy link
Member

pnuu commented Nov 24, 2022

I upgraded my local netdcf4 to the same version as in CI (1.6.2), and the test_netcdf_utils pass.

ameraner added a commit to ameraner/satpy that referenced this pull request Nov 25, 2022
…ter during get_dataset

uses the caching trick as discussed in pytroll#2237 (comment)
@pnuu
Copy link
Member

pnuu commented Nov 28, 2022

I got the tests failing also locally when I run all the reader tests. If I go back to 17db0de the tests pass.

@pnuu
Copy link
Member

pnuu commented Nov 28, 2022

So:

  • decorating the method works in the tests, but causes a memory-leak
  • using the trick in the video or here works when running the code but causes tests to fail if all the reader tests are run. Running only the tests in test_netcdf_utils.py works fine

@pnuu
Copy link
Member

pnuu commented Nov 28, 2022

Going back to the original caching with dictionaries, don't want to use too much time wondering why the functools way causes netCDF4 to break.

@pnuu
Copy link
Member

pnuu commented Nov 28, 2022

I added couple of tests for get_and_cache_npxr() as @mraspaud requested.

If everything seems fine for merging, please lets do that. We can then continue to work with the next PRs and inspect the caching and possible test side effects on a more stable (not being changed all the time by me) base.

@ameraner
Copy link
Member Author

ameraner commented Nov 28, 2022

I've ran a few more tests, and added a few more usages of get_and_cache_npxr in the code. Here the results (for the test code above, now on the EWC machine):

main first commit (40d1d67) custom var collect (c7a4c3d) expand get_and_cache usage (528ee31)
Scene (s) 6,90 4,08 3,76 3,81
load (s) 3,10 3,91 3,80 2,70
save_datasets (s) 12,38 12,29 12,38 12,34
Total (s) 22,38 20,28 19,94 18,85
Memory (MiB) 3070 3058 2974 2998

this is for the case of local data, with the maximum memory usage monitored with profile from memory_profiler.
In summary:

  • the first commit reduces the Scene init by deactivating the caching, and increases loading time as the variables need to be read from scratch. Memory is reduced due to less variable caching.
  • the custom variable collection further reduces the Scene init time by reducing the variables and metadata collection - this is particularly useful for the h5netcdf/remote reading case, as seen in other tests. Memory is reduced due to less variables and metadata collection.
  • the last commit adds more variables to the caching mechanism, most importantly the x/y geolocation 1-d arrays, reducing loading time. Memory is slightly increased again.

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot for optimising the reader!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:readers enhancement code enhancements, features, improvements optimization
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

FCI L1c reader is single threaded
5 participants