Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make viirs-compact datasets compatible with dask distributed #1546

Merged
merged 5 commits into from Feb 16, 2021

Conversation

mraspaud
Copy link
Member

This PR makes viirs-compact datasets compatible with dask distributed.

The problem was coming from the map_blocks calls using a method of the filehandler as a callable, thus embedding all the filehandler attributes including an open h5py.File instance.

Feedback on how to test this would be very much appreciated!

Copy link
Member

@djhoese djhoese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think the most basic way of testing would be to do:

from dask.distributed import Client
client = Client()
# normal reader/Scene loading
scn['I01'].compute()

This should run it in a "local cluster" and I assume this fails with the current master branch otherwise this isn't much of a test. There may be issues with creating a Client in testing. I know I had some issues with this when working on the MultiScene. We might be able to steal some of dask's pytest "fixtures" but since these aren't public API we probably want to avoid this as much as possible.

@codecov
Copy link

codecov bot commented Feb 16, 2021

Codecov Report

Merging #1546 (55b4145) into master (50ddcc6) will decrease coverage by 1.10%.
The diff coverage is 88.40%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1546      +/-   ##
==========================================
- Coverage   92.09%   90.98%   -1.11%     
==========================================
  Files         251      251              
  Lines       36711    36698      -13     
==========================================
- Hits        33808    33390     -418     
- Misses       2903     3308     +405     
Flag Coverage Δ
behaviourtests 4.48% <0.00%> (+<0.01%) ⬆️
unittests 91.50% <88.40%> (-1.12%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
satpy/readers/viirs_compact.py 67.42% <88.40%> (+2.53%) ⬆️
satpy/tests/reader_tests/test_seviri_l2_grib.py 23.07% <0.00%> (-76.93%) ⬇️
satpy/tests/reader_tests/test_seviri_l2_bufr.py 30.35% <0.00%> (-69.65%) ⬇️
satpy/readers/iasi_l2_so2_bufr.py 26.66% <0.00%> (-67.96%) ⬇️
satpy/readers/ascat_l2_soilmoisture_bufr.py 30.66% <0.00%> (-66.74%) ⬇️
satpy/tests/reader_tests/test_iasi_l2_so2_bufr.py 30.98% <0.00%> (-66.20%) ⬇️
...ts/reader_tests/test_ascat_l2_soilmoisture_bufr.py 37.20% <0.00%> (-60.47%) ⬇️
satpy/readers/seviri_l2_grib.py 27.08% <0.00%> (-59.66%) ⬇️
satpy/readers/seviri_l2_bufr.py 35.36% <0.00%> (-58.69%) ⬇️
satpy/_compat.py 42.85% <0.00%> (-57.15%) ⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 50ddcc6...c1171a3. Read the comment docs.

Copy link
Member

@pnuu pnuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just couple comments.

satpy/readers/viirs_compact.py Outdated Show resolved Hide resolved
satpy/readers/viirs_compact.py Outdated Show resolved Hide resolved
satpy/readers/viirs_compact.py Show resolved Hide resolved
@ghost
Copy link

ghost commented Feb 16, 2021

DeepCode failed to analyze this pull request

Something went wrong despite trying multiple times, sorry about that.
Please comment this pull request with "Retry DeepCode" to manually retry, or contact us so that a human can look into the issue.

@mraspaud
Copy link
Member Author

I added a simple distributed test. Seems to run fine, fails on master.

def test_distributed(self):
"""Check that distributed computations work."""
from dask.distributed import Client
self.client = Client()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to have a separate class that always creates the client in setup rather than assigning to self.client here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I end up having more tests, I will split the TestCase, yes. For now it's fine like it is I think

@mraspaud mraspaud merged commit 2245c9a into pytroll:master Feb 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants