Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add xarray/dask support to DayNightCompositor #217

Merged
merged 18 commits into from
Mar 19, 2018
Merged

Conversation

pnuu
Copy link
Member

@pnuu pnuu commented Mar 8, 2018

This PR adds xarray and dask support to DayNightCompositor.

  • Tests added
  • Tests passed
  • Passes git diff origin/develop **/*py | flake8 --diff

Copy link
Member

@djhoese djhoese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I'd just like a small change to the chunk size passed to get_lonlats_dask so that lon/lat chunks are the same size as the data chunks.

except IndexError:
from pyorbital.astronomy import cos_zen
LOG.debug("Computing sun zenith angles.")
lons, lats = day_data.attrs["area"].get_lonlats_dask(CHUNK_SIZE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably pass day_data.chunks as the chunk size for potentially better performance (assuming day_data is 2D).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that, but day_data is usually, but not always, a 3D dataset, so the chunking doesn't necessarily match. I'll take a closer look when I'm back at office.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok. Makes sense. You might be able to get away with day_data.sel(bands=day_data['bands'][0]).chunks...maybe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put another way, what if you got one 2D array from the 3D array and use the chunks from that. Dask might give you logical chunks. The point is if you can pass the actual chunks then calculations should be overall faster since the lons/lats will be at the same chunk size as the data.

@@ -673,7 +675,7 @@ def mask_source_lonlats(source_def, mask):
# the data may have additional masked pixels
# let's compare them to see if we can use the same area
# assume lons and lats mask are the same
if mask and isinstance(source_geo_def, SwathDefinition):
if mask is not None and isinstance(source_geo_def, SwathDefinition):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @mraspaud merged his fix for this. You may want to rebase and take his changes which are slightly different than this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's fixed and merged

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I thought I did a pull on develop, but maybe I was still on this feature branch. I'll rebase before I continue with the comments you left.

alpha = new_data[0].copy()
alpha.data = da.ones((data.sizes['y'],
data.sizes['x']),
chunks=new_data[0].chunks)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you had questions about this. This is probably one of the easier ways to do this. You could have also taken all the dask arrays for each band, added an alpha with da.ones(...) and then did da.stack to concatenate them. You would then have to create a new xarray and pass all of the dims, attrs, and coords from the original data. At least that's the only way I know how to do it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, at least it works! I'll try if I can wrap my head around what you described. Sounds complicated.

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for the small changes you need to make

new_data['bands'] = ['R', 'G', 'B']
data = new_data
# Add alpha band
if 'A' not in data.bands.data and 'A' in bands.data:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you user data['bands'] instead ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do!

data = new_data
# Add alpha band
if 'A' not in data.bands.data and 'A' in bands.data:
new_data = [data.sel(bands=band) for band in data.bands.data]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

@@ -673,7 +675,7 @@ def mask_source_lonlats(source_def, mask):
# the data may have additional masked pixels
# let's compare them to see if we can use the same area
# assume lons and lats mask are the same
if mask and isinstance(source_geo_def, SwathDefinition):
if mask is not None and isinstance(source_geo_def, SwathDefinition):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's fixed and merged

@coveralls
Copy link

coveralls commented Mar 9, 2018

Coverage Status

Coverage increased (+0.8%) to 60.328% when pulling 184d74e on feature-xarray-DNC into cebde2b on develop.

@mraspaud mraspaud merged commit 5c7bc75 into develop Mar 19, 2018
@mraspaud mraspaud deleted the feature-xarray-DNC branch March 19, 2018 13:11
@mraspaud mraspaud added enhancement code enhancements, features, improvements component:compositors labels Mar 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:compositors enhancement code enhancements, features, improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants