Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MergeError when translating to xarray #29

Closed
davemlz opened this issue Oct 4, 2021 · 4 comments
Closed

MergeError when translating to xarray #29

davemlz opened this issue Oct 4, 2021 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@davemlz
Copy link

davemlz commented Oct 4, 2021

Hi, @aazuspan!

Just wanted to say that I love wxee! I'm using it to combine products from Earth Engine and Planetary Computer and that's amazing! I'm using it almost every day, but sometimes this error happens:

---------------------------------------------------------------------------
MergeError                                Traceback (most recent call last)
/tmp/ipykernel_1042/4012842980.py in <module>
      1 CLOUD_MASK = PCL_s2cloudless(S2_ee).map(PSL).map(PCSL).map(matchShadows).select("CLOUD_MASK")
----> 2 CLOUD_MASK_xarray = CLOUD_MASK.wx.to_xarray(scale = 20,crs = "EPSG:" + str(S2.epsg.data),region = ee_aoi)

/srv/conda/envs/notebook/lib/python3.8/site-packages/wxee/collection.py in to_xarray(self, path, region, scale, crs, masked, nodata, num_cores, progress, max_attempts)
    135             )
    136 
--> 137             ds = _dataset_from_files(files)
    138 
    139         # Mask the nodata values. This will convert int datasets to float.

/srv/conda/envs/notebook/lib/python3.8/site-packages/wxee/utils.py in _dataset_from_files(files)
    120     das = [_dataarray_from_file(file) for file in files]
    121 
--> 122     return xr.merge(das)
    123 
    124 

/srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/core/merge.py in merge(objects, compat, join, fill_value, combine_attrs)
    898         dict_like_objects.append(obj)
    899 
--> 900     merge_result = merge_core(
    901         dict_like_objects,
    902         compat,

/srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/core/merge.py in merge_core(objects, compat, join, combine_attrs, priority_arg, explicit_coords, indexes, fill_value)
    633 
    634     prioritized = _get_priority_vars_and_indexes(aligned, priority_arg, compat=compat)
--> 635     variables, out_indexes = merge_collected(
    636         collected, prioritized, compat=compat, combine_attrs=combine_attrs
    637     )

/srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/core/merge.py in merge_collected(grouped, prioritized, compat, combine_attrs)
    238                 variables = [variable for variable, _ in elements_list]
    239                 try:
--> 240                     merged_vars[name] = unique_variable(name, variables, compat)
    241                 except MergeError:
    242                     if compat != "minimal":

/srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/core/merge.py in unique_variable(name, variables, compat, equals)
    147 
    148     if not equals:
--> 149         raise MergeError(
    150             f"conflicting values for variable {name!r} on objects to be combined. "
    151             "You can skip this check by specifying compat='override'."

MergeError: conflicting values for variable 'CLOUD_MASK' on objects to be combined. You can skip this check by specifying compat='override'.

It is weird because it is not something that happens all the time, and most of the times I just have to re-run the code and it works. So, I don't know exactly what the problem is xD

Anyway, here I let you the error I got. I was trying to get a cloud mask in GEE and download it as a xarray. I aleady tried it again and now it works, but, as I said, I don't know why. It also happens with other datasets. I was downloading some Sentinel-2 data (just as it is, without any processing steps) and sometimes work, but sometimes it doesn't and I can't reproduce the error because when I re-run it, most of the times it works xD

Ok, that was it!

Thank you!

@aazuspan
Copy link
Owner

aazuspan commented Oct 4, 2021

Hey @davemlz, I'm psyched to hear you're using wxee! I wasn't sure how practical it would be for higher resolution data like Sentinel and Landsat with the size limitations, so that's great you've found some practical applications :)

I haven't run into that MergeError before, but I'm guessing it's getting multiple images with the same time coordinate and identical band names, and xarray doesn't like that. I suppose that would probably happen if two overlapping Sentinel scenes with the same system:time_start were downloaded together... Of course, that definitely doesn't explain why re-running would fix the problem! The only other thing I can think of is that there are some temp files getting left behind that are causing problems... I'll play around to see if I can re-create the problem and hopefully come up with a solution.

Thanks for reporting the error! Keep me posted if you do find a reproducible example or run into more problems, and I'll let you know if I come up with anything.

P.S. Any tips on getting access to Planetary Computer? xD

@aazuspan aazuspan added the bug Something isn't working label Oct 4, 2021
@aazuspan aazuspan self-assigned this Oct 4, 2021
aazuspan added a commit that referenced this issue Oct 5, 2021
…null values. Raise a warning if conflicting non-null values are merged, but supress the MergeError. Move null-masking into dataset creation util functions to ensure that nulls are set BEFORE merging to allow the above to work as designed. (#29)
@aazuspan
Copy link
Owner

aazuspan commented Oct 5, 2021

Okay, I tracked down a bug that might have caused this!

Before, if two pixels at the same x, y, and time coordinate from the same band had different values (even if one was masked), it would raise a MergeError. Now, a masked and unmasked value can be merged without any issues. Merging two conflicting unmasked values will raise a warning instead of throwing an error.

If you upgrade to v0.1.0, that will include the bug fix. It's up on PyPI now, but may take a day or two to get to conda-forge.

Hopefully that fixes your issue, but let me know if it happens again with the new version! :)

@davemlz
Copy link
Author

davemlz commented Oct 5, 2021

@aazuspan, you're the best! You solved the problem like in 4 hours or less! I upgraded to v0.1.0 and today everytime that I run the code it works perfectly and without errors! Thank you and thanks for all the explanation!

I will close the issue with this comment :)

PD: I don't know what's the deal with Microsoft to get access to PC xD They gave me access like a thousand months ago, but, for example, they gave access to César Aybar just yesterday xD there is still hope, don't lose it!

@davemlz davemlz closed this as completed Oct 5, 2021
@aazuspan
Copy link
Owner

aazuspan commented Oct 6, 2021

That's awesome @davemlz, glad it fixed the problem! Thanks for reporting it :)

Good to hear that other folks are still getting access to PC. Maybe my turn will come soon! 🤞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants