Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dc.load returning almost no scenes for a large polygon but many for a small polygon in the same area #276

BexDunn opened this issue Aug 4, 2017 · 3 comments


Copy link

commented Aug 4, 2017

Expected behaviour

Expect to get a similar number of scenes retrieved across a polygon area, when using a big polygon in the dc.load call vs using a small polygon in the dc.load retrieval.

Actual behaviour

If I plot the scenes retrieved I get a triangle shape with about 2 observations in it in the big polygon, whereas in that area for the small polygon I get 300-600 scenes retrieved. (Over 30 yr epoch, ls5,7 and 8)

Steps to reproduce the behaviour

The code above should run for either of the queries below. Obv you may wish to change the paths for the output files.

queries used:

Big polygon

 'geopolygon': Geometry(POLYGON ((-83105.2314368732 -1451602.90924564,-33391.0599240792 -1455644.18659597,-37415.0420370399 -1505610.83402089,-87117.7910054179 -1501569.35056345,-83105.2314368732 -1451602.90924564)), PROJCS["GDA94_Australian_Albers",GEOGCS["GCS_GDA_1994",DATUM["Geocentric_Datum_of_Australia_1994",SPHEROID["GRS_1980",6378137,298.257222101]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]],PROJECTION["Albers_Conic_Equal_Area"],PARAMETER["standard_parallel_1",-18],PARAMETER["standard_parallel_2",-36],PARAMETER["latitude_of_center",0],PARAMETER["longitude_of_center",132],PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["Meter",1]]),
 'time': ('1987-01-01', '2016-12-31')}

Little polygon

 'geopolygon': Geometry(POLYGON ((131.554434234076 -13.8009653022795,131.547892015813 -13.8790036027955,131.496441846499 -13.8746177747694,131.503013176119 -13.7965809082819,131.554434234076 -13.8009653022795)), GEOGCS["GCS_WGS_1984",DATUM["WGS_1984",SPHEROID["WGS_84",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]]),
 'time': ('1987-01-01', '2016-12-31')}

Environment information

  • Which datacube --version are you using?
  • What datacube deployment/enviornment are you running against?
    behavior same on raijin and on VDI

This comment has been minimized.

Copy link

commented Aug 8, 2017

After some analysis error was identified in Datacube.load when using dask_chunks parameter, basically custom fuser function and skip_broken_datasets do not flow through to .

Missing fuse_func leads to using default fuser, which in the case of PQ masks is completely broken. This in turn leads to no data in the south part of the tile after masking is applied.


Bottom image should look like top-right, instead it's an exact copy of top-left.


This comment has been minimized.

Copy link

commented Aug 8, 2017

Error was in and was partially fixed by this commit


By explicitly naming fuse_func parameter.

However call chain load->make_dask_array->fuse_lazy->_fuse_measurement still drops skip_broken_datasets=<False|True> parameter.


This comment has been minimized.

Copy link

commented Aug 8, 2017

Kirill888 added a commit that referenced this issue Aug 28, 2017

Fix for issue #276
`fuse_func` parameter was not passed correctly, adding explicit name.

@Kirill888 Kirill888 closed this Dec 4, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
2 participants
You can’t perform that action at this time.