@Kirill888 Kirill888 released this May 16, 2019 · 42 commits to develop since this release

Not a lot of changes since rc1.

  • Early exit from dc.load on KeyboardInterrupt, allows partial loads inside notebook.
  • Some bug fixes in geometry related code
  • Some cleanups in tests
  • Pre-commit hooks configuration for easier testing
  • Re-enable multi-threaded reads for s3aio driver (set use_threads=True in dc.load(..))
@omad omad released this Apr 18, 2019 · 86 commits to develop since this release

1.7rc1 (18 April 2019)

Virtual Products

Add Virtual Products for multi-product loading.

(#522, #597, #601, #612, #644, #677, #699, #700)

Changes to Data Loading

The internal machinery used when loading and reprojecting data, has been
completely rewritten. The new code has been tested, but this is a
complicated and fundamental part of code and there is potential for

When loading reprojected data, the new code will produce slightly
different results. We don't believe that it is any less accurate than
the old code, but you cannot expect exactly the same numeric results.

Non-reprojected loads should be identical.

This change has been made for two reasons:

  1. The reprojection is now core Data Cube, and is not the
    responsibility of the IO driver.
  2. When loading lower resolution data, DataCube can now take advantage
    of available overviews.
  • New futures based IO driver interface (#686)

Other Changes

  • Allow specifying different resampling methods for different data
    variables of the same Product. (#551)
  • Allow all reampling methods supported by rasterio. (#622)
  • Bug fix (Index out of bounds causing ingestion failures)
  • Support indexing data directly from HTTP/HTTPS/S3 URLs (#607)
  • Renamed the command line tool datacube metadata_type to datacube metadata (#692)
  • More useful output from the command line datacube {product|metadata} {show|list}
  • Add optional progress_cbk to dc.load(_data) (#702), allows user
    to monitor data loading progress.
  • Thread-safe netCDF access within dc.load (#705)

Performance Improvements

  • Use single pass over datasets when computing bounds (#660)
  • Bugfixes and improved performance of dask-backed arrays (#547,

Documentation Improvements


  • From the command line, the old query syntax for searching within
    vague time ranges, eg: 2018-03 < time < 2018-04 has been removed.
    It is unclear exactly what that syntax should mean, whether to
    include or exclude the months specified. It is replaced by time in [2018-01, 2018-02] which has the same semantics as dc.load time
    queries. (#709)
Patch release to build a new Docker container, to resolve an upstream security bug.

See #631 for more details.

@omad omad released this Nov 27, 2018 · 566 commits to develop since this release

The real 1.6 release, not an accidental duplicate of the release candidate.

@omad omad released this Aug 23, 2018 · 116 commits to stable since this release

  • Enable use of aliases when specifying band names
  • Fix ingestion failing after the first run #510
  • Docker images now know which version of ODC they contain #523
  • Fix data loading when nodata is NaN #531
  • Allow querying based on python datetime.datetime objects. #499
  • Require rasterio 1.0.2 or higher, which fixes several critical bugs when loading and reprojecting from multi-band files.
  • Assume fixed paths for id and sources metadata fields #482
  • datacube.model.Measurement was put to use for loading in attributes and made to inherit from dict to preserve current behaviour. #502
  • Updates when indexing data with datacube dataset add (See #485, #451 and #480)
    • Allow indexing without lineage datacube dataset add --ignore-lineage
    • Removed the --sources-policy=skip|verify|ensure. Instead use --[no-]auto-add-lineage and --[no-]verify-lineage
    • New option datacube dataset add --exclude-product <name> allows excluding some products from auto-matching
  • Preliminary API for indexing datasets #511
  • Enable creation of MetadataTypes without having an active database connection #535
@omad omad released this Jun 30, 2018 · 685 commits to develop since this release

Backwards Incompatible Changes

  • The helpers.write_geotiff() function has been updated to support
    files smaller than 256x256. It also no longer supports specifying
    the time index. Before passing data in, use
    xarray_data.isel(time=<my_time_index>). (#277)
  • Removed product matching options from datacube dataset update
    (#445). No matching is needed in this case as all datasets are
    already in the database and are associated to products.
  • Removed --match-rules option from datacube dataset add (#447)
  • The seldom-used stack keyword argument has been removed from
    Datcube.load. (#461)
  • The behaviour of the time range queries has changed to be compatible
    with standard Python searches (eg. time slice an xarray). Now the
    time range selection is inclusive of any unspecified time units.
    • Example 1:
      time=('2008-01', '2008-03') previously would have returned all
      data from the start of 1st January, 2008 to the end of 1st of
      March, 2008. Now, this query will return all data from the start
      of 1st January, 2008 and 23:59:59.999 on 31st of March, 2008.

    • Example 2:
      To specify a search time between 1st of January and 29th of
      February, 2008 (inclusive), use a search query like
      time=('2008-01', '2008-02'). This query is equivalent to using
      any of the following in the second time element:

      ('2008-02-29 23')
      ('2008-02-29 23:59')
      ('2008-02-29 23:59:59')
      ('2008-02-29 23:59:59.999')


  • A --location-policy option has been added to the datacube dataset update command. Previously this command would always add a new
    location to the list of URIs associated with a dataset. It's now
    possible to specify archive and forget options, which will mark
    previous location as archived or remove them from the index
    altogether. The default behaviour is unchanged. (#469)

  • The masking related function describe_variable_flags() now returns
    a pandas DataFrame by default. This will display as a table in
    Jupyter Notebooks. (#422)

  • Usability improvements in datacube dataset [add|update] commands
    (#447, #448, #398)

    • Embedded documentation updates
    • Deprecated --auto-match (it was always on anyway)
    • Renamed --dtype to --product (the old name will still work,
      but with a warning)
    • Add option to skip lineage data when indexing (useful for saving
      time when testing) (#473)
  • Enable compression for metadata documents stored in NetCDFs
    generated by stacker and ingestor (#452)

  • Implement better handling of stacked NetCDF files (#415)

    • Record the slice index as part of the dataset location URI,
      using #part=<int> syntax, index is 0-based
    • Use this index when loading data instead of fuzzy searching by
    • Fall back to the old behaviour when #part=<int> is missing and
      the file is more than one time slice deep
  • Expose the following dataset fields and make them searchable:

    • indexed_time (when the dataset was indexed)
    • indexed_by (user who indexed the dataset)
    • creation_time (creation of dataset: when it was processed)
    • label (the label for a dataset)

    (See #432 for more details)

Bug Fixes

  • The .dimensions property of a product no longer crashes when
    product is missing a grid_spec. It instead defaults to time,y,x
  • Fix a regression in v1.6rc1 which made it impossible to run
    datacube ingest to create products which were defined in 1.5.5
    and earlier versions of ODC. (#423, #436)
  • Allow specifying the chunking for string variables when writing
    NetCDFs (#453)
@omad omad released this Apr 11, 2018 · 794 commits to develop since this release

v1.6rc1 Easter Bilby (10 April 2018)

This is the first release in a while, and so there’s a lot of changes, including
some significant refactoring, with the potential having issues when upgrading.

Backwards Incompatible Fixes

  • Drop Support for Python 2. Python 3.5 is now the earliest supported Python version.
  • Removed the old ndexpr, analytics and execution engine code. There is work underway in the execution engine branch to replace these features.


  • Support for third party drivers, for custom data storage and custom index implementations

  • The correct way to get an Index connection in code is to use datacube.index.index_connect().

  • Changes in ingestion configuration

    • Must now specify the Data Write Plug-ins to use. For s3 ingestion there was a top level container specified, which has been renamed and moved under storage. The entire storage section is passed through to the Data Write Plug-ins, so drivers requiring other configuration can include them here. eg:

        driver: s3aio
        bucket: my_s3_bucket
  • Added a Dockerfile to enable automated builds for a reference Docker image.

  • Multiple environments can now be specified in one datacube config. See PR 298 and the Runtime Config

    • Allow specifying which index_driver should be used for an environment.
  • Command line tools can now output CSV or YAML. (Issue issue 206, PR 390)

  • Support for saving data to NetCDF using a Lambert Conformal Conic Projection (PR 329)

  • Lots of documentation updates:

    • Information about Bit Masking.
    • A description of how data is loaded.
    • Some higher level architecture documentation.
    • Updates on how to index new data.

Bug Fixes

  • Allow creation of datacube.utils.geometry.Geometry objects from 3d representations. The Z axis is simply thrown away.
  • The datacube --config_file option has been renamed to datacube --config, which is shorter and more consistent with the other options. The old name can still be used for now.
  • Fix a severe performance regression when extracting and reprojecting a small region of data. (PR 393)
  • Fix for a somewhat rare bug causing read failures by attempt to read data from a negative index into a file. (PR 376)
  • Make CRS equality comparisons a little bit looser. Trust either a Proj.4 based comparison or a GDAL based comparison. (Closed issue 243)

New Data Support

  • Added example prepare script for Collection 1 USGS data; improved band handling and downloads.
  • Add a product specification and prepare script for indexing Landsat L2 Surface Reflectance Data (PR 375)
  • Add a product specification for Sentinel 2 ARD Data (PR 342)
@jeremyh jeremyh released this Jan 18, 2018

  • Fixes to package dependencies. No code changes.
@jeremyh jeremyh released this Dec 14, 2017

  • Minor features backported from 2.0:

    • Support for limit in searches

    • Alternative lazy search method find_lazy

  • Fixes:

    • Improve native field descriptions

    • Connection should not be held open between multi-product searches

    • Disable prefetch for celery workers

    • Support jsonify-ing decimals

@harshurampur harshurampur released this Oct 16, 2017

  • Use cloudpickle as the celery serialiser

  • Allow celery tests to run without installing it

  • Move datacube-worker inside the main datacube package

  • Write metadata_type from the ingest configuration if available

  • Support config parsing limitations of Python 2

  • Fix #303: resolve GDAL build dependencies on Travis

  • Upgrade rasterio to newer version

