Skip to content

Latest commit

 

History

History
438 lines (282 loc) · 19.4 KB

CHANGELOG.md

File metadata and controls

438 lines (282 loc) · 19.4 KB

Changelog

All notable changes to this project will be documented in this file.

This project relies on continuous integration for new features. So we do not yet have explicitly versioned releases. Releases are simply built continuously, automatically tested, deployed to a development environment and then to production.

Note that the openEO API provides a way to support stable and unstable versions in the same implementation: https://openeo.org/documentation/1.0/developers/api/reference.html#operation/connect

If needed, feature flags are used to allow testing unstable features in development/production, without compromising stable operations.

Unreleased

0.28.1

0.28.0

  • Export to JSON is now more robust, supports datetime objects returned by dimension_labels, and will default to the string representation.
  • GDAL upgraded to 3.8.4 and Orfeo Toolbox to 8.1.2. This mainly reduces the volume of bytes read from object storage by GDAL. (#571)
  • Size of incoming requests is now limited to 2MB by default (Open-EO/openeo-python-driver#254)
  • load_stac: support loading netCDF multiple netCDF items with a time dimension, as produced with 'sample_by_feature' option
  • In batch result STAC metadata proj:shape is fixed to be in Y-X order, as prescribed by the standard. (#693)
  • Copy batch job output assets to a workspace with the export_workspace process (#676).
  • Support vector cubes loaded from load_url in "sample_by_feature" feature (#700)
  • Keep polygons and multipolygons sorted when calling aggregate_spatial (#60)

0.27.1

  • Add timeout to requests towards ETL API to unblock JobTracker (#690).

0.27.0

  • Expose "bbox" and "geometry" for spatial STAC Item with netCDF assets (#646)

0.26.2

  • MultiEtlApiConfig: don't fail-fast on missing env vars for credentials extraction, just skip with warnings for now

0.26.1

Bugfix

  • fix load_stac from unsigned job results URL in batch job (#644)

0.26.0

  • Introduce MultiEtlApiConfig to support multiple ETL API configurations (#531)

0.25.0

  • The default for the soft-errors job option is now set to 0.1 and made configurable at backend level. This value better recognizes the fact that many EO archives have corrupt files that otherwise break jobs #617.
  • Support GeoParquet output format for aggregate_spatial (#623)

Improved datatype conversion

A rather big improvement in this release is the handling of datatypes. OpenEO does not have very explicit rules when it comes to datatypes, and in this implementation, in most cases the datatype was simply preserved, like in most programming languages.

For most users, this resulted in unexpected behaviour, for instance when dividing integer dataypes, or subtracting two unsigned 8 bit numbers, and expecting to get negative values.

This implementation will now try to use wider datatypes when necessary. For instance by switching to floating point when performing a division. This change makes writing formulas more intuitive, and should save time debugging issues.

When there is still a need to get a smaller datatype, users can use the 'linear_scale_range' process. This process for instance will convert to 8 bit unsigned integers if the target range uses integer values and fits in the [0,255] range.

Relevant issues:

0.24.0

  • Start using DynamicEtlApiJobCostCalculator in job tracker. Effective ETL API selection strategy is to be configured through EtlApiConfig

Bugfix

  • added max_processing_area_pixels custom option to sar_backscatter, avoiding going out of memory when processing too large chunks

0.23.1

Bugfix

  • Requests towards Job Registry Elastic API are unreliable; reconsider ZK as primary data store.

0.23.0

Added

  • Support disabling ZkJobRegistry (#632)

0.22.3

Bugfix

  • Restore batch job result metadata; this reverts the Zookeeper fix introduced in 0.22.2

0.22.2

Bugfix

  • Prevent Zookeeper from blocking requests (#639)

0.22.1

Bugfix

  • Prevent usage duplication in ETL API (#41)

0.22.0

Added

  • Added config use_zk_job_registry to disable ZkJobRegistry usage

Bugfix

  • apply_neighborhood: fix error if overlap is null/None (#519)

0.21.5

  • Initial implementation of DynamicEtlApiJobCostCalculator and added caching feature to get_etl_api() (#531)

0.21.4

  • Support for reading GeoPackage vector data
  • move legacy-vs-dynamic ETL selection logic to get_etl_api() (#531)

0.21.3

  • job tracker: move app state mapping to CostDetails construction time

0.21.2

  • job tracker: pass job_options to JobCostsCalculator through CostDetails (related to #531)

0.21.1

  • job tracker: do job info iteration in streaming fashion (instead of loading all job info in memory at once)

0.21.0

  • Initial support for dynamic ETL API configuration (#531)

0.20.1

  • finetune zookeeper_set.py script and concurrent_pod_limit logic

0.20.0

  • Introduce GpsBackendConfig.zookeeper_hosts and GpsBackendConfig.zookeeper_root_path

0.19.4

  • eliminate use_etl_api arg from GeoPySparkBackendImplementation in favor of use_etl_api_on_sync_processing config field
  • Upgrade GDAL to 3.8.1 and Orfeo Toolbox to 8.1.2 #571
  • Performance improvement for apply_dimension with target='bands' (Open-EO/openeo-geotrellis-extensions#235 , #595 )

0.19.3

Feature

  • Experimental support for filter_labels: only works with catalog based collections, and when used close to the load_collection call #559
  • Support for UDF signature that works directly on XArray DataArray, avoiding the need for openEO specific wrapper class.
  • Support filtering on tileId with a wildcard. Open-EO/openeo-opensearch-client#25

Bugfix

  • Error fixed when doing aggregate_temporal + merge_cubes Open-EO/openeo-geotrellis-extensions#201
  • Avoid 'out-of-memory' errors when writing large netCDF files. Allows files of >600MB without custom memory settings. #199
  • netCDF output will generate a more useful warning in case of a mismatch with cube band metadata

0.19.2

  • (Temporarily) disable extensive /validation checks on production (related to #566, #575)

0.18.0a1

Removed

  • Remove old "v1" job_tracker script (#545)

0.17.0a1

Feature

Bugfix

2023-09-18 (0.9.5a1)

Important change: time intervals are now left closed. Workflows that are sensitive to exact time intervals may need to be updated.

Feature

  • load_stac support, allowing to load STAC collections that conform to the mainstream approach for organizing metadata in STAC. (#402)
  • First support for UDF's that change resolution, in Python-Jep runtime. (Open-EO/openeo-geotrellis-extensions#197)
  • Improved support for running UDF's on vector cubes.
  • Support load_geojson and load_url processes to create vector cubes. (Open-EO/openeo-python-driver#211)
  • The 'partial' query parameter is now supported when requesting job results, load_stac supports loading unfinished results. (#489)
  • Support new (experimental) vector_to_raster process, allowing to combine data from a vector source with EO data. (#423)

Bugfix

Changed

2023-07-30 (0.9.5a1)

Feature

  • array_element: Support band selection by label (#43)
  • apply_neigborhood: Support applying function over time intervals (#415)

2023-06-30 (0.9.5a1)

Feature

Changed

2023-03-30 (0.9.5a1)

Bugfix

  • Fix "Permission denied" issue with run_udf usage on vector date cube (#367)
  • Fix: Extent in STAC result metadata should be lat lon (#321)
  • Single row/line results with SentinelHub (#375)
  • Fix: Creodias: Download asset from object storage (S3) before extracting projection metadata (#403)

Changed

  • /validation now detects if the amount of pixels that will be processed is too large (#320)
  • Add projection extension metadata to batch job results (openeo-geotrellis-extensions/#72)

2023-03-08 (0.9.3a1)

Build 20230307-1166, with components: openeo-geopyspark-0.9.3a1.dev20230307+1073, openeo_driver-0.37.0a1.dev20230307+441, openeo-0.15.0, geotrellis-extensions-static-2.3.0_2.12-SNAPSHOT-5822561

Feature

  • Add "filename_prefix" to format_options.

Bugfix

2023-02-27 (0.7.0a1)

Build 20230221-1118

Note: this deploy was rolled back to previous build 20230117-966 the same day.

  • GeoParquet support to allow loading large vector files
  • Improved specific log messages
  • Better support for multiple filter_spatial prcesses in same process graph (#147)
  • Bugfix for sampling sentinelhub based collections (#279)
  • vector_buffer: Throw an error when a negative buffer size resuls in invalid geometries (Open-EO/openeo-python-driver#164)
  • batch jobs now also report usage of credits (#272)
  • non-utm collections should now have a better alignment to the original rasters, if the process graph does not apply an explicit resampling (Open-EO/openeo-geotrellis-extensions#69)

2023-02-07 (0.6.7a1)

Build 20230117-966

  • Added initial support for the inspect process. It can be used on datacubes and in callbacks.
  • The size of a single chunk is now automatically increased for larger jobs, to improve IO performance.
  • resample_cube_spatial is no longer needed in all cases when using merge_cubesor mask
  • Better detection of duplicate products in source catalogs
  • The 'if' process will no longer evaluate the branch that is not accepted Open-EO/openeo-python-driver#109

2023-01-20 (0.6.7a1)

  • Changed: Getting a job's logs now leaves out log lines that have no loglevel or a level that is not supported. openeo-python-driver/#160

2022-11-28 (0.6.3a1)

  • Added an experimental job option 'udf-dependency-archives' to pass on archives of UDF dependencies

2022-10-27 (0.6.3a1)

  • Reprojection is performed at load time whenever possible, by pushing down parameters from resample_spatial and resample_cube_spatial
  • PROBA-V collections can now be loaded at original resolution
  • Overlap between original products is now handled based on the footprint in STAC/Opensearch metadata
  • Logging for synchronous jobs is now more complete
  • First prototype for running vector data UDF's on Spark
  • Bugfix: allow large (multiple GB) CSV output
  • Try to avoid going out of memory by reducing default partition size

2022-09-21 (0.6.3a1)

  • Expose logging from UDF's
  • Feature id's from GeoJSON are used to name timeseries in netCDF export
  • NetCDF's are now cropped to provided extent
  • Support remote STAC collections
  • Sentinelhub usage is now recorded for batch jobs
  • "task-cpus" job option to control number of cpu's for a single Spark task. Mostly relevant for UDF's that use multi-threaded libraries such as Tensorflow.
  • New processes:
    • array_find
    • exp

2022-05-04 (0.6.3a1)

  • Enable JSON logging from batch_job.py (and inject user_id/job_id)
  • New processes:
    • predict_catboost (not-standard)
    • predict_random_forest
    • fit_class_random_forest
    • array_interpolate_linear
  • Faster sar_backscatter both for large areas and sparse sampling
  • STAC metadata for random forest models
  • Colormap support in PNG's
  • Support custom Sentinelhub collections, e.g. PlanetScope data
  • 'soft-errors' job option to allow failure of individual Sentinelhub requests

2022-04-07 (0.6.2a1)

  • EP-4012: implement collection source selection based on product availability (e.g. collection "SENTINEL2_L2A" "forwards" to "TERRASCOPE_S2_TOC_V2" when possible, but falls back to "SENTINEL2_L2A_SENTINELHUB" when there are missing products for the selected spatiotemporal extent.

2021-11-17

Feature

  • Support load_result
  • Allow raster masks to filter a collection before loading any data
  • Caching of Sentinelhub data
  • Streaming writing of netCDF files
  • Support filter_spatial
  • Support first and last processes
  • Jep based UDF implementation

2021-07-14

Changed

  • Add support for openeo.udf based UDFs and keep backward compatibility with openeo_udf based UDFs (EP-3856, #78, #93)

2021-04-08

Feature

  • Add support for (multiple) default OIDC clients (for EGI Check-in OIDC provider) (EP-3700, Open-EO/openeo-api#366)

2021-03-30

Feature

  • Add support for Sentinelhub layers on different endpoints (e.g. Landsat-8, MODIS)
  • In batch jobs, write one geotiff per date as opposed to reducing all dates into a single pixel
  • Improved CARD4L metadata generation for atmospheric_correction

2021-03-12

  • Fix support for UDPs in batch jobs (EP-3754)
  • Fix support for custom processes in batch jobs (EP-3771)

2021-01-26

Feature

Add an experimental resolution_merge for Sentinel-2 based on the implemntation in FORCE.

Support reading Copernicus Global Land NetCDF files.

Support the Sentinelhub batch process API to generate Sentinel-1 backscatter data.

The atmospheric_correction process can now apply iCor on SentinelHub layers.

2021-01-25

Feature

  • Add implementation of on-the-fly Sentinel1 Backscatter (Sigma0) calculation using Orfeo Toolbox on Creodias (EP-3612)

2020-12-06

Performance

Performance improvement for requests with small spatial extents. The backend was loading too much tile metadata.

2020-11-11

Feature

Support the "if" process:https://processes.openeo.org/#if

Major performance improvements for SentinelHub layers. The UTM projection is now used by default when processing these layers. The datatype is no longer set to float by default.

2020-10-28

Internal

Refactored internal process graph parsing: first to a dry-run processing to extract information that can help loading initial data sources. (EP-3509)

2020-10-14

Feature

Support "PNG" output format (non-indexed only).

2020-10-06

Performance improvement

Geotiff (GTiff) output format is faster to generate, and works for larger areas.

Compatibility

Copernicus projections stored in UTM are now also processed and returned in UTM, as opposed to web mercator. This affects processing parameters that depend on a specific projection, like the size of pixels in map units.

This change also improves memory usage and performance for a number of layers.