Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingestor migration from 1.5.5 to 1.6rc1 causes product incompatibility warnings/errors #423

Closed
Kirill888 opened this issue Apr 19, 2018 · 8 comments

Comments

@Kirill888
Copy link
Contributor

commented Apr 19, 2018

Expected behaviour

Should be able to update previously ingested product with the new datacube version.

Actual behaviour

Reported by @loicdtx on slack, ingestor aborts with this error

ValueError: Ingest config differs from the existing output product, but allow_product_changes=False

Steps to reproduce the behaviour

Config used for ingestion is available here

https://github.com/CONABIO/antares3/blob/3edfecb46ea425ce67f518d3ab13cf08ca36718d/madmex/conf/ingestion/s2_l2a_20m_mexico.yaml

EDIT: link above

Environment information

  • Originally ingested with 1.5.5
  • Faling to update with 1.6rc1
@Kirill888

This comment has been minimized.

Copy link
Contributor Author

commented Apr 19, 2018

Rolling back to 1.5.5 allows ingestion to proceed, so it's definitely not a config file change problem.

@loicdtx

This comment has been minimized.

Copy link
Contributor

commented Apr 19, 2018

Here's the output of datacube product show s2_l2a_20m_granule

{
    "metadata_type": "eo",
    "name": "s2_l2a_20m_granule",
    "description": "Sentinel 2 data processed with sen2cor",
    "metadata": {
        "instrument": {
            "name": "MSI"
        },
        "platform": {
            "code": "sentinel2"
        },
        "product_type": "sen2cor",
        "format": {
            "name": "JPEG2000"
        }
    },
    "measurements": [
        {
            "nodata": 0,
            "name": "blue",
            "dtype": "uint16",
            "aliases": [
                "band_2",
                "blue"
            ],
            "units": "reflectance"
        },
        {
            "nodata": 0,
            "name": "green",
            "dtype": "uint16",
            "aliases": [
                "band_3",
                "green"
            ],
            "units": "reflectance"
        },
        {
            "nodata": 0,
            "name": "red",
            "dtype": "uint16",
            "aliases": [
                "band_4",
                "red"
            ],
            "units": "reflectance"
        },
        {
            "nodata": 0,
            "name": "re1",
            "dtype": "uint16",
            "aliases": [
                "band_5",
                "red_edge_1",
                "re1"
            ],
            "units": "reflectance"
        },
        {
            "nodata": 0,
            "name": "re2",
            "dtype": "uint16",
            "aliases": [
                "band_6",
                "red_edge_2",
                "re2"
            ],
            "units": "reflectance"
        },
        {
            "nodata": 0,
            "name": "re3",
            "dtype": "uint16",
            "aliases": [
                "band_7",
                "red_edge_3",
                "re3"
            ],
            "units": "reflectance"
        },
        {
            "nodata": 0,
            "name": "nir",
            "dtype": "uint16",
            "aliases": [
                "band_8A",
                "nir"
            ],
            "units": "reflectance"
        },
        {
            "nodata": 0,
            "name": "swir1",
            "dtype": "uint16",
            "aliases": [
                "band_11",
                "swir1"
            ],
            "units": "reflectance"
        },
        {
            "nodata": 0,
            "name": "swir2",
            "dtype": "uint16",
            "aliases": [
                "band_12",
                "swir2"
            ],
            "units": "reflectance"
        },
        {
            "name": "pixel_qa",
            "dtype": "uint16",
            "flags_definition": {
                "sca": {
                    "bits": [
                        0,
                        1,
                        2,
                        3,
                        4,
                        5,
                        6,
                        7,
                        8,
                        9,
                        10,
                        11,
                        12,
                        13,
                        14,
                        15
                    ],
                    "description": "Sen2Cor Scene Classification",
                    "values": {
                        "4": "Vegetation",
                        "5": "Not-vegetated",
                        "9": "Cloud high probability",
                        "0": "No Data",
                        "3": "Cloud shadows",
                        "11": "Snow or ice",
                        "10": "Thin cirrus",
                        "8": "Cloud medium probability",
                        "6": "Water",
                        "7": "Unclassified",
                        "2": "Dark features / Shadows",
                        "1": "Saturated or defective pixel"
                    }
                }
            },
            "nodata": 0,
            "aliases": [
                "slc",
                "qa"
            ],
            "units": "1"
        }
    ]
}
datacube --version
Open Data Cube core, version 1.5.5+5.gdd0aaa3
@loicdtx

This comment has been minimized.

Copy link
Contributor

commented Apr 20, 2018

And here's the output of datacube product show s2_l2a_20m_mexico

{
    "storage": {
        "crs": "PROJCS[\"unnamed\",GEOGCS[\"WGS 84\",DATUM[\"unknown\",SPHEROID[\"WGS84\",6378137,6556752.3141]],PRIMEM[\"Greenwich\",0],UNIT[\"degree\",0.0174532925199433]],PROJECTION[\"Lambert_Conformal_Conic_2SP\"],PARAMETER[\"standard_parallel_1\",17.5],PARAMETER[\"standard_parallel_2\",29.5],PARAMETER[\"latitude_of_origin\",12],PARAMETER[\"central_meridian\",-102],PARAMETER[\"false_easting\",2500000],PARAMETER[\"false_northing\",0]]",
        "origin": {
            "y": 2426720,
            "x": 977160
        },
        "resolution": {
            "y": -20,
            "x": 20
        },
        "tile_size": {
            "y": 100020,
            "x": 100020
        }
    },
    "name": "s2_l2a_20m_mexico",
    "managed": true,
    "metadata": {
        "product_type": "sen2cor",
        "instrument": {
            "name": "MSI"
        },
        "platform": {
            "code": "sentinel2"
        },
        "format": {
            "name": "NetCDF"
        }
    },
    "measurements": [
        {
            "name": "blue",
            "aliases": [
                "band_2",
                "blue"
            ],
            "nodata": 0,
            "dtype": "uint16",
            "units": "reflectance"
        },
        {
            "name": "green",
            "aliases": [
                "band_3",
                "green"
            ],
            "nodata": 0,
            "dtype": "uint16",
            "units": "reflectance"
        },
        {
            "name": "red",
            "aliases": [
                "band_4",
                "red"
            ],
            "nodata": 0,
            "dtype": "uint16",
            "units": "reflectance"
        },
        {
            "name": "re1",
            "aliases": [
                "band_5",
                "red_edge_1",
                "re1"
            ],
            "nodata": 0,
            "dtype": "uint16",
            "units": "reflectance"
        },
        {
            "name": "re2",
            "aliases": [
                "band_6",
                "red_edge_2",
                "re2"
            ],
            "nodata": 0,
            "dtype": "uint16",
            "units": "reflectance"
        },
        {
            "name": "re3",
            "aliases": [
                "band_7",
                "red_edge_3",
                "re3"
            ],
            "nodata": 0,
            "dtype": "uint16",
            "units": "reflectance"
        },
        {
            "name": "nir",
            "aliases": [
                "band_8A",
                "nir"
            ],
            "nodata": 0,
            "dtype": "uint16",
            "units": "reflectance"
        },
        {
            "name": "swir1",
            "aliases": [
                "band_11",
                "swir1"
            ],
            "nodata": 0,
            "dtype": "uint16",
            "units": "reflectance"
        },
        {
            "name": "swir2",
            "aliases": [
                "band_12",
                "swir2"
            ],
            "nodata": 0,
            "dtype": "uint16",
            "units": "reflectance"
        },
        {
            "name": "pixel_qa",
            "flags_definition": {
                "sca": {
                    "bits": [
                        0,
                        1,
                        2,
                        3,
                        4,
                        5,
                        6,
                        7,
                        8,
                        9,
                        10,
                        11,
                        12,
                        13,
                        14,
                        15
                    ],
                    "values": {
                        "9": "Cloud high probability",
                        "5": "Not-vegetated",
                        "6": "Water",
                        "2": "Dark features / Shadows",
                        "3": "Cloud shadows",
                        "1": "Saturated or defective pixel",
                        "10": "Thin cirrus",
                        "4": "Vegetation",
                        "8": "Cloud medium probability",
                        "0": "No Data",
                        "7": "Unclassified",
                        "11": "Snow or ice"
                    },
                    "description": "Sen2Cor Scene Classification"
                }
            },
            "dtype": "uint16",
            "units": "1",
            "aliases": [
                "slc",
                "qa"
            ],
            "nodata": 0
        }
    ],
    "metadata_type": "eo",
    "description": "Sentinel 2 bottom of atmosphere processed with sen2cor. Resampled to 20m Mexico INEGI Lambert Conformal Conic projection with a 100 km tile size."
}
@Kirill888

This comment has been minimized.

Copy link
Contributor Author

commented Apr 20, 2018

My guess so far is that format.name is to blame in this case

output_type.metadata_doc['format'] = {'name': storage_format}

In the past we copied format name as is, now it comes from the "normalised" driver name. So my guess is that the difference is "NetCDF" != "NetCDF CF".

Ingest should report more about the differences, it has access to that info.

@uchchwhash

This comment has been minimized.

Copy link
Contributor

commented Apr 20, 2018

The normalised driver name issue has come up before, see #411 for example. Perhaps include this in the 'What's New' section of the release notes?

edit: apologies, #411 seems to be a separate issue altogether.

@omad

This comment has been minimized.

Copy link
Member

commented May 1, 2018

The change that has broken things here is that storage.driver is now being stored in the Product in the DB. It was made in commit 63d721b when DriverManager was still a thing.

I think it was an oversite to not revert this, and the best option is to not store the storage.driver name in the database product. As far as I can tell that would be more in line with the fields which are kept when morphing from an ingestion configuration to a product definition.

@Kirill888 Would this have any impact on how drivers are selected?

@jeremyh @andrewdhicks Does this sound okay to you two?

@omad

This comment has been minimized.

Copy link
Member

commented May 1, 2018

@loicdtx
If you run ingester in verbose mode, it will log which changes, if any, need to be made to the target product. I've run a test with your configuration files between 1.5.5 and 1.6rc1 and get the following output:

$ datacube -v ingest -c s2_l2a_20m_mexico.yaml
2018-05-01 10:30:28,396 68994 datacube INFO Running datacube command: /Users/omad/miniconda3/envs/py36/bin/datacube -v ingest -c s2_l2a_20m_mexico.yaml
2018-05-01 10:30:28,563 68994 datacube-ingest INFO Created DatasetType s2_l2a_20m_mexico
2018-05-01 10:30:28,575 68994 datacube.index._products INFO Unsafe change in storage.driver from missing to 'NetCDF CF'
2018-05-01 10:30:28,575 68994 datacube-ingest INFO Cannot update "s2_l2a_20m_mexico": 1 unsafe changes, 0 safe changes
2018-05-01 10:30:28,575 68994 datacube-ingest INFO Safe changes: []
2018-05-01 10:30:28,575 68994 datacube-ingest INFO Unsafe changes: [(('storage', 'driver'), missing, 'NetCDF CF')]

This shows how we're now attempting to store storage.driver whereas before we weren't. I think this is a bug and we'll fix it in the next release.

In the meantime, after reviewing the log output, it's possible to run:

datacube -v ingest --allow-product-changes -c s2_l2a_20m_mexico.yaml

Which will update the product in the database. After we fix the bug, you would need to run again with the same option to convert back to the old definition.

@Kirill888

This comment has been minimized.

Copy link
Contributor Author

commented May 1, 2018

@omad only format and protocol are used to select driver, nothing under storage is consulted. Not sure what are the expectations in the s3aio driver are.

@omad omad referenced this issue May 1, 2018
3 of 5 tasks complete

omad added a commit that referenced this issue May 2, 2018

omad added a commit that referenced this issue May 2, 2018

@omad omad closed this in #436 May 2, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.