Skip to content

geotiff: _write_vrt_tiled drops nodatavals/gdal_metadata/extra_tags/resolution/raster_type #1606

@brendancol

Description

@brendancol

Summary

to_geotiff(da, "out.vrt") (which dispatches to _write_vrt_tiled) silently drops several metadata attrs that to_geotiff(da, "out.tif") propagates from a rioxarray-style DataArray:

  • attrs['nodatavals'] / attrs['_FillValue'] -- the VRT path uses data.attrs.get('nodata') directly instead of _resolve_nodata_attr(data.attrs) (the alias resolver added in to_geotiff drops rioxarray nodatavals and CF _FillValue silently #1582 for the TIF / GPU writers).
  • attrs['gdal_metadata'] / attrs['gdal_metadata_xml']
  • attrs['extra_tags']
  • attrs['image_description'] / attrs['extra_samples'] / attrs['colormap'] (folded in via _merge_friendly_extra_tags on the TIF path)
  • attrs['x_resolution'] / attrs['y_resolution'] / attrs['resolution_unit']
  • attrs['raster_type']

A rioxarray-sourced raster therefore round-trips correctly through .tif but loses its nodata sentinel (along with the rich tags) through .vrt. The two destinations should produce metadata-equivalent outputs.

Repro

import numpy as np, xarray as xr, tempfile, os, glob
from xrspatial.geotiff import to_geotiff, open_geotiff

arr = np.arange(64, dtype=np.float32).reshape(8, 8)
arr[0, 0] = -9999.0
da = xr.DataArray(arr, dims=('y', 'x'),
                  coords={'y': np.arange(8.0), 'x': np.arange(8.0)},
                  attrs={
                      'nodatavals': (-9999.0,),   # rioxarray convention
                      'crs': 4326,
                      'gdal_metadata': {'foo': 'bar'},
                      'raster_type': 'point',
                      'x_resolution': 96,
                      'y_resolution': 96,
                      'resolution_unit': 'inch',
                  })

with tempfile.TemporaryDirectory() as tmpd:
    tif = os.path.join(tmpd, 'out.tif')
    vrt = os.path.join(tmpd, 'out.vrt')
    to_geotiff(da, tif, tile_size=4)
    to_geotiff(da, vrt, tile_size=4)

    tif_da = open_geotiff(tif)
    tile = sorted(glob.glob(os.path.join(tmpd, 'out_tiles', '*.tif')))[0]
    tile_da = open_geotiff(tile)

    print('TIF:', tif_da.attrs.get('nodata'), tif_da.attrs.get('gdal_metadata'),
          tif_da.attrs.get('raster_type'), tif_da.attrs.get('x_resolution'))
    print('VRT tile:', tile_da.attrs.get('nodata'),
          tile_da.attrs.get('gdal_metadata'),
          tile_da.attrs.get('raster_type'),
          tile_da.attrs.get('x_resolution'))

Output:

TIF: -9999.0 {'foo': 'bar'} point 96
VRT tile: None None None None

Fix

In _write_vrt_tiled:

  1. Use _resolve_nodata_attr(data.attrs) to honor nodatavals / _FillValue like the TIF and GPU writers do.
  2. Pull raster_type, gdal_metadata (XML or dict), extra_tags (folded with friendly tag attrs via _merge_friendly_extra_tags), x_resolution, y_resolution, resolution_unit from data.attrs and thread them to each per-tile write_single_tile call so every tile carries the same rich metadata as the equivalent TIF write would.

Severity: MEDIUM (backend-inconsistent metadata between .tif and .vrt outputs for the same input).

Found by /sweep-metadata.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions