Skip to content

Geotiff readers/writers reject file-like (BytesIO) sources despite advertising 'str | path-like' #1511

@brendancol

Description

@brendancol

Problem

open_geotiff, read_geotiff, read_geotiff_dask, read_geotiff_gpu, read_vrt, and to_geotiff are all typed as taking source: str / path: str, and the docstrings describe a file path. The reader's _open_source (xrspatial/geotiff/_reader.py) only handles local paths, http(s) URLs, and fsspec URIs — passing a BytesIO (or any file-like) raises TypeError/AttributeError.

This is a common ask for in-memory pipelines (test fixtures, fetch-then-decode, S3 multipart staging, network-attached buffers) and the obvious workaround (write to a temp file) defeats the purpose of having a fast streaming reader.

Repro

```python
import io, numpy as np, xarray as xr
from xrspatial.geotiff import to_geotiff, read_geotiff

buf = io.BytesIO()
to_geotiff(xr.DataArray(np.zeros((8, 8), np.float32)), buf) # TypeError
read_geotiff(buf) # AttributeError
```

Suggested fix

Accept a binary file-like (anything with read/seek/tell, plus write for the writer) wherever the API takes a path. Add a _BytesIOSource class alongside _FileSource / _HttpSource / _FsspecSource and route _open_source to it for non-string inputs. For the writer, when the destination is file-like, build the TIFF into a BytesIO (or directly into the caller's buffer) and skip the os.replace/atomic-rename code path.

Out of scope:

  • COG output to BytesIO (still needs full materialisation; would work but should be documented).
  • VRT output to BytesIO (VRT is inherently a file-system construct that references sibling files).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions