Add support for float type raster datasets #379

tritolol · 2022-01-31T09:27:43Z

I'm trying to load a custom DSM raster dataset of type float32.

I'm implementing a new class based on RasterDataset like so:

from torchgeo.datasets import RasterDataset

class DsmData(RasterDataset):
    filename_glob = "*.tif"

When I call __getitem__() on a DsmData object, I get an array containing int32 data instead.
I traced this down to the following line in the RasterDataset definition

torchgeo/torchgeo/datasets/geo.py

Line 464 in 3f7e525

dest = dest.astype(np.int32)

All raster datasets are forced into the int32 type which should not happen.

Having RasterDatasets with different types will probably cause problems when applying union or intersection operations.
But since a custom collate_fn can be defined, the user is able to provide a solution for this.

I achieved the desired behavior simply by removing the mentioned line.

The text was updated successfully, but these errors were encountered:

calebrob6 · 2022-01-31T16:39:02Z

Oof, good catch. We should definitely change this. Do you want to open a PR to get this started?

adamjstewart · 2022-01-31T20:04:04Z

Having RasterDatasets with different types will probably cause problems when applying union or intersection operations.

Nope, this won't be a problem. All of our collation functions use torch operators that can handle various dtypes:

$ python
>>> import torch
>>> a = torch.zeros(3, 3, dtype=torch.int32)
>>> b = torch.zeros(3, 3, dtype=torch.float32)
>>> torch.stack((a, b))
tensor([[[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]]])
>>> torch.maximum(a, b)
tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])
>>> torch.cat((a, b))
tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])

tritolol · 2022-02-01T15:03:37Z

Created PR #384

adamjstewart added the datasets Geospatial or benchmark datasets label Jan 31, 2022

tritolol mentioned this issue Feb 1, 2022

Fix forced int32 type conversion in RasterDataset #384

Merged

calebrob6 closed this as completed in #384 Feb 17, 2022

adamjstewart added this to the 0.2.1 milestone Mar 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for float type raster datasets #379

Add support for float type raster datasets #379

tritolol commented Jan 31, 2022 •

edited

calebrob6 commented Jan 31, 2022

adamjstewart commented Jan 31, 2022

tritolol commented Feb 1, 2022

Add support for float type raster datasets #379

Add support for float type raster datasets #379

Comments

tritolol commented Jan 31, 2022 • edited

calebrob6 commented Jan 31, 2022

adamjstewart commented Jan 31, 2022

tritolol commented Feb 1, 2022

tritolol commented Jan 31, 2022 •

edited