Error training model - band regex? #466

graceebc9 · 2022-03-16T16:43:54Z

Hello!

I've got a custom datamodule for Landcover / Modis / Sentinel data. The data module works fine when called directly (I can plot the 1 mask 4 channels) by sampling with a dataloader.

Issue comes when trying to run this datamodule with a semantic segmentation binary task - there's an issue with the bands within the geo.py raster dataset - ' no such group' . I've looked at the source code in datasets/geo.py but I'm not clear how to solve.

It seems to be some kind of issue with the band and the regex - we seem to match the date ok, but potentially fail with the band. However I copied the form of the band regex from the torchgeo sentinel2class.

from torchgeo.datasets import Sentinel2

class Sentinel2(Sentinel2):
    filename_glob = '*B03.tif'
    filename_regex = '^(?P<date>\d{6})\S{4}(?P<band>B[018][\dA]).tif$'
    date_format = '%Y%m'
    all_bands = ['B03', 'B08', 'B11']

def main():

    datamodule = MODISJDLandcoverSimpleDataModule(
      modis_root_dir="MODIS/",
      landcover_root_dir="landcover/Classified/",
      sentinel_root_dir ='sentinel/',
      patch_size=250,
      batch_size=10,
      length=10,
      num_workers=0,
      one_hot_encode=False,
      balance_samples=False,
      burn_prop = 0, 
      grid_sampler = False,
      units = Units.PIXELS
)

    # ignore_zeros=True corresponds to ignoring the background class
    # in metrics evaluation
    model = BinarySemanticSegmentationTask(
        segmentation_model="unet",
        encoder_name="resnet18",
        encoder_weights=None, #"imagenet",
        in_channels=4,
        num_filters=64,
        num_classes=2,
        loss="jaccard",
        # tversky_alpha=0.7,
        # tversky_beta=0.3,
        # tversky_gamma=1.0,
        learning_rate=0.1,
        ignore_zeros=False,
        learning_rate_schedule_patience=5,
    )

    trainer = Trainer(gpus=1, fast_dev_run=True)


    # this is used when automatically finding the learning rate
    trainer.tune(
        model, datamodule
    )  
    trainer.fit(model, datamodule)


if __name__ == "__main__":

    # set random seed for reproducibility
    pl.seed_everything(0)

    # TRAIN
    main()

Global seed set to 0
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Running in fast_dev_run mode: will run a full train, val, test and prediction loop using 1 batch(es).
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name          | Type             | Params
---------------------------------------------------
0 | model         | Unet             | 14.3 M
1 | loss          | JaccardLoss      | 0     
2 | train_metrics | MetricCollection | 0     
3 | val_metrics   | MetricCollection | 0     
4 | test_metrics  | MetricCollection | 0     
---------------------------------------------------
14.3 M    Trainable params
0         Non-trainable params
14.3 M    Total params
57.325    Total estimated model params size (MB)
/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/data_loading.py:433: UserWarning: The number of training samples (1) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
  f"The number of training samples ({self.num_training_batches}) is smaller than the logging interval"
Epoch 0: 0%
0/2 [00:00<?, ?it/s]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
[<ipython-input-33-74529b3a24c6>](https://localhost:8080/#) in <module>()
     72 
     73     # TRAIN
---> 74     main()

26 frames
[/usr/local/lib/python3.7/dist-packages/torchgeo/datasets/geo.py](https://localhost:8080/#) in __getitem__(self, query)
    415                     if match:
    416                         if "date" in match.groupdict():
--> 417                             start = match.start("band")
    418                             end = match.end("band")
    419                             filename = filename[:start] + band + filename[end:]

IndexError: no such group

The text was updated successfully, but these errors were encountered:

graceebc9 · 2022-03-16T17:01:01Z

format of name of files is '201808_21_B03.tif'

adamjstewart · 2022-03-16T17:29:41Z

This one took me a long time to figure out. You need to use a raw string (filename_regex = r'...') or replace all \ with \\.

graceebc9 · 2022-03-16T17:53:34Z

Thank you very much!!!!!! That solved that but now I'm getting the error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-62-d2a0032e8c00>](https://localhost:8080/#) in <module>()
     70 
     71     # TRAIN
---> 72     main()

39 frames
[/usr/local/lib/python3.7/dist-packages/segmentation_models_pytorch/unet/decoder.py](https://localhost:8080/#) in forward(self, x, skip)
     36         x = F.interpolate(x, scale_factor=2, mode="nearest")
     37         if skip is not None:
---> 38             x = torch.cat([x, skip], dim=1)
     39             x = self.attention1(x)
     40         x = self.conv1(x)

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 64 but got size 63 for tensor number 1 in the list.

adamjstewart · 2022-03-16T18:08:47Z

This seems related to the following bug reports. Basically, the UNet that comes with SMP requires images with patch_size divisible by 32. Can you try switching from 250 to 256 and see if that solves your issue?

graceebc9 · 2022-03-16T18:11:36Z

yes that worked, thanks so much!!!!

adamjstewart added the datasets Geospatial or benchmark datasets label Mar 16, 2022

adamjstewart closed this as completed Mar 16, 2022

adamjstewart mentioned this issue Apr 26, 2022

RasterDataset with RBG Tif #517

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error training model - band regex? #466

Error training model - band regex? #466

graceebc9 commented Mar 16, 2022 •

edited by adamjstewart

Loading

graceebc9 commented Mar 16, 2022

adamjstewart commented Mar 16, 2022

graceebc9 commented Mar 16, 2022 •

edited by adamjstewart

Loading

adamjstewart commented Mar 16, 2022

graceebc9 commented Mar 16, 2022

Error training model - band regex? #466

Error training model - band regex? #466

Comments

graceebc9 commented Mar 16, 2022 • edited by adamjstewart Loading

graceebc9 commented Mar 16, 2022

adamjstewart commented Mar 16, 2022

graceebc9 commented Mar 16, 2022 • edited by adamjstewart Loading

adamjstewart commented Mar 16, 2022

graceebc9 commented Mar 16, 2022

graceebc9 commented Mar 16, 2022 •

edited by adamjstewart

Loading

graceebc9 commented Mar 16, 2022 •

edited by adamjstewart

Loading