Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting Error of "No Image" When Trying to Train on Custom Dataset #9

Closed
yimingsu01 opened this issue Sep 2, 2022 · 3 comments
Closed

Comments

@yimingsu01
Copy link

Hi. I was trying to replicate the experiment with my custom dataset of bedrooms. I've modified my local.yaml and set the basepath to the image folder, where it contains train, test, and validate. I'm sure that they all contain images. But when I execute this:

python src/run.py +experiment=[blobgan,local,jitter] wandb.name='10-blob BlobGAN on bedrooms'

It gives this error:

AssertionError: Cannot compute FID without name of statistics file to use.

After scrolling up, I found these errors:

Warning: no images found in /home/PythonProjects/blobgan/resizedout/train. Using empty dataset for split train. Perhaps you set `dataset.path` incorrectly?
Warning: no images found in /home/PythonProjects/blobgan/resizedout/validate. Using empty dataset for split validate. Perhaps you set `dataset.path` incorrectly?
Warning: no images found in /home/PythonProjects/blobgan/resizedout/test. Using empty dataset for split test. Perhaps you set `dataset.path` incorrectly?

This is the entire error log:

resume:
  id: null
  step: null
  epoch: null
  last: true
  best: false
  clobber_hparams: false
  project: Blobgan investigation
  log_dir: ./logs
  model_only: false
logger: wandb
wandb:
  save_code: true
  offline: false
  log_dir: ./logs
  id: null
  name: 10-blob BlobGAN on bedrooms
  group: XXX Group
  project: Blobgan investigation
  entity: yimingsu
trainer:
  accelerator: ddp
  benchmark: false
  deterministic: false
  gpus: 1
  precision: 32
  plugins: null
  max_steps: 10000000
  profiler: simple
  num_sanity_val_steps: 0
  log_every_n_steps: 200
  limit_val_batches: 0
dataset:
  dataloader:
    prefetch_factor: 2
    pin_memory: true
    drop_last: true
    persistent_workers: true
    num_workers: 12
    batch_size: 24
  name: ImageFolderDataModule
  resolution: 256
  category: bedroom
  path: /home/yimingsu/PythonProjects/blobgan/resizedout/
mode: fit
seed: 0
checkpoint:
  every_n_train_steps: 1500
  save_top_k: -1
  mode: max
  monitor: step
model:
  name: BlobGAN
  lr: 0.002
  dim: 512
  noise_dim: 512
  resolution: 256
  lambda:
    D_real: 1
    D_fake: 1
    D_R1: 50
    G: 1
    G_path: 2
    G_feature_mean: 10
    G_feature_variance: 10
  discriminator:
    name: StyleGANDiscriminator
    size: 256
  generator:
    name: models.networks.layoutstylegan.LayoutStyleGANGenerator
    style_dim: 512
    n_mlp: 8
    size_in: 16
    c_model: 96
    spatial_style: true
    size: 256
  layout_net:
    name: models.networks.layoutnet.LayoutGenerator
    n_features_max: 10
    feature_dim: 768
    style_dim: 512
    noise_dim: 512
    norm_features: true
    mlp_lr_mul: 0.01
    mlp_hidden_dim: 1024
    spatial_style: true
  D_reg_every: 16
  G_reg_every: -1
  λ:
    D_real: 1
    D_fake: 1
    D_R1: 50
    G: 1
    G_path: 2
    G_feature_mean: 10
    G_feature_variance: 10
  log_images_every_n_steps: 1000
  n_features_min: 10
  n_features_max: 10
  n_features: 10
  spatial_style: true
  feature_jitter_xy: 0.04
  feature_jitter_shift: 0.5
  feature_jitter_angle: 0.1

Global seed set to 0
wandb: Currently logged in as: yimingsu (use `wandb login --relogin` to force relogin)
wandb: wandb version 0.13.2 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Tracking run with wandb version 0.12.11
wandb: Run data is saved locally in /home/yimingsu/PythonProjects/blobgan/logs/wandb/run-20220902_114754-xjimkxit
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run 10-blob BlobGAN on bedrooms
wandb: ⭐️ View project at https://wandb.ai/yimingsu/Blobgan%20investigation
wandb: 🚀 View run at https://wandb.ai/yimingsu/Blobgan%20investigation/runs/xjimkxit
[2022-09-02 11:47:58,898][torch.distributed.nn.jit.instantiator][INFO] - Created a temporary directory at /tmp/tmpr5cdgp32
[2022-09-02 11:47:58,898][torch.distributed.nn.jit.instantiator][INFO] - Writing /tmp/tmpr5cdgp32/_remote_module_non_sriptable.py
Froze 65 parameters - ['conv1.conv', 'conv1.noise', 'conv1.activate', 'to_rgb1', 'to_rgb1.conv', 'convs.0', 'convs.1', 'convs.2', 'convs.3', 'convs.4', 'convs.5', 'convs.6', 'convs.7', 'to_rgbs.0', 'to_rgbs.1', 'to_rgbs.2', 'to_rgbs.3'] - for model of type LayoutStyleGANGenerator
Froze 16 parameters - ['mlp.1', 'mlp.2', 'mlp.3', 'mlp.4', 'mlp.5', 'mlp.6', 'mlp.7', 'mlp.8'] - for model of type LayoutGenerator
[2022-09-02 11:47:59,637][py.warnings][WARNING] - /home/yimingsu/anaconda3/envs/blobgan/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:286: LightningDeprecationWarning: Passing `Trainer(accelerator='ddp')` has been deprecated in v1.5 and will be removed in v1.7. Use `Trainer(strategy='ddp')` instead.
  rank_zero_deprecation(

[2022-09-02 11:47:59,637][py.warnings][WARNING] - /home/yimingsu/anaconda3/envs/blobgan/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/callback_connector.py:147: LightningDeprecationWarning: Setting `Trainer(checkpoint_callback=True)` is deprecated in v1.5 and will be removed in v1.7. Please consider using `Trainer(enable_checkpointing=True)`.
  rank_zero_deprecation(

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Global seed set to 0
initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1
[2022-09-02 11:47:59,641][torch.distributed.distributed_c10d][INFO] - Added key: store_based_barrier_key:1 to store for rank: 0
[2022-09-02 11:47:59,641][torch.distributed.distributed_c10d][INFO] - Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes.
----------------------------------------------------------------------------------------------------
distributed_backend=nccl
All distributed processes registered. Starting with 1 processes
----------------------------------------------------------------------------------------------------

Warning: no images found in /home/yimingsu/PythonProjects/blobgan/resizedout/train. Using empty dataset for split train. Perhaps you set `dataset.path` incorrectly?
Warning: no images found in /home/yimingsu/PythonProjects/blobgan/resizedout/validate. Using empty dataset for split validate. Perhaps you set `dataset.path` incorrectly?
Warning: no images found in /home/yimingsu/PythonProjects/blobgan/resizedout/test. Using empty dataset for split test. Perhaps you set `dataset.path` incorrectly?
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Optimizing 57.19M params for G and 28.86M params for D

  | Name           | Type                    | Params
-----------------------------------------------------------
0 | discriminator  | StyleGANDiscriminator   | 28.9 M
1 | generator_ema  | LayoutStyleGANGenerator | 35.9 M
2 | generator      | LayoutStyleGANGenerator | 35.9 M
3 | layout_net_ema | LayoutGenerator         | 21.3 M
4 | layout_net     | LayoutGenerator         | 21.3 M
-----------------------------------------------------------
86.1 M    Trainable params
57.2 M    Non-trainable params
143 M     Total params
573.008   Total estimated model params size (MB)
[2022-09-02 11:48:01,076][py.warnings][WARNING] - /home/yimingsu/anaconda3/envs/blobgan/lib/python3.9/site-packages/torch/utils/data/dataloader.py:487: UserWarning: This DataLoader will create 12 worker processes in total. Our suggested max number of worker in current system is 8, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(_create_warning_msg(

Training: 0it [00:00, ?it/s]Error executing job with overrides: ['+experiment=[blobgan,local,jitter]', 'wandb.name=10-blob BlobGAN on bedrooms', 'dataset=imagefolder', '++dataset.path=/home/yimingsu/PythonProjects/blobgan/resizedout/']
Traceback (most recent call last):
  File "/home/yimingsu/PythonProjects/blobgan/src/run.py", line 81, in run
    trainer.fit(model, datamodule=datamodule)
  File "/home/yimingsu/anaconda3/envs/blobgan/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 740, in fit
    self._call_and_handle_interrupt(
  File "/home/yimingsu/anaconda3/envs/blobgan/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/yimingsu/anaconda3/envs/blobgan/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 777, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/yimingsu/anaconda3/envs/blobgan/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1199, in _run
    self._dispatch()
  File "/home/yimingsu/anaconda3/envs/blobgan/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1279, in _dispatch
    self.training_type_plugin.start_training(self)
  File "/home/yimingsu/anaconda3/envs/blobgan/lib/python3.9/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 202, in start_training
    self._results = trainer.run_stage()
  File "/home/yimingsu/anaconda3/envs/blobgan/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1289, in run_stage
    return self._run_train()
  File "/home/yimingsu/anaconda3/envs/blobgan/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1319, in _run_train
    self.fit_loop.run()
  File "/home/yimingsu/anaconda3/envs/blobgan/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 140, in run
    self.on_run_start(*args, **kwargs)
  File "/home/yimingsu/anaconda3/envs/blobgan/lib/python3.9/site-packages/pytorch_lightning/loops/fit_loop.py", line 200, in on_run_start
    self.trainer.call_hook("on_train_start")
  File "/home/yimingsu/anaconda3/envs/blobgan/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1501, in call_hook
    output = model_fx(*args, **kwargs)
  File "/home/yimingsu/PythonProjects/blobgan/src/models/blobgan.py", line 131, in on_train_start
    assert not ((self.log_fid_every_n_steps > -1 or self.log_fid_every_epoch) and (not self.fid_stats_name)), \
AssertionError: Cannot compute FID without name of statistics file to use.

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Any help would be greatly appreciated!

Best,
Yiming

@dave-epstein
Copy link
Owner

Hi,

Please see the README for information on logging FID.

As for the other error, which is unrelated to the FID error, maybe you can print out the exception here for more detail: https://github.com/dave-epstein/blobgan/blob/main/src/data/imagefolder.py#L49

@alexander-novo
Copy link

Just in case someone else finds this like I have, the issue is that ImageFolder from torchvision requires all of your images to be separated into different folders depending on class. So each of train/, test/, and validate subfolders in your dataset must also have some subfolders in them which then contain the images (rather than just directly containing the images). As far as I know this class information isn't used, so you can probably just put all of your images into a single subfolder.

@dave-epstein
Copy link
Owner

Yes, that's exactly right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants