root@5285e769b96f:/home/imaginaire# python -m torch.distributed.launch --nproc_per_node=8 train.py --config configs/projects/vid2vid/kitti/ampO1.yaml --logdir /home/logs
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
*****************************************
LMDB file at datasets/kitti/lmdb/train/images opened.
LMDB file at datasets/kitti/lmdb/train/seg_maps opened.
LMDB file at datasets/kitti/lmdb/val/images opened.
LMDB file at datasets/kitti/lmdb/val/seg_maps opened.
LMDB file at datasets/kitti/lmdb/train/images opened.
LMDB file at datasets/kitti/lmdb/train/seg_maps opened.
LMDB file at datasets/kitti/lmdb/val/images opened.
LMDB file at datasets/kitti/lmdb/val/seg_maps opened.
LMDB file at datasets/kitti/lmdb/train/images opened.
LMDB file at datasets/kitti/lmdb/train/seg_maps opened.
LMDB file at datasets/kitti/lmdb/val/images opened.
LMDB file at datasets/kitti/lmdb/val/seg_maps opened.
Make folder /home/logs
cudnn benchmark: True
cudnn deterministic: False
LMDB file at datasets/kitti/lmdb/train/images opened.
LMDB file at datasets/kitti/lmdb/train/seg_maps opened.
LMDB file at datasets/kitti/lmdb/val/images opened.
LMDB file at datasets/kitti/lmdb/val/seg_maps opened.
LMDB file at datasets/kitti/lmdb/train/images opened.
LMDB file at datasets/kitti/lmdb/train/seg_maps opened.
LMDB file at datasets/kitti/lmdb/val/images opened.
LMDB file at datasets/kitti/lmdb/val/seg_maps opened.
LMDB file at datasets/kitti/lmdb/train/images opened.
LMDB file at datasets/kitti/lmdb/train/seg_maps opened.
LMDB file at datasets/kitti/lmdb/val/images opened.
LMDB file at datasets/kitti/lmdb/val/seg_maps opened.
LMDB file at datasets/kitti/lmdb/train/images opened.
LMDB file at datasets/kitti/lmdb/train/seg_maps opened.
LMDB file at datasets/kitti/lmdb/val/images opened.
LMDB file at datasets/kitti/lmdb/val/seg_maps opened.
LMDB file at datasets/kitti/lmdb/train/images opened.
LMDB file at datasets/kitti/lmdb/train/seg_maps opened.
Num datasets: 1
Num sequences: 19
Max sequence length: 1065
Epoch length: 19
LMDB file at datasets/kitti/lmdb/val/images opened.
LMDB file at datasets/kitti/lmdb/val/seg_maps opened.
Num datasets: 1
Num sequences: 8
Max sequence length: 83
Epoch length: 8
Train dataset length: 19
Val dataset length: 8
Concatenate images:
    ext: png
    num_channels: 3
    interpolator: BILINEAR
    normalize: True
    pre_aug_ops: None
    post_aug_ops: None
    use_dont_care: False
    computed_on_the_fly: False for input.
Concatenate seg_maps:
    ext: png
    num_channels: 35
    interpolator: NEAREST
    normalize: False
    pre_aug_ops: None
    post_aug_ops: None
    use_dont_care: False
    computed_on_the_fly: False for input.
	Num. of channels in the input label: 35
Concatenate images:
    ext: png
    num_channels: 3
    interpolator: BILINEAR
    normalize: True
    pre_aug_ops: None
    post_aug_ops: None
    use_dont_care: False
    computed_on_the_fly: False for input.
	Num. of channels in the input image: 3
Concatenate images:
    ext: png
    num_channels: 3
    interpolator: BILINEAR
    normalize: True
    pre_aug_ops: None
    post_aug_ops: None
    use_dont_care: False
    computed_on_the_fly: False for input.
	Num. of channels in the input image: 3
Concatenate images:
    ext: png
    num_channels: 3
    interpolator: BILINEAR
    normalize: True
    pre_aug_ops: None
    post_aug_ops: None
    use_dont_care: False
    computed_on_the_fly: False for input.
Concatenate seg_maps:
    ext: png
    num_channels: 35
    interpolator: NEAREST
    normalize: False
    pre_aug_ops: None
    post_aug_ops: None
    use_dont_care: False
    computed_on_the_fly: False for input.
	Num. of channels in the input label: 35
Concatenate images:
    ext: png
    num_channels: 3
    interpolator: BILINEAR
    normalize: True
    pre_aug_ops: None
    post_aug_ops: None
    use_dont_care: False
    computed_on_the_fly: False for input.
	Num. of channels in the input image: 3
Concatenate images:
    ext: png
    num_channels: 3
    interpolator: BILINEAR
    normalize: True
    pre_aug_ops: None
    post_aug_ops: None
    use_dont_care: False
    computed_on_the_fly: False for input.
	Num. of channels in the input image: 3
Concatenate images:
    ext: png
    num_channels: 3
    interpolator: BILINEAR
    normalize: True
    pre_aug_ops: None
    post_aug_ops: None
    use_dont_care: False
    computed_on_the_fly: False for input.
Concatenate seg_maps:
    ext: png
    num_channels: 35
    interpolator: NEAREST
    normalize: False
    pre_aug_ops: None
    post_aug_ops: None
    use_dont_care: False
    computed_on_the_fly: False for input.
	Num. of channels in the input label: 35
Concatenate images:
    ext: png
    num_channels: 3
    interpolator: BILINEAR
    normalize: True
    pre_aug_ops: None
    post_aug_ops: None
    use_dont_care: False
    computed_on_the_fly: False for input.
	Num. of channels in the input image: 3
Initialize net_G and net_D weights using type: xavier gain: 0.02
net_G parameter count: 121,904,182
net_D parameter count: 1,422,658
Selected optimization level O1:  Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Setup trainer.
GAN mode: hinge
Perceptual loss:
	Mode: vgg19
	Perceptual loss is evaluated in the fp16 mode.
FlowNet2 is running in fp16 mode.
FlowNet2 is running in fp16 mode.
FlowNet2 is running in fp16 mode.
FlowNet2 is running in fp16 mode.
FlowNet2 is running in fp16 mode.
FlowNet2 is running in fp16 mode.
FlowNet2 is running in fp16 mode.
FlowNet2 is running in fp16 mode.
Loss GAN                  Weight 1.0
Loss FeatureMatching      Weight 10.0
Loss Perceptual           Weight 10.0
Loss Flow                 Weight 10.0
Loss Flow_L1              Weight 10.0
Loss Flow_Warp            Weight 10.0
Loss Flow_Mask            Weight 10.0
Load from: /home/logs/epoch_00025_iteration_000000075_checkpoint.pt
Done with loading the checkpoint.
Epoch 25 ...
Epoch length: 19
------- Updating sequence length to 8 -------
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 8192.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 8192.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 8192.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 8192.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 8192.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 8192.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 8192.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 8192.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 8192.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 8192.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 8192.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 8192.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 8192.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 8192.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 8192.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 8192.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 4096.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 4096.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 4096.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 4096.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 4096.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 4096.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 4096.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 4096.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 4096.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 4096.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 4096.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 4096.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 4096.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 4096.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 4096.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 4096.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 2048.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 2048.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 2048.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 2048.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 2048.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 2048.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 2048.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 2048.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 1024.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 1024.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 1024.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 1024.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 1024.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 1024.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 1024.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 1024.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 512.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 512.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 512.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 512.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 512.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 512.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 512.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 512.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 256.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 256.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 256.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 256.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 256.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 256.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 256.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 256.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 2048.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 2048.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 2048.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 2048.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 2048.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 2048.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 2048.0
Gradient overflow.  Skipping step, loss scaler 1 reducing loss scale to 2048.0
Epoch: 26, total time: 75.287018.
Epoch 26 ...
Epoch: 27, total time: 44.580074.
Epoch 27 ...
Epoch: 28, total time: 43.974533.
Epoch 28 ...
Epoch: 29, total time: 43.974683.
Epoch 29 ...
Epoch: 30, total time: 43.891855.
Save output images to /home/logs/images/epoch_00030_iteration_000000090.jpg
Save checkpoint to /home/logs/epoch_00030_iteration_000000090_checkpoint.pt
Computing FID.
Get FID mean and cov and save to /home/logs/regular_fid/epoch_00030_iteration_000000090.npy
Extract mean and covariance.
Number of videos used for evaluation: 8
Number of frames per video used for evaluation: 10
Load FID mean and cov from /home/logs/regular_fid/real_mean_cov.npz
Traceback (most recent call last):
  File "train.py", line 93, in <module>
    main()
  File "train.py", line 87, in main
    trainer.end_of_epoch(data, current_epoch, current_iteration)
  File "/home/imaginaire/imaginaire/trainers/base.py", line 402, in end_of_epoch
    self.write_metrics()
  File "/home/imaginaire/imaginaire/trainers/vid2vid.py", line 699, in write_metrics
    regular_fid, average_fid = self._compute_fid()
  File "/home/imaginaire/imaginaire/trainers/vid2vid.py", line 745, in _compute_fid
    is_video=True, few_shot_video=few_shot)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 53, in compute_fid
    is_video, few_shot_video)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 133, in load_or_compute_stats
    is_video, few_shot_video)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 165, in get_inception_mean_cov
    sample_size, preprocess, few_shot_video)
  File "/opt/conda/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "/home/imaginaire/imaginaire/evaluation/common.py", line 99, in get_video_activations
    inception = inception.to('cuda')
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 611, in to
    return self._apply(convert)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 358, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 358, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 380, in _apply
    param_applied = fn(param)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 609, in convert
    return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: the launch timed out and was terminated
terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: the launch timed out and was terminated
Exception raised from create_event_internal at ../c10/cuda/CUDACachingAllocator.cpp:687 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7fbf5ebed99b in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0xc10 (0x7fbf5ee30280 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::TensorImpl::release_resources() + 0x4d (0x7fbf5ebd5dfd in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #3: <unknown function> + 0x5414e2 (0x7fbfe2bf54e2 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x19aaae (0x55c633390aae in /opt/conda/bin/python)
frame #5: <unknown function> + 0xf244f (0x55c6332e844f in /opt/conda/bin/python)
frame #6: <unknown function> + 0xf244f (0x55c6332e844f in /opt/conda/bin/python)
frame #7: <unknown function> + 0xf2828 (0x55c6332e8828 in /opt/conda/bin/python)
frame #8: <unknown function> + 0x19aa90 (0x55c633390a90 in /opt/conda/bin/python)
frame #9: <unknown function> + 0xf27f8 (0x55c6332e87f8 in /opt/conda/bin/python)
frame #10: <unknown function> + 0x19aa90 (0x55c633390a90 in /opt/conda/bin/python)
frame #11: <unknown function> + 0xf2247 (0x55c6332e8247 in /opt/conda/bin/python)
frame #12: <unknown function> + 0xf20d7 (0x55c6332e80d7 in /opt/conda/bin/python)
frame #13: <unknown function> + 0xf20ed (0x55c6332e80ed in /opt/conda/bin/python)
frame #14: PyDict_SetItem + 0x3da (0x55c63332ed7a in /opt/conda/bin/python)
frame #15: PyDict_SetItemString + 0x4f (0x55c633335c5f in /opt/conda/bin/python)
frame #16: PyImport_Cleanup + 0x99 (0x55c63339adc9 in /opt/conda/bin/python)
frame #17: Py_FinalizeEx + 0x61 (0x55c633405961 in /opt/conda/bin/python)
frame #18: Py_Main + 0x35e (0x55c63340fcae in /opt/conda/bin/python)
frame #19: main + 0xee (0x55c6332d9f2e in /opt/conda/bin/python)
frame #20: __libc_start_main + 0xe7 (0x7fc00c257b97 in /lib/x86_64-linux-gnu/libc.so.6)
frame #21: <unknown function> + 0x1c327f (0x55c6333b927f in /opt/conda/bin/python)

Traceback (most recent call last):
  File "train.py", line 93, in <module>
    main()
  File "train.py", line 87, in main
    trainer.end_of_epoch(data, current_epoch, current_iteration)
  File "/home/imaginaire/imaginaire/trainers/base.py", line 402, in end_of_epoch
    self.write_metrics()
  File "/home/imaginaire/imaginaire/trainers/vid2vid.py", line 699, in write_metrics
    regular_fid, average_fid = self._compute_fid()
  File "/home/imaginaire/imaginaire/trainers/vid2vid.py", line 745, in _compute_fid
    is_video=True, few_shot_video=few_shot)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 53, in compute_fid
    is_video, few_shot_video)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 133, in load_or_compute_stats
    is_video, few_shot_video)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 165, in get_inception_mean_cov
    sample_size, preprocess, few_shot_video)
  File "/opt/conda/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "/home/imaginaire/imaginaire/evaluation/common.py", line 99, in get_video_activations
    inception = inception.to('cuda')
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 611, in to
    return self._apply(convert)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 358, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 358, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 380, in _apply
    param_applied = fn(param)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 609, in convert
    return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: the launch timed out and was terminated
terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: the launch timed out and was terminated
Exception raised from create_event_internal at ../c10/cuda/CUDACachingAllocator.cpp:687 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7f7f5020799b in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0xc10 (0x7f7f5044a280 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::TensorImpl::release_resources() + 0x4d (0x7f7f501efdfd in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #3: <unknown function> + 0x5414e2 (0x7f7fd420f4e2 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x19aaae (0x5622432daaae in /opt/conda/bin/python)
frame #5: <unknown function> + 0xf244f (0x56224323244f in /opt/conda/bin/python)
frame #6: <unknown function> + 0xf244f (0x56224323244f in /opt/conda/bin/python)
frame #7: <unknown function> + 0xf2828 (0x562243232828 in /opt/conda/bin/python)
frame #8: <unknown function> + 0x19aa90 (0x5622432daa90 in /opt/conda/bin/python)
frame #9: <unknown function> + 0xf27f8 (0x5622432327f8 in /opt/conda/bin/python)
frame #10: <unknown function> + 0x19aa90 (0x5622432daa90 in /opt/conda/bin/python)
frame #11: <unknown function> + 0xf2247 (0x562243232247 in /opt/conda/bin/python)
frame #12: <unknown function> + 0xf20d7 (0x5622432320d7 in /opt/conda/bin/python)
frame #13: <unknown function> + 0xf20ed (0x5622432320ed in /opt/conda/bin/python)
frame #14: PyDict_SetItem + 0x3da (0x562243278d7a in /opt/conda/bin/python)
frame #15: PyDict_SetItemString + 0x4f (0x56224327fc5f in /opt/conda/bin/python)
frame #16: PyImport_Cleanup + 0x99 (0x5622432e4dc9 in /opt/conda/bin/python)
frame #17: Py_FinalizeEx + 0x61 (0x56224334f961 in /opt/conda/bin/python)
frame #18: Py_Main + 0x35e (0x562243359cae in /opt/conda/bin/python)
frame #19: main + 0xee (0x562243223f2e in /opt/conda/bin/python)
frame #20: __libc_start_main + 0xe7 (0x7f7ffd871b97 in /lib/x86_64-linux-gnu/libc.so.6)
frame #21: <unknown function> + 0x1c327f (0x56224330327f in /opt/conda/bin/python)

Traceback (most recent call last):
  File "train.py", line 93, in <module>
    main()
  File "train.py", line 87, in main
    trainer.end_of_epoch(data, current_epoch, current_iteration)
  File "/home/imaginaire/imaginaire/trainers/base.py", line 402, in end_of_epoch
    self.write_metrics()
  File "/home/imaginaire/imaginaire/trainers/vid2vid.py", line 699, in write_metrics
    regular_fid, average_fid = self._compute_fid()
  File "/home/imaginaire/imaginaire/trainers/vid2vid.py", line 745, in _compute_fid
    is_video=True, few_shot_video=few_shot)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 53, in compute_fid
    is_video, few_shot_video)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 133, in load_or_compute_stats
    is_video, few_shot_video)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 165, in get_inception_mean_cov
    sample_size, preprocess, few_shot_video)
  File "/opt/conda/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "/home/imaginaire/imaginaire/evaluation/common.py", line 99, in get_video_activations
    inception = inception.to('cuda')
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 611, in to
    return self._apply(convert)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 358, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 358, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 380, in _apply
    param_applied = fn(param)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 609, in convert
    return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: the launch timed out and was terminated
terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: the launch timed out and was terminated
Exception raised from create_event_internal at ../c10/cuda/CUDACachingAllocator.cpp:687 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7f15edc0499b in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0xc10 (0x7f15ede47280 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::TensorImpl::release_resources() + 0x4d (0x7f15edbecdfd in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #3: <unknown function> + 0x5414e2 (0x7f1671c0c4e2 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x19aaae (0x555ba313faae in /opt/conda/bin/python)
frame #5: <unknown function> + 0xf244f (0x555ba309744f in /opt/conda/bin/python)
frame #6: <unknown function> + 0xf244f (0x555ba309744f in /opt/conda/bin/python)
frame #7: <unknown function> + 0xf2828 (0x555ba3097828 in /opt/conda/bin/python)
frame #8: <unknown function> + 0x19aa90 (0x555ba313fa90 in /opt/conda/bin/python)
frame #9: <unknown function> + 0xf27f8 (0x555ba30977f8 in /opt/conda/bin/python)
frame #10: <unknown function> + 0x19aa90 (0x555ba313fa90 in /opt/conda/bin/python)
frame #11: <unknown function> + 0xf2247 (0x555ba3097247 in /opt/conda/bin/python)
frame #12: <unknown function> + 0xf20d7 (0x555ba30970d7 in /opt/conda/bin/python)
frame #13: <unknown function> + 0xf20ed (0x555ba30970ed in /opt/conda/bin/python)
frame #14: PyDict_SetItem + 0x3da (0x555ba30ddd7a in /opt/conda/bin/python)
frame #15: PyDict_SetItemString + 0x4f (0x555ba30e4c5f in /opt/conda/bin/python)
frame #16: PyImport_Cleanup + 0x99 (0x555ba3149dc9 in /opt/conda/bin/python)
frame #17: Py_FinalizeEx + 0x61 (0x555ba31b4961 in /opt/conda/bin/python)
frame #18: Py_Main + 0x35e (0x555ba31becae in /opt/conda/bin/python)
frame #19: main + 0xee (0x555ba3088f2e in /opt/conda/bin/python)
frame #20: __libc_start_main + 0xe7 (0x7f169b26eb97 in /lib/x86_64-linux-gnu/libc.so.6)
frame #21: <unknown function> + 0x1c327f (0x555ba316827f in /opt/conda/bin/python)

Traceback (most recent call last):
  File "train.py", line 93, in <module>
    main()
  File "train.py", line 87, in main
    trainer.end_of_epoch(data, current_epoch, current_iteration)
  File "/home/imaginaire/imaginaire/trainers/base.py", line 402, in end_of_epoch
    self.write_metrics()
  File "/home/imaginaire/imaginaire/trainers/vid2vid.py", line 699, in write_metrics
    regular_fid, average_fid = self._compute_fid()
  File "/home/imaginaire/imaginaire/trainers/vid2vid.py", line 745, in _compute_fid
    is_video=True, few_shot_video=few_shot)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 53, in compute_fid
    is_video, few_shot_video)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 133, in load_or_compute_stats
    is_video, few_shot_video)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 165, in get_inception_mean_cov
    sample_size, preprocess, few_shot_video)
  File "/opt/conda/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "/home/imaginaire/imaginaire/evaluation/common.py", line 99, in get_video_activations
    inception = inception.to('cuda')
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 611, in to
    return self._apply(convert)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 358, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 358, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 380, in _apply
    param_applied = fn(param)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 609, in convert
    return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: the launch timed out and was terminated
terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: the launch timed out and was terminated
Exception raised from create_event_internal at ../c10/cuda/CUDACachingAllocator.cpp:687 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7f1c349a399b in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0xc10 (0x7f1c34be6280 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::TensorImpl::release_resources() + 0x4d (0x7f1c3498bdfd in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #3: <unknown function> + 0x5414e2 (0x7f1cb89ab4e2 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x19aaae (0x557456104aae in /opt/conda/bin/python)
frame #5: <unknown function> + 0xf244f (0x55745605c44f in /opt/conda/bin/python)
frame #6: <unknown function> + 0xf244f (0x55745605c44f in /opt/conda/bin/python)
frame #7: <unknown function> + 0xf2828 (0x55745605c828 in /opt/conda/bin/python)
frame #8: <unknown function> + 0x19aa90 (0x557456104a90 in /opt/conda/bin/python)
frame #9: <unknown function> + 0xf27f8 (0x55745605c7f8 in /opt/conda/bin/python)
frame #10: <unknown function> + 0x19aa90 (0x557456104a90 in /opt/conda/bin/python)
frame #11: <unknown function> + 0xf2247 (0x55745605c247 in /opt/conda/bin/python)
frame #12: <unknown function> + 0xf20d7 (0x55745605c0d7 in /opt/conda/bin/python)
frame #13: <unknown function> + 0xf20ed (0x55745605c0ed in /opt/conda/bin/python)
frame #14: PyDict_SetItem + 0x3da (0x5574560a2d7a in /opt/conda/bin/python)
frame #15: PyDict_SetItemString + 0x4f (0x5574560a9c5f in /opt/conda/bin/python)
frame #16: PyImport_Cleanup + 0x99 (0x55745610edc9 in /opt/conda/bin/python)
frame #17: Py_FinalizeEx + 0x61 (0x557456179961 in /opt/conda/bin/python)
frame #18: Py_Main + 0x35e (0x557456183cae in /opt/conda/bin/python)
frame #19: main + 0xee (0x55745604df2e in /opt/conda/bin/python)
frame #20: __libc_start_main + 0xe7 (0x7f1ce200db97 in /lib/x86_64-linux-gnu/libc.so.6)
frame #21: <unknown function> + 0x1c327f (0x55745612d27f in /opt/conda/bin/python)

Epoch 00030, Iteration 000000090, Regular FID 373.8494401389866
Computing FID.
Get FID mean and cov and save to /home/logs/average_fid/epoch_00030_iteration_000000090.npy
Extract mean and covariance.
Number of videos used for evaluation: 8
Number of frames per video used for evaluation: 10
Traceback (most recent call last):
  File "train.py", line 93, in <module>
    main()
  File "train.py", line 87, in main
    trainer.end_of_epoch(data, current_epoch, current_iteration)
  File "/home/imaginaire/imaginaire/trainers/base.py", line 402, in end_of_epoch
    self.write_metrics()
  File "/home/imaginaire/imaginaire/trainers/vid2vid.py", line 699, in write_metrics
    regular_fid, average_fid = self._compute_fid()
  File "/home/imaginaire/imaginaire/trainers/vid2vid.py", line 745, in _compute_fid
    is_video=True, few_shot_video=few_shot)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 53, in compute_fid
    is_video, few_shot_video)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 133, in load_or_compute_stats
    is_video, few_shot_video)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 165, in get_inception_mean_cov
    sample_size, preprocess, few_shot_video)
  File "/opt/conda/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "/home/imaginaire/imaginaire/evaluation/common.py", line 99, in get_video_activations
    inception = inception.to('cuda')
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 611, in to
    return self._apply(convert)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 358, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 358, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 380, in _apply
    param_applied = fn(param)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 609, in convert
    return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: the launch timed out and was terminated
terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: the launch timed out and was terminated
Exception raised from create_event_internal at ../c10/cuda/CUDACachingAllocator.cpp:687 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7fc4a875899b in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0xc10 (0x7fc4a899b280 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::TensorImpl::release_resources() + 0x4d (0x7fc4a8740dfd in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #3: <unknown function> + 0x5414e2 (0x7fc52c7604e2 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x19aaae (0x55da19891aae in /opt/conda/bin/python)
frame #5: <unknown function> + 0xf244f (0x55da197e944f in /opt/conda/bin/python)
frame #6: <unknown function> + 0xf244f (0x55da197e944f in /opt/conda/bin/python)
frame #7: <unknown function> + 0xf2828 (0x55da197e9828 in /opt/conda/bin/python)
frame #8: <unknown function> + 0x19aa90 (0x55da19891a90 in /opt/conda/bin/python)
frame #9: <unknown function> + 0xf27f8 (0x55da197e97f8 in /opt/conda/bin/python)
frame #10: <unknown function> + 0x19aa90 (0x55da19891a90 in /opt/conda/bin/python)
frame #11: <unknown function> + 0xf2247 (0x55da197e9247 in /opt/conda/bin/python)
frame #12: <unknown function> + 0xf20d7 (0x55da197e90d7 in /opt/conda/bin/python)
frame #13: <unknown function> + 0xf20ed (0x55da197e90ed in /opt/conda/bin/python)
frame #14: PyDict_SetItem + 0x3da (0x55da1982fd7a in /opt/conda/bin/python)
frame #15: PyDict_SetItemString + 0x4f (0x55da19836c5f in /opt/conda/bin/python)
frame #16: PyImport_Cleanup + 0x99 (0x55da1989bdc9 in /opt/conda/bin/python)
frame #17: Py_FinalizeEx + 0x61 (0x55da19906961 in /opt/conda/bin/python)
frame #18: Py_Main + 0x35e (0x55da19910cae in /opt/conda/bin/python)
frame #19: main + 0xee (0x55da197daf2e in /opt/conda/bin/python)
frame #20: __libc_start_main + 0xe7 (0x7fc555dc2b97 in /lib/x86_64-linux-gnu/libc.so.6)
frame #21: <unknown function> + 0x1c327f (0x55da198ba27f in /opt/conda/bin/python)

Traceback (most recent call last):
  File "train.py", line 93, in <module>
    main()
  File "train.py", line 87, in main
    trainer.end_of_epoch(data, current_epoch, current_iteration)
  File "/home/imaginaire/imaginaire/trainers/base.py", line 402, in end_of_epoch
    self.write_metrics()
  File "/home/imaginaire/imaginaire/trainers/vid2vid.py", line 699, in write_metrics
    regular_fid, average_fid = self._compute_fid()
  File "/home/imaginaire/imaginaire/trainers/vid2vid.py", line 745, in _compute_fid
    is_video=True, few_shot_video=few_shot)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 53, in compute_fid
    is_video, few_shot_video)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 133, in load_or_compute_stats
    is_video, few_shot_video)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 165, in get_inception_mean_cov
    sample_size, preprocess, few_shot_video)
  File "/opt/conda/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "/home/imaginaire/imaginaire/evaluation/common.py", line 99, in get_video_activations
    inception = inception.to('cuda')
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 611, in to
    return self._apply(convert)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 358, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 358, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 380, in _apply
    param_applied = fn(param)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 609, in convert
    return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: the launch timed out and was terminated
terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: the launch timed out and was terminated
Exception raised from create_event_internal at ../c10/cuda/CUDACachingAllocator.cpp:687 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7f2661e4c99b in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0xc10 (0x7f266208f280 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::TensorImpl::release_resources() + 0x4d (0x7f2661e34dfd in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #3: <unknown function> + 0x5414e2 (0x7f26e5e544e2 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x19aaae (0x55ee34244aae in /opt/conda/bin/python)
frame #5: <unknown function> + 0xf244f (0x55ee3419c44f in /opt/conda/bin/python)
frame #6: <unknown function> + 0xf244f (0x55ee3419c44f in /opt/conda/bin/python)
frame #7: <unknown function> + 0xf2828 (0x55ee3419c828 in /opt/conda/bin/python)
frame #8: <unknown function> + 0x19aa90 (0x55ee34244a90 in /opt/conda/bin/python)
frame #9: <unknown function> + 0xf27f8 (0x55ee3419c7f8 in /opt/conda/bin/python)
frame #10: <unknown function> + 0x19aa90 (0x55ee34244a90 in /opt/conda/bin/python)
frame #11: <unknown function> + 0xf2247 (0x55ee3419c247 in /opt/conda/bin/python)
frame #12: <unknown function> + 0xf20d7 (0x55ee3419c0d7 in /opt/conda/bin/python)
frame #13: <unknown function> + 0xf20ed (0x55ee3419c0ed in /opt/conda/bin/python)
frame #14: PyDict_SetItem + 0x3da (0x55ee341e2d7a in /opt/conda/bin/python)
frame #15: PyDict_SetItemString + 0x4f (0x55ee341e9c5f in /opt/conda/bin/python)
frame #16: PyImport_Cleanup + 0x99 (0x55ee3424edc9 in /opt/conda/bin/python)
frame #17: Py_FinalizeEx + 0x61 (0x55ee342b9961 in /opt/conda/bin/python)
frame #18: Py_Main + 0x35e (0x55ee342c3cae in /opt/conda/bin/python)
frame #19: main + 0xee (0x55ee3418df2e in /opt/conda/bin/python)
frame #20: __libc_start_main + 0xe7 (0x7f270f4b6b97 in /lib/x86_64-linux-gnu/libc.so.6)
frame #21: <unknown function> + 0x1c327f (0x55ee3426d27f in /opt/conda/bin/python)

Traceback (most recent call last):
  File "train.py", line 93, in <module>
    main()
  File "train.py", line 87, in main
    trainer.end_of_epoch(data, current_epoch, current_iteration)
  File "/home/imaginaire/imaginaire/trainers/base.py", line 402, in end_of_epoch
    self.write_metrics()
  File "/home/imaginaire/imaginaire/trainers/vid2vid.py", line 699, in write_metrics
    regular_fid, average_fid = self._compute_fid()
  File "/home/imaginaire/imaginaire/trainers/vid2vid.py", line 745, in _compute_fid
    is_video=True, few_shot_video=few_shot)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 53, in compute_fid
    is_video, few_shot_video)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 133, in load_or_compute_stats
    is_video, few_shot_video)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 165, in get_inception_mean_cov
    sample_size, preprocess, few_shot_video)
  File "/opt/conda/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "/home/imaginaire/imaginaire/evaluation/common.py", line 99, in get_video_activations
    inception = inception.to('cuda')
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 611, in to
    return self._apply(convert)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 358, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 358, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 380, in _apply
    param_applied = fn(param)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 609, in convert
    return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: the launch timed out and was terminated
terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: the launch timed out and was terminated
Exception raised from create_event_internal at ../c10/cuda/CUDACachingAllocator.cpp:687 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7fdc9bd0199b in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0xc10 (0x7fdc9bf44280 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::TensorImpl::release_resources() + 0x4d (0x7fdc9bce9dfd in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #3: <unknown function> + 0x5414e2 (0x7fdd1fd094e2 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x19aaae (0x55f9d45e0aae in /opt/conda/bin/python)
frame #5: <unknown function> + 0xf244f (0x55f9d453844f in /opt/conda/bin/python)
frame #6: <unknown function> + 0xf244f (0x55f9d453844f in /opt/conda/bin/python)
frame #7: <unknown function> + 0xf2828 (0x55f9d4538828 in /opt/conda/bin/python)
frame #8: <unknown function> + 0x19aa90 (0x55f9d45e0a90 in /opt/conda/bin/python)
frame #9: <unknown function> + 0xf27f8 (0x55f9d45387f8 in /opt/conda/bin/python)
frame #10: <unknown function> + 0x19aa90 (0x55f9d45e0a90 in /opt/conda/bin/python)
frame #11: <unknown function> + 0xf2247 (0x55f9d4538247 in /opt/conda/bin/python)
frame #12: <unknown function> + 0xf20d7 (0x55f9d45380d7 in /opt/conda/bin/python)
frame #13: <unknown function> + 0xf20ed (0x55f9d45380ed in /opt/conda/bin/python)
frame #14: PyDict_SetItem + 0x3da (0x55f9d457ed7a in /opt/conda/bin/python)
frame #15: PyDict_SetItemString + 0x4f (0x55f9d4585c5f in /opt/conda/bin/python)
frame #16: PyImport_Cleanup + 0x99 (0x55f9d45eadc9 in /opt/conda/bin/python)
frame #17: Py_FinalizeEx + 0x61 (0x55f9d4655961 in /opt/conda/bin/python)
frame #18: Py_Main + 0x35e (0x55f9d465fcae in /opt/conda/bin/python)
frame #19: main + 0xee (0x55f9d4529f2e in /opt/conda/bin/python)
frame #20: __libc_start_main + 0xe7 (0x7fdd4936bb97 in /lib/x86_64-linux-gnu/libc.so.6)
frame #21: <unknown function> + 0x1c327f (0x55f9d460927f in /opt/conda/bin/python)

Traceback (most recent call last):
  File "train.py", line 93, in <module>
    main()
  File "train.py", line 87, in main
    trainer.end_of_epoch(data, current_epoch, current_iteration)
  File "/home/imaginaire/imaginaire/trainers/base.py", line 402, in end_of_epoch
    self.write_metrics()
  File "/home/imaginaire/imaginaire/trainers/vid2vid.py", line 699, in write_metrics
    regular_fid, average_fid = self._compute_fid()
  File "/home/imaginaire/imaginaire/trainers/vid2vid.py", line 745, in _compute_fid
    is_video=True, few_shot_video=few_shot)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 45, in compute_fid
    is_video, few_shot_video)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 133, in load_or_compute_stats
    is_video, few_shot_video)
  File "/home/imaginaire/imaginaire/evaluation/fid.py", line 165, in get_inception_mean_cov
    sample_size, preprocess, few_shot_video)
  File "/opt/conda/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "/home/imaginaire/imaginaire/evaluation/common.py", line 157, in get_video_activations
    batch_y = torch.cat(batch_y).cpu().data.numpy()
  File "/opt/conda/lib/python3.6/site-packages/apex/amp/wrap.py", line 28, in wrapper
    return orig_fn(*new_args, **kwargs)
RuntimeError: CUDA error: the launch timed out and was terminated
terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: the launch timed out and was terminated
Exception raised from create_event_internal at ../c10/cuda/CUDACachingAllocator.cpp:687 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7f5e37a6799b in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0xc10 (0x7f5e37caa280 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::TensorImpl::release_resources() + 0x4d (0x7f5e37a4fdfd in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #3: <unknown function> + 0x5414e2 (0x7f5ebba6f4e2 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x19aaae (0x55fd1871eaae in /opt/conda/bin/python)
frame #5: <unknown function> + 0xf244f (0x55fd1867644f in /opt/conda/bin/python)
frame #6: <unknown function> + 0xf2247 (0x55fd18676247 in /opt/conda/bin/python)
frame #7: <unknown function> + 0xf20d7 (0x55fd186760d7 in /opt/conda/bin/python)
frame #8: <unknown function> + 0xf20ed (0x55fd186760ed in /opt/conda/bin/python)
frame #9: <unknown function> + 0xf20ed (0x55fd186760ed in /opt/conda/bin/python)
frame #10: <unknown function> + 0xf20ed (0x55fd186760ed in /opt/conda/bin/python)
frame #11: <unknown function> + 0xf20ed (0x55fd186760ed in /opt/conda/bin/python)
frame #12: <unknown function> + 0xf20ed (0x55fd186760ed in /opt/conda/bin/python)
frame #13: <unknown function> + 0xf20ed (0x55fd186760ed in /opt/conda/bin/python)
frame #14: <unknown function> + 0xf20ed (0x55fd186760ed in /opt/conda/bin/python)
frame #15: <unknown function> + 0xf20ed (0x55fd186760ed in /opt/conda/bin/python)
frame #16: <unknown function> + 0xf20ed (0x55fd186760ed in /opt/conda/bin/python)
frame #17: PyDict_SetItem + 0x3da (0x55fd186bcd7a in /opt/conda/bin/python)
frame #18: PyDict_SetItemString + 0x4f (0x55fd186c3c5f in /opt/conda/bin/python)
frame #19: PyImport_Cleanup + 0x99 (0x55fd18728dc9 in /opt/conda/bin/python)
frame #20: Py_FinalizeEx + 0x61 (0x55fd18793961 in /opt/conda/bin/python)
frame #21: Py_Main + 0x35e (0x55fd1879dcae in /opt/conda/bin/python)
frame #22: main + 0xee (0x55fd18667f2e in /opt/conda/bin/python)
frame #23: __libc_start_main + 0xe7 (0x7f5ee50d1b97 in /lib/x86_64-linux-gnu/libc.so.6)
frame #24: <unknown function> + 0x1c327f (0x55fd1874727f in /opt/conda/bin/python)

Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/conda/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/conda/lib/python3.6/site-packages/torch/distributed/launch.py", line 261, in <module>
    main()
  File "/opt/conda/lib/python3.6/site-packages/torch/distributed/launch.py", line 257, in main
    cmd=cmd)
subprocess.CalledProcessError: Command '['/opt/conda/bin/python', '-u', 'train.py', '--local_rank=7', '--config', 'configs/projects/vid2vid/kitti/ampO1.yaml', '--logdir', '/home/logs']' died with <Signals.SIGABRT: 6>.