How to change KP_detector and dense_motion parameters to train on Higher resolution? #81

stark-akib · 2020-04-05T15:27:07Z

Hello @AliaksandrSiarohin . First of all, congratulations on the great work and thank you for sharing the repository.

I'm planning to train the model to generate higher resolution output (such as 512x512, 1024x1024). I would really appreciate your insight on my approach.

You mentioned here #14

Currently keypoint detector and dense-motion net operate on 64x64 images

Do I need to change this behavior for better motion transfer performance (while training on higher resolution)? How would you suggest doing it?

Looking forward to hearing from you. :)

AliaksandrSiarohin · 2020-04-05T20:39:58Z

Hi @stark-akib,
I don't have a receipt here, you should try and see for yourself.
I would try first to use 64x64 resolution for keypoint detector and dense-motion (use scale_factor = 0.125 and 0.0625 for resolution 512 and 1024 respectively), in case you will see that keypoints is not accurate increase the resolution for keypoint detector, in case you will see that deformations need to be more precise increase resolution for dense motion.

stark-akib · 2020-04-06T07:56:19Z

Thank you @AliaksandrSiarohin for the direction. How can I increase the resolution for the Keypoint detector & Dense motion model?

For say, If I want to increase the Keypoint detector's resolution to 256x256, do I only change the scale_factor to 0.5 for 512x512 resolution input? Or do I need to change any other parameters, functions or files?

AliaksandrSiarohin · 2020-04-06T07:58:12Z

Yes just change scale_factor

stark-akib · 2020-04-06T08:27:50Z

Great. Thank you.

Another quick question, I want to preprocess both VoxCeleb1 and VoxCeleb2 dataset. As you have mentioned in the Video_preprocessing page

Note .png format take aproximatly 300GB.

Does VoxCeleb1 require approx 300GB space for preprocessing? Then how much space will VoxCeleb2 require (as it has more data than VoxCeleb1) for preprocessing?

AliaksandrSiarohin · 2020-04-06T08:29:23Z

No idea, never download it entirely.

stark-akib · 2020-04-06T08:50:36Z

Okay. Thank you again.

stark-akib · 2020-04-13T07:05:21Z

Hello @AliaksandrSiarohin

I'm gonna start the training on VoxCeleb1 at 512x512. As you mentioned here
I'm looking for a similar training time as I have 4 NVIDIA Tesla V100 GPUs.

Can you help me specify how much storage will be needed to complete the training process?
Will 1-2TB storage suffice(considering the intermediate files generated while training)?

Also, when should the training terminate? Is 1000 epoch is enough (as stated in the YAML file)?

AliaksandrSiarohin · 2020-04-13T07:08:59Z

Vox celeb in png format is 300Gb, 300Gb x 4 is 1200Gb. Intermediate files consume less than several Gb. 1000 epochs? Guess should be 100.

stark-akib · 2020-04-13T07:11:02Z

Great. I'll change the parameters accordingly. Thank you.

newExplore-hash · 2020-04-14T08:32:26Z

@AliaksandrSiarohin
hi, for VoxCeleb Dataset if i want to replace yours KP_detector with existing keypoint detector, such as dlib, what should i do? i have no idea how to handle jacobian_map.

stark-akib · 2020-04-16T07:05:38Z

@AliaksandrSiarohin

Hello, Just giving an update on the Voxceleb1 preprocessing. The 512x512 preprocessing of VoxCeleb1 took around 870GB space for .png format. So, the required storage for training would be 900GB x 4 = 3.6 TB.

AliaksandrSiarohin · 2020-04-16T08:19:25Z

Why? Training don't need additional space. x4 was an estimate for 512x512 space occupancy. Because 512 image is roughly 4 times larger.

stark-akib · 2020-04-16T08:24:10Z

Sorry, I mistook it as a parallel multiplier. The preprocessing I have performed contains 18,671 folders in "train" and 510 folders in "test" folder. The rest of the videos are either showing a broken link or skipped message in the console. I guess no additional space is needed to train then.
Thank you.

AliaksandrSiarohin · 2020-04-16T08:32:09Z

I guess you may need to filter out low resolution videos. So to create vox-metadata.csv I used all the videos where size of the bbox was greater than 256. You can infer size of the bbox from bbox parameter in vox-metadata.csv.

stark-akib · 2020-04-16T08:34:58Z

Thank you for the tip. I'll have a look.

stark-akib · 2020-04-16T08:56:53Z

@AliaksandrSiarohin
Just a quick question. What's the difference between "vox-adv-256.yaml" and "vox-256.yaml"?
The parameter such as
use_kp: True
and
sn: True
what is the difference?

Also, what's the use of epoch_milestones: [60, 90]?

AliaksandrSiarohin · 2020-04-16T09:02:43Z

vox-adv is with adversarial loss
use_kp add key-points heatmaps to discriminator.
sn - spectral normalization
epoch_milestones is epochs at which learning rate dropped.

stark-akib · 2020-04-16T09:05:32Z

Thank you. Which one you would suggest using as the config file? "vox-adv-256.yaml" and "vox-256.yaml"? ( Considering that I will only change the frame_shape and scale factors for 512x512)

AliaksandrSiarohin · 2020-04-16T09:22:05Z

Without adversarial it is more stable.

stark-akib · 2020-04-16T09:23:58Z

Thank you for your insight.

stark-akib · 2020-04-16T13:40:44Z

Hello @AliaksandrSiarohin ,

I've started the training using the following command.
CUDA_VISIBLE_DEVICES=0,1,2,3 python run.py --config config/vox-adv-512.yaml --device_ids 0,1,2,3

After 15 seconds, this error is occurring. Can you help me find the problem?
I'm using batch size 40, but still getting the OOM error.

run.py:40: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
config = yaml.load(f)
Use predefined train-test split.
Training...
0%| | 0/150 [00:00<?, ?it/s]/home/ubuntu/anaconda3/envs/alethea/lib/python3.7/site-packages/torch/nn/functional.py:2423: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
/home/ubuntu/anaconda3/envs/alethea/lib/python3.7/site-packages/torch/nn/functional.py:1332: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
Traceback (most recent call last):
File "run.py", line 81, in
train(config, generator, discriminator, kp_detector, opt.checkpoint, log_dir, dataset, opt.device_ids)
File "/home/ubuntu/Downloads/first-order-model/train.py", line 51, in train
losses_generator, generated = generator_full(x)
File "/home/ubuntu/anaconda3/envs/alethea/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/ubuntu/anaconda3/envs/alethea/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 143, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/ubuntu/anaconda3/envs/alethea/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 153, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/ubuntu/anaconda3/envs/alethea/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 83, in parallel_apply
raise output
File "/home/ubuntu/anaconda3/envs/alethea/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 59, in worker
output = module(*input, **kwargs)
File "/home/ubuntu/anaconda3/envs/alethea/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/ubuntu/Downloads/first-order-model/modules/model.py", line 166, in forward
x_vgg = self.vgg(pyramide_generated['prediction' + str(scale)])
File "/home/ubuntu/anaconda3/envs/alethea/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/ubuntu/Downloads/first-order-model/modules/model.py", line 45, in forward
h_relu2 = self.slice2(h_relu1)
File "/home/ubuntu/anaconda3/envs/alethea/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/ubuntu/anaconda3/envs/alethea/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
input = module(input)
File "/home/ubuntu/anaconda3/envs/alethea/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/ubuntu/anaconda3/envs/alethea/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 320, in forward
self.padding, self.dilation, self.groups)
RuntimeError: CUDA out of memory. Tried to allocate 160.00 MiB (GPU 0; 15.78 GiB total capacity; 14.45 GiB already allocated; 21.88 MiB free; 148.08 MiB cached)

AliaksandrSiarohin · 2020-04-16T13:43:49Z

40 is too large. Batch size should be approximately 4 times smaller than for 256, e.g. 16 or 12.

stark-akib · 2020-04-16T14:09:37Z

Thank you. Lowering the batch size to 16 solved the issue. Just showing those UserWarnings.
What's the expected output of the command prompt? Is it supposed to stay like this? There is no output in the log.txt.

stark-akib · 2020-04-16T15:24:09Z

@AliaksandrSiarohin Checked again after leaving it for an hour, it's still stuck here. When closing by "Ctrl + C" the output of the console is like this.

Also, I've set the num_epochs: 100, num_repeats: 50 and batch size down to 12, still, the training is stuck here. So, are there any changes needed in the loss values or in model.py?

AliaksandrSiarohin · 2020-04-16T18:06:22Z

Probably just slow. You can try to change num_repeats to 1 to see. Also you may want to start with pretrained 256 checkpoint to accelerate convergence.

stark-akib · 2020-04-17T06:03:32Z

@AliaksandrSiarohin Thank you. Changing the number of repeat to 1 seems to work. The log file is showing loss. As your YAML files suggest,
For Voxceleb, 256x256, num_epochs: 100, num-repeats: 75
For Voxceleb adv, 256x256, num_epochs: 150, num-repeats: 75
Then what should be thenum_epochs:' & 'num-repeats for 512x512?

AliaksandrSiarohin · 2020-04-17T06:08:48Z

Depends on how much training you can afford. The more the better.

stark-akib · 2020-04-17T06:18:17Z

I can train up to 5 days on my setup, what num_epoch and num_repeats will you suggest? (Considering the 4 NVIDIA Tesla V100 GPUs I've mentioned earlier, I can add another 4. So total 8 V100 GPUs)

MitalPattani · 2020-10-18T10:22:11Z

@stark-akib can you please share the checkpoints or the config file? thanks

alessiapacca · 2020-10-29T15:32:41Z

Hey @AliaksandrSiarohin

I re-trained the net for 512. the script https://github.com/AliaksandrSiarohin/video-preprocessing/blob/master/crop_vox.py in my case was giving many errors and not working, so I just took the https://github.com/AliaksandrSiarohin/video-preprocessing/blob/master/vox-metadata.csv and selected the ones >= 512x512. In this case, I had 5827 mp4 videos in the train folder, and 166 mp4 videos int he test folder.

I performed 100 epochs with num_repeats = 20, and the result is not extremely good:

In the training, I increased the resolution of KP_detector and Dense motion to 256 (by using scale_factor 0.5).
Do you think the cause of the flickering and of the artifacts is:

too less num_repeats
too small dataset
mp4 dataset (instead of png)
?

the losses at the last epoch are:
00000099) perceptual - 95.23073; equivariance_value - 0.12323; equivariance_jacobian - 0.33881

AliaksandrSiarohin · 2020-10-29T15:48:13Z

I guess the problem is high resolution for KP_detector and Dense motion, have you tried scale_factor: 0.125 and maybe even take a pretrained dense motion and KP_detector?

alessiapacca · 2020-10-29T16:45:50Z

@AliaksandrSiarohin Oh, I used 0.5 cause i read on this issue you were suggesting to increase the resolution for the keypoint detector and dense-motion.
So you think 0.125 would help more?
What do you mean by taking a pretrained dense motion and KP detector?

AliaksandrSiarohin · 2020-10-29T22:15:02Z

I don't know you should try.
Initialize them with weight from my checkpoint.

alessiapacca · 2020-10-30T13:14:45Z

If I try to start the training from your weights, it gives me the error:

RuntimeError: Error(s) in loading state_dict for OcclusionAwareGenerator:
        size mismatch for dense_motion_network.down.weight: copying a param with shape torch.Size([3, 1, 13, 13]) from checkpoint, the shape in current model is torch.Size([3, 1, 29, 29]).

probably because I am using scale_factor = 0.125 instead you used 0.25 (as you trained for resolution 256, instead I am training for resolution 512)?

alessiapacca · 2020-11-02T20:30:14Z

100 epochs, 20 num_repeats, scale_factor 0.125 for both dense motion and kp_detector and this is the result

do you think the dataset is too small? @AliaksandrSiarohin
or training with mp4 gives worse results?
or I am doing something else in the wrong way?
It doesn't even move the mouth or close the eyes

AliaksandrSiarohin · 2020-11-03T06:41:02Z

Well hard to say based on a single photo. Hardset sigma in AntialiasingInterpolation and try with pretrained.

alessiapacca · 2020-11-03T08:38:19Z

if I use the pretrained model, with hardset sigma, with 512x512, it works but it works in a very bad way and this is why I was trying to retrain.
So your suggestion is to re-train with scale_factor = 0.125 but hardset sigma? Cause the previous 2 experiments I did where using the original sigma.
@AliaksandrSiarohin

alessiapacca · 2020-11-03T09:19:53Z

I mean, I seriously did the same stuff that @stark-akib did:

changed the scale_factor for dense motion and kp detector to 0.5, for 100 epochs and 20 num_repeats, and it looked very bad (I think this is how @stark-akib trained his model, as I can read from this issue)
changed the scale_factor for dense motion and kp detector to 0.125, for 100 epochs and 20 num_repeats, and again it couldn't animate the output.

So the possibilities here are three:

either I am using a too small dataset (approximately 6000 videos, taken from your metadata file where I selected only the ones that had a bigger bbox than 512x512)
either I am training for a too small time
either I should use png format to train

AliaksandrSiarohin · 2020-11-03T10:45:34Z

I tought you were using png format, that is why it is so slow for you.
Yes, you should use .png format.

alessiapacca · 2020-11-05T23:06:20Z

hey @AliaksandrSiarohin , I downloaded in png format.
Now I have the test and train folders. They contain other folders inside, having the name of the corresponding video.
Inside every folder there are all the frames in png format.
However, if I try to make the training start, I get the error:

TypeError: Traceback (most recent call last):
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/first-order-model/frames_dataset.py", line 155, in __getitem__
    return self.dataset[idx % self.dataset.__len__()]
  File "/first-order-model/frames_dataset.py", line 115, in __getitem__
    video_array = [img_as_float32(io.imread(os.path.join(path, frames[idx]))) for idx in frame_idx]
  File "/first-order-model/frames_dataset.py", line 115, in <listcomp>
    video_array = [img_as_float32(io.imread(os.path.join(path, frames[idx]))) for idx in frame_idx]
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/posixpath.py", line 94, in join
    genericpath._check_arg_types('join', a, *p)
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/genericpath.py", line 155, in _check_arg_types
    raise TypeError("Can't mix strings and bytes in path components") from None
TypeError: Can't mix strings and bytes in path components

what could be the reason for this?

AliaksandrSiarohin · 2020-11-05T23:42:59Z

This I don't know.
I guess you can fix this by replacing frame[idx] with frame[idx]..decode("utf-8")

alessiapacca · 2020-11-06T08:55:43Z

@AliaksandrSiarohin tried that. Now it gives this one:

ValueError: Traceback (most recent call last):
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/first-order-model/frames_dataset.py", line 155, in __getitem__
    return self.dataset[idx % self.dataset.__len__()]
  File "/first-order-model/frames_dataset.py", line 115, in __getitem__
    video_array = [img_as_float32(io.imread(os.path.join(path, (frames[idx]).decode("utf-8")))) for idx in frame_idx]
  File "/first-order-model/frames_dataset.py", line 115, in <listcomp>
    video_array = [img_as_float32(io.imread(os.path.join(path, (frames[idx]).decode("utf-8")))) for idx in frame_idx]
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/skimage/io/_io.py", line 48, in imread
    img = call_plugin('imread', fname, plugin=plugin, **plugin_args)
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/skimage/io/manage_plugins.py", line 210, in call_plugin
    return func(*args, **kwargs)
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/skimage/io/_plugins/imageio_plugin.py", line 10, in imread
    return np.asarray(imageio_imread(*args, **kwargs))
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/imageio/core/functions.py", line 265, in imread
    reader = read(uri, format, "i", **kwargs)
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/imageio/core/functions.py", line 182, in get_reader
    "Could not find a format to read the specified file in %s mode" % modename
ValueError: Could not find a format to read the specified file in single-image mode

maybe it's a problem of the names of the folders? they are saved with the name of the video and the mp4 extension. For example, the name of a folder is
id10001#7w0IBEWc9Qw#000993#001143.mp4

AliaksandrSiarohin · 2020-11-06T09:25:54Z

No, this should not be a problem if you are on linux. Check what is inside the folder id10001#7w0IBEWc9Qw#000993#001143.mp4 and send what filenames and some files from there.

alessiapacca · 2020-11-06T09:43:00Z

yes I am on linux. Inside that folder there are 150 png frames going from 0000000.png to 0000149.png

an example of frame is this one

there are other folders with more frames inside, but they are always named starting either from 0000000.png or continuing the previous frame number (if we are talking of the same video of another folder)

alessiapacca · 2020-11-06T20:03:26Z

now I substituted that line with
video_array = [img_as_float32(io.imread(path + '/' + frames[idx].decode('utf-8')) )for idx in frame_idx]

the training starts but after a while I get

Traceback (most recent call last):
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/first-order-model/frames_dataset.py", line 155, in __getitem__
    return self.dataset[idx % self.dataset.__len__()]
  File "/first-order-model/frames_dataset.py", line 115, in __getitem__
    video_array = [img_as_float32(io.imread(path + '/' + frames[idx].decode('utf-8')) )for idx in frame_idx]
  File "/first-oder-model/frames_dataset.py", line 115, in <listcomp>
    video_array = [img_as_float32(io.imread(path + '/' + frames[idx].decode('utf-8')) )for idx in frame_idx]
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/skimage/io/_io.py", line 48, in imread
    img = call_plugin('imread', fname, plugin=plugin, **plugin_args)
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/skimage/io/manage_plugins.py", line 210, in call_plugin
    return func(*args, **kwargs)
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/skimage/io/_plugins/imageio_plugin.py", line 10, in imread
    return np.asarray(imageio_imread(*args, **kwargs))
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/imageio/core/functions.py", line 265, in imread
    reader = read(uri, format, "i", **kwargs)
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/imageio/core/functions.py", line 186, in get_reader
    return format.get_reader(request)
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/imageio/core/format.py", line 170, in get_reader
    return self.Reader(self, request)
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/imageio/core/format.py", line 221, in __init__
    self._open(**self.request.kwargs.copy())
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/imageio/plugins/pillow.py", line 298, in _open
    return PillowFormat.Reader._open(self, pilmode=pilmode, as_gray=as_gray)
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/imageio/plugins/pillow.py", line 135, in _open
    pil_try_read(self._im)
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/imageio/plugins/pillow.py", line 680, in pil_try_read
    raise ValueError(error_message)
ValueError: Could not load "" 
Reason: "image file is truncated"
Please see documentation at: http://pillow.readthedocs.io/en/latest/installation.html#external-libraries

it seems like there is a problem on the images, like it's not reading them

AliaksandrSiarohin · 2020-11-07T07:54:33Z

Guess you are right, try to print the names of the images that case the error and inspect them manually.

alessiapacca · 2020-11-07T08:13:44Z

@AliaksandrSiarohin AliaksandrSiarohin I did that.

It was printing them as bytes, so something like
b'0000000.png'

So I Changed that converting the paths to strings and the name was correct, it could identify the correct numer of frames and also the correct names for the frames. However then, after a bit that the training started, I got this

ValueError: Traceback (most recent call last):
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/first-order-model/frames_dataset.py", line 161, in __getitem__
    return self.dataset[idx % self.dataset.__len__()]
  File "/first-order-model/frames_dataset.py", line 121, in __getitem__
    video_array = [img_as_float32(io.imread(os.path.join(path, frames[idx]))) for idx in frame_idx]
  File "/first-order-model/frames_dataset.py", line 121, in <listcomp>
    video_array = [img_as_float32(io.imread(os.path.join(path, frames[idx]))) for idx in frame_idx]
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/skimage/io/_io.py", line 48, in imread
    img = call_plugin('imread', fname, plugin=plugin, **plugin_args)
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/skimage/io/manage_plugins.py", line 210, in call_plugin
    return func(*args, **kwargs)
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/skimage/io/_plugins/imageio_plugin.py", line 10, in imread
    return np.asarray(imageio_imread(*args, **kwargs))
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/imageio/core/functions.py", line 265, in imread
    reader = read(uri, format, "i", **kwargs)
  File "/local/home/user/anaconda3/envs/first-order-model/lib/python3.7/site-packages/imageio/core/functions.py", line 182, in get_reader
    "Could not find a format to read the specified file in %s mode" % modename
ValueError: Could not find a format to read the specified file in single-image mode

it is strange cause I never had problems with mp4 format. Maybe its the "animation" format in the config file that should be .png?

If I print num_frames, frames and path in frames_dataset.py, I get the correct names:
path: /media/user/hdd/vox/train/id10068#5M2EGef.0f4#001945#002065.mp4
num frames: 140
frames: ['0000067.png', '0000000.png', '0000128.png', '0000042.png', '0000024.png', '0000080.png', '0000081.png', '0000087.png', '0000122.png', '0000139.png', '0000025.png', '0000079.png', '0000015.png', '0000021.png', '0000012.png', '0000010.png', '0000044.png', '0000022.png', '0000055.png', '0000125.png', '0000070.png', '0000033.png', '0000065.png', '0000101.png', '0000132.png', '0000103.png', '0000026.png', '0000085.png', '0000074.png', '0000089.png', '0000083.png', '0000001.png', '0000061.png', '0000088.png', '0000041.png', '0000131.png', '0000105.png', '0000097.png', '0000073.png', '0000077.png', '0000110.png', '0000082.png', '0000071.png', '0000109.png', '0000095.png', '0000058.png', '0000098.png', '0000049.png', '0000027.png', '0000048.png', '0000078.png', '0000059.png', '0000066.png', '0000126.png', '0000134.png', '0000005.png', '0000069.png', '0000037.png', '0000057.png', '0000115.png', '0000002.png', '0000031.png', '0000052.png', '0000060.png', '0000117.png', '0000034.png', '0000113.png', '0000006.png', '0000090.png', '0000068.png', '0000133.png', '0000072.png', '0000091.png', '0000019.png', '0000118.png', '0000028.png', '0000045.png', '0000040.png', '0000102.png', '0000023.png', '0000018.png', '0000130.png', '0000029.png', '0000137.png', '0000011.png', '0000035.png', '0000093.png', '0000111.png', '0000106.png', '0000036.png', '0000084.png', '0000053.png', '0000016.png', '0000032.png', '0000136.png', '0000124.png', '0000050.png', '0000020.png', '0000051.png', '0000064.png', '0000100.png', '0000123.png', '0000094.png', '0000039.png', '0000054.png', '0000116.png', '0000121.png', '0000008.png', '0000017.png', '0000099.png', '0000092.png', '0000076.png', '0000063.png', '0000104.png', '0000047.png', '0000138.png', '0000003.png', '0000043.png', '0000129.png', '0000127.png', '0000046.png', '0000108.png', '0000004.png', '0000014.png', '0000096.png', '0000007.png', '0000056.png', '0000114.png', '0000086.png', '0000120.png', '0000038.png', '0000075.png', '0000013.png', '0000135.png', '0000112.png', '0000107.png', '0000030.png', '0000119.png', '0000062.png', '0000009.png']

I even printed the os.path.join output for some of them, and it looks correct:
joining : /media/user/hdd/vox/train/id10098#8f2ReesQMrs#001291#001410.mp4/0000080.png
joining : /media/user/hdd/vox/train/id10098#8f2ReesQMrs#001291#001410.mp4/0000050.png
joining : /media/user/hdd/vox/train/id10748#XaQk7W-ySMo#005166#005379.mp4/0000033.png
joining : /media/user/hdd/vox/train/id10748#XaQk7W-ySMo#005166#005379.mp4/0000148.png
joining : /media/user/hdd/vox/train/id10909#M3rfGq1-lXg#008731#009013.mp4/0000212.png
joining : /media/user/hdd/vox/train/id10909#M3rfGq1-lXg#008731#009013.mp4/0000052.png

AliaksandrSiarohin · 2020-11-07T14:56:39Z

Yes, yes this I get. Can you check specifically which image is producing an error and validate if it is a good image?

alessiapacca · 2020-11-08T10:35:57Z

Ok I made it work . It was a problem with some corrupted files.
It will take long to train but I will see if like this the results are better than the mp4 version.

Aaron2286 · 2020-11-12T15:29:03Z

@alessiapacca hi I also want to get a higher resolution, just want to know how about your results, can you tell me, thank you

alessiapacca · 2020-11-12T15:36:28Z

@Aaron2286 the training is extremely slow, so I still don't know whether the result will be good or not. I am training it though

Aaron2286 · 2020-11-13T00:03:03Z

@alessiapacca yes I know, thank you very much. Actually, I am not very good at this, but I think this is very important to my grandmother, so I am studying hard. If there are results, can you provide some information? Thank you.

sicilyliu · 2020-12-22T08:19:05Z

Thank you. I'll give a try on that.

@stark-akib Can you share your checkpoints/model weights？thanks。

chloejihye · 2021-01-11T12:02:41Z

@stark-akib @alessiapacca Hello, can you share the result of your training? :) I'm really curious about the video quality after training with 512 sizes cos I'm trying to do the same. Your answer would be very appreciated. Thank you !!

celikmustafa89 · 2022-04-27T13:25:44Z

TO SUM UP THE WHOLE ISSUE:

There is no model for higher resolution (e.g., 512x512). Am I right? If you have can you share it?

stark-akib closed this as completed Apr 6, 2020

stark-akib reopened this Apr 13, 2020

AliaksandrSiarohin mentioned this issue Jul 2, 2020

Training for high quality #171

Open

lordkitsuna mentioned this issue Aug 4, 2020

Improve resolution #213

Closed

chloejihye mentioned this issue Jan 14, 2021

Training higher resolution on top of vox-256 pretrained model - gives me a size mismatch error #359

Open

How to change KP_detector and dense_motion parameters to train on Higher resolution? #81

How to change KP_detector and dense_motion parameters to train on Higher resolution? #81

Comments

stark-akib commented Apr 5, 2020

AliaksandrSiarohin commented Apr 5, 2020

stark-akib commented Apr 6, 2020 • edited Loading

AliaksandrSiarohin commented Apr 6, 2020

stark-akib commented Apr 6, 2020

AliaksandrSiarohin commented Apr 6, 2020

stark-akib commented Apr 6, 2020

stark-akib commented Apr 13, 2020

AliaksandrSiarohin commented Apr 13, 2020

stark-akib commented Apr 13, 2020

newExplore-hash commented Apr 14, 2020

stark-akib commented Apr 16, 2020

AliaksandrSiarohin commented Apr 16, 2020

stark-akib commented Apr 16, 2020

AliaksandrSiarohin commented Apr 16, 2020

stark-akib commented Apr 16, 2020

stark-akib commented Apr 16, 2020

AliaksandrSiarohin commented Apr 16, 2020

stark-akib commented Apr 16, 2020 • edited Loading

AliaksandrSiarohin commented Apr 16, 2020

stark-akib commented Apr 16, 2020

stark-akib commented Apr 16, 2020 • edited Loading

AliaksandrSiarohin commented Apr 16, 2020

stark-akib commented Apr 16, 2020

stark-akib commented Apr 16, 2020 • edited Loading

AliaksandrSiarohin commented Apr 16, 2020

stark-akib commented Apr 17, 2020

AliaksandrSiarohin commented Apr 17, 2020

stark-akib commented Apr 17, 2020 • edited Loading

MitalPattani commented Oct 18, 2020

alessiapacca commented Oct 29, 2020

AliaksandrSiarohin commented Oct 29, 2020

alessiapacca commented Oct 29, 2020

AliaksandrSiarohin commented Oct 29, 2020

alessiapacca commented Oct 30, 2020 • edited Loading

alessiapacca commented Nov 2, 2020

AliaksandrSiarohin commented Nov 3, 2020

alessiapacca commented Nov 3, 2020

alessiapacca commented Nov 3, 2020

AliaksandrSiarohin commented Nov 3, 2020

alessiapacca commented Nov 5, 2020

AliaksandrSiarohin commented Nov 5, 2020

alessiapacca commented Nov 6, 2020

AliaksandrSiarohin commented Nov 6, 2020

alessiapacca commented Nov 6, 2020 • edited Loading

alessiapacca commented Nov 6, 2020

AliaksandrSiarohin commented Nov 7, 2020

alessiapacca commented Nov 7, 2020 • edited Loading

AliaksandrSiarohin commented Nov 7, 2020

alessiapacca commented Nov 8, 2020

Aaron2286 commented Nov 12, 2020

alessiapacca commented Nov 12, 2020

Aaron2286 commented Nov 13, 2020

sicilyliu commented Dec 22, 2020

chloejihye commented Jan 11, 2021

celikmustafa89 commented Apr 27, 2022

stark-akib commented Apr 6, 2020 •

edited

Loading

stark-akib commented Apr 16, 2020 •

edited

Loading

stark-akib commented Apr 16, 2020 •

edited

Loading

stark-akib commented Apr 16, 2020 •

edited

Loading

stark-akib commented Apr 17, 2020 •

edited

Loading

alessiapacca commented Oct 30, 2020 •

edited

Loading

alessiapacca commented Nov 6, 2020 •

edited

Loading

alessiapacca commented Nov 7, 2020 •

edited

Loading