Sample Dataset for usage #2 #20

shikhar-scs · 2020-04-09T11:29:27Z

Hi @Rudrabha ,
Amazing work.
I was working on generating video from image+audio and it would be very helpful if you could post a sample image and audio file. I've been getting different errors every time I'm using a random image.

Thanks!

prajwalkr · 2020-04-09T14:32:00Z

Thank you very much for your interest.

Could you please some of the errors? If it is a mistake in the code, then we can fix it for everyone.

shikhar-scs · 2020-04-09T17:40:59Z

Hey, I did a setup of the same on local(mac) and it worked fine.

On remote (gpus) there was the following tensorflow error

prajwalkr · 2020-04-09T18:19:21Z

The input size must be 96x96. Please ensure this.

If it doesn't get solved with this, please report the tensorflow and keras version you have.

prajwalkr · 2020-04-09T18:32:04Z

The input size must be 96x96. Please ensure this.

Have updated the repo code as well to not receive img_size as a variable input parameter.

shikhar-scs · 2020-04-10T02:14:22Z

Oh okay, thanks for the help !

shikhar-scs · 2020-04-10T10:36:19Z

hey @prajwalkr , another thing.

For training the model could you please specify the dataset structure, would be helpful for a lot of people, the way its mentioned here https://github.com/Hangz-nju-cuhk/Talking-Face-Generation-DAVS#preparing-training-data

prajwalkr · 2020-04-10T13:11:39Z

Done.

shikhar-scs · 2020-04-11T05:23:18Z

Hi @Rudrabha finally completed the end to end setup and more importantly everything is working now.
Just a few nit bugs in preprocess.py. Pointing out so that future users don't have to spend extra time.

The split in line 116 & 118 should probably be args.split

LipGAN/preprocess.py

Lines 116 to 118 in 16ca935

    
           			for line in open(path.join(args.filelists, '{}.txt'.format(split))).readlines()] 
        
           jobs = [(vfile, args, split) for vfile in filelist]

Also, sr here, the sampling rate, is undefined again. Is there any specific value to be used ? (as of now I've set it to None, to preserve the native sampling rate of the file).

LipGAN/preprocess.py

Line 87 in 16ca935

wav = audio.load_wav(wavpath, sr)

Will let you know if I come across any others. Thanks again for the work!

prajwalkr · 2020-04-11T05:25:15Z

Please use sr=16000, I will correct these two in preprocess.py right away. Thank you very much.

shikhar-scs closed this as completed Apr 10, 2020

shikhar-scs reopened this Apr 10, 2020

prajwalkr closed this as completed Apr 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sample Dataset for usage #2 #20

Sample Dataset for usage #2 #20

shikhar-scs commented Apr 9, 2020

prajwalkr commented Apr 9, 2020

shikhar-scs commented Apr 9, 2020

prajwalkr commented Apr 9, 2020

prajwalkr commented Apr 9, 2020

shikhar-scs commented Apr 10, 2020

shikhar-scs commented Apr 10, 2020 •

edited

prajwalkr commented Apr 10, 2020

shikhar-scs commented Apr 11, 2020

prajwalkr commented Apr 11, 2020

Sample Dataset for usage #2 #20

Sample Dataset for usage #2 #20

Comments

shikhar-scs commented Apr 9, 2020

prajwalkr commented Apr 9, 2020

shikhar-scs commented Apr 9, 2020

prajwalkr commented Apr 9, 2020

prajwalkr commented Apr 9, 2020

shikhar-scs commented Apr 10, 2020

shikhar-scs commented Apr 10, 2020 • edited

prajwalkr commented Apr 10, 2020

shikhar-scs commented Apr 11, 2020

prajwalkr commented Apr 11, 2020

shikhar-scs commented Apr 10, 2020 •

edited