Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample Dataset for usage #2 #20

Closed
shikhar-scs opened this issue Apr 9, 2020 · 9 comments
Closed

Sample Dataset for usage #2 #20

shikhar-scs opened this issue Apr 9, 2020 · 9 comments

Comments

@shikhar-scs
Copy link

Hi @Rudrabha ,
Amazing work.
I was working on generating video from image+audio and it would be very helpful if you could post a sample image and audio file. I've been getting different errors every time I'm using a random image.

Thanks!

@prajwalkr
Copy link
Collaborator

Thank you very much for your interest.

Could you please some of the errors? If it is a mistake in the code, then we can fix it for everyone.

@shikhar-scs
Copy link
Author

Hey, I did a setup of the same on local(mac) and it worked fine.

On remote (gpus) there was the following tensorflow error

image

@prajwalkr
Copy link
Collaborator

The input size must be 96x96. Please ensure this.

If it doesn't get solved with this, please report the tensorflow and keras version you have.

@prajwalkr
Copy link
Collaborator

The input size must be 96x96. Please ensure this.

Have updated the repo code as well to not receive img_size as a variable input parameter.

@shikhar-scs
Copy link
Author

Oh okay, thanks for the help !

@shikhar-scs
Copy link
Author

shikhar-scs commented Apr 10, 2020

hey @prajwalkr , another thing.

For training the model could you please specify the dataset structure, would be helpful for a lot of people, the way its mentioned here https://github.com/Hangz-nju-cuhk/Talking-Face-Generation-DAVS#preparing-training-data

@shikhar-scs shikhar-scs reopened this Apr 10, 2020
@prajwalkr
Copy link
Collaborator

Done.

@shikhar-scs
Copy link
Author

Hi @Rudrabha finally completed the end to end setup and more importantly everything is working now.
Just a few nit bugs in preprocess.py. Pointing out so that future users don't have to spend extra time.

The split in line 116 & 118 should probably be args.split

LipGAN/preprocess.py

Lines 116 to 118 in 16ca935

for line in open(path.join(args.filelists, '{}.txt'.format(split))).readlines()]
jobs = [(vfile, args, split) for vfile in filelist]

Also, sr here, the sampling rate, is undefined again. Is there any specific value to be used ? (as of now I've set it to None, to preserve the native sampling rate of the file).

wav = audio.load_wav(wavpath, sr)

Will let you know if I come across any others. Thanks again for the work!

@prajwalkr
Copy link
Collaborator

Please use sr=16000, I will correct these two in preprocess.py right away. Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants