-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training with new dataset #10
Comments
I've used logloss = nn.BCEWithLogitsLoss() to solve this problem. But then I realized it tended to vanishing problem again. So I decided to apply Wasserstein distance loss |
Many thanks, I'll try this loss. |
I recently trained this model, but the discriminator's eval loss is only 0.30, can not go down to ~0.25. |
let me know your batch_size, learning_rate, and logging when you're training |
you can give me some of your sample? |
Sorry for late reply. One sample is: en1_88.mp4It doesn't sync well. |
have you used Wasserstein distance or BCE loss? |
I used both ReLU and BCELoss. I also tried Wasserstein distance, but even worse. what is your meaning of "scale up audio block"? |
why don't you use leaky instead of relu? |
When I used leakyrelu, the loss did not converge, so I tried relu. Many thanks, I will scale up audio block. |
ok, if you need some help, you can ask this thread |
Can you tell me how to compute Wasserstein distance in syncnet? def cosine_loss(a, v, y): There are other methods like torch.mean(y*d) or use scipy.stats.wasserstein_distance, but seems not right. |
you need to understand theory and implement from scratch |
hi, what is your infrastructure? 10 GPUS? telsa v100? |
now I use 4 * V100 |
with batch size 128, How much memory does it take up? |
With batch size 64, it takes less than 13G. My V100 is 16GB. |
with color_syncnet_train.py, I trained on batch_size 128, lr=1e5, it has converged to 0.23 |
I'm checking my wasserstein distance loss |
thanks. |
can you show me your Wasserstein loss? |
I followed this blog. This example genereated using bceloss and relu. result_voice.mp4 |
good job bro :) I can see your result is very good |
Hello, can you please suggest how to scale audio block and does it make sence? |
Hello) |
I would like to buy this model, how much is it ? connect me with 1243137612@qq.com @MingZJU |
you can follow Primepake's code and obtain your results.I downloaded avspeech but have no time to train a model.-------- 原始邮件 --------发件人: Curisan ***@***.***>日期: 2023年1月29日周日 14:58收件人: primepake/wav2lip_288x288 ***@***.***>抄送: MingZJU ***@***.***>, Mention ***@***.***>主 题: Re: [primepake/wav2lip_288x288] Training with new dataset (Issue #10)
I followed this blog. the loss is very simple: torch.mean(torch.mul(y_true, y_pred)), not right. I may modify it from scratch in spare time.
This example genereated using bceloss and relu.
result_voice.mp4
Do you train the model using AVSpeech dataset? @MingZJU
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Thanks. I use AVSpeech dataset. And I did the following:
|
I used Mandarin video dataset collected by my partners. There are around 10,000 video clips with less than 1000 people. I think bceloss and relu is OK. The syncnet loss should drop soon below 0.5, after a few or decades of epoches.you may check your data again and try to modified the lr. you can also try to scale up audio encoder, though I haven't find its positive effects.-------- 原始邮件 --------发件人: Curisan ***@***.***>日期: 2023年1月30日周一 09:33收件人: primepake/wav2lip_288x288 ***@***.***>抄送: MingZJU ***@***.***>, Mention ***@***.***>主 题: Re: [primepake/wav2lip_288x288] Training with new dataset (Issue #10)
Thanks. I use AVSpeech dataset. And I did the following:
choose the 25fps videousing syncnet_python to filter dataset in range [-1,1]
And finally use 32000 video to train the syncnet. I also use bceloss and relu. But the loss only can drop to 0.5. Can you give me some suggestion? And how long do you use to drop loss to 0.3? @MingZJU
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Thanks, Also how many epoches do you use to drop loss below 0.3? @MingZJU |
It takes 750-1250 epochs to get there, depending on you data, optimizer and LR |
Thanks for your suggestion. I use Relu, BCELoss, and AdamW and the train loss can drop below to 0.3. But the eval loss is about 0.42 and the eval loss value seems to be increasing when train loss decrease. It seems to be overfitting. Now I use 32000 video to train the syncnet. Did you meet the problem? Can you give me some suggestion? @MingZJU @NikitaKononov |
hello i was stoped at "using syncnet_python to filter dataset in range [-1,1]" ,can you paste your process code ? def parse_args(): if name == "main":
get an error can not soloved. RuntimeError: mat1 dim 1 must match mat2 dim 0, follow the issue, |
May I ask how you use syncnet_ Python batch filters datasets within the range of [-1,1], and I have thousands of videos |
请问您知道如何筛选数据集吗,如何使用syncnet_Python |
我在linux上执行命令 bash calculate_scores_real_videos.sh you_folder_name,这样会在 syncnet_python目录下生成一个结果文件 all_scores.txt,关键代码是:offset, conf, dist = s.evaluate(opt,videofile=fname) |
Hi, Happy Christmas!
I'd like to train the model with hq images (320X320) and need some help.
There is a loss error in line 133 of "color_syncnet_train.py". The loss is out of (0, 1).
It seems work when I change loss function
from
logloss = nn.BCELoss()
to
logloss = nn.BCEWithLogitsLoss()
Is it OK? Do I have to make other changes?
The training lasts more than 4 days now, loss around 0.54 after 940, 000 steps.
Thanks in advance.
The text was updated successfully, but these errors were encountered: