Fail to reproduce result from demo video #5

CompactSupport · 2022-11-01T21:57:19Z

I tried to reproduce the result of the Mandarin lip reading from the following demo video:

I've made a clip "demo_cn.mp4" from this video 0:33-0:41.

My code:

python main.py --config-filename configs/CMLR_V_WER8.0.ini --data-filename inputs/demo_cn.mp4

The output:

load a pre-trained model from: models/CMLR/CMLR_V_WER8.0/model.pth
face tracking speed: 4.90 fps.
hyp: 有一种种的人俗话说的大家人才能真的一年里的一个行

This is different from the one shown in the demo: 中青祝愿大家在新的一年里新春愉快身体健康

I also extracted mouth ROIs from the clip (link).

Would you please let me know if I missed anything?

The text was updated successfully, but these errors were encountered:

zhunge · 2022-11-03T09:26:36Z

set video fps=25 maybe get correct result

fantasyfw · 2023-03-20T12:18:40Z

Same problem, not able to reproduce the result. Here is my output:

load a pre-trained model from: models/CMLR/CMLR_V_WER8.0/model.pth
face tracking speed: 5.98 fps.
hyp: 我们总结了祖国的大量餐饮新的一天也成了这样

25fps was set in "configs/CMLR_V_WER8.0.ini"

mostafamdy · 2023-04-19T22:44:44Z

same here. the visual only is not working well but the visual and audio are perfect.

mpc001 closed this as completed Nov 19, 2022

cooelf mentioned this issue Jun 1, 2023

Is there an audio-visual Chinese model? #15

Open

Provide feedback