How to download and generate the data set #1

miyamamoto · 2017-03-08T04:53:08Z

That's a great task.
Please put a sample dataset for running the code.

klauscc · 2017-03-09T15:47:17Z

The model part is easy to understand I suppose!
I am trying to replication this work recently but it still performs very bad: on our small word level dataset it can only achieve accuracy of 57%(remove GRU layers, with logloss).
If you are working on this too, I am glad to discuss with you.

rizkiarm · 2017-05-02T16:58:10Z

Stumbled upon your works!

I see that you're working at the same thing as to what I did here: https://github.com/rizkiarm/LipNet
Using that model, I managed to achieve 10.2% CER, 15.0% WER, and 84.4% BLEU score in 15 epoch, which is approximately only ~3% more error than the actual model. You can check it out and use it if you're interested.

It is still under development. It would be great if you can contribute towards its development and share some results with me :)

klauscc · 2017-05-04T12:51:41Z

My result is
wer: 13.2% cer:2.44% on seen people(unlike the paper, test set was not in the training set)
wer: 23.7 cer:5.76 on unseen people(1,2,20,21).
The cer was nearly the same as the paper declared, but the wer is far higher than which in the paper.
In the paper they have splited sentences into words but I got worse result if I do like that.

rizkiarm · 2017-05-04T13:21:20Z

How did you manage to outperforms the paper (CER) in unseen speakers?
Because I see that you haven't employed any postprocessing towards the output, didn't implement any language model for decoding, and didn't use any special strategy.
May I know how many epoch your model has been trained on?

klauscc · 2017-05-04T14:55:41Z

I statistic all the metrics in The callback lipnet-replication/model/lipnet.py/StatisticCallback, which will evaluate the performace on validation set on each epoch end. Decoding I only use greedy search now.
The best model(evaluate by test on seen people loss) was trained 176 epoch.

klauscc · 2017-05-04T15:01:54Z

The CER is the same as mean edit distance.
I noticed the differences between my code and the code released by the author: he add the dropout layer after each BI-GRU layer but I do not. I tried to add dropout after GRU, the CER on seen people could achieve the accuracy in the paper but on the unseen people the CER and WER is very bad.

rizkiarm · 2017-05-04T15:12:47Z

I see, that explains why your model achieved that much accuracy compared to mine (176 epoch compared to 15 epoch).
Yeah I know that CER is edit distance, that's why its surprised me.
How bad is it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to download and generate the data set #1

How to download and generate the data set #1

miyamamoto commented Mar 8, 2017

klauscc commented Mar 9, 2017

rizkiarm commented May 2, 2017 •

edited

klauscc commented May 4, 2017

rizkiarm commented May 4, 2017 •

edited

klauscc commented May 4, 2017

klauscc commented May 4, 2017

rizkiarm commented May 4, 2017

How to download and generate the data set #1

How to download and generate the data set #1

Comments

miyamamoto commented Mar 8, 2017

klauscc commented Mar 9, 2017

rizkiarm commented May 2, 2017 • edited

klauscc commented May 4, 2017

rizkiarm commented May 4, 2017 • edited

klauscc commented May 4, 2017

klauscc commented May 4, 2017

rizkiarm commented May 4, 2017

rizkiarm commented May 2, 2017 •

edited

rizkiarm commented May 4, 2017 •

edited