Skip to content
This repository has been archived by the owner on Feb 9, 2023. It is now read-only.

Repeating Benchmark results #23

Open
st-tomic opened this issue Apr 6, 2020 · 2 comments
Open

Repeating Benchmark results #23

st-tomic opened this issue Apr 6, 2020 · 2 comments
Labels
question Further information is requested

Comments

@st-tomic
Copy link

st-tomic commented Apr 6, 2020

Hi @rolczynski

I am experimenting with your code and would like to know how to repeat benchmark results from the Table?

Is it the pipeline from readme? With 25epoch and batch size of 32? How many gpu-s did you use (4x8 I guess)?

Dataset should be full librispeech.
Was data augmentation used?

Does the code support decoding on the whole dev-clean subset?

@rolczynski
Copy link
Owner

rolczynski commented Apr 8, 2020

Hey @st-tomic

My primary goal was slightly different. I just wanted to provide the good and open-sourced Polish ASR. I tried to experiment with the Mozilla DeepSpeech, Kaldi, etc. there are several attempts, but well ... They are overcomplicated and too specific for further research. I decided to build this little package from scratch.

OK, and back to the question. To make this package more general, I had to adjust my aim and provide the English model. I plan to train a model from the very beginning, but for now, I adapted the English model from the Seq2Seq repository (here the NVIDIA documentation, and the configuration file where you can find detailed information here and the my model adaptation file - It should be compatible what we have here)

I do not want to stuck with CTC based models. In the next months, I will do the second version of this package, where I introduce the Transformer based English ASR (I am quite fascinated about NLP in general, check out my new repo: Aspect Based Sentiment Analysis).

ps. The presented result is for the greedy decoder. In my opinion, the sophisticated decoding algorithms are old-fashioned, crude... isn't it? ;)

@rolczynski rolczynski added the question Further information is requested label Apr 8, 2020
@st-tomic
Copy link
Author

Hi @rolczynski,

Thanks for the feedback and interesting info. I agree that we should look at the wider image also :)

I am looking forward to seeing your future work.

Best regards.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants