Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to reproduce the results of LSTM on AN4 #4

Open
Jia-zb opened this issue Oct 5, 2022 · 2 comments
Open

How to reproduce the results of LSTM on AN4 #4

Jia-zb opened this issue Oct 5, 2022 · 2 comments

Comments

@Jia-zb
Copy link

Jia-zb commented Oct 5, 2022

I was interested in your work on PPoPP'22, thank you for making the code open source. I tried to run the LSTM AN4 code, but cannot achieve the results claimed in the paper(WER=0.309 or 0.368),only reach 0.46. I know I'm using a different environment, Perhaps you can give me some suggestions to improve the WER?

Here is the environment I use:

  • 8*A100 within one server
  • Horovod 0.22.1
  • Here are the parameters I use: horovodrun -np 8 python horovod_trainer.py --dnn lstman4 --dataset an4 --max-epochs 1000 --batch-size 2 --nworkers 8 --data-dir ./audio_data --lr 0.001 --nwpernode 8 --nsteps-update 1

In addition, I also adjusted learning rate decay rate(In dl_trainer.py/_Adjust_Learning_Rate_LSTMan4() ). The original 1.01 May not be suitable for my environment, so I changed it to 1.005.

Thank you for seeing this, do you have any suggestions?

@Shigangli
Copy link
Owner

Hi, could you try to use batch-size=8 instead of 2? I used global batch size = 64.

@Jia-zb
Copy link
Author

Jia-zb commented Oct 5, 2022

Thank you for your reply, I tried Batchsize 4 and 8 respectively, but the results were even worse(the former was 0.73 and the latter was 0.92). I don't know why, but it seems to prefer the smaller batchsize.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants