-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimal hyper-parameters #11
Comments
@phoaiphuthinh Thanks for your interest in our work. Due to some stochastic factors, it is necessary to slightly tune the hyper-parameters using grid search. In our experiments, we carefully tune the hyper-parameters including dropout/batch-size/hidden-size. In addition, in pre-trained models, we carefully tune the hyper-parameters learning-rate (such as 110^-16 ~ 910^-6). Hope it is helpful for you. |
@yizhen20133868 Thank you for replying. Could you provide the ranges you used to perform grid search for me to reproduce it, or suggest a combination of hyper-parameters that produce comparable results? |
For the MixATIS dataset, the hyperparameter settings we have tried include: word_embedding_dim:[64, 128]; intent_embedding_dim[64, 128]; decoder_gat_hidden_dim[64, 128], slot_graph_ window[2, 3, 4] |
Thank you for your reply. |
Hi authors,
I couldn't find the optimal parameters to reproduce your results reported in the paper. In detail, I validate your provided model, the results are 84.3%, 72.8% and 38.4% for Slot F1, Intent accuracy and Overall accuracy, respectively. It is a large margin compared to your reported results in the paper (88.3%, 76.3% and 43.5%).
Could you provide more details and/or your pretrained model with the best result?
Thank you
The text was updated successfully, but these errors were encountered: