Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimal hyper-parameters #11

Closed
thinhphp opened this issue Mar 10, 2023 · 4 comments
Closed

Optimal hyper-parameters #11

thinhphp opened this issue Mar 10, 2023 · 4 comments

Comments

@thinhphp
Copy link

thinhphp commented Mar 10, 2023

Hi authors,

I couldn't find the optimal parameters to reproduce your results reported in the paper. In detail, I validate your provided model, the results are 84.3%, 72.8% and 38.4% for Slot F1, Intent accuracy and Overall accuracy, respectively. It is a large margin compared to your reported results in the paper (88.3%, 76.3% and 43.5%).
Could you provide more details and/or your pretrained model with the best result?

Thank you

@thinhphp thinhphp changed the title Optimal parameters Optimal hyper-parameters Mar 11, 2023
@yizhen20133868
Copy link
Owner

@phoaiphuthinh Thanks for your interest in our work. Due to some stochastic factors, it is necessary to slightly tune the hyper-parameters using grid search. In our experiments, we carefully tune the hyper-parameters including dropout/batch-size/hidden-size. In addition, in pre-trained models, we carefully tune the hyper-parameters learning-rate (such as 110^-16 ~ 910^-6). Hope it is helpful for you.

@thinhphp
Copy link
Author

thinhphp commented Mar 13, 2023

@yizhen20133868 Thank you for replying. Could you provide the ranges you used to perform grid search for me to reproduce it, or suggest a combination of hyper-parameters that produce comparable results?

@awake020
Copy link
Collaborator

For the MixATIS dataset, the hyperparameter settings we have tried include: word_embedding_dim:[64, 128]; intent_embedding_dim[64, 128]; decoder_gat_hidden_dim[64, 128], slot_graph_ window[2, 3, 4]
For the MixSNIPS dataset, the main hyperparameters we try to set are: word_embedding_dim:[64, 128]; intent_embedding_dim[64, 128]; decoder_gat_hidden_dim[64, 128], slot_graph_ window[1]. Thank you for your interest in this work and hope it will help you.

@thinhphp
Copy link
Author

thinhphp commented Mar 15, 2023

Thank you for your reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants