Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different results from allennlp tune and allennlp retrain with transformers #45

Closed
MagiaSN opened this issue Jun 1, 2021 · 4 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@MagiaSN
Copy link
Contributor

MagiaSN commented Jun 1, 2021

When I am tuning a transformer model, I get different results from allennlp tune and allennlp retrain with the same hyperparameters.

I found this is caused by allennlp.common.cached_transformers module, which only constructs the model in the first trial (which would consume some random numbers), and uses the cached model in trials afterwards (which would not consume random numbers), leading to inconsistent results between tune and retrain.

@MagiaSN
Copy link
Contributor Author

MagiaSN commented Jun 1, 2021

#46 fixes this for me, but it has to access allennlp private interfaces, is this acceptable?

@himkt himkt self-assigned this Jun 1, 2021
@himkt
Copy link
Owner

himkt commented Jun 1, 2021

@MagiaSN Thank you so much for the investigation! I think we have to fix it.
However, it would be better to clear the cache in the Optuna's AllenNLPExecutor, as allennlp-optuna is a simple wrapper of Optuna and AllenNLPExecutor (in Optuna) invokes AllenNLP functionalities.

Would you mind sending the PR to Optuna? I would review your PR if you could send it.
And could you please some small reproducible configuration? It would be really helpful.

@MagiaSN
Copy link
Contributor Author

MagiaSN commented Jun 1, 2021

@himkt Fine, I have opened an issue in Optuna with reproducible scripts optuna/optuna#2716 and a PR optuna/optuna#2717

@MagiaSN
Copy link
Contributor Author

MagiaSN commented Jun 6, 2021

Since this is fixed in optuna, I am closing this now :)

@MagiaSN MagiaSN closed this as completed Jun 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants