-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] #15
Comments
Thanks! We will check it in a few days. |
Please check if you re-install ConvLab-2 since the code has changed. ( |
Yeah, I have already change the code, sure about that. |
I have tried these:
|
Oh, that is cool, for me the PPO is also good, but for GDPL, there are still some problem exists. |
Hi @zqwerty , Thanks for the tips. This works for the commit 2422980! |
move to #54 |
Hey guys, I tested PPO on the latest version of convab-2 today and I got a success rate of 84% that is way bigger than the reported 73%, I wonder if there are any mistakes? If not, I think the performance record should be updated. |
Describe the bug
This platform could not train a MLE model.
When I load the MLE model for GDPL, PPO, PG, it could train with no problem, but it never gets to the optimal score(I run evluate.py to see the model). Actually it goes down after few eopchs. And here is a graph I made in GDPL, and PPO is pretty similar to this one.
To Reproduce
Steps to reproduce the behavior:
Simply run train.py in PG/GDPL/PPO, and it will give this issue. I write a script which could evluate all of the models in one dir, and here is the graph I made about GDPL.
Expected behavior
The evluate score should go higher when loaded MLE model.
Score for PPO:
[0.52, 0.53, 0.54, 0.49, 0.44, 0.49, 0.46, 0.47, 0.44, 0.43, 0.42, 0.44, 0.47, 0.48, 0.49, 0.46, 0.45, 0.46, 0.45, 0.48, 0.46, 0.48, 0.49, 0.49, 0.49, 0.48, 0.45, 0.47, 0.43, 0.43, 0.43, 0.42, 0.42, 0.41, 0.42, 0.43, 0.44, 0.47, 0.45, 0.43]
So, the max of PPO is just goes to 0.53, not like 0.74.
The text was updated successfully, but these errors were encountered: