-
-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad results from char-rnn after CPU training #12
Comments
Hello Robert, there are two issues in the sample function.
This, for me, gives much better text, but boring one: It gets repetetive fast.
(you may have to rename something.)
I hope this helps. You write from Sweden, so maybe this comparison after training fairy tales by Hans Christian Andersen (in german) for just 60 epochs (one night on my laptop) can give you an impression: The current sample function: "- gehtn sir distel wer aon die eeten wurde urin die distag werde, Dir wellst du mur diud aan sied iiem missen!“ -ring niedere aie eahi tüße eineridet werde Iie fährein helle deran sie deste surtehen “\r\n Then the version that tells the net about the chosen character: "1reichen saß eine große Thüre aus dem Schlosse der Schneekönigin saß in dem andern besondert aus dem Schlosse der Schneekönigin saß in dem andern besondert aus dem Schlosse der Schneekönigin saß in dem andern bes And the last version: "4en, hste zwei gürte, aber es setzte blassenblich in die\r\nKönig tanzen.\r\n\r\n„Das ist eingewehkte sie\r\nnicht mehr niemand mit Blumen genommen; wäre ihr auch auf und flocht es,\r\nalle Seele\r\nseiner Ofer_ |
Isn't this equivalent to the |
Hi Mike! No. s(m) is a vector of floats between 0 and 1, aka probability vector; onehot(c, ..) is a vector of o or one. c, the result of the argmax call, isn't given to the net. Say m(s) gives two cells with p=0.5, and all other cells are zero. The net cannot know how argmax is implemented and thus cannot know what of both values has been used as output. Greetings, z. |
@zenon good catch, thanks! this should be fixed now. I'm also trying to replicate my old results with this code, so that should make sure it's all working as expected. |
I might be doing things in the wrong way but after playing with the char-rnn example for some time I wonder if other people can get good performance from it? It is based on a blog entry which shows impressive text generation capabilities after training and char-rnn.jl seems to be trying to replicate them but my results are nowhere near the results of the blog entry. Are people tweaking the example code to get better results or can anyone get good results from the "vanilla" settings?
The text was updated successfully, but these errors were encountered: