-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A bug to solve #2
Comments
Dear i3esn0w Thanks for your input. However, this repo uses an older version of Keras and is outdated. Now it is only used for storing my previous experiment files. I changed description to be
ClockworkSGU is a clockwork version of my previous model SGU. I don not know what would you like to do with it. If you would like to port Clockwork to some other libraries, you can look at this class (link). I think anyone with experiences with older keras should know how to port it. If you would like to check the code of SGU and DSGU, please have a look at this link. Actually both SGU and DSGU have some problems with multi-class classification (it only gives best class but does not provide a probabilistic distribution over all classes). Please do not hesitate to tell me what would you like to do with the code, so I can help you with it. |
I've been using DSGU w/ softmax l2_activation for multi-label classification in an RL-setting, with stellar results. GRUs comprise the hidden state, DSGU does the output. Network inputs range (-inf, inf). Do you see any reason this shouldn't be a viable approach? Thanks, and great work... I was sad to see the arxiv paper come down. |
Dear white54503 Did you get a good result ? As mentioned in the paper, DSGU uses a sigmoid function as an activation function. Combining it with categorical cross entropy can make the classification right but output does not follow a probabilistic distribution. This is not useful in many cases (probably more experiments are needed if somebody want to continue investigation on multiplicative gate). That is why I withdrawn my paper. Using DSGU + softmax may not be a good choice as DSGU+softmax did not work on my experiments (DSGU + sigmoid worked). Actually, I suggest you look at this paper. By using batch normalization on LSTM, they managed to reach 99% on MNIST dataset (with softmax output layer), which is at least higher than my result. Keras recently has a pr about it (link). However, I never implemented batch normalization on my model. |
Performance is quite good where other architectures have struggled mightily... I'm using DSGU in an asynchronous actor-critic setup similar to http://arxiv.org/pdf/1602.01783v1.pdf; loss is the negative of actor's advantage vs critic baseline. Classification by the max of the network's output yields control vectors; there is no probabilistic interpretation of the class labels. I'll test sigmoid vs. softmax and report results in a week or two. |
First of all, thanks for your reply.I use it to sentiment analyse,and I need tri-class classification.My old code are base keras 1.0 . So I will consider to port it to keras 1.0. |
@white54503 That is quite interesting. My original purpose of designing this network is for control problem as well, which does not require any probabilistic distribution. I am really interested in how did you use softmax+DSGU for your problem. I will read the paper and try to understand your system. Let us continue discussion latter. btw, maybe we can discuss here. |
好吧……我想试试ClockworkRNN |
i use the code that you implement in example , but it can not run. And I check the code .there is no error , how i fix it ?
The text was updated successfully, but these errors were encountered: