Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A bug to solve #2

Open
i3esn0w opened this issue May 16, 2016 · 9 comments
Open

A bug to solve #2

i3esn0w opened this issue May 16, 2016 · 9 comments

Comments

@i3esn0w
Copy link

i3esn0w commented May 16, 2016

Traceback (most recent call last):
File "clock_gated_rnn.py", line 63, in
model.compile(loss='binary_crossentropy', optimizer='adam', class_mode="binary")
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 287, in compile
self.y_train = self.get_output(train=True)
File "/usr/local/lib/python2.7/dist-packages/keras/layers/containers.py", line 51, in get_output
return self.layers[-1].get_output(train)
File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 223, in get_output
X = self.get_input(train)
File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 31, in get_input
return self.previous.get_output(train=train)
File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 341, in get_output
X = self.get_input(train)
File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 31, in get_input
return self.previous.get_output(train=train)
File "build/bdist.linux-x86_64/egg/einstein/layers/recurrent.py", line 714, in get_output
File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan.py", line 745, in scan
condition, outputs, updates = scan_utils.get_updates_and_outputs(fn(*args))
File "build/bdist.linux-x86_64/egg/einstein/layers/recurrent.py", line 694, in _step
AttributeError: 'module' object has no attribute 'ifelse'

i use the code that you implement in example , but it can not run. And I check the code .there is no error , how i fix it ?

@gaoyuankidult
Copy link
Owner

Dear i3esn0w

Thanks for your input. However, this repo uses an older version of Keras and is outdated. Now it is only used for storing my previous experiment files.

I changed description to be

This repo depends on an older version of keras and is outdated. Now it only serves as a place to store my previous experiments. One may face a lot of errors trying to run these codes.

ClockworkSGU is a clockwork version of my previous model SGU. I don not know what would you like to do with it.

If you would like to port Clockwork to some other libraries, you can look at this class (link). I think anyone with experiences with older keras should know how to port it.

If you would like to check the code of SGU and DSGU, please have a look at this link. Actually both SGU and DSGU have some problems with multi-class classification (it only gives best class but does not provide a probabilistic distribution over all classes).

Please do not hesitate to tell me what would you like to do with the code, so I can help you with it.

@white54503
Copy link

white54503 commented May 16, 2016

I've been using DSGU w/ softmax l2_activation for multi-label classification in an RL-setting, with stellar results. GRUs comprise the hidden state, DSGU does the output. Network inputs range (-inf, inf). Do you see any reason this shouldn't be a viable approach? Thanks, and great work... I was sad to see the arxiv paper come down.

@gaoyuankidult
Copy link
Owner

gaoyuankidult commented May 16, 2016

Dear white54503

Did you get a good result ?

As mentioned in the paper, DSGU uses a sigmoid function as an activation function. Combining it with categorical cross entropy can make the classification right but output does not follow a probabilistic distribution. This is not useful in many cases (probably more experiments are needed if somebody want to continue investigation on multiplicative gate). That is why I withdrawn my paper.

Using DSGU + softmax may not be a good choice as DSGU+softmax did not work on my experiments (DSGU + sigmoid worked).

Actually, I suggest you look at this paper. By using batch normalization on LSTM, they managed to reach 99% on MNIST dataset (with softmax output layer), which is at least higher than my result. Keras recently has a pr about it (link). However, I never implemented batch normalization on my model.

@white54503
Copy link

white54503 commented May 16, 2016

Performance is quite good where other architectures have struggled mightily... I'm using DSGU in an asynchronous actor-critic setup similar to http://arxiv.org/pdf/1602.01783v1.pdf; loss is the negative of actor's advantage vs critic baseline. Classification by the max of the network's output yields control vectors; there is no probabilistic interpretation of the class labels. I'll test sigmoid vs. softmax and report results in a week or two.

@i3esn0w
Copy link
Author

i3esn0w commented May 17, 2016

First of all, thanks for your reply.I use it to sentiment analyse,and I need tri-class classification.My old code are base keras 1.0 . So I will consider to port it to keras 1.0.

@gaoyuankidult
Copy link
Owner

@white54503 That is quite interesting. My original purpose of designing this network is for control problem as well, which does not require any probabilistic distribution. I am really interested in how did you use softmax+DSGU for your problem. I will read the paper and try to understand your system. Let us continue discussion latter. btw, maybe we can discuss here.

@gaoyuankidult
Copy link
Owner

gaoyuankidult commented May 17, 2016

@i3esn0w Which one would you like to port ? DSGU or ClockworkRNN ? If you would like to port DSGU, there is a ported version here. If you would like to port ClockworkRNN, you should look at this class.

如果你懂中文也可以中文说。

@i3esn0w
Copy link
Author

i3esn0w commented May 17, 2016

好吧……我想试试ClockworkRNN

@gaoyuankidult
Copy link
Owner

gaoyuankidult commented May 17, 2016

@i3esn0w 我最近比较忙,可能没时间完成转成keras 1.0 的工作,不过我的step 函数 已经写recurrent.py里面了。

我建议你写一个支持Keras 1.0的ClockworkRNN函数,这样大家也会用的。

不过Clockwork RNN 非常慢,我不是很推荐用这个做sentiment analysis。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants