Deep Gate Recurrent Neural Network #2387

gaoyuankidult · 2016-04-18T17:29:31Z

I designed a new structures called Deep Simple Gated Unit.
The structure has shown some advantages comparing with LSTM and GRU. (details can be found in this paper: http://arxiv.org/abs/1604.02910)

Originally the experiments were done using an early version of Keras. (https://github.com/gaoyuankidult/einstein/blob/master/einstein/layers/recurrent.py#L345)

I have done several initial experiments with this model but the paper is still under development. You are welcome to test the model. If it is proven to be useful, then maybe we can add it to Keras library.

xingdi-eric-yuan · 2016-05-04T19:33:54Z

Just curious, do you have new figures for figure9/10/11 in your paper with dropout applied?

gaoyuankidult · 2016-05-05T10:19:28Z

@xingdi-eric-yuan not yet. But if you are interested, I will try to run several experiments with dropout applied in this week. According to my experiences, dropout rate can not be too high when using DSGU.

xingdi-eric-yuan · 2016-05-05T14:15:03Z

@gaoyuankidult Yes please run experiments. What happens when dropout rate is high (how high)? Sometimes it may related to your init method 👻

the-moliver · 2016-05-05T22:59:10Z

Have you considered adding Batch Normalization to your layers as well? Given it's success with LSTMs it may be beneficial here as well...

xingdi-eric-yuan · 2016-05-05T23:54:15Z

Have you considered adding Batch Normalization to your layers as well? Given it's success with LSTMs it may be beneficial here as well...

That's a good point!

gaoyuankidult · 2016-05-06T13:18:09Z

@xingdi-eric-yuan I think it should not be more than 0.3. The dropout function was added to make sure DSGU is consistent with other RNN classes in Keras.

gaoyuankidult · 2016-05-06T13:29:22Z

@the-moliver Thanks for your input. I also think it can be very beneficial to this model.

gaoyuankidult · 2016-05-09T06:49:34Z

@xingdi-eric-yuan This is the dropout results of IMDB example.

I don't see clear differences. Maybe it is because IMDB dataset is small.
This figure shows one and half epochs for each configuration. Every 25 iterations is one epoch.

gaoyuankidult · 2016-05-09T11:08:14Z

A complete result.

gaoyuankidult · 2016-05-16T14:34:18Z

This model uses sigmoid activation function for both binary (link) and multi-class classification problems (link). Although it is fine in binary case but in multi-class case, it can not provide a proper probabilistic distribution (it only gives the best class). As a consequence , the applicability of this model is limited. After discussion with different people, I decided to close this pull request. If you are still interested in this model, some discussions are also here.

fchollet · 2016-05-21T20:15:16Z

Thanks for letting us know, and best of luck with future iterations of this research.

In general we won't merge into Keras algorithms that aren't widely accepted or haven't been covered in a peer-reviewed paper. At the same time, we try to stay on top of things and incorporate the latest advances --as soon as we are confident in their viability.

gaoyuankidult · 2016-05-21T20:42:10Z

Thanks for the information.

It is good to know principles of merging into Keras library. I will be more cautious when I make a pull request next time.

gaoyuankidult added 4 commits April 18, 2016 20:10

add initial testing file for DSGU

84351ce

fixed minor bugs in dsgu testing file, added dropout support for DSGU

e449cf3

fixed format based on pep8, cleaned mnist_dsgu.py

32cdd1c

correct some typos

fe65a74

gaoyuankidult added 2 commits May 8, 2016 22:33

Added DSGU imdb experiment.

504eed7

Fixed typos imdb_dsgu.py file.

6933889

gaoyuankidult closed this May 16, 2016

gaoyuankidult mentioned this pull request May 16, 2016

A bug to solve gaoyuankidult/einstein#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deep Gate Recurrent Neural Network #2387

Deep Gate Recurrent Neural Network #2387

gaoyuankidult commented Apr 18, 2016 •

edited

Loading

xingdi-eric-yuan commented May 4, 2016

gaoyuankidult commented May 5, 2016

xingdi-eric-yuan commented May 5, 2016

the-moliver commented May 5, 2016

xingdi-eric-yuan commented May 5, 2016

gaoyuankidult commented May 6, 2016 •

edited

Loading

gaoyuankidult commented May 6, 2016

gaoyuankidult commented May 9, 2016 •

edited

Loading

gaoyuankidult commented May 9, 2016 •

edited

Loading

gaoyuankidult commented May 16, 2016 •

edited

Loading

fchollet commented May 21, 2016

gaoyuankidult commented May 21, 2016

Deep Gate Recurrent Neural Network #2387

Deep Gate Recurrent Neural Network #2387

Conversation

gaoyuankidult commented Apr 18, 2016 • edited Loading

xingdi-eric-yuan commented May 4, 2016

gaoyuankidult commented May 5, 2016

xingdi-eric-yuan commented May 5, 2016

the-moliver commented May 5, 2016

xingdi-eric-yuan commented May 5, 2016

gaoyuankidult commented May 6, 2016 • edited Loading

gaoyuankidult commented May 6, 2016

gaoyuankidult commented May 9, 2016 • edited Loading

gaoyuankidult commented May 9, 2016 • edited Loading

gaoyuankidult commented May 16, 2016 • edited Loading

fchollet commented May 21, 2016

gaoyuankidult commented May 21, 2016

gaoyuankidult commented Apr 18, 2016 •

edited

Loading

gaoyuankidult commented May 6, 2016 •

edited

Loading

gaoyuankidult commented May 9, 2016 •

edited

Loading

gaoyuankidult commented May 9, 2016 •

edited

Loading

gaoyuankidult commented May 16, 2016 •

edited

Loading