Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add other neural net types #16

Closed
robertleeplummerjr opened this issue Jun 22, 2016 · 18 comments
Closed

Add other neural net types #16

robertleeplummerjr opened this issue Jun 22, 2016 · 18 comments

Comments

@robertleeplummerjr
Copy link
Contributor

It'd be really cool to have something like this: http://karpathy.github.io/2015/05/21/rnn-effectiveness/ in brain.js

@robertleeplummerjr
Copy link
Contributor Author

robertleeplummerjr commented Jul 6, 2016

So I did several weeks of non-in-depth searching for a js implementation of a recurrent neural net that was general enough for reuse and output a single function for reuse like some of the more popular libraries in js, and I couldn't find one. The above blog post served me with some serious insight, and I found (low and behold) where @karpathy wrote a very nice javascript implementation that was still in beta, but very well proved the points of what the neural net could do.

After a couple days of experimenting, I ended up creating the branch above. And as of later today I feel like I really got the ball rolling: here: https://github.com/harthur-org/brain.js/compare/recurrent?expand=1#diff-bc861d9ec4bdd00fbfec14233411cce4R1

I was able to clean up the way the the network is instantiated, so it is like the following:

new RNN({
  needsBackprop: true,
  inputSize: -1,
  hiddenSizes: -1,
  outputSize: -1,
  learningRate: -1,
  decayRate: 0.999,
  smoothEps: 1e-8,
  ratioClipped: null
});

Note: it doesn't yet work, but it will!

Also the solver and the graph I merged into RNN, so rather than creating new ones, we can potentially reuse some of the instantiated objects/methods so there isn't as much memory leakage, though I have not really measured that as of yet, I know, I know, premature optimization... I just didn't feel like creating a ton of different objects when I could do it in one place. I just saw a lot of usage of new Foo() where the function seemed like it should be a part of the neural network, and/or returned something, thus overriding the whole new concept.

The end goal, I feel, is to create a reusable recurrent neural network in js that matches very closely what @harthur started with the initial brain concept, also that is run very efficiently, can train easily, and when done, can output a single function that wraps the entire functionality of the trained network.

@robertleeplummerjr
Copy link
Contributor Author

Listing some notes as I find them. I did find some very nice implementations in

You'll notice a theme here... no fully working reusable library in js yet

@robertleeplummerjr
Copy link
Contributor Author

As for some of the network properties, I've been using http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43905.pdf which seems to be very straightforward on the mathematical properties.

@robertleeplummerjr
Copy link
Contributor Author

Also, outstanding illustrations: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

@robertleeplummerjr
Copy link
Contributor Author

I did finally see this: https://github.com/cazala/synaptic/blob/master/src/architect.js#L52
Not specifically an rnn, but rather a type of rnn, a long short term memory (lstm) neural network.

@robertleeplummerjr
Copy link
Contributor Author

While I really like the above approach on abstracting into an architect, I don't think it exactly merges with the existing means used in brain.js, but I could be wrong.

@cburgdorf
Copy link

cburgdorf commented Jul 12, 2016

Sorry if my question sounds like totally off the planet but I'm just starting with AI and would like to know if this issue would solve my use case.

I'd like to feed lots of text data to brain.js and then have it generate other text. The way I understand the capabilities of brain.js, that doesn't seem to be possible today, right?

@robertleeplummerjr
Copy link
Contributor Author

There are a number of network types that would work for you. The current network type in brain.js, a feed forward neural net, wouldn't be ideal. I believe the exact network type you'd want is either a an rnn, lstm (both of which I'm working on) or a gru, but there are hoards of different types of networks that may be what you are looking for.

This looks like it might fulfil your exact use case: https://github.com/garywang/char-rnn.js
Ironically, it is a port of similar code I'm working on.

@cburgdorf
Copy link

cburgdorf commented Jul 12, 2016

Thank you for your quick response. That's really interesting. Unfortunately there doesn't seem to be much documentation for the linked project. Do you have any pointers for the work you are currently doing that you mentioned? Will this be something based on brain.js?

Also when you say the feed forward neural net wouldn't be ideal can you elaborate how it could work at all? The way I understand it, after I trained the network, I can only get a score calculated from some other input text that I feed to the network. But I couldn't make it to actually generate text, right?

@robertleeplummerjr
Copy link
Contributor Author

robertleeplummerjr commented Jul 12, 2016

Thank you for your quick response. That's really interesting. Unfortunately there doesn't seem to be much documentation for the linked project.

I think that is what we're aiming at solving.

Do you have any pointers for the work you are currently doing that you mentioned?

Read this entire article, and look at the mentioned libraries/projects: http://karpathy.github.io/2015/05/21/rnn-effectiveness/
By the end, you may be the expert!

Will this be something based on brain.js?

Short answer: yes, that is the goal.
Long answer: I've started porting @karpathy's work to the style of brain.js, which should give a very simplistic, very minimal format so that the these types of neural nets aren't a mystery, but easily understood (I have been writing code more than half my life, and still feel the best type of code is something I can easily explain to a four year old 😄, but that may be because I'm just not that smart 😛) but it isn't even running as of yet, at least not in something that doesn't result in a runtime exception.

Also when you say the feed forward neural net wouldn't be ideal can you elaborate how it could work at all?

A recurrent neural network (or more specifically an long short term memory network, generally called lstm) keeps some of what it learns over the duration that it runs, it can even imagine or generate things similar to what it has learned. It can learn to spell, to write sentences, etc. because it can recall patterns that it previously handled. A feed forward network has no memory, it is weighted math and is more for classification or pattern recognition.
You could probably devise a means of using a feed forward network, but it would largely be a solution outside of what the network itself is able to produce.

@cburgdorf
Copy link

Thanks for this great answer! This made a lot of things much more clear to me 👍

@robertleeplummerjr
Copy link
Contributor Author

Huge find here: https://github.com/Atyanar/neuraljs
Implementing work done by @karpathy, and apparently working in js, and with the addition of gru's.

@robertleeplummerjr
Copy link
Contributor Author

breakthrough! Just got rnn seemingly online. Still a lot of work to match the strategy outlined in brain.js, but here is what instantiation of a vanilla rnn looks like:

var rnn = new RNN({
  inputSize: vocabData.inputSize,
  outputSize: vocabData.outputSize
});
rnn.input(phraseToIndexes(randomPhrase()));
var prediction = rnn.predict();
console.log(indexesToPhrase(prediction));

as outlined here: cdfd348#diff-1c9e38cd442769faa688f4c4536b5073R90

@robertleeplummerjr
Copy link
Contributor Author

Both lstm and rnn are now up and running, and should have gru running in a few hours.

@robertleeplummerjr
Copy link
Contributor Author

all of them up and running, and have to and from json tested.

@riatzukiza
Copy link

I've also been reading around karpathys works, specifically their convnet js.
https://github.com/karpathy/convnetjs

This perticular one states that it may not even work as it is, I am still reading around and playing with the demo's they've got on their site.

I am not too new to the idea of neural networks, i've been studying them and the theories behind them for the last year or so, but I've never attempted to implement one, or even work with one.

I've got a bit of experience doing other probabalistic programming though, markov chains and so on.

I am wondering how feasable it would be to take the Feedforward model used in the core of brain and build a convolutional network off of it?

Also, conv net had some other interesting ideas. For example, it was very easy to create different layer types, so one could create a convolutional layer, followed by a simpler feed forward layer, followed by some pooling, some regression neurons, and so on.

Brain doesn't appear to have this ability so much, the layers are kind of locked into a single activation function ( I think that I read it was a tanh?)

I will share my experiments, if I can get a convolutional network out of the brain framework.

@robertleeplummerjr
Copy link
Contributor Author

I've also been reading around karpathys works, specifically their convnet js.
https://github.com/karpathy/convnetjs

Yea, his mind is fantastic at this stuff, I wish I could follow his code better. For us mere mortals, I think brain.js's goal is ultimately simplicity, usefulness, and speed. For example, the toFunction method, when I first saw it, I was struck by how simplistic the implementation: https://github.com/harthur-org/brain.js/blob/master/src/neural-network.js#L497

This perticular one states that it may not even work as it is, I am still reading around and playing with the demo's they've got on their site.

Welcome to the club 😄

I am not too new to the idea of neural networks, i've been studying them and the theories behind them for the last year or so, but I've never attempted to implement one, or even work with one.

I've got a bit of experience doing other probabalistic programming though, markov chains and so on.

I am wondering how feasable it would be to take the Feedforward model used in the core of brain and build a convolutional network off of it?

Absolutely, and I would welcome the idea! The guidelines would be that it would need to match very close the existing api for the standard feedforward network.

Also, conv net had some other interesting ideas. For example, it was very easy to create different layer types, so one could create a convolutional layer, followed by a simpler feed forward layer, followed by some pooling, some regression neurons, and so on.

I don't mind this so much, but the further we bury the neural network as an abstraction feels like it takes away from the original model, simplicity, usefulness, and speed. I would much rather see 100 lines of pure convolutional network, than it abstracted into 50, for the simple reason that it may be less composable, less understandable. Too, "burying in abstraction" is popular, but fine tuning it with simplicity not so much. End the end though, I'm sure just following simplicity first would end us up with something useful.

Brain doesn't appear to have this ability so much, the layers are kind of locked into a single activation function ( I think that I read it was a tanh?)

tanh and sigmoid for recurrent neural nets, sigmoid for feed forward.

I will share my experiments, if I can get a convolutional network out of the brain framework.

I look forward to working with you!

@robertleeplummerjr
Copy link
Contributor Author

Going to go ahead and close this issue and continue on new issues:
ANN: #38
RNN, LSTM, GRU: #24
Convolutional: #39
Please post any other types that would be beneficial, but for the time being, closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants