How to define a net structure according to the data? #14

davay42 · 2021-01-31T17:34:29Z

davay42
Jan 31, 2021

I've started my experiments with Dann in browser. I'm elaborating on the Neural Network learning to count with 4bit binary digits interactive demo, merging it with a vue UI. Everything goes fine until I start changing the structure of the dataset and the result.

Take this example, that I'm trying to implement:
I have a text in Russian language. It has unique letters and a ' ' space. I want to use it to teach a net to predict the next symbol from a given one. And then use it ti generate random words. )

So I need to feed data like [{input:'a',target:'c'}], but in form of a normalized array. It's fine, I've managed to do that. So it's a model with 34 parameters coming in and 34 coming out. So the net should be new Dann(34,34), yes? What should be the inner layers structure? Should the layers node count be always a power of 2? What activation and loss functions are best suitable for it? How to define the size of the text to be a sufficient training data? For how many epochs?

For now I got the net created like that

  net = new Dann(34, 34);
  net.addHiddenLayer(34, "tanH");
  net.outputActivation("sigmoid");
  net.makeWeights();
  net.lr = 0.1;

with texts of 300-3000 letters and only 10 epochs - strangely the avgLoss doesn't drop after just a couple of epochs with the same input text. And greater the learning - greater the freeze time of the browser tab. So for more complex nets it seems I'll need to pack it into a separate web-worker thread.

Sorry if I ask too simple questions. I'm just very curious about all the neural net stuff, but don't have enough math intuition of all that. A dozen of youtube tutorials gave some basic understanding, but it's not enough. So hope the practice with Dann will help explore it all better. )

matiasvlevi · 2021-01-31T17:58:28Z

matiasvlevi
Jan 31, 2021
Maintainer

I assume you have 34 different characters including the space, that's why you have 34 input/output neurons? The inner layers structure really depends on what you are trying to accomplish. If your goal is to predict the next letter character in the Russian alphabet, this model should do the job (maybe tweak some values & experiment with different model configurations). I see you're talking bout texts of 300-3000 characters. I assume you feed the model letter by letter. This is a job for an LSTM and not a neural network, an LSTM is used to recognize patterns that have something to do with sequences, just like text.

2 replies

davay42 Jan 31, 2021
Author

Thank you for the answer! Seems like LSTM is more suitable for the task. So neural nets are better for data recognition. I'll think of a more suitable case to experiment with )

matiasvlevi Jan 31, 2021
Maintainer

If you need an idea for a good situation to experiment with: Language recognition is possible with neural networks.
Having input neurons for the characters of a certain word (normalized between 0 and 1) and a few output neurons to indicate which language the word comes from is a classification task and is totally possible with neural networks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to define a net structure according to the data? #14

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

How to define a net structure according to the data? #14

davay42 Jan 31, 2021

Replies: 1 comment · 2 replies

matiasvlevi Jan 31, 2021 Maintainer

davay42 Jan 31, 2021 Author

matiasvlevi Jan 31, 2021 Maintainer

davay42
Jan 31, 2021

Replies: 1 comment 2 replies

matiasvlevi
Jan 31, 2021
Maintainer

davay42 Jan 31, 2021
Author

matiasvlevi Jan 31, 2021
Maintainer