You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Proof:
I tried changing this line:
var reward = action === 0 ? 1.0 : 0.0;
into:
var reward = action === 1 ? 1.0 : 0.0;
*** and got the same result which is 0
Code Example:
/START CODE/
var brain = new deepqlearn.Brain(3, 2); // 3 inputs, 2 possible outputs (0,1)
var state = [Math.random(), Math.random(), Math.random()];
for(var k=0;k<10000;k++) {
var action = brain.forward(state); // returns index of chosen action
var reward = action === 0 ? 1.0 : 0.0;
brain.backward([reward]); // <-- learning magic happens here
state[Math.floor(Math.random()*3)] += Math.random()*2-0.5;
}
brain.epsilon_test_time = 0.0; // don't make any more random choices
brain.learning = false;
// get an optimal action from the learned policy
var action = brain.forward(state);
/END CODE/
The text was updated successfully, but these errors were encountered:
Exactly the same issue here. The author of this library is not taking care of this repo tho. I think it doesn't work anymore...
Things just dont start working unless they are outdated and this javascript still works fine. i got an example working but i had to link the brain js library. Its not included here but i found it in the source code of the example
The basic example provided here does not seem to work because the output was always 0:
https://cs.stanford.edu/people/karpathy/convnetjs/docs.html
Proof:
I tried changing this line:
var reward = action === 0 ? 1.0 : 0.0;
into:
var reward = action === 1 ? 1.0 : 0.0;
*** and got the same result which is 0
Code Example:
/START CODE/
var brain = new deepqlearn.Brain(3, 2); // 3 inputs, 2 possible outputs (0,1)
var state = [Math.random(), Math.random(), Math.random()];
for(var k=0;k<10000;k++) {
var action = brain.forward(state); // returns index of chosen action
var reward = action === 0 ? 1.0 : 0.0;
brain.backward([reward]); // <-- learning magic happens here
state[Math.floor(Math.random()*3)] += Math.random()*2-0.5;
}
brain.epsilon_test_time = 0.0; // don't make any more random choices
brain.learning = false;
// get an optimal action from the learned policy
var action = brain.forward(state);
/END CODE/
The text was updated successfully, but these errors were encountered: