You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to reuse this codebase for another game in which pieces can move freely on the board. At a certain point, during the first game, the method moveToLeaf() in the class MCTS starts looping for ever.
It seems like the condition while not currentNode.isLeaf(): is never satisfied.
Do you have any hint for finding why this issue occurs?
2018-04-07 13:57:44,052 INFO PLAYER TURN...-1
2018-04-07 13:57:44,052 INFO action: 192 (3)... N = 0, P = 0.235297, nu = 0.000000, adjP = 0.235297, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,053 INFO action: 211 (1)... N = 0, P = 0.257799, nu = 0.000000, adjP = 0.257799, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,053 INFO action: 122 (3)... N = 0, P = 0.252349, nu = 0.000000, adjP = 0.252349, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,053 INFO action: 123 (4)... N = 0, P = 0.254554, nu = 0.000000, adjP = 0.254554, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,053 INFO action with highest Q + U...192
2018-04-07 13:57:44,053 INFO PLAYER TURN...-1
2018-04-07 13:57:44,053 INFO action: 190 (1)... N = 0, P = 0.209210, nu = 0.000000, adjP = 0.209210, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,053 INFO action: 102 (4)... N = 0, P = 0.196694, nu = 0.000000, adjP = 0.196694, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,054 INFO action: 118 (6)... N = 0, P = 0.182340, nu = 0.000000, adjP = 0.182340, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,054 INFO action: 122 (3)... N = 0, P = 0.205482, nu = 0.000000, adjP = 0.205482, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,054 INFO action: 123 (4)... N = 0, P = 0.206274, nu = 0.000000, adjP = 0.206274, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,054 INFO action with highest Q + U...190
2018-04-07 13:57:44,054 INFO PLAYER TURN...-1
2018-04-07 13:57:44,054 INFO action: 192 (3)... N = 0, P = 0.235297, nu = 0.000000, adjP = 0.235297, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,054 INFO action: 211 (1)... N = 0, P = 0.257799, nu = 0.000000, adjP = 0.257799, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,054 INFO action: 122 (3)... N = 0, P = 0.252349, nu = 0.000000, adjP = 0.252349, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,055 INFO action: 123 (4)... N = 0, P = 0.254554, nu = 0.000000, adjP = 0.254554, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,055 INFO action with highest Q + U...192
2018-04-07 13:57:44,055 INFO PLAYER TURN...-1
2018-04-07 13:57:44,055 INFO action: 190 (1)... N = 0, P = 0.209210, nu = 0.000000, adjP = 0.209210, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,055 INFO action: 102 (4)... N = 0, P = 0.196694, nu = 0.000000, adjP = 0.196694, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,055 INFO action: 118 (6)... N = 0, P = 0.182340, nu = 0.000000, adjP = 0.182340, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,055 INFO action: 122 (3)... N = 0, P = 0.205482, nu = 0.000000, adjP = 0.205482, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,056 INFO action: 123 (4)... N = 0, P = 0.206274, nu = 0.000000, adjP = 0.206274, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,056 INFO action with highest Q + U...190
2018-04-07 13:57:44,056 INFO PLAYER TURN...-1
2018-04-07 13:57:44,056 INFO action: 192 (3)... N = 0, P = 0.235297, nu = 0.000000, adjP = 0.235297, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,056 INFO action: 211 (1)... N = 0, P = 0.257799, nu = 0.000000, adjP = 0.257799, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,056 INFO action: 122 (3)... N = 0, P = 0.252349, nu = 0.000000, adjP = 0.252349, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,056 INFO action: 123 (4)... N = 0, P = 0.254554, nu = 0.000000, adjP = 0.254554, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,056 INFO action with highest Q + U...192
Those two actions keep looping.
The text was updated successfully, but these errors were encountered:
Hello everybody,
I'm trying to reuse this codebase for another game in which pieces can move freely on the board. At a certain point, during the first game, the method
moveToLeaf()
in the classMCTS
starts looping for ever.It seems like the condition
while not currentNode.isLeaf():
is never satisfied.Do you have any hint for finding why this issue occurs?
Thank in advance,
Fabrizio
P.S.: see my fork for full code -> https://github.com/fmicheloni/DeepReinforcementLearning
EDIT:
Here a few logs of what's happening:
2018-04-07 13:57:44,052 INFO PLAYER TURN...-1
2018-04-07 13:57:44,052 INFO action: 192 (3)... N = 0, P = 0.235297, nu = 0.000000, adjP = 0.235297, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,053 INFO action: 211 (1)... N = 0, P = 0.257799, nu = 0.000000, adjP = 0.257799, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,053 INFO action: 122 (3)... N = 0, P = 0.252349, nu = 0.000000, adjP = 0.252349, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,053 INFO action: 123 (4)... N = 0, P = 0.254554, nu = 0.000000, adjP = 0.254554, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,053 INFO action with highest Q + U...192
2018-04-07 13:57:44,053 INFO PLAYER TURN...-1
2018-04-07 13:57:44,053 INFO action: 190 (1)... N = 0, P = 0.209210, nu = 0.000000, adjP = 0.209210, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,053 INFO action: 102 (4)... N = 0, P = 0.196694, nu = 0.000000, adjP = 0.196694, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,054 INFO action: 118 (6)... N = 0, P = 0.182340, nu = 0.000000, adjP = 0.182340, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,054 INFO action: 122 (3)... N = 0, P = 0.205482, nu = 0.000000, adjP = 0.205482, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,054 INFO action: 123 (4)... N = 0, P = 0.206274, nu = 0.000000, adjP = 0.206274, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,054 INFO action with highest Q + U...190
2018-04-07 13:57:44,054 INFO PLAYER TURN...-1
2018-04-07 13:57:44,054 INFO action: 192 (3)... N = 0, P = 0.235297, nu = 0.000000, adjP = 0.235297, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,054 INFO action: 211 (1)... N = 0, P = 0.257799, nu = 0.000000, adjP = 0.257799, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,054 INFO action: 122 (3)... N = 0, P = 0.252349, nu = 0.000000, adjP = 0.252349, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,055 INFO action: 123 (4)... N = 0, P = 0.254554, nu = 0.000000, adjP = 0.254554, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,055 INFO action with highest Q + U...192
2018-04-07 13:57:44,055 INFO PLAYER TURN...-1
2018-04-07 13:57:44,055 INFO action: 190 (1)... N = 0, P = 0.209210, nu = 0.000000, adjP = 0.209210, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,055 INFO action: 102 (4)... N = 0, P = 0.196694, nu = 0.000000, adjP = 0.196694, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,055 INFO action: 118 (6)... N = 0, P = 0.182340, nu = 0.000000, adjP = 0.182340, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,055 INFO action: 122 (3)... N = 0, P = 0.205482, nu = 0.000000, adjP = 0.205482, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,056 INFO action: 123 (4)... N = 0, P = 0.206274, nu = 0.000000, adjP = 0.206274, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,056 INFO action with highest Q + U...190
2018-04-07 13:57:44,056 INFO PLAYER TURN...-1
2018-04-07 13:57:44,056 INFO action: 192 (3)... N = 0, P = 0.235297, nu = 0.000000, adjP = 0.235297, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,056 INFO action: 211 (1)... N = 0, P = 0.257799, nu = 0.000000, adjP = 0.257799, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,056 INFO action: 122 (3)... N = 0, P = 0.252349, nu = 0.000000, adjP = 0.252349, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,056 INFO action: 123 (4)... N = 0, P = 0.254554, nu = 0.000000, adjP = 0.254554, W = 0.000000, Q = 0.000000, U = 0.000000, Q+U = 0.000000
2018-04-07 13:57:44,056 INFO action with highest Q + U...192
Those two actions keep looping.
The text was updated successfully, but these errors were encountered: