You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
See parameter "training". We need to create a bool placeholder to check whether it is in the training phase.
That is, After line 25 in tensorflow/OthelloNNet.py, update
self.isTraining = tf.placeholder(tf.bool, name="is_training")
h_conv1 = Relu(BatchNormalization(self.conv2d(x_image, args.num_channels, 'same'), axis=3, training=self.isTraining)) # batch_size x board_x x board_y x num_channels
h_conv2 = Relu(BatchNormalization(self.conv2d(h_conv1, args.num_channels, 'same'), axis=3, training=self.isTraining)) # batch_size x board_x x board_y x num_channels
h_conv3 = Relu(BatchNormalization(self.conv2d(h_conv2, args.num_channels, 'valid'), axis=3, training=self.isTraining)) # batch_size x (board_x-2) x (board_y-2) x num_channels
h_conv4 = Relu(BatchNormalization(self.conv2d(h_conv3, args.num_channels, 'valid'), axis=3, training=self.isTraining)) # batch_size x (board_x-4) x (board_y-4) x num_channels
After Line 50, add
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
self.train_step = tf.train.AdamOptimizer(self.args.lr).minimize(self.total_loss)
Also directly generate action probability after line 37
self.prob = tf.nn.softmax(self.pi)
In tensorflow/NNet.py, in train member function, update
input_dict = {self.nnet.input_boards: boards, self.nnet.target_pis: pis, self.nnet.target_vs: vs, self.nnet.dropout: args.dropout, self.nnet.isTraining: True}
In predict function,
prob, v = self.sess.run([self.nnet.prob, self.nnet.v], feed_dict={self.nnet.input_boards: board, self.nnet.dropout: 0, self.nnet.isTraining: False})
comment out line: "pi = np.exp(pi) / np.sum(np.exp(pi))"
return prob[0], v[0]
If you like, you can give me a permission. I can create a bug feature branch and update the above change and other changes so that you can review them.
Thanks
Jianxiong
The text was updated successfully, but these errors were encountered:
Hi Surg,
I found a bug related to the usage of tensorflow BatchNormalization.
in ../othello/tensorflow/OthelloNNet.py
check API of tf.layers.batch_normalization (https://www.tensorflow.org/api_docs/python/tf/layers/batch_normalization)
See parameter "training". We need to create a bool placeholder to check whether it is in the training phase.
That is, After line 25 in tensorflow/OthelloNNet.py, update
self.isTraining = tf.placeholder(tf.bool, name="is_training")
h_conv1 = Relu(BatchNormalization(self.conv2d(x_image, args.num_channels, 'same'), axis=3, training=self.isTraining)) # batch_size x board_x x board_y x num_channels
h_conv2 = Relu(BatchNormalization(self.conv2d(h_conv1, args.num_channels, 'same'), axis=3, training=self.isTraining)) # batch_size x board_x x board_y x num_channels
h_conv3 = Relu(BatchNormalization(self.conv2d(h_conv2, args.num_channels, 'valid'), axis=3, training=self.isTraining)) # batch_size x (board_x-2) x (board_y-2) x num_channels
h_conv4 = Relu(BatchNormalization(self.conv2d(h_conv3, args.num_channels, 'valid'), axis=3, training=self.isTraining)) # batch_size x (board_x-4) x (board_y-4) x num_channels
s_fc1 = Dropout(Relu(BatchNormalization(Dense(h_conv4_flat, 1024), axis=1, training=self.isTraining)), rate=self.dropout)
s_fc2 = Dropout(Relu(BatchNormalization(Dense(s_fc1, 512), axis=1, training=self.isTraining)), rate=self.dropout)
After Line 50, add
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
self.train_step = tf.train.AdamOptimizer(self.args.lr).minimize(self.total_loss)
Also directly generate action probability after line 37
self.prob = tf.nn.softmax(self.pi)
In tensorflow/NNet.py, in train member function, update
input_dict = {self.nnet.input_boards: boards, self.nnet.target_pis: pis, self.nnet.target_vs: vs, self.nnet.dropout: args.dropout, self.nnet.isTraining: True}
In predict function,
prob, v = self.sess.run([self.nnet.prob, self.nnet.v], feed_dict={self.nnet.input_boards: board, self.nnet.dropout: 0, self.nnet.isTraining: False})
comment out line: "pi = np.exp(pi) / np.sum(np.exp(pi))"
return prob[0], v[0]
If you like, you can give me a permission. I can create a bug feature branch and update the above change and other changes so that you can review them.
Thanks
Jianxiong
The text was updated successfully, but these errors were encountered: