Hi, From the [code](https://github.com/JoshVarty/AlphaZeroSimple/blob/master/main.py), I see [trainer.learn()](https://github.com/JoshVarty/AlphaZeroSimple/blob/master/main.py#L26) calls [self.exceute_episode()](https://github.com/JoshVarty/AlphaZeroSimple/blob/master/trainer.py#L57) which in turn, calls [self.mcts.run](https://github.com/JoshVarty/AlphaZeroSimple/blob/master/trainer.py#L28) when then calls [model.predict(state)](https://github.com/JoshVarty/AlphaZeroSimple/blob/master/monte_carlo_tree_search.py#L102). Is it intentional to predict the policy and value before training the network?