You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During training the game threads spend most of their time idling waiting for an inference from the TF/model worker thread, forcing it to divide its time between learning and inference which creates a bottleneck that can slow down training. A more distributed asynchronous approach to serving inferences should help with utilization.
Make each game worker keep its own copy of the model to use for inference during evaluation games (and maybe also rollout), which is periodically synced when appropriate. They should be using the CPU version of TensorFlow due to the batch and model size being small enough on each instance that it shouldn't cause too much of a performance loss, or possibly a gain if this takes care of the current bottleneck.
If the rollout workers are also doing this they should also be sending experience data to the learner manually but with more preprocessing for things like n-step returns for better alignment with this idea of offloading some of the work from the learner.
The same could likely be done for the model comparison and/or psbot scripts, but some profiling might be needed first.
The text was updated successfully, but these errors were encountered:
During training the game threads spend most of their time idling waiting for an inference from the TF/model worker thread, forcing it to divide its time between learning and inference which creates a bottleneck that can slow down training. A more distributed asynchronous approach to serving inferences should help with utilization.
Make each game worker keep its own copy of the model to use for inference during evaluation games (and maybe also rollout), which is periodically synced when appropriate. They should be using the CPU version of TensorFlow due to the batch and model size being small enough on each instance that it shouldn't cause too much of a performance loss, or possibly a gain if this takes care of the current bottleneck.
If the rollout workers are also doing this they should also be sending experience data to the learner manually but with more preprocessing for things like n-step returns for better alignment with this idea of offloading some of the work from the learner.
The same could likely be done for the model comparison and/or psbot scripts, but some profiling might be needed first.
The text was updated successfully, but these errors were encountered: