We are adding support for multi-step RL. This diff updates the pytorch codes for several new features:
1. read multi-step RL config
2. read hive table with multi-step data
3. Q-learning with multi steps

Two previous diffs are related: D13049646 added a spark pipeline for reading multi-step data, and D13125518 tested the spark pipeline in dataswarm.

Applied Reinforcement Learning @ Facebook

Horizon is an open source end-to-end platform for applied reinforcement learning (RL) developed and used at Facebook. Horizon is built in Python and uses PyTorch for modeling and training and Caffe2 for model serving. The platform contains workflows to train popular deep RL algorithms and includes data preprocessing, feature transformation, distributed training, counterfactual policy evaluation, and optimized serving. For more detailed information about Horizon see the white paper here.

