Deep Deterministic Policy Gradient and Geometric Brownian Motion for Simulated Portfolio Optimization
Given N independently moving price series following geometric Brownian Motion with equivalent parameters, train a DDPG agent that maximizes the logarithmic portfolio value.
This is a 4-year long culmination of multiple quantitative trading algorithm projects listed below:
Suppose an asset's price,
where
Given
Given some state
Since the critic network directly maps the state-action space to reward, the action gradient
Let the state space be
All hyperparameters can be found in ./lib/param.hpp.
Note that the results above are merely experimental as the model's convergence and behavior may vary by the random seed assigned for simulating an environment. Further hyperparameter testing and more rigorous optimization objectives would be needed for practical applications in real market environments.
https://www.columbia.edu/~ks20/FE-Notes/4700-07-Notes-GBM.pdf
https://arxiv.org/abs/1509.02971
https://spinningup.openai.com/en/latest/algorithms/ddpg.html